Browse or search for courses
5 matching courses
Courses per page: 10 | 25 | 50 | 100
This session provides a brief introduction to different methods for capturing bulk data from online sources or via agreement with data collection holders, including Application Programme Interfaces (APIs). We will address issues of data provenance, exceptions to copyright for text and data-mining, and discuss good practice in managing and working with data that others have created.
- Data collection methods
- Introduction to working with APIs
- Data brokerage
- Provenance and integrity
- Assessing intellectual property, copyright and Data Protection issues
- Documentation of collection methods
The impact of well-crafted data visualisations has been well-documented historically. Florence Nightingale famously used charts to make her case for hospital hygiene in the Crimean War, while Dr John Snow’s bar charts of cholera deaths in London helped convince the authorities of the water-borne nature of the disease. However, as information designer Alberto Cairo notes, charts can also lie. This introductory Basics session presents the basic principles of data visualisation for researchers who are new to working with quantitative data.
- Principles and good practice in data visualisation
- Basic introduction to quantitative methods of data analysis
This CDH Basics session explores the lifecycle of a digital research project across the stages of design, data capture, transformation, and analysis, presentation and preservation. It introduces tactics for embedding ethical research principles and practices at each stage of the research process.
- Introduction to the digital project life cycle
- Ethics by design and EDI-informed data processing
- Data and metadata - definitions
- Basics of data curation (good practice in file naming, version control)
- Understanding files and folders
Ensuring long-term access to digital data is often a difficult task: both hardware and code decay much more rapidly than many other means of information storage. Digital data created in the 1980s is frequently unreadable, whereas books and manuscripts written in the 980s are still legible. This session explores good practice in data preservation and software sustainability and looks at what you need to do to ensure that the data you don’t want to keep is destroyed.
- Data and code sustainability
- Retention, archiving and re-use
- Data destruction
- Recap on the project life-cycle
Data which you have captured rather than created yourself is likely to need cleaning up before you can use it effectively. This short session will introduce you to the basic principles of creating structured datasets and walk you through some case studies in data cleaning with OpenRefine, a powerful open source tool for working with messy data.
- Structuring your data
- Cleaning messy textual data with OpenRefine
- Batch processing file names