skip to navigation skip to content
- Select training provider - (Cambridge Digital Humanities)

Theme: The Library as Data

Show:
Show only:

5 matching courses


The Library as Data: Digital Text Markup and TEI new Wed 23 Oct 2019   11:00 Finished

Text encoding, or the addition of semantic meaning to text, is a core activity in digital humanities, covering everything from linguistic analysis of novels to quantitative research on manuscript collections. In this session we will take a look at the fundamentals of text encoding – why we might want to do it, and why we need to think carefully about our approaches. We will also introduce the TEI (Text Encoding Initiative), the most commonly used standard for markup in the digital humanities, and look at some common research applications through examples.

Correspondence collections are a unique window into the social networks of prominent historical figures. With the digitisation and encoding of personal letters, researchers have at their disposal a wealth of relational data, which can be studied using social network analysis.

This session will introduce and demonstrate foundational concepts, methods and tools in social network analysis using datasets prepared from the Darwin Correspondence collection. Topics covered will include

  • Explanation of the encoding procedures and rationale following the Text Encoding Initiative guidelines
  • Preparation and transformation of .xml files for analysis with an open source data wrangler
  • Rendering of network visualisations using an open source SNA tool

No knowledge of prior knowledge of programming is required, instructions on software to install will be sent out before the session

The Library as Data: An overview new Wed 16 Oct 2019   11:00 Finished

Is the "digital library" more than a virtual rendering of the bookshelf or filing cabinet? Does the transformation of books into bytes and manuscripts into pixels change the way we create and share knowledge? This session introduces a conceptual toolkit for understanding the library collection in the digital age, and provides a guide to key methods for accessing, transforming and analysing the contents as data. Using the rich collections of Cambridge University Library as a starting point, we will explore:

  • Relations between digital and material texts and artefacts
  • Definitions of data and metadata
  • Methods for accessing data in bulk from digital collections
  • Understanding file formats and standards

The session will also provide an overview of the content in the rest of the term’s Library as Data programme, and introduce our annual call for applications to the Machine Reading the Archive Projects mentoring scheme.

We are currently reformatting our Learning programme for remote teaching; this will require some rescheduling so bookings will reopen and new sessions will be created for online courses as soon as possible. In the interim we would encourage you to register your interest so as to be notified of the new schedule. Please be aware that we hope to run many of our courses online, but that this is dependent on staff availability and resources so please be aware we may have to postpone or cancel some sessions

This session focusses on providing photography skills for those undertaking archival research. Dr Oliver Dunn has experience spanning more than 10 years digitising written and printed historical sources for major university research projects in the humanities and social sciences. The focus is very much on low-tech approaches and small budgets. We’ll consider best uses of smartphones, digital cameras and tripods.

Recent advances in machine learning are allowing computer vision and humanities researchers to develop new tools and methods for exploring digital image collections. Neural network models are now able to match, differentiate and classify images at scale in ways which would have been impossible a few years ago. This session introduces the IIIF image data framework, which has been developed by a consortium of the world’s leading research libraries and image repositories, and demonstrates a range of different machine learning- based methods for exploring digital image collections. We will also discuss some of the ethical challenges of applying computer vision algorithms to cultural and historical image collections. Topics covered will include:

  • Unlocking image collections with the IIIF image data framework
  • Machine Learning: a very short introduction
  • Working with images at scale: ethical and methodological challenges
  • Applying computer vision methods to digital collections