Practices, Standards, and Arrangements
This section enumerates and defines Livingstone Online's methodological practices, data production standards, and hosting and backup arrangements.
Data Production Standards
Hosting, Site Setup, and Backup Arrangements
Appendix 1: File Naming
Appendix 2: MODS Elements
Introduction Top ⤴
Livingstone Online is a data-driven project. We promote access to and use of our entire digital collection. We work in a transparent and accountable manner. We seek to ensure the long-term digital preservation of our core data. As a result, we rely on a set of well-defined practices, standards, and arrangements to realize these objectives.
Methodological Practices Top ⤴
Code of conduct. The Livingstone Online Code sets out ideals that guide our work and determine our day-to-day collaboration with individuals and institutions around the world.
Open access and use. By design, we have made nearly all of Livingstone Online freely accessible to the public-at-large. We encourage the broad non-commercial use of our materials and, whenever possible, try to secure Creative Commons licenses for the materials we publish in an effort to promote educational dissemination.
"Dr. Livingstone's Steam Launch Ma Robert," 1858. Copyright Wellcome Library, London. Creative Commons Attribution 4.0
User-friendly design. The Livingstone Online interface seeks to create a user-friendly experience. We have invested significant effort in developing a site that is easy to navigate, intuitive, and aesthetically enriching. We have also conducted extensive testing and have collaborated with a variety of end-users in an attempt to ensure that our site is as glitch- and bug-free as possible.
Quality control. Our image and transcription data and metadata undergo rigorous quality control. Each element of our core data passes through at least four stages of review, often much more, prior to online publication. Based on our quality control practices, we conservatively estimate our data production error rate to be about 5%, although in most cases the error rate is probably much lower.
Documentation. Our project is the result of a sustained, decade-long collaboration among a variety of entities. We strive to make all of our work as open as possible, as befits a publicly-funded project. In this spirit, we offer rich project histories that record every stage of project development via narrative text, images, and downloadable project documents. We encourage other interested parties to study and learn from our efforts as part of our mission to facilitate digital humanities knowledge transfer.
"The Livingstone Mid Africa Film Corporation, I Presume?" Punch, June 1933, p.323. Copyright National Library of Scotland. Creative Commons Share-alike 2.5 UK: Scotland
Data Production Standards Top ⤴
File naming. The base file name of each item in our digital collection consists of only a "liv" prefix plus a unique six-digit item number. The item number has no meaning, and we make it a practice not to put any bibliographic information into our file names. Rather we keep all such information in our metadata files. Click here for more detail on how we use file names.
Images. When possible, we request 8-bit TIFF images at 600 dpi created to the 6.0 TIFF specification from our collaborating institutions, and we provide the institutions with a clear set of imaging guidelines. However, we have collected our core image data from an array of institutions over a ten-year period (2004-present). As a result, variations in image capture methods and specifications have been inevitable, due to differing institutional protocols and ongoing methodological changes in fields such as library science, imaging science, and the digital humanities. Our usual practice is to crop images to one image per manuscript page, if images do not already conform to this format. When we crop images, we always retain an archival backup copy of the uncropped image.
Gary Li, Sharon Messenger (in reflection), and Caroline Overy outside the Royal Society for Arts Library, 2007. Copyright Livingstone Online (Sharon Messenger, photographer). Creative Commons Attribution-NonCommercial 3.0 Unported
Transcriptions. We have produced all our transcriptions in XML in conformance with the TEI P5 encoding guidelines. We have recorded our custom use of these guidelines in a special TEI document called an ODD (One Document Does-it-all). We use this ODD to generate both an HTML-based encoding manual to direct all our transcription efforts and an RNG schema to ensure that all our XML-based transcriptions are valid within the scope of our TEI customization.
Bonus: Download our complete TEI transcription files (497 files).
Double-Bonus: Download PDF reading copies of our TEI transcription files.
Triple-Bonus: Download our complete TEI transcription materials, which include our coding manual, transcription templates, ODD, and RNG schema.
Metadata. We build detailed metadata for each item in our digital collection in an XML file created according to Version 3 of the Metadata Object Description Schema (MODS). Click here for more detail on how we use the MODS format.
Bonus: Download our complete MODS files, which include additional metadata not available elsewhere in our site.
Index of Culpeper's Complete Herbal (London 1815), by Richard Evans. Copyright Livingstone Online (Gary Li, photographer). May not be reproduced without the express written consent of the National Trust for Scotland, on behalf of the Scottish National Memorial to David Livingstone Trust (David Livingstone Centre).
Hosting, Site Setup, and Backup Arrangements Top ⤴
Site hosting. The University of Maryland Libraries host Livingstone Online in an Islandora framework. The Islandora framework combines a front-end Drupal content management system (version 7.34 for Livingstone Online) with a back-end Fedora digital asset management system. Thanks to this arrangement, Livingstone Online is integrated within the overall collection of the Libraries and so can be preserved and maintained as part of that larger collection.
Development access. An agreement between the University of Maryland Libraries and the Livingstone Online project team sets out the responsibilities of the Libraries in hosting the site and defines the basis on which the project team can access and develop all the site's content.
Site Setup. Livingstone Online is built and deployed using a number of different tools. At the heart of our system, we use Docker for deployment of code and dependencies, and Git for storing and managing code and configuration. We have a number of Github Repositories where we share our code and configuration. These repositories fall into three categories:
- Docker related - used to build Docker image(s) that are deployed to the server (prefixed with docker-);
- Code related - used by the site to implement functionality (prefixed with livingstone_online_); and
- Unrelated - not related to site setup (for instance, MODS files, TEI files, etc.).
Docker images are built automatically when changes are made to the repositories identified above. This is performed by the Docker Hub service. After Docker images are built they are automatically deployed by the Docker auto deploy application running on the site server. More information about the site setup is available here.
Site versions. Our project team uses three versions of the site for development: dev, stage, and production. Dev and stage provide iterations of the site where our programmers can experiment with design and test changes to code, while production provides the public-facing version of the site.
Data and site backup. We employ multiple strategies to backup Livingstone Online's data. The University of Maryland Libraries create nightly incremental backups of the site using Commvault data protection systems, and deleted files are retained for fourteen days after deletion. All Livingstone Online data is also duplicated to tape storage held in the UMD Libraries secure server room. Code, including TEI and MODS files, is versioned in GitHub. The site's Drupal database is backed up automatically to our production server on a daily basis using Drupal's Backup and Migrate module, while the whole Drupal files directory is also sync'd via a cron job to an external server. Finally, the project director maintains local backups of all core project data on a series of computers, external hard drives, and remote servers.
Appendix 1: File Naming Top ⤴
Each item in our collection receives only a simple base file name:
Images receive an additional four digit segment that identifies the specific image in the item sequence:
MODS and TEI files have the relevant acronyms added to the base file name:
Consequently, the base file names enable the easy association and organization of all files related to an item, while additions to this name and/or file suffixes allow differentiation among the files:
liv_000017 – base file name
liv_000017_0001.jpg – JPEG image file
liv_000017_0001.tif – TIFF image file
liv_000017_0001.tif.md5 – MD5 data integrity verification file
liv_000017_0001.tif.txt – TXT Dublin Core and image capture and processing metadata file
liv_000017_0001.tif.xmp – XMP sidecar metadata file
liv_000017_MODS.xml – XML MODS metadata file
Return to primary section on File Naming.
Appendix 2: MODS Elements Top ⤴
Our MODS files include the following elements:
identifier – the base file name for the given item; the genre and number of the item as set out in the Clendennan and Cunningham catalogues of Livingstone documents (1979, 1985); where relevant, the shelfmark of a copy of the item held by the National Library of Scotland
titleInfo.title – the title of the item with and without the date
name.namePart and name.description – the name and birth and death dates of the creator(s) and, if a letter, the authority name of the addressee(s); the authority name of the repository as set out, whenever possible, in the Library of Congress Name Authority File (NAF) file; a short set of biographical facts related to the addressee
genre – the genre of the item as drawn from the Getty Research Institute's Art and Architecture Thesaurus Online
originInfo.dateCreated – the date of the item in written day-month-year form; the date as expressed according to the ISO 8601 format
originInfo.place.placeTerm – the place where an item was created or composed , as specified by Livingstone himself or, if not specified, than as supplied by the Livingstone Online team based on contextual inference; the authority name in the Library of Congress Name Authority File (NAF) file of the place where an item was created or composed
subject.cartographics - the approximate latitude and longitude of the place where an item was created or composed
physicalDescription.note and physicalDescription.extent – physical details relating to an item, including whether it takes the form of a manuscript, photocopy, typescript, newspaper item, or other printed format; the page length of the item; the size of the item in millimeters
location.shelfLocator – the repository shelfmark
accessCondition – the terms by which an item is available for use and reuse
relatedItem.identifier – the bibliographical details or URL for any previous full or partial publications of the item
The MODS metadata also forms the basis for the derivative XMP metadata that we add to the image header of each image and for the standalone XMP and TXT Dublin Core metadata that we produce for each image.
Return to primary section on Metadata.