This section enumerates and defines Livingstone Online's methodological practices, data production standards, and hosting and backup arrangements.
Introduction Top ⤴
Livingstone Online is a data-driven project. We promote access to and use of our entire digital collection. We work in a transparent and accountable manner. We seek to ensure the long-term digital preservation of our core data. As a result, we rely on a set of well-defined practices, standards, and arrangements to realize these objectives.
Methodological Practices Top ⤴
Code of conduct. The Livingstone Online Code sets out ideals that guide our work and determine our day-to-day collaboration with individuals and institutions around the world.
Open access and use. By design, we have made nearly all of Livingstone Online freely accessible to the public-at-large. We encourage the broad non-commercial use of our materials and, whenever possible, try to secure Creative Commons licenses for the materials we publish in an effort to promote educational dissemination.
User-friendly design. The Livingstone Online interface seeks to create a user-friendly experience. We have invested significant effort in developing a site that is easy to navigate, intuitive, and aesthetically enriching. We have also conducted extensive testing and have collaborated with a variety of end-users in an attempt to ensure that our site is as glitch- and bug-free as possible.
"Dr. Livingstone's Steam Launch Ma Robert," 1858. Copyright Wellcome Library, London. Creative Commons Attribution 4.0
Quality control. Our image and transcription data and metadata undergo rigorous quality control. Each element of our core data passes through at least four stages of review, often much more, prior to online publication. Based on our quality control practices, we conservatively estimate our data production error rate to be about 5%, although in most cases the error rate is probably much lower.
Documentation. Our project is the result of a sustained, decade-long collaboration among a variety of entities. We strive to make all of our work as open as possible, as befits a publicly-funded project. In this spirit, we offer rich project histories that record every stage of project development via narrative text, images, and downloadable project documents. We encourage other interested parties to study and learn from our efforts as part of our mission to facilitate digital humanities knowledge transfer.
Credit. Livingstone Online is a collaborative project to which a wide range of individuals have made contributions of varying sorts. We have developed a complex credit model in order to recognize these contributions. Our model consists of several components:
- Our staff page identifies and cites the roles of the core members of the Livingstone Online initiative as a whole.
- Our project team pages list the participants responsible for developing Livingstone Online's editions and other projects.
- Our main acknowledgments page enumerates the many individuals who are not Livingstone Online core or project team members, but who nonetheless have made important contributions to our work. There are also acknowledgment pages for our critical editions of Livingstone's Final Manuscripts, 1870 and 1871 Field Diaries, and Letter from Bambarre.
- Our project histories describe and define collaborator contributions in more detail (see sub-section on "Documentation," above).
Livingstone Online thus aspires to digital humanities best practice through the combination of these strategies and seeks to foster a collaborative environment where the work of contributors is not only valued, but also recognized in a manner that does justice to their contributions and, ulimately, benefits both the contributors and the project as a whole.
"The Livingstone Mid Africa Film Corporation, I Presume?" Punch, June 1933, p.323. Copyright National Library of Scotland. Creative Commons Share-alike 2.5 UK: Scotland
Attribution. Our project has a standardized system for the attribution of individual site essays. As relevant, we identify first, second, and third authors as well as editors, peer-reviewing editors, and other such individuals. By default, the individuals who appear in the byline of a given piece are first authors unless otherwise specified: "Megan Ward" or "Megan Ward and Adrian S. Wisnicki." We use "with" to signal one or more second authors and "also with" to signal one or more third authors: "Ashanka Kumari, with Adrian S. Wisnicki, also with Keith Knox and Megan Ward." If necessary, parenthetical information may be added to distinguish between roles: "Adrian S. Wisnicki, with Megan Ward (authors); Justin Livingstone (peer-reviewing editor)."
Data Production Standards Top ⤴
File naming. The base file name of each item in our digital collection consists of only a "liv" prefix plus a unique six-digit item number. The item number has no meaning, and we make it a practice not to put any bibliographic information into our file names. Rather we keep all such information in our metadata files. Click here for more detail on how we use file names.
Images. When possible, we request 8-bit TIFF images at 600 dpi created to the 6.0 TIFF specification from our collaborating institutions, and we provide the institutions with a clear set of imaging (and permission) guidelines. However, we have collected our core image data from an array of institutions over more than a ten-year period (2004-present). As a result, variations in image capture methods and specifications have been inevitable, due to differing institutional protocols and ongoing methodological changes in fields such as library science, imaging science, and the digital humanities.
Our usual practice is to crop images to one image per manuscript page, if images do not already conform to this format. However, we do not normally crop out any contexutal elements that may appear in images, such as color charts or rulers, in the interests of giving users as much information as possible. Also, when we crop images, we always retain an archival backup copy of the uncropped image.
Gary Li, Sharon Messenger (in reflection), and Caroline Overy outside the Royal Society for Arts Library, 2007. Copyright Livingstone Online (Sharon Messenger, photographer). Creative Commons Attribution-NonCommercial 3.0 Unported
Transcriptions. We have produced all our transcriptions in XML in conformance with the TEI P5 encoding guidelines. We have recorded our custom use of these guidelines in a special TEI document called an ODD (One Document Does-it-all). We use this ODD to generate both an HTML-based encoding manual to direct all our transcription efforts and an RNG schema to ensure that all our XML-based transcriptions are valid within the scope of our TEI customization.
Bonus: Download our complete TEI transcription files (780 files).
Double-Bonus: Download PDF reading copies (823 files, including 50 "extra bonus" HTML annotated reading copies) of our TEI transcription files.
Triple-Bonus: Download our complete TEI transcription materials, which include our coding manual, transcription templates, ODD, and RNG schema.
Metadata. We build detailed metadata for each item in our digital collection in an XML file created according to Version 3 of the Metadata Object Description Schema (MODS). Click here for more detail on how we use the MODS format.
Bonus: Download our complete MODS files (3032 records), which include additional metadata not available elsewhere in our site.
Index of Culpeper's Complete Herbal (London 1815), by Richard Evans. Copyright Livingstone Online (Gary Li, photographer). May not be reproduced without the express written consent of the National Trust for Scotland, on behalf of the Scottish National Memorial to David Livingstone Trust (David Livingstone Centre).
Hosting, Site Setup, and Backup Arrangements Top ⤴
Site hosting. The University of Maryland Libraries host Livingstone Online in an Islandora framework (version 7.1.7). The Islandora framework combines a front-end Drupal content management system (version 7.56) with a back-end Fedora digital asset management system (version 3.6.2). Thanks to this arrangement, Livingstone Online is integrated within the overall collection of the Libraries and so can be preserved and maintained as part of that larger collection.
Development access. An agreement between the University of Maryland Libraries and the Livingstone Online project team sets out the responsibilities of the Libraries in hosting the site and defines the basis on which the project team can access and develop all the site's content.
Site Setup. Livingstone Online is built and deployed using a number of different tools. At the heart of our system, we use Docker (version 17.06) for deployment of code and dependencies, and Git for storing and managing code and configuration. We have a number of GitHub Repositories where we share our code and configuration. These repositories fall into three categories:
- Docker related - used to build Docker image(s) that are deployed to the server (prefixed with docker-);
- Code related - used by the site to implement functionality (prefixed with livingstone_online_); and
- Unrelated - not related to site setup (for instance, MODS files, TEI files, etc.).
Docker images are built automatically when changes are made to the repositories identified above. This is performed by the Docker Hub service. After Docker images are built they are automatically deployed by the Docker auto deploy application running on the site server. More information about the site setup is available here.
Site versions. Our project team uses two online versions of the site for development: stage and production. Stage provides iterations of the site where our programmers can experiment with design and test changes to code for review by project staff, while production provides the public-facing version of the site. Programmers also work with local versions of the site on their own computers.
Data and site backup. We employ multiple strategies to backup Livingstone Online's data. The University of Maryland Libraries create nightly incremental backups of the site using Commvault data protection systems, and deleted files are retained for fourteen days after deletion. All Livingstone Online data is also duplicated to tape storage held in the UMD Libraries secure server room. Code, including TEI and MODS files, is versioned in GitHub. The site's Drupal database is backed up automatically to our production server on a daily basis using Drupal's Backup and Migrate module, while the whole Drupal files directory is also sync'd via a cron job to an external server. Finally, the project director maintains local backups of all core project data on a series of computers, external hard drives, and remote servers.
Appendix 1: File Naming Top ⤴
Each item in our collection receives only a simple base file name:
Images receive an additional four digit segment that identifies the specific image in the item sequence:
MODS and TEI files have the relevant acronyms added to the base file name:
Consequently, the base file names enable the easy association and organization of all files related to an item, while additions to this name and/or file suffixes allow differentiation among the files:
liv_000017 – base file name
liv_000017_0001.jpg – JPEG image file
liv_000017_0001.tif – TIFF image file
liv_000017_0001.tif.md5 – MD5 data integrity verification file
liv_000017_0001.tif.txt – TXT Dublin Core and image capture and processing metadata file
liv_000017_0001.tif.xmp – XMP sidecar metadata file
liv_000017_MODS.xml – XML MODS metadata file
liv_000017_TEI.xml – XML TEI P5 transcription file
Return to primary section on File Naming.
Appendix 2: MODS Elements Top ⤴
Our MODS files include the following elements:
identifier – the base file name for the given item; the genre and number of the item as set out in the Clendennan and Cunningham catalogues of Livingstone documents (1979, 1985); where relevant, the shelfmark of a copy of the item held by the National Library of Scotland
titleInfo.title – the title of the item with and without the date
name.namePart and name.description – the name and birth and death dates of the creator(s) and, if a letter, the authority name of the addressee(s); the authority name of the repository as set out, whenever possible, in the Library of Congress Name Authority File (NAF) file; a short set of biographical facts related to the addressee
genre – the genre of the item as drawn from the Getty Research Institute's Art and Architecture Thesaurus Online
originInfo.dateCreated – the date of the item in written day-month-year form; the date as expressed according to the ISO 8601 format
originInfo.place.placeTerm – the place where an item was created or composed , as specified by Livingstone himself or, if not specified, than as supplied by the Livingstone Online team based on contextual inference; the authority name in the Library of Congress Name Authority File (NAF) file of the place where an item was created or composed
subject.cartographics - the approximate latitude and longitude of the place where an item was created or composed
physicalDescription.note and physicalDescription.extent – physical details relating to an item, including whether it takes the form of a manuscript, photocopy, typescript, newspaper item, or other printed format; the page length of the item; the size of the item in millimeters
location.shelfLocator – the repository shelfmark
accessCondition – the terms by which an item is available for use and reuse
relatedItem.identifier – the bibliographical details or URL for any previous full or partial publications of the item
The MODS metadata also forms the basis for the derivative XMP metadata that we add to the image header of each image and for the standalone XMP and TXT Dublin Core metadata that we produce for each image.
Return to primary section on Metadata.