Long-Term Preservation of Digital Documents: Principles and Practices

October 14, 2010

Key to our culture is that we can disseminate information, and then maintain and access it over time. While we are rapidly advancing from vulnerable physical solutions to superior, digital media, preserving and using data over the long term involves complicated research challenges and organization efforts.Uwe Borghoff and his coauthors address the problem of storing, reading, and using digital data for periods longer than 50 years. They briefly describe several markup and document description languages like TIFF, PDF, HTML, and XML, explain the most important techniques such as migration and emulation, and present the OAIS (Open Archival Information System) Reference Model. To complement this background information on the technology issues the authors present the most relevant international preservation projects, such as the Dublin Core Metadata Initiative, and experiences from sample projects run by the Cornell University Library and the National Library of the Netherlands. A rated survey list of available systems and tools completes the book.With this broad overview, the authors address librarians who preserve our digital heritage, computer scientists who develop technologies that access data, and information managers engaged with the social and methodological requirements of long-term information access.
Uwe Borghoff is a full professor of computer science at the University of the Armed Forces (UniBwM), Munich, Germany. Prior to this, he worked at Xerox Research Centre Europe in Grenoble, France, where he led the coordination technologies group. Peter Rödig (UniBwM) is developing methods for long-term preservation of digital data. Rel...
Format:Paperback Dimensions:274 pages Published:October 14, 2010 Publisher:Springer-Verlag/Sci-Tech/Trade Language:English

ISBN - 10:3642070175

ISBN - 13:9783642070174


Table of Contents(new sections and chapters w.r.t. the German version of this book are printed in bold face)Part I: Approaches to Long-Term Preservation (approx. 155 pages)1 Long-term Preservation of Digital Documents (approx. 23 pages)1.1 Blessing and Curse of Digital Documents1.2 Challenges, Terms, Concepts1.3 Preserving Byte Streams1.4 Technical Approaches to Long-term Preservation1.5 Legal and Social Issues2 The OAIS Reference Model and the DSEP Process Model (approx. 13 pages)2.1 The OAIS Reference Model2.1.1 Background Information2.1.2 Information Model2.1.3 Process Model2.2 The DSEP Process Model for Libraries3 Migration (approx. 23 pages)3.1 Migration: Notions and Goals3.2 Migration as a Means for Long-term Preservation3.2.1 Data Formats as Migration Targets3.2.2 Migration via Changing Media3.2.3 Migration via Changing Logical Structure3.3 Preservation Processes in Migration Approaches3.4 Migration: Pros and Cons4 Emulation (approx. 27 pages)4.1 Emulation: Notions and Goals4.2 Emulation as a Means for Long-term Preservation4.2.1 What exactly means Emulation?4.2.2 Variants of Emulation4.2.3 Exploiting Virtual Machines4.3 Preservation Processes in Emulation Approaches4.4 Emulation: Pros and Cons5 Document Markup (approx. 25 pages)5.1 An Example5.2 Different Forms of Markup5.2.1 Procedural, Structural, Semantic Markup5.2.2 Embedded Markup Considered Harmful5.2.3 Levels of Markup5.3 Exploiting Markup for Long-term Preservation5.3.1 Requirements for Long-term Preservation5.3.2 Bibliographic Requirements5.4 Persistence is a Virtue5.4.1 Uniform Resource Identifier, -Name, -Locator5.4.2 Referencing Documents5.4.3 Handles and Digital Object Identifiers5.4.4 Summary6 Standard Markup Languages (approx. 26 pages)6.1 Standards for Syntactic Document Markup6.1.1 Tagged Image File Format (TIFF)6.1.2 Portable Document Format (PDF)6.1.3 HyperText Markup Language (HTML)6.1.4 eXtensible Markup Language (XML)6.2 Standards for Semantic Document Markup6.2.1 Resource Description Framework (RDF)6.2.2 Topic Maps6.2.3 Ontologies6.3 Vision: The Semantic Web7 Discussion (approx. 11 pages)7.1 Why do You Need to Act NOW?7.2 What do We Know already, What Remains to be Done?7.3 Facing Reality7.4 A Combined Approach   Part II: Recent Preservation Initiatives (approx. 157 pages)(Projects are subject to change)8 Markup: Current Research and Development (approx. 50 pages)8.1 The Dublin Core Metadata Initiative (DCMI)8.2 The Metadata Encoding and Transmission Standard (METS)8.3 The Victorian Electronic Records Strategy (VERS)8.3 The Text Encoding Initiative (TEI)8.4 The Research Libraries Group (RLG)8.5 The Pandora Project9 Migration: Current Research and Development (approx. 50 pages)9.1 Migration in the VERS Project9.2 Preserving the Whole9.3 Risk Management of Digital Information9.4 Database Migration9.4.1 Motivation9.4.2 Overview of the Architecture9.4.3 Detaching Digital Objects from Physical Media9.4.4 Services of Database Management Systems9.4.5 Experiments9.4.6 Discussion10 Emulation: Current Research and Development (approx. 16 pages)10.1 Emulation Experiments by Rothenberg10.2 Universal Virtual Computer (UVC)11 Digital Archiving Systems for Long-Term Preservation (approx. 40 pages)11.1 Assessment Methodology11.2 Market Survey11.2.1 EPrints (University of Southampton)11.2.2 DSpace (MIT)11.2.3 DIAS (IBM/National Library of the Netherlands)11.2.4 Fedora (Cornell University, The University of Virginia) List of FiguresList of TablesReferencesIndex

From the reviews:"The book examines the problem of storing, reading and using digital documents for periods longer than 50 years. . is user friendly and well structured with a pleasant layout. It is accessible to different user groups, and targets a broad audience including archivists, records managers, librarians, computer scientists and information managers - anyone interested in the long-term preservation of digital documents. . this book can be recommended as necessary reading for all students and professionals in the field of information management." (Tony Rodrigues, Online Information Review, Vol. 31 (6), 2008)