References
- National Library of the Netherlands- selected tab,
- Project kopal
Proof of Experience:
National Library of the Netherlands
Royal Dutch Library - Koninklijke Bibliotheek (KB) - preserves culture with IBM Digital Information Archiving Solution.
Customer Background
Founded in 1798, the Royal Dutch Library strives to preserve the country’s cultural heritage.
The library started its engagement in the area of long-term digital preservation together with IBM in the mid-90s and is hence one of the pioneers in long-term digital information preservation.
For the Royal Dutch Library the preservation of the country’s cultural heritage, the collection and maintenance all of the publications issued in the Netherlands are imperative. The library archives a vast volume of materials electronically, using CD-ROMs, diskettes and magnetic optical disks. But it is also responsible for preserving and providing public access to these materials.
Business challenge
The National Library of the Netherlands is faced with the problem of preserving large amounts of digital documents for the long-term. These documents come from two sources: from media published directly in digital form and from digitizing paper documents. Anticipating several hundred terabytes of digital content, the library recognized it was time for a scalable, reliable digital media management solution. Up to that time no solution readily addressed both the aspects of large volume and durable storage as well as the long-term preservation requirements. So this project could not rely on out-of-the-box solutions alone.
Solution
In 2000, the KB and IBM started building an electronic deposit system. IBM provided a complete, high-quality digital media solution, for which IBM’s Digital Information Archiving System (DIAS) is used as the technical core of the infrastructure for the e-Deposit for the Royal Dutch Library. IBM Content Manager provides a scalable, robust foundation for digital content preservation and retrieval. IBM Tivoli Storage Manager, centralizes and automates backup and archival services for all of the solution’s storage media. IBM WebSphere software solutions provided an efficient, dynamic e-business development environment for the content-loading application. The solution is operational since 2003 and IBM will provide 10 years of maintenance.
Additionally a joined study to further the research in the area of long-term digital preservation was initiated.
Benefits
- KB became one of the first libraries to have an electronic deposit with long-term preservation capabilities
- Capability of managing hundreds of thousands of electronic documents annually
- Management of various content formats, from print-based documents to digitized images
- Permanent access to digital information within an international context
- Access to research information for 100 concurrent external users—members of the public who want to do research
- Faster and direct access to information for users
- Supported storage and retrieval of content for the next 100 years
- An affordable, integrated software and hardware solution as well as business support services from one IT provider
- Sustainable solution and flexibility through the application of open standards and hardware and software components
- Development of best-practices in area of long-term digital preservation
IBM/KB Long-term Preservation Study
Introducing the IBM KB Long-term Preservation Report Series
The National Library of the Netherlands (KB, Koninklijke Bibliotheek) is faced with the problem of preserving large amounts of digital documents for the long term. These documents come from two sources: from media published directly in digital form and from digitizing paper documents. In 2000, the KB and IBM started building an electronic deposit system (“Digital Information Archiving System or DIAS)”, the technical core of the infrastructure for the e-Deposit for the Netherlands.
From the beginning it was clear that this project could not rely on out-of-the-box solutions alone because up to that time no solution readily addressed both the aspects of large volume and durable storage as well as the long-term preservation requirements. So an IBM / KB Long-term Preservation Study (LTP Study) was initiated as part of the overall project of developing a deposit system.
The primary objective of the LTP Study was to investigate the functionality required for the long-term preservation (hundreds of years) of the digital information stored in DIAS. This study has resulted in 6 reports: one overview report and five specific reports, each one addressing an important aspect of long-term preservation in its own right.
Titles of the IBM / KB Long-term Preservation Study Reports Series:
Number 1: The Long-Term Preservation Study of the DNEP Project - an Overview of the Results
This report explains the reasons and objectives behind defining the LTP Study as part of the overall project to implement an electronic deposit system. It also provides a quick and general overview of all the study results, which are then elaborated on in more detail in the other published reports.
Number 2: Authenticity in a Digital Environment
Authenticity acquires a new meaning in a digital context. Normally objects are physical and their physical characteristics are the main source for defining authenticity. Moreover, authenticity is not a single concept, but involves different aspects that can be associated with an object:
- A traceable path from the object’s origin to its current ownership;
- Measures and techniques for safeguarding against and/or recognizing modifications;
- Techniques for establishing the use of original materials.
The problem of digital objects is that in fact they are just conceptual objects. A digital object is a conceptual object to be interpreted (rendered) by executing the digital objects in a specific IT infrastructure (hardware & software). This report focuses on defining a framework in which we can define what is actually meant when one speaks of an authentic digital object.
Number 3: Preservation Requirements in a Deposit System
The initial DIAS release only provides basic functionality for preserving and rendering the stored digital objects for the long term. One of the primary responsibilities of the LTP Study is to define the functional requirements of the Preservation Subsystem, which is scheduled for development later. This report identifies requirements of the DIAS Preservation Subsystem so as to provide the services and functions for monitoring the technical environment associated with the digital objects stored in DIAS.
The Preservation Subsystem can be summarized by the following three objectives:
- Identifying digital objects that are in danger of becoming inaccessible because of changes intechnology;
- Implementing the activities associated with technical preservation;
- Supplying the requisite technical metadata in order to generate / validate the environments needed during digital object delivery.
Number 4: The UVC: a Method for Preserving Digital Documents – Proof of Concept
Within IBM Research in Almaden Raymond Lorie was already working on a combined emulation / migration approach to preserve a certain class of digital objects with an approach called a Universal Virtual Computer (UVC).
The main idea consists of archiving a program P along with the data file that decodes the data and returns the information to a future client based on a logical view. The logical view of the data is simple and self-contained enough to be interpreted without any specific software or hardware. Program P is written for a Universal Virtual Computer (UVC) that is general, yet basic enough to continue to be relevant in the future. Given the simplicity of the UVC, it will be relatively easy to write an emulator of the UVC in the future on a real machine of that time. The emulated machine will run the program P and return all data in an easy to understand logical view of the data.
The LTP Study conducted a proof of concept with the KB to test the UVC approach in a library environment. The PDF format was selected because it is the primary data format for electronic publications to be stored in DIAS.
Number 5: Managing Media Migration in a Deposit System
Storage technology obsolescence makes media migration a necessity. Data has to be copied from one storage medium to another on a regular basis. However, the fact that storage technology becomes obsolete is not the only trigger for rewriting previously stored digital objects. All storage media degrade over time and have to be rewritten either on the same medium (refreshing) or on another medium (migration).
Ordinarily media refreshment / migration would be a straightforward process. However, the large amounts of storage associated with an electronic deposit system introduce certain volume-specific requirements. Most electronic deposit systems define their storage capacity needs in several TeraBytes (1012 Bytes). Take a deposit system with 100 TeraBytes of information stored on tape, for example. Let’s assume that you want to migrate all this information to an optical storage medium. Current optical storage media have a capacity of around 5 GigaBytes and a write speed of around 4 MegaBytes/second. A quick calculation shows that a complete migration to optical storage would take at least 290 days (100 TeraBytes / 4 MegaBytes seconds)!
This report describes the actions to be taken to manage media migration / refreshment effectively within an electronic deposit system, focussing specifically on the media migration issues within DIAS. Potential additional capacity required for media migration might be created by redundancy and parallelism.
Number 6: Archiving Web Publications
More and more web publications are becoming a primary source of information and will thus be stored as digital objects in DIAS. Web publications have specific characteristics and requirements that DIAS must meet if they are to be archived successfully.
This report investigates the issues and requirements introduced by archiving Web publications and their potential impact on DIAS.
