Personal tools
You are here: Home Národní digitální knihovna Podrobnější popis projektu
Document Actions

NDK context

by Jan Hutař last modified 2011-12-13 00:08

The Creation of the National Digital Library Project

In February 2010, the National Library of CR (NLCR) applied along with the Moravian Library (ML) as its partner for the ‘Creation of the National Digital Library’ (NDL, Czech initials: NDK) project. The project was submitted within the Call 07 of the Integrated Operational Programme ‘Electronisation of Public Administration’. In June 2010, the project was approved. It is one of the corner stones of the eCulture concept, through which the sector of culture significantly contributes to the fulfilment of the aims of Smart Administration.

The project is financed from the EU Integrated Operational Program with CZK 255 million and co-financed from the budget of the MC CR with CZK 45 million. The 85% contribution from the ERDF structural fund of CZK 254,946,300 is complemented by a 15% co-financing from the state budget of CZK 44,990,700. The total eligible public expenditures thus are CZK 299,937,000.

Thanks to the right of complete legal deposit copy, the NLCR and ML have preserved in their collections the majority of the monographs, periodicals and other types of documents published in the CR (Bohemica in the narrow sense of the word), a great number of documents related to the CR published abroad (Bohemica in the broad sense of the word) and have administered abundant historical collections. Since 2000, they have cooperated also on the underpinning of the recording the Czech web. They thus have available an extensive and at the same time unique material of singular cultural but with respect to the context of Smart Administration primarily factual value.

The NDL project has three main aims:

1. The digitisation of a significant part of the Bohemica of the 19th–21st centuries, i.e. books issued in the Czech Republic, written in Czech or discussing the Czech Republic. By the end of 2019, we will have digitised a total of more than 50 million pages, hence approximately 300,000 volumes. The scope of the project is far from being limited only to its duration but will intensively continue also after its completion in 2014 – not only until 2019 within the required sustainability of the project but also in further years.

2. The long-term preservation of documents in a reliable digital repository, which will provide a space for the safe placement of already digitised documents as well as the digital documents created or acquired in the NDL project and within other projects.

3. Making digital documents accessible in a uniform, user-friendly interface with a high degree of personalisation possible. From a single place, the digitised documents as well as paid online databases will be accessible.

The NDL System  

The NDL system is comprised of four basic subsystems and a number of ancillary subsystems. All of the subsystems will be built on the basic technological infrastructure, including the servers, deposit storage, network and basic drivers. This infrastructure will emerge from an expansion of the existing technological environment already in operation at the NL CR and ML. 

The basic subsystems are:

·         Digitisation Subsystem  

·         Long-Term Preservation of Digital Documents Subsystem (LTP Subsystem)

·         Transformation and Consistency Control Subsystem (Transformation Module)

·         Information and Document Access Subsystem (simultaneous access applications + central access applications)

The components of the system are the technical and administrative tools ensuring the interconnection of the subsystems and their integration in the existing applications (integration layer).

The role of the individual subsystems within the NDL system as well as their interconnections are captured by the following graphic depiction.

NDK_new

Legend to the picture:

– Digitisation, Long-Term Archiving and an Application for Central Access are modules where ready solutions can be found (based on the requirements in the selection process, the most suitable solution will be chosen)

– The Transformation Module is an as-yet nonexistent component that will be necessary to develop within the NDL project (possibly to complete programming its functionality for the useable product)

– The systems that already exist (Aleph, Registr digitalizace /Register of Digitisation/, URN:NBN resolver, Kramerius, WebArchiv, Manuscriptorium) must be integrated in the NDL system and connected with the newly built systems and modules

MC = master copy

UC = user copy

PSP = producer submission package – a package of data and metadata from the process of digitisation or provided from external sources

SIP = submission information package – a package of data and metadata entering the LTP system

DIP = dissemination information package – a package of data and metadata exiting the LTP system

AIP = archival information package – a package of data and metadata in the archives

DB = database

Types of lines:

– solid line – flows of the packages (data + metadata)

– dashed line – flows of metadata, assignment of identifiers

– dotted line – entry checks

– blue line – user copies of the Manuscriptorium + WebArchiv projects (they go outside, only the archiving of the MC from these projects is ensured)

 

The Movement of the Documents through the NDL System

(Basic Description of the Picture)

  • A document selected for digitisation is processed in the digitisation subsystem using the so-called ‘digitisation workflow’ tools; the metadata are acquired from the Aleph library system by reading the barcode and are transferred into the digitisation workflow and in the Register of Digitisation (hereinafter RD); the digitisation workflow assigns the documents identifiers (URN:NBN), which are further adjusted by the Resolver URN:NBN application.
  •  The output of the process of digitisation is the PSP data package, which contains both the data and metadata for access and the data and metadata intended for archiving. This package is entered in the shared workspace.
  •  In the workspace, the PSP packages are further processed by the transformation module. The metadata are controlled and the SIP1 package is created for the LTP system and the SIP2 package for the accessing system (through the transformation of the metadata and structure of the packages). The transformation module assigns the URN:NBN to external documents which do not come from the digitisation workflow.
  •  The SIP1 is further processed in the LTP system: the output is the AIP package and its preservation in the archiving module of the LTP system.  
  • The Kramerius application (at the NL CR and ML) processes the SIP2 package and ensures the accessibility of the user copies (UC). The user copies produced by both digitisation workplaces (Prague, Brno) as well as the user copies of external data will be placed in the Kramerius applications.
  • The digitisation workflow checks the consistency – whether the UC and MC ended in the target position – and ensures the deletion of the PSP packages from the workspace.
  • The RD and Resolver URN:NBN systems collect data from the Kramerius applications and the RD subsequently provides the URL of the UC to the Aleph library catalogue (NL CR, ML), from which it is transferred on to the Union Catalogue of the CR (SKC database).
  • The application for central access harvests the data from the accessing applications (Kramerius, Manuscriptorium, WebArchiv), from which the user gains also image data and full texts, possibly an extended description.
  • If user copies need to be replaced, the archival master data are exported through the adjustable DIP into the transformation module and are entered in the applications for accessibility (the metadata as well as data migration can take place within the LTP system and its workflow for DIP).
  • The end user does not have access to the archival data, only to the UC through representations in the applications for accessibility. If the end user still needs the data in high quality, s/he can acquire it on a request from the archives through the manual route (through the system administrator).
  • The data from the Manuscriptorium and WebArchiv projects will not be adjusted in the transformation module for the accessing applications. Here, the entry into the accessing application takes place ‘independently of the NDL project’ outside of the transformation module, before or after the data is entered in the LTP system.
  • The data from other resources intended for archivation and access in the NDL system are saved in the working space of the transformation module, which ensures their transfer to the SIP1 and SIP2 packages and their sending to the LTP and applications of accessibility.
  • The transformation module monitors the flow of the documents from external resources and checks the consistency of the UC and MC between the LTP and applications for accessibility. If everything is in order, it ensures the deletion of the PSP packages of external data from the workspace.

 

A Description of the Subsystems of the NDL Project

 

The Digitisation Subsystem

The digitisation subsystem ensures the operation of the digitisation workplaces, which are placed in two localities (Prague-Hostivař and Brno). At these workplaces, the analogue background materials (paper documents, or microfilm models) are transferred – scanned into digital form and the data are prepared for long-term preservation and presentation. The input into the digitisation workplace consists of both the actual analogue models of the future digital objects and the metadata of the individual objects, which are created by a transfer from the library catalogue of the NL CR and ML. The document for scanning is loaned through the library catalogue; the metadata are downloaded from the catalogue after entry in the RD. The RD system preserves the information of the digitised documents in the entire CR; it retrieves information on the locations of the user copies (URL) from the individual components of the subsystem for accessibility. It is capable of transferring this information back to the library system.

The scanning of the analogue models is followed by further adjustments of the scanned images (straightening, trimming etc.) and predominantly the complementation of the information on the document – the metadata. In the end, a producer submission package (PSP) is created for every document including all of the metadata belonging to the document as well as the actual scanned image (=data).

The PSP package is deleted from the workspace based on a check of the presence of the data in the LTP and in the accessibility application, which is performed by the control module of the digitisation workflow (in the case of external data, which will not go through the digitisation workflow, this check is performed by the transformation module).

 

The LTP Subsystem

The LTP subsystem is a reliable digital repository allowing effective administration, protection and preservation of the data and metadata created in the NDL project and other data which will become a component of the NDL. The repository ensures both the physical data protection (bitstream protection) and the logical data protection (preserving the utility and comprehensibility for the near and distant future) according to the OAIS standard (ISO 14721:2003). The LTP subsystem will store archival (master) copies of the digitised documents of the NL CR, ML as well as of some of the external suppliers, the archival (master) copies of the born-digital documents from the web archiving, the files of other documents, predominantly those born-digital (e-deposit).

Descriptive (of limited extent), structural, technical and administrative metadata will be stored for each document individually and as a logical entity. Any kind of metadata will have to correspond to the current generally accepted standards.

 

The Subsystem for Transformations and Consistency Control (Transformation Module)

The LTP subsystem will require on input the SIP information packages in a certain, defined form. A number of already existing, digitised documents, however, are saved in other metadata formats and structures arising from the previous digitisation work at the NL CR and ML. A similar problem with unsuitable format of image data and metadata may occur also with documents arriving from external digitisation. If the existing and external documents are to be saved correctly in the LTP subsystem, their formats will have to be transformed into the preferred form. The transformation module must allow the adaptation of various metadata formats and digital document formats. In terms of usage, it will mean individual settings or a development for specific sources of documents. Since even the systems for accessibility require the data in a certain, defined form on input, the transformation module will also ensure the conversion of data from various resources into the SIP format for the accessibility of the user copy.

This subsystem, indispensible for the complex running of the entire system, will be the subject of development on the part of the system integrator of the whole project and will necessarily be created in cooperation with experts from the NL CR and ML. It will have to be further developed and maintained by the actual employees of both institutions even after the conclusion of the duration of the project.

 

The Subsystem for Making Information and Documents Accessible

The accessibility subsystem will have two layers:

A. The Existing Accessibility Applications – Kramerius, WebArchiv and Manuscriptorium

The basis of the subsystem of the accessibility of the documents will be three already existing applications: Kramerius (installed at the NL CR + installed at the ML), WebArchiv and Manuscriptorium. The documents being made available will be saved as user copies in their operational storage site. The user copy of the document contains metadata similar to its archival master copy, but the actual document is in a more space-saving format sufficient for common display.

B. Application for Central Access

Other than these applications, a so-called application for central access will be built within the accessibility system; it will ensure easy access to the documents from various applications for the normal users, namely in a single, user-friendly interface. The advantage of access through the application of central access will be that it allows the users to search all of the documents and information at once without having to know in which of the above-mentioned and other applications the relevant information is stored and without having to become acquainted with a number of different user interfaces. Through the application for central access, not only will the users have the NDL outcomes available, but they will also be able to search for information on physical documents in the collections of the NL CR and ML and other libraries or information from prepaid external electronic resources (the individual aggregators offer resources from their portfolios in various user interfaces, but the applications for central access already today allow access to all of the resources within a single central index).

ue
 
null
Notice: This text is just tenative information for general public and is not to be confused with the text of the tender for project system integrator. 

This site conforms to the following standards: