Digital Object Identifier (DOI)

DOIs provide persistent links to scholarly content, helping users get to the authoritative, published version of the content they are searching for, even when the content changes location or ownership. With about 35 Million registered DOI for publications (2009), the system is established and consequently used by scientific publishers and organisations.

Through the project STD-DOI, the TIB (German National Library of Science and Technology, Hannover) was established as an agency for data DOI. PANGAEA among four data providers was the first system using DOI for automated persistent identification of data sets. A data DOI has the prefix 10.1594 which is assigned to the publication of primary data through the TIB. The suffix, separated by a slash, is composed of the data system or center acronym and a system specific part. In a Pangaea DOI, this part is equivalent to the internal ID, automaticaly assigened to a data set by the relational database management system during import; thus the uniqueness of each DOI is assured.

A valid Pangaea-DOI has the syntax 10.1594/PANGAEA.738357

Citation and DOI are defined in three steps of the data set publication process:

  • registry status will be registered, than
  • registry status registration is in the lead time with DOI registration in progress for 30 days followed by
  • registry status registered after transfer of the DOI to the DOI-registry > DOI can be resolved globaly, e.g. at
  1. If a data set is imported and the status is set to validated, its internal ID can only be resolved as a preliminary DOI through (PANGAEAs own DOI resolver). In the citation, the data set is identified as Dataset #738509
  2. If a data set status is set to published, the internal ID is changed to a global resolvable technical DOI 4 weeks after the last edit and the data set becomes the status citable. In the citation, the data set is identified as Dataset #738509 (DOI registration in progress), changing after 4 weeks to doi:10.1594/PANGAEA.82361 which can be resolved globaly.
  3. On request the dataset can be defined as an offical data publication added to the library catalog of the TIB, see citation.

PangaVista and the DOI resolver of PANGAEA can be used for any registered DOI, including preliminary DOIs of Pangaea.

In case a registered data set has to be deleted, in the field other version the link/DOI to the substitute must be given before deletion.

Prerequisites to become an agent for the registration of scientific primary data

Any data provider, interested in assigning DOI for data may use one of the agents listed below or become a new agent of the data-DOI agency TIB. When establishing a data system/center new agents need to assure the following points defined through a concept and a data policy:

  • Metadata
    • metadata are mandatory and should follow standards of the specific scientific field the data are covering (e.g. ISO19115 for geo-data)
    • data sets must be accompanied by a citation, consisting of bibliographic fields according to the STD-DOI application profile
  • Access and availibility
    • long-term availability must be assured, stable linking is provided by means of a DOI
    • data must be available online, assuring Open Access for metadata; Open Access to data is highly recommended (access restrictions may appear for a muratorium period); data should be provided under a CC-license
    • it is highly recommended, that data are machine readable, giving data in the repository an added values. This means, that
      • (1) data are provided in a standard technical format (best is ascii and ISO formats)
      • (2) data are organized in a way, that further processing of any part of the repository can easily performed (data model, relational database)
    • a full backup of the data repository must be assured
  • Data review and integrity
    • once registered, data sets are static
    • versioning is allowed, different verions should be related to each other
    • data curation must include an editorial process with proofread by the author/principle investigator (the author is responsible for the scientific quality of the data!)
    • an external peer-review of data publications is recommended

Links to agents for archiving geoscientific primary data with DOI

