Citation

Work in Progress

Best practice of data citation
PANGAEA publishes data similar to scientific journal publications. And as such, published data sets have to be cited in a similar manner. A data citation should contain:
 * the authors (creators)
 * the publication year
 * the dataset title
 * the publisher
 * a unique persistent identifier (e.g. a DOI)

The full data citation of each referenced data set should be included in the reference list of any publication citing the data. For the general structure, we follow the DataCite recommendations:

''Creator (PublicationYear): Title. Publisher (PANGAEA). Identifier (DOI)''

On the landing page of each data set, the suggested citation of the data set is displayed at the top, e.g. see here:

„Timofeeva, Anna; Smolyanitsky, Vasily; Bessonov, Vladimir; Petrovskiy, Tomash (2020): Special sea ice observations aboard Akademik Fedorov MOSAiC leg 1, 2019-09-25 to 2019-10-20. PANGAEA, https://doi.org/10.1594/PANGAEA.912021“

The citation can be copied or exported in the preferred format using the copy or export buttons below the title. If the data publication is not related to a journal article, it is possible to include the Institution as a source of the data. It will appear in the suggested citation:

''Creator (PublicationYear): Title. Institution. Publisher (PANGAEA). Identifier (DOI)''

As an example, see here:

„Burkhardt, Elke (2020): Whale sightings during Polarstern cruise PS95.1 (ANT-XXXI/1.1). Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Bremerhaven, PANGAEA, https://doi.org/10.1594/PANGAEA.924050“

Where to refer to the data?
As stated above, the full data citation should be contained in the reference list of any work citing the data.

But where in the text is the right position to refer to the data? Obviously it depends on the context, how the data is used, if it is reused data or original data. Generally, the data can be cited in the methods or results section, or in the data availability section if offered by the journal.

For the latter, a suggestion to refer to the data would be:

"Data for this study were published open access (Authors, YYYY).", followed by the full citation of the dataset in the list of references.

Data sets "in review"
During the archiving and review process or when a moratorium is set on the data, for example due to the publication status of a connected manuscript, the data is kept in the status "in review". A dataset "in review" might be modified or even deleted during the review process. During this process, the data will be displayed on the website with a preliminary link instead of a registered, persistent DOI. This preliminary link can be recognized by the following format:

https://doi.pangaea.de/10.1594/PANGAEA.XXXXXX  (XXXXXX = DataSetID)

It can only be resolved by the PANGAEA DOI resolver. Once the DOI is registered, it will take the form of https://doi.org/10.1594/PANGAEA.XXXXXX  (XXXXXX = DataSetID)

Only the second form guarantees persistent access and reference to the data. Citation of any data with the status "in review" should be avoided.

Publication of data in PANGAEA
After technical review by the curator, import and approval of the author/PI, a dataset is set to status published and appears as citable on the Internet. Upon publication of a data set, the DOI registration is initiated. This process is finalized after 28 days. During this time, the data set can still be modified. However, after finalizing the DOI registration, the data set is published and cannot be changed anymore. Any changes to the dataset would be analougous to an erratum of a journal article.

Small adjustments, as the correction of small mistakes or typos, can still occur and are displayed as metadata „Change history“, both on the data set landing page (below the parameter overview) and the downloaded data set. As an example see here:

Dataset History Change history: 2020-03-25T13:34:53 – Parameter Ice thickness [m] exchanged with Parameter Thickness of ice accretion [cm], no recalculation of values necessary