Processing levels

From PANGAEA Wiki
Jump to navigation Jump to search

PANGAEA Data Processing Levels

PANGAEA data processing levels describe the degree to which the archived version of data have been processed since their creation. It is common that institutes, research and data centers, as well as research projects use terms that define data processing levels, usually tailored to their particular data types. These typically describe the gradual transition from raw data to processed data and data products. They do NOT describe data quality or their scientific significance.

PANGAEA uses a generic nomenclature suitable for heterogeneous scientific subjects and data types to describe the processing level in a machine readable form. We are aware that the PANGAEA data processing levels may not be as suitable for some data types as other field-specific specialized nomenclatures.

Processing levels according to the PANGAEA definition should be suggested by the authors during data submission and may be added or modified by PANGAEA data editors during data curation. When the submitted data contains several PANGAEA processing levels (e.g. Binary_object) in one submission, then the higher PANGAEA data processing level will be assigned to the dataset (or the submission may be split into several datasets). PANGAEA's data processing levels support that data archived in PANGAEA become as FAIR as possible according to FAIR principles (Wilkinson et al., 2016).
PANGAEA provides the following data processing levels:

Definitions

PANGAEA data processing level 0

Further referred to as "Level 0". IS NOT PART OF PANGAEA INFORMATION SYSTEM.

  • Raw data (unprocessed)
  • Data measured by an instrument or collected by humans
  • Data from one experiment or one series of experiments (e.g. one research campaign / one measurement)
  • Data without any metadata

PANGAEA data processing level 1

Further referred to as "Level 1".

  • Raw data (unprocessed)
  • Data measured by an instrument or collected by humans
  • Data from one experiment or one series of experiments (e.g. one research campaign / one measurement)
  • Metadata added to data of Level 0

(* Possible to restore level 0 from level 1 data)

PANGAEA data processing level 2

Further referred to as "Level 2".

  • Processed data
  • Data processed by humans
  • Data from one experiment or one series of experiments (e.g. one research campaign / one measurement)
  • Outliers are flagged in the data

(* Possible to restore level 1 from level 2 data)

PANGAEA data processing level 3

Further referred to as "Level 3".

  • Processed data
  • Data processed by humans
  • Data from one experiment or one series of experiments (e.g. one research campaign / one measurement)
  • Outliers are removed from the data

(* Not possible to restore level 1 or level 2 from level 3 data)

PANGAEA data processing level 4

Further referred to as "Level 4".

  • Processed data
  • Data processed by humans
  • Data from multiple series of experiments (e.g. multiple research campaigns / multiple measurements)
  • Compilation of multiple data sets (mix of several level 1/2/3 data sets)


Graphical description Level: 0,1,2,3,4

Mapping PANGAEA processing levels to processing levels used elsewhere

The figure below illustrates an approximate mapping of PANGAEA data processing levels to the levels used in other repositories or data information systems. We cannot accept any liability for the correctness of this information.


Sources:
[1] Earthdata.nasa.gov (2021): Data Processing Levels | Earthdata. Available at: https://earthdata.nasa.gov/collaborate/open-data-services-and-software/data-information-policy/data-levels [Accessed 23 July 2021].
[2] Nsidc.org (2021): Is it 1B, 2, or 3? Definitions of data processing levels | The Drift. Available at: https://nsidc.org/the-drift/2013/08/is-it-1b-2-or-3-definitions-of-data-processing-levels/ [Accessed 23 July 2021].
[3] Community.wmo.int (2021): Data Access and Use | World Meteorological Organization. Available at: https://community.wmo.int/activity-areas/wmo-space-programme-wsp/data-products [Accessed 23 July 2021].
[4] Peters, K & Höck, H (2021): Data processing levels in the Earth System Sciences. DKRZ. Available at: <https://cera-www.dkrz.de/docs/PostprocessingLevelDescriptions_for_CERA.pdf> [Accessed 23 July 2021].
[5] National Aeronautics and Space Administration (2016): Vocabulary: Processing Levels for Earth Observing System Standard Data Products. NVS. Available at: https://vocab.nerc.ac.uk/collection/E02/current/ [Accessed 23 July 2021].
[6] British Oceanographic Data Centre (2021): Submission templates. Available at: https://www.bodc.ac.uk/submit_data/submission_templates/ [Accessed 23 July 2021].
[7] Ifremer (2021): Processing levels - Oceanographic Data. En.data.ifremer.fr. Available at: http://en.data.ifremer.fr/All-about-data/Data-management/Processing-levels [Accessed 23 July 2021].
[8] Neonscience.org (2021): Data Processing | NSF NEON | Open Data to Understand our Ecosystems. Available at: https://www.neonscience.org/data-samples/data-management/data-processing [Accessed 23 July 2021].
[9] Immerz, A., Frickenhaus, S., von der Gathen, P., Shupe, M., Morris, S., Nicolaus, M., … Rex, Ma. (2019): MOSAiC Data Policy. Zenodo. http://doi.org/10.5281/zenodo.4537178

Examples

PANGAEA processing level is listed here in the metaheader [Domain attribute(s)]:



  • see DOI example for level 1: Damaske, Daniel; Becker, Marius (2021): Multibeam bathymetry raw data (Kongsberg EM 122 entire dataset) of RV MARIA S. MERIAN during cruise MSM97/2. PANGAEA, https://doi.org/10.1594/PANGAEA.927786
  • see DOI example for level 2: Wintersteller, Paul; Kammann, Janina; Strack, Anne; Geissler, Wolfram H (2021): Multibeam bathymetry processed data (Kongsberg EM 122 entire dataset, MB-System data format) of RV MARIA S. MERIAN during cruise MSM24. PANGAEA, https://doi.org/10.1594/PANGAEA.928269
  • see DOI example for level 3: Wölfl, Anne-Cathrin; Wheeler, Benjamin; Schade, Martin (2018): AtlantOS data products from multibeam EM122 data: Maria S. Merian cruise MSM54 (North Atlantic). PANGAEA, https://doi.org/10.1594/PANGAEA.896605
  • see DOI example for level 4: Grevemeyer, Ingo; Rüpke, Lars H; Morgan, Jason Phipps; Iyer, Karthik; Devey, Colin W (2020): Compilation of swath-mapping bathymetry from oceanic transform fault systems - a global approach. PANGAEA, https://doi.org/10.1594/PANGAEA.924451


References

  • Wilkinson, M. D. et al. (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3:160018 doi:10.1038/sdata.2016.18