Talk:Processing levels

PANGAEA Data Processing Levels
PANGAEA processing levels are NOT describing the quality or the scientific significance of any archived data sets. To describe data, processing levels (usually a graduate transition from raw to processed data) are frequently used from institutes, research & data centers and even projects. Because of this fact and due to many different data types making a categorisation even more difficult, processing levels in general have always a slightly different level structure or definition. PANGAEA uses a generic nomenclature suitable for heterogeneous scientific subjects and data types to describe the processing level in machine readable form. We are aware, that sometimes our PANGAEA processing levels can not be applied on all data published in PANGAEA. Processing levels according to this definition should be suggested by the authors during data submission and may be added or modified by PANGAEA data editors during data curation. When the submitted data contains several PANGAEA processing levels (e.g. Binary_object) in one submission, than the higher PANGAEA processing level will be assigned to dataset (or the submission may be split into several datasets). PANGAEAs processing levels are supporting that the data archived in PANGAEA become as FAIR as possible according to FAIR principles (Wilkinson et al., 2016). PANGAEA provides the following data processing levels:

Level: 0 (NOT PART OF PANGAEA DATABASE)

 * Raw data (unprocessed)
 * Data measured from instrument or collected by scientist
 * Data from a series of experiment (e.g. one research campaign / one measurement)
 * Data without any metadata

Level: 1
(* Possible to restore level 0 from level 1 data)
 * Raw data (unprocessed)
 * Data measured from instrument or collected by scientist
 * Data from a series of experiment (e.g. one research campaign / one measurement)
 * Metadata added to data of Level 0

Level: 2
(* Possible to restore level 1 from level 2 data)
 * Processed data
 * Data which are processed by scientist
 * Data from a series of experiment (e.g. one research campaign / one measurement)
 * Data is flagged from outliers

Level: 3
(* Not possible to restore level 1 or level 2 from level 3 data)
 * Processed data
 * Data which are processed by scientist
 * Data from a series of experiment (e.g. one research campaign / one measurement)
 * Data removed from outliers

Level: 4

 * Pocessed data
 * Data which are processed by scientists
 * Data from multiple series of experiment (e.g.multiple research campaign / multiple measurement)
 * Compilation of multiple data sets (mix of several level 1/2/3 data sets)

Mapping PANGAEA processing levels to processing levels used elsewhere
The figure below illustrates an approximate mapping of PANGAEA processing levels to the levels used in other repositories or data information systems. We cannot accept any liability for the correctness of this information.

Source: [1] https://earthdata.nasa.gov/collaborate/open-data-services-and-software/data-information-policy/data-levels [2] https://nsidc.org/the-drift/2013/08/is-it-1b-2-or-3-definitions-of-data-processing-levels/ [3] https://community.wmo.int/activity-areas/wmo-space-programme-wsp/data-products [4] https://cera-www.dkrz.de/docs/PostprocessingLevelDescriptions_for_CERA.pdf [5] https://vocab.nerc.ac.uk/collection/E02/current/ [6] https://www.bodc.ac.uk/submit_data/submission_templates/ [7] http://en.data.ifremer.fr/All-about-data/Data-management/Processing-levels [8] https://www.neonscience.org/data-samples/data-management/data-processing [9] http://doi.org/10.5281/zenodo.4537178