Talk:Data submission

=For Authors/Data Submitter (Soll Seitentitel werden)=

=Editorial Criteria and Processes= under construction

=Authors Guides/Formatting Guides= The author’s guides describe how to prepare your metadata and data for submission in PANGAEA. We recommend you to read this guideline before submitting your data. In addition, we recommend that you familiarize yourself with the PANGAEA publishing style by reading about PANGAEA's scope and by searching for and viewing data sets of your research field.

Furthermore, please be aware that with your registration to PANGAEA and submitting data to PANGAEA you have accepted our Terms of Use (https://www.pangaea.de/about/terms.php).

PANGAEA is an international data publisher and therefore we expect all data and metadata written in English.

Preparation Dataset Metadata, how to fill the Submission Form:
All data have to be submitted using our online tool (https://www.pangaea.de/submit/). Any other data transfer will not be processed or passed on. For any request concerning your data submission either use our contact form (https://www.pangaea.de/contact/) or for existing data submissions write your comment in the field provided for this purpose. PLEASE NOTE: Any emails or calls related to data submission will not be answered or processed due to resource limitation. Our system will automatically inform you about the status of your data submission (processing step).

1. Page

 * Title: Give a dataset title, briefly describing what and where. Title must be independent of manuscript/paper title
 * Authors: Give all authors of the dataset. Give full names, no initials. Author names are case-sensitive, no full uppercase for last names (how to: Doe, Jane). Please enter the correct e-mail addresses for each author, no duplicates. If there is really no email address no-reply@pangaea.de can be entered. Fill up the affiliation field (using full names, no abbreviations).
 * Keywords: Give keywords here
 * Abstract: Add a dataset abstract, which is independent of the manuscript/paper abstract. Abstract contains a concise and method-oriented description of the observation or measurement, namely what, when, where, why and how the data was collected. The summary should consist of meaningful running text. The format of the dataset abstract is the same as that of paper abstracts. We expect more than two sentences, the length should be ideally limited to 5000 characters. Avoid interpretation of the data. For further information see: https://wiki.pangaea.de/wiki/Abstract
 * License: Choose the license for your dataset

2. Page

 * References: Add any relevant reference as full citation and not limited to a DOI here: Paper/manuscript to which the data belong. Add in additional references mentioned in the data, methods or abstract. Add SOPs, AWI-Registry handles/links.

3. Page

 * Projects: Give Projects and awards. Please add the funder’s DOI (can be found here: https://doi.crossref.org/funderNames?mode=list) additionally into the Project website field

4. Page

 * Upload: Upload your data files here. Please see below how to prepare your data files.
 * More than 20 Files -> please ask for an upload link on Page 5 comment field. We will reject the submission if you simply upload here more than 20 files without being asked to do so. For file uploads please name the files without a space
 * Files larger than 100 MB -> please ask for an upload link on Page 5 comment field. Files must be less than 15GB, however several files can be uploaded simultaneously. For file uploads please name the files without a space
 * File description: You can describe your files here. If you have more than one data table/dataset please ideally provide a title and an abstract for each data table/dataset here.

5. Page

 * Comment: Field for asking for upload link, or any other request/comment for the PANGAEA editors
 * Moratorium: check, if you need a moratorium. If yes, please choose the date. The default is 6 months, if no date is chosen.
 * Terms of Use: please read our ToU (https://www.pangaea.de/about/terms.php) and accept them

Data and their Metadata
PANGAEA publishes data from earth system research in diverse formats (https://wiki.pangaea.de/wiki/Format). Tabular data are the main focus of PANGAEA and should be prepared in TAB-delimited text files (UTF-8 encoding) or Excel-format. Please checkout our Best practice manuals and Templates (https://wiki.pangaea.de/wiki/Best_practice_manuals_and_templates)

Data-Metadata

 * Campaign: Sampling/measurements were done during Campaigns, expeditions, field trips, cruises. This is called “Campaign” and includes the following information (https://wiki.pangaea.de/wiki/Campaign). Please use our Template Campaign_Event or topic specific templates. Information that should be provided:
 * Campaign_Label e.g., Cruise Number
 * Basis e.g. ship’s name, station, airplane etc. leave empty, when no basis can be given
 * Begin Date(/Time) in ISO-format YYYY-MM-DDThh:mm:ss, UTC
 * End Date(/Time) in ISO-format YYYY-MM-DDThh:mm:ss, UTC
 * Responsible Scientist


 * Event: is the sampling or measurement site/position for field observations or the sampling position of organisms/water/mediums of experiments. See for detailed description: https://wiki.pangaea.de/wiki/Event. And use our Template Campaign_Event or topic specific templates. Information that should be provided:
 * Event_Label = Station/Sample point etc. . For data from German Research vessels please use the official Event labels, can be checked here: https://www.pangaea.de/expeditions/
 * Latitude and Longitude are mandatory event metadata, specified in decimal degrees, WGS84 (positive for north, negative for south)
 * Elevation (see https://wiki.pangaea.de/wiki/Geocode)
 * Date/Time of sampling/measurement provided as ISO-format YYYY-MM-DDThh:mm:ss, UTC
 * Method/Device
 * Campaign
 * Any other information……

Data Preparation
Structure of tabular data:
 * In PANGAEA tables, the first column indicates the Event label, followed by columns with the 3rd geocode (https://wiki.pangaea.de/wiki/Geocode ) and/or sample ID and Sample information. This is followed by the columns with the variables/parameters. Each value of a row refers to the event and the 3rd geocode in column 1 and 2.
 * Several tables with different structures should be provided as different data files/sheets

Dos:
 * All Parameters/Variables must be written out and provided together with their unit.
 * Please write out species names and do not abbreviate the genus name. Whenever possible check species names spelling in WoRMS or equivalent taxonomy data provider
 * Use English language only for parameters and any text in the data table
 * Number format in PANGAEA has a dot as decimal separator and no thousands separator
 * Decimal places should be chosen in a scientifically meaningful way. Do not specify an unnecessary and unrealistic number of decimal places. Please be aware numbers of position after the comma represents the precision of your measurement
 * For numeric entries, no special characters are allowed, except PANGAEA Quality Flags (https://wiki.pangaea.de/wiki/Quality_flag)
 * Missing measurements are indicated with an empty cell, and NOT filled with '-', 'n/a', 'NaN', -9999 or '*' etc.
 * Measurements below the detection limit are marked with <”detection limit”
 * Multiple values separated by '-', '±', '' (ranges, values with errors, uncertainties, or alternative values in brackets) within a single cell should be avoided. Instead, multiple columns need to be used.
 * Abbreviations in the data tables must be explained
 * Remove empty lines and columns; those will not be imported.
 * Please provide the primary instrument used to measure each specific variable/parameter, in the following format: "Instrument type, Manufacturer, Model name". If you did not use any instrument, please provide the method used as alternative, in the following format: "Method type according to Reference et al. (YYYY)". Further details on how to provide measurement devices or methods can be found https://wiki.pangaea.de/wiki/Method
 * For file uploads please name the files without a space

Don‘ts:
 * Do not use any Macros or active formulas
 * Do not use any formatting, or color coding, or returns/linebreaks in excel cells
 * Do not use any notes/comment features of excel
 * Do not include graphs in your excel sheets
 * Do not fill cells of missing measurements with '-', 'n/a', 'NaN', -9999 or '*' etc.
 * Do not set multiple values
 * Mix several tables in one sheet like: Event 1 Depth 1 Parameter 1 -empty column- Event 2 Depth 2 Parameter 2 -empty column- Event 3 Depth 3 Parameter 3….. (Fig. Donts_mix_tables)

Alte Kommentare (2023-06-01)

Resources

 * Karl W. Broman & Kara H. Woo (2018) Data Organization in Spreadsheets, The American Statistician, 72:1, 2-10,


 * Elizabeth T. Borer, Eric W. Seabloom, Matthew B. Jones & Mark Schildhauer (2009) Some Simple Guidelines for Effective Data Management, The Bulletin of the Ecological Society of America, 90: 205-214.


 * Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, et al. (2017) Good enough practices in scientific computing. PLOS Computational Biology 13(6): e1005510.

https://nceas.github.io/datateam-training/reference/

https://nceas.github.io/datateam-training/training/

Submission templates
Good examples! http://www.earthchem.org/data/templates

Examples of data publications
For more information on submissions of frequent types of data see best practice manuals and templates.

The examples below may give a first impression, which information is required for specific scientific fields. The export formats may differ slightly. Please keep in mind that the export format is dynamically produced by the relational database behind PANGAEA. It is thus NOT required to provide the data submission in the exact same technical format; the content is the important part of the data submission.
 * Moorings with trap/current meter
 * Vertical oceanographic profile
 * Horizontal profile/ships track
 * Horizontal distribution of irregular distributed samples
 * Vertical profile
 * Bulk sediment parameter
 * Core logging, Physical properties
 * Hole logging
 * Mineralogy
 * Grain size
 * Pollen
 * Geochemistry
 * Porewater
 * XRF
 * Horizontal profile
 * Ships track data in general
 * Intern:Geophysical profile
 * Reflection seismic
 * Refraction seismic
 * Magnetic
 * Gravimetry
 * Profile versus relative distance
 * Speleotheme
 * Coral
 * Time series
 * Radiation
 * Biological measurements
 * Binary object (data files in various binary formats)
 * photos, images, graphics
 * seismic profiles in sgy-format
 * models
 * Maps
 * Experiments