Data submission

From PANGAEA Wiki
(Redirected from Workflow)
Jump to navigation Jump to search
Videotutorial: How to publish FAIR Data with PANGAEA

Data submissions and technical requests are administered through a Ticket System (JIRA issue and project tracking made by ATLASSIAN). For each request or submission, an issue (ticket) is created which is tracked during the workflow until it is resolved.

NEW to PANGAEA? Learn how to submit data with our video tutorial in just 5 minutes.

READ FIRST

PANGAEA is a publisher for any kind of data from earth system research and thus has no special format requirements for submissions. Data can be submitted in any format of the author's choice and will be converted to the final import and publication format by the PANGAEA editors. The data provider is kindly requested to keep the following points in mind to minimize the preparatory work for the data publication.

  • Position(s) (latitude/longitude in decimal degree, WGS84) must be provided for every sample, observation and measurement carried out anywhere on earth.
  • If data are supplementary to a publication, the (preliminary) citation with journal title and abstract (different from the dataset abstract) must be added.
  • Maximum size for attachments is 100 MB per file. If you have larger files OR if you have more than 20 files, please request an upload link by writing a comment into your submission. Here, files must be less than 15GB, however several files can be uploaded simultaneously.
  • Ideally, provide titles for all your submitted datasets. A dataset title should not be the same as the title of the related publication, but should reflect what was measured, where and when.
  • If data are related to a project, add the project, project acronym and ideally the AWARD number and funder info.
  • Date/Time must be provided in the ISO-format (e.g. 1954-04-07T13:34:11) as coordinated universal time (UTC).
  • All measurement variables submitted ("parameters"), should be accompanied by a unit.
  • Please provide the primary instrument used to measure each specific variable/parameter, in the following format: "Instrument type, Manufacturer, Model name". If you did not use any instrument, please provide the method used as alternative, in the following format: "Method type according to Reference et al. (YYYY)". Further details on how to provide measurement devices or methods can be found here.
  • Abbreviations must be explained.
  • Extended documentations may be added as plain text or pdf-file.
  • Submit data tables as Excel or tab-delimited text files; specific formats (e.g. shape, netCDF, segy ...) may be added as a zip-archive.
  • ... submit via the PANGAEA ticket system

Additional recommendations

  • Preferred format for data tables is TAB-delimited text files (UTF-8 encoding), submitted as ZIP-archive, or as Excel-format.
  • Several tables with different structures should be provided as different sheets.
  • Several tables with identical structures may be provided in one file (one data set below the other, event label in the 1. column).
  • Parameter names (or PANGAEA parameter ID) with the unit belong in the header line of the data table.
  • Use proper events, e.g. as defined during an expedition (if appropriate).
  • Format for positions (lat/long) should be decimal degree (-65.1234) (S and W are negative, WGS84).
  • Provide additional references by the full citation, including its DOI; documentations (README, processing reports etc.) should also be provided as pdf (documents will be stored and linked in the dataset).
  • Numeric parameter columns must contain numbers only; exception see quality flags, DO NOT USE FORMULAS (Excel) - cells with formulas have to be saved as number prior to submission.
  • If the result of a scientific analysis is zero, the corresponding field in the data table must be filled with 0 (and not left empty).
  • Fields without data should be left empty (and NOT filled with '-', 'n/a', 'NaN', -9999 or '*' etc.).
  • Multiple values separated by '-', '±', '()' (ranges, values with errors, uncertainties, or alternative values in brackets) within a single cell should be avoided. Instead, multiple columns need to be used.
  • Remove empty lines and columns; those will not be imported.
  • Avoid abbreviations.
  • Avoid redundant information.
  • Use standards as far as available
  • Care for proper geocodes
  • Use the decimal point.
  • Use the English language.

Data publication workflow

Workflow overview of a data publication

The workflow for a data publication from source to publication is similar to the submission > review > editorial > publication flow established in scientific literature. The editorial process is coordinated by the editor-in-chief and the data editors. The workflow and communication of each data submission is documented through a Ticket System.

The workflow is primarily an interaction between the (corresponding) author and the editors and consists of 6 steps:

  1. Data submission - the authors submit their data set and a description of their data set (metadata) via the Ticket System. They follow the submission guidelines and the project or institute data policies.
  2. Editorial review - the editor-in-chief checks the datasets with respect to completeness of the metadata and with respect to the validity/format of the data. A request will be sent to the author if mandatory information is missing. Once the submission is complete and the data set is accepted for publication in PANGAEA, the author is informed. Once the publication process can be started, the editor-in-chief assigns the ticket to the editor in charge.
  3. Data Import - during a technical review, existing metadata is checked and, if necessary, additional metadata is added by the editor. Data are reformatted to fit to the PANGAEA Data model. During this step, if necessary, tables are transposed, combined or divided, columns with metadata are added (e.g. official event labels), etc. Relations between metadata and data are established via the editorial system. After import, the editor performs a final check of the data set (in the "browser view"). This check includes, among others, a validity check of all external links.
  4. Data set proof - the editor sends the data set link to the author(s), requesting a proofread. The DOI is assigned, but not yet registered ("activated"). The data set status is in "in review" at this stage.
  5. Corrections - through an iterative process between author and editor, the data set is edited until the final approval by the author.
  6. Publication - the data set status is set to "published"; the DOI will be activated 4 weeks after the final editing and is then part of the official data set citation. Upon request of the author, a password protection may be set for a moratorium period or until the related paper is published. A temporary access link with an expiry date can be granted upon request of the author. Such a link can be used to share the data with individuals or groups, for example for co-authors or anonymous reviewers.

Timing of data publication

If data (a data set or a data collection) is submitted as a supplement to an article, the publication process of both, the article and the related data, should be synchronized. Therefore the data should be submitted to PANGAEA at the time of the paper submission, or ideally even earlier. A password protection for the data can be in place until the related article is accepted/published.

Costs

The basic operation is covered by public funding, but in order ensure a high quality in processing and archiving new data, PANGAEA receives additional funds. In case that data are submitted as part of a project for which funding is available for publication, PANGAEA would appreciate a financial contribution of 500.– € (net) for a data submission (e.g. as part of the costs for Open Access publications at the DFG). Other forms of funded collaborations can be negotiated. Please contact us for further information and invoicing.

Best practice manuals and templates

For more information on submissions of frequent types of data see best practice manuals and templates.