Data set

A data set is a collection of data (often from one event) that is organized in a matrix and is mostly put together in a scientific context. Data in PANGAEA are organized in predefined data sets which are quite similar to the original files uploaded by the author (e.g. one table in one Excel sheet).

The granularity of a data set depends on the type of data and the number of data points, and is primarily in the decision of the data author. In principle, a PANGAEA data set can have an unlimited number of columns and lines (excel 2003: 65,536 x 256; excel 2008: >1 Mio x 16,384) - Examples: A data set may contain one to many data series = parameters. Two or many data sets may be grouped into one parent set. Access restrictions can be defined for a complete data set only. Each data set consists of the data and the metadata according to ISO standard fields (ISO 19115). A data set appears on the Internet with a metaheader which contains the information as described below.
 * 17 columns
 * 2,000,000+ lines
 * 22,600,000+ lines (in ascii: 551 MB; in ASE +index: 2.2 GB; export from IQ +DOI: 1.44 GB) (Fig. 2)

Deleting/Updating PANGAEA data
30 days after the publication of a data set, the DOI number of the data set is registered, e.g. (this does not apply to data sets with the status "in review"). Before DOI registration, a dataset can be deleted without problems. After registration, a dataset cannot be deleted anymore. However, a new version of the dataset - carrying a new DOI number - can be uploaded and the old version can be "hidden" from the search (it can still be found using the DOI link of the data set).