Data set

A data set is a collection of data (often from one event) in a scientific context organized in one matrix. Data in Pangaea are organized in predefined data sets which are quite similar to the original files uploaded and exported from the archive.

The granularity of a data set depends on the type of data, the number of data points and is primarily in the decision of the data author. In principle, a Pangaea data set can have an unlimited number of columns and lines (excel 2003: 65,536 x 256; excel 2008: >1 Mio x 16,384) - Examples: A data set may contain one to many data series. Two to many data sets may be grouped to one parent set. Access rights can be defined for a complete data set only. Each data set consists of the data accompanied by metadata according to ISO standard fields (ISO 19115). A data set appears on the Internet with a metaheader which contains the information as described below.
 * 507 columns
 * 2,000,000+ lines
 * 22,600,000+ lines (in ascii: 551 MB; in ASE +index: 2.2 GB; export from IQ +DOI: 1.44 GB) (Fig. 2)

Deleting/Updating PANGAEA data
30 days after the import a permanent doi-number is attached to a dataset (only published and published & citable status). Before, a dataset can be deleted without problems. Later, a dataset can be deleted only, if a new version of the dataset - carrying a new doi-number - exists.