Qualiservice Data Model

The Qualiservice metadata model will be described here.

As it was created as an extension of the PANGAEA Data model, all structures and names are taken from this. They have similar applications, but differ in the type of data described and its access: since PANGAEA presents the data and its metadata openly in its portal ( https://www.pangaea.de/ ), Qualiservice uses this data model only as a representation/description of the data (metadata), not to exchange/display the data itself.

Therefore we talk here about a Metadata Model for Qualiservice.

The PANGAEA Data Model runs on a relational database (PostgreSQL) and is expressed more technically as an XML-Schema in https://ws.pangaea.de/schemas/pangaea/MetaData.xsd

Main tables
Note: The meaning names as used by Qualiservice are displayed here in addition to the original table and module names in this form Original name|Qualiservice name

The Metadata model consists of four main modules (Project, Campaign|Study, Event, Dataset|Collection of Data) and supporting tables with supplemental information. The data object metadata (metadata about single interviews or cases) are organized in Data Series|Micrometadata (this micrometadata was referred to as Interview-Metadata in the first phase of the Qualiservice Project, s. Betancort & Haake, 2014 ).

As the PANGAEA Data model is a generic one, it could be reused by Qualiservice increasing interoperability and findability of the data collections shared and archived by Qualiservice: "The hierarchy of the four main tables follows the steps in science for gathering analytical data: within a PROJECT different CAMPAIGNs are executed to get samples for investigations or to make measurements at distinct locations (EVENT). The result of the investigations are analytical data, organized in Data Series, grouped in DATASETS"This last point about Data Series and Datasets refers in the case of Qualiservice only to MICROMETADATA grouped in COLECCIONS OF DATA.

Project
The PROJECT table is the uppermost level in the data model, used to define big research projects like Collaborative Research Centres or Clusters of Excellence.

Details of the project framework and its funding are included.

Fields
Required fields for the project definition are (mandatory in bold):

Campaign|Study
This module includes the study metadata.

Fields
Required fields for the campaign definition are (mandatory in bold):

Event
This table include information about events by which the data was colleted, transformed (transcription, anonymization) or analized.

Fields
Required fields for the event definition are (mandatory in bold):

Dataset
A dataset describes a collection of data (from one or several data collection events) whose metadata are organized in a data frame (matrix) and is mostly put together in a scientific context. The collections of data follows the archival principles of provenance and integrity as far as possible, grouping the data according to the entity by which they were created or resulting from the same activity.

If the data collector or data provider submits the data to Qualiservice in their own logical order, this structure should be maintained to reflect the context and structure in which they were created, used or transferred. Thus, some data collections will be arranged in geographical groups, others in temporal groups, some in methodological groups or in administrative groups (e.g. data from the whole study or from a single round).

The dataset is the central entity of the model and therefore it is associated to a persistent identifier (Digital Object Identifier or DOI) for unique identification, citation, and long term location of the data.

A parent set bundles two or more related datasets for a certain reason, e.g. to made them citable through a single citation (the supplement to a publication consists of more than one set, for example in the case of a dissertation) or a number of data sets are defined by the PI as a citable entity (childs are independent, for example in the case of time series, or dependent, if the usability / comprehensiveness of the individual data sets is only ensured by supplying all data sets as a package)

Mappings
Mapping-Note: the dataset is mapped to the element Study Unit of DDI3.2., instead of Archive, as in this specification there isn't the possibility to link the archived dataset with its correspondig metadata about data collection and processing event(s). Also the analysis unit and the universe of a certain collection of data couldn't be associated to its specific dataset. Nevertheless the archival characteristics (access information, status or classification of the curation and reuse level, completeness...) of a collection of data or dataset published by Qualiservice can be expressed partly via the Collection element in DDI3.2.

A parent set is expressed as a SubGroup (collection of datasets) to bundle all datasets of a study/round/version together.

Fields
Required fields for the dataset definition are (mandatory in bold):