Term catalogue
The PANGAEA terminology catalogue (synonym: feature catalogue) is a thesaurus-like “construction kit” that is integrated into the PANGAEA internal relational data management system and enables the use of controlled vocabularies in PANGAEA. The terminology catalogue consists of relationships, concepts and terminologies that can be associated with various compounds of the PANGAEA data model such as data set parameters and their units, methods and devices or event locations.
The use of terminologies increases semantic annotations of parameters and metadata, e.g. term concepts including synonyms and hierarchies are added (“is a broader term of”), terms of different terminologies are mapped (“is same as”), and the harmonization and consistency of archived data is improved (e.g. avoidance of duplicates). This makes the data more reliable and interoperable, and facilitates comprehensive data retrievals. With respect to data retrievals, the search of data is improved by automatically assigning data sets to PANGAEA “topics”, by allowing to search for synonyms or broader terms of concepts used in data sets, and by allowing for filtering search results ("facetting") by using the broader term relations. This way, data sets can be found even if search terms do not exactly match with the metadata of a dataset.
The PANGAEA terminology catalogue comprises distinct PANGAEA terminologies (e.g., “Keywords, PANGAEA”, “Methods and Devices, PANGAEA”) that structure terms for which no suitable external terminology services are available. In addition, PANGAEA contains terms from a number of external terminologies that are listed below. To address limitations in mapping metadata to other formats, the terminology catalogue allows mapping PANGAEA's own terms to those from external controlled vocabularies (using an “equivalent to” relation). With WoRMS and ChEBI, PANGAEA has for some time maintained a bidirectional workflow that includes submission of new species names and regular downloads.
List of implemented terminology services
Taxonomy: World Register of Marine Species WoRMS(including information on algae by AlgaeBase, which is redistributed by WoRMS with permission), Integrated Taxonomic Information System ITIS
Measurements: QUDT
Environmental features: The Environment EnvO
Chemistry: Chemical Entities of Biological Interest ChEBI(bidirectional workflow), PubChem
Current activities
Work is currently underway to map PANGAEA method/device terms with terms from the BODC terminology.
References
British Oceanographic Data Centre (2023) The NERC Vocabulary Server, Natural Environment Research Council. https://vocab.nerc.ac.uk
Buttigieg, P.L., Morrison, N., Smith, B. et al. (2013) The environment ontology: contextualising biological and biomedical entities. Journal of Biomedical Semantics;4:43. https://doi.org/10.1186/2041-1480-4-43
FAIRsharing.org (2023) QUDT; Quantities, Units, Dimensions and Types, https://doi.org/10.25504/FAIRsharing.d3pqw7
Guiry MD & Guiry GM (2023) AlgaeBase. World-wide electronic publication, National University of Ireland, Galway. https://www.algaebase.org
Hastings J, Owen G, Dekker A, et al. (2016) ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Research;44(D1):D1214-D1219. https://doi.org/10.1093/nar/gkv1031
Integrated Taxonomic Information System (ITIS) (2023) www.itis.gov, CC0, https://doi.org/10.5066/F7KH0KBK
Kim S, Chen J, Cheng T, et al. (2023) PubChem 2023 update. Nucleic Acids Research;51(D1):D1373–D1380. https://doi.org/10.1093/nar/gkac956
WoRMS Editorial Board (2023) World Register of Marine Species. Available from https://www.marinespecies.org at VLIZ. https://doi.org/10.14284/170