Event-Campaign-Merge

From PANGAEA Wiki
Jump to navigation Jump to search

Modernizing the PANGAEA Event Data Structure

To better support the evolving needs of the global research community across all disciplines, we are implementing a significant update to our database architecture. The centerpiece of this modernization is the "Event-Campaign Merge," which transitions our metadata structure around data collection events from a rigid hierarchy into a flexible, hierarchical tree structure.

Why is the structure changing?

Until now, PANGAEA was built on a fixed two-level hierarchy (Campaign ⇒ Event). While effective for marine expeditions, this was often too restrictive for other fields of science. The new structure allows for:

  • Interdisciplinary Support: Better representation for Social Sciences (e.g., using "Study" instead of “Campaign”) and Lab Experiments.
  • Greater Flexibility: The process of data collection (events) can now be better organized into multiple nested levels, making it easier to represent complex sampling or long-term monitoring.
  • Reduced Redundancy: Streamlining how event metadata is stored and retrieved.

Key Improvements at a Glance:

  • Hierarchical Events: Technically, Campaigns are being integrated as a specific type of "Event." This allows for a more natural "parent-child" relationship between different stages of data collection. A campaign event is the parent of several generic events describing the sampling or data collection.
  • Enhanced Metadata: Events can now directly store information regarding the "Basis" (e.g., a research vessel or a specific instrument) and "Responsible Staff / Principal investigators” (formerly only available as “chief scientists” in campaigns).
  • Advanced Geolocation: We are introducing a "Location 2" field complementing the already existing “Latitude/Longitude 2” fields. This allows us to accurately map trajectories or transect events by defining both start and end points.

What to Expect During the Transition

We are entering a transition phase of a few weeks as we migrate our datasets and infrastructure. Depending on how you interact with PANGAEA, this may affect you differently.

For General Web Users and Website Visitors

The impact on the user interface will be minimal. You may notice some minor inconsistencies in labels or metadata display (e.g., how a "Campaign" is titled or nested). However, the search functionality and data access will remain fully operational.

Please be aware that PANGAEA cannot control how event and campaign metadata is displayed on external websites or third-party platforms, such as the Marine Data Portal. These portals use their own code and logic to interpret our data structures. Until these external providers update their systems to align with our new architecture, there may be discrepancies in how our metadata appears on their sites.

The same applies for PANGAEA’s own Expeditions web page that also needs to be adapted, but with lower priority. During the transition newly added campaigns which are only available in the new database structure may not appear there.

For Technical Users, API Consumers, and Data Harvesters

If you rely on PANGAEA’s XML metadata, OAI-PMH, or other API services, please take note of the following critical technical details:

  • Schema Validation: While we have prioritized backwards compatibility, the transition involves changing XML element names (campaign names get labels). During this migration, our XML schema will not validate because records will contain a mix of old, redundant elements and new elements. We will separately announce when the XML files exposed by our APIs no longer contain redundant compatibility elements.
  • Harvesting Recommendations: If you are harvesting PANGAEA’s own metadata schema, we recommend temporarily pausing the retrieval of updates during this transition phase to avoid processing inconsistent records.
  • Code and Script Adaptation: API users should prepare their scripts to interpret the new, more flexible hierarchical structure. Ensure that your parsers are resilient and do not fail when encountering new or unknown XML elements. Please adapt to new element names as soon as possible, as the old, redundant elements will be removed.
  • OAI-PMH Consumers: Please verify that your harvesting pipelines can handle these schema shifts without breaking, particularly during the period when element names are in a state of flux.
  • pangaeapy / pangaear Users: Those libraries should work without problems to search and download data. We will later adapt pangaeapy to better reflect the new event campaign structure. pangaear is a third party product; to our knowledge it is not affected by the event-campaign merge.

These changes represent a major step forward in making PANGAEA more versatile and future-proof.