Intern:Station lists

Standard operation procedure (SOP) for upload of station lists in PANGAEA: =Station lists in PANGAEA = Station lists (or Event lists) show all the sampling events per cruise of a research vessel. PANGAEA archives the station lists of the large German research vessels and links research data to these sampling events in a common format, standardized and in one place (Figure 1). In the framework of the pilot-phase of the DAM data management project, the workflows for the creation of stations lists for PANGAEA have been revised, optimized and are herein documented. Some parts of these workflows are still subject to discussion or might be revised again, when new standards apply or underlying data changes. Station list archiving is part of expedition archiving. Cruise lists and related information can be accessed via https://www.pangaea.de/expeditions/ or, for each of the research vessels, via the following links:
 * Cruise list of the research vessel POLARSTERN


 * Cruise list of the research vessel SONNE
 * Cruise list of the research vessel MERIAN
 * Cruise list of the research vessel METEOR
 * Cruise list of the research vessel POSEIDON

= DShip Action Log = The DShip system is used on the German Research vessels to document stations and device usage. Sampling actions are being logged manually, including date/time, lat/long and information about the device. The resulting log file is an “Action Log”. Currently, Action Logs vary at times among the different ships they come from, regarding column names, comments, best practices, etc. We use these Action Logs to create the station lists for PANGAEA. Within DShip logs are performed for every action performed with a certain device, while in PANGAEA we want to condense this information to an event. The event covers the start and possibly end information of a device usage. That means, if there are several lines with the same Event label in an Action Log, we pick the start and end information of the device usage to create an event with that label. The time for the event in PANGAEA should always cover the exact time span of sampling, i.e. data acquisition. In DShip the action “station start” might refer to some setups, that do not need to be displayed within the event time span in PANGAEA. Note: The event should cover the time span in which data could have been obtained.

Action Log Download

 * The Action Log is the table from which you will create the event list / station list
 * For POLARSTERN and HEINCKE access it via the AWI DShip
 * For MERIAN, METEOR and SONNE access it via the BSH DShip
 * User name is “dsu06”, password is “Mintaka”
 * For POSEIDON access it via the BSH DShip as well
 * User name is “dship”, password is “underWay”
 * This is how you get to the download file
 * Choose vessel (left of the page)
 * Click on “ActionLogExtraction”
 * Select "Load user template" on top and enter as user: mrehage, then select PANGAEA_default if you want to do the semi-automatically processing /Python script later on. (Old, still valid for Meteor: Under “Report” choose “Events”)
 * Click on small globe symbol and choose the cruise
 * Click on “Next” and follow the steps until 3 “Format settings” (If you did not use the user template, you have to additionally select column "Device", otherwise, default can be used)
 * Click on “Next” anbd insert a file name “events-cruise” (example: “events-HE538”)
 * Insert a (dummy) user name (remember for download!)
 * Click on “Send order”
 * Click on the “home” button in the upper left corner
 * Click on “Extraction Download”
 * Insert your previously used user name again (upper left corner)
 * Download the file “events-HE538.dat” in the corresponding folder
 * If you want to work with the script later on, you may put the file into a designated folder that will be your working directory (see Get a Python development environment running on your computer)
 * The Station lists may (in case of GEOMAR cruises) be imported from DShip to OSIS first and then from OSIS export to PANGAEA.

Example for making a PANGAEA event out of Device Operations

 * The sampling time span for the bucket, in which data can be collected, is probably very short, so we take the action “in the water” as the source of Date/Time, Elevation, Latitude and Longitude for the event (Figure 2).
 * If the time span for sampling is longer, Latitude, Longitude, Depth (m) and date time from the first action of the event (e.g. “profile start”) become Latitude, Longitude, Elevation [m], Date/Time of the event and from the last action of the event (e.g. “profile end”) these columns become Latitude2, Longitude2, Elevation2, Date/Time2. Bucket-event.png
 * Meaningful start and end data for the events is, of course, not always easily extracted from the action log. Keep in mind, that the event can be changed afterwards, for example when data is being submitted that is linked to that event.
 * Devices as logged in DShip might also have different names or abbreviations than in PANGAEA (example: Both the DShip devices “Seismic Source” and “Seismic Towed Receiver” are “Seismic” (SEIS) in PANGAEA).
 * Note: the methods in PANGAEA do not reflect the same granularity as the devices listed in the DShip Action Logs. Pairing of DShip device names with PANGAEA method names and which actions are considered for creating events out of the DShip export are listed in | this spreadsheet, that is explained in detail in Mapping table.

= Creating and importing station lists =

Cruises/campaigns in Jira and 4D
Every cruise gets an "Expedition Archiving" ticket in Jira, including a sub-task for the station list. This will be assigned by the curator responsible for distribution, who will also create the campaigns in 4D. See also the documentation on expedition archiving in Jira. In case the responsible curator is not available, but a station list needs to be imported urgently, please consider the following:
 * Check this wiki documentation on how to create expedition archiving tickets, including station list subtasks
 * If the campaign does not yet exist in 4D then import it and consider the following regarding related metadata:
 * Date/Time and other information may differ from different sources
 * The most reliable source is the cruise planning (see Additional information and links)
 * Alternatively, you may have a look at the dates in DShip
 * The BSH-pages with cruise plans are rather not up-to-date in this case

4D Import table
The import table for station lists is basically the | Event import form. Figure 4 shows an example of an import table.

Mapping table

 * This table is for matching the device names from DShip with the device names used in PANGAEA
 * It also sets the preference for which actions to choose for start and end of an event per device
 * It needs to be updated continually; new devices need to be inserted and actions need to be chosen
 * We can create versions of this spreadsheet or different ones per ship
 * Access it here
 * Due to differences in complexity and granularity, we use | this separate spreadsheet for the mapping of devices used during the MOSAiC expedition at the moment. The two spreadsheets should be merged at some point.

Manual approach

 * Open the .dat file (Action Log) with a text editor or spreadsheet program of your choice
 * Remove unnecessary columns (varies from ship to ship)
 * Sort by “Device” and then by “date time”
 * Caution! Sometimes the Actions we want to extract are not logged, but others instead. Sometimes Actions are missing (“station end” for example)
 * Which information to take for the event needs to be decided on a case-by-case basis
 * Now go from station to station / event to event and remove the rows we do not need to create PANGAEA events
 * Special case: Underway and track-events
 * Have mapping table on hand (see information on mapping table)
 * You may create a table with the 4D Import table header to put your event information into
 * Some columns need some formatting: Date/Time (YYY-MM-DDThh:mm:ss)
 * Coordinates should always be present in decimals; if not, they need to be converted
 * Comments need to be evaluated manually and carefully

Semi-automatic approach

 * All scripts and example files used in the following chapter can be found in this folder (access to PANGAEA Google Drive folder needed)
 * Roland Koppe (AWI) wrote the script featuring the import of information from SENSOR
 * This information needs to be included for most/all/some research vessels in the future
 * As there is no designated place to insert the information in the event and most Action Logs do not yet feature information on SENSOR (Link, ID, etc) nor are all devices recorded in SENSOR, Marianne Kunkel modified the script so it runs for the other vessels without problems
 * However, we have to keep an eye on this and reintroduce it whenever necessary
 * Judith Weber (MARUM) edited it further more to include coordinate conversion and the creation of an output file with devices that are missing in the mapping table

AWI JupyterHub

 * If you have an AWI account, you do not need to install a Python environment
 * Go to https://gitlab.awi.de/software-engineering/de.pangaea.tools.eventlist/
 * Download “dship_sensor_event_list_for_pangaea_manually_ipynb”
 * Go to https://jupyterhub.awi.de/ and log in with your user account
 * Upload the Python script "HE538_dship_event_list_jupyter.ipynb"
 * Upload DShip Action Log
 * Run the script

Spyder

 * If you do not have an AWI account or want to use Spyder instead of Jupyter Notebook, you need to install a development environment for Python
 * Go to https://www.anaconda.com/
 * Download latest version (“Python 3.7 version” (64-Bit Graphical Installer)) and install it
 * Open “Anaconda” and then open “Spyder” or “Jupyter Notebook” (in browser): these are two programming environments that provide nice user interfaces
 * If you want to use Spyder as a programming environment, then do the following:
 * Open Spyder and insert the example file “HE538_dship_event_list_spyder.py“
 * The working directory (the folder in which you work with the script; files can be read from there and will be saved there) is automatically configured
 * If you want to use Jupyter Notebook as a programming environment, then do the following:
 * Open Jupyter Notebook in Anaconda and upload the file (does not contain the coordinate recalculation and mapping check yet) “HE538_dship_event_list_jupyter.ipynb”
 * This also connects automatically to your home/user directory
 * The file to use in Jupyter Notebook is basically the same as the one to use for Spyder; it has less comments and is formatted in a special way

Keep in mind

 * If D-Ship entries are problematic or missing or other actions than listed in our mapping-spreadsheet, the script takes the first action of a kind
 * Comments need to be evaluated manually
 * Sometimes the comment column contains information about SENSOR ID, Recovery, etc. This information can be imported via a column to the events' attributes
 * Special case: Underway and track-events
 * For continuous measurements among the entire cruise, underway- and track events are often created
 * Devices/ methods of underway events are typically thermosalinograph or weather station
 * Sometimes there are several underway- or track-events with one device, because the device may have been turned off and on again
 * The master tracks of the ships are also connected to the track-events
 * Underway should always be written with a capital in the PANGAEA events

Find area/location

 * The station lists and subsequently events in PANGAEA need to feature the information about the location about the event.
 * There are several ways to find the location, besides checking it manually

Semi-automatically using a script

 * At the moment finding the location based on the event/station coordinated is being done by another script ("Gazzeteer_2.py") by Roland Koppe
 * Before you run the script add a blank column named "Area" in the output file from before ("events-campaign-pangaea-final.txt")
 * This script and the underlying database file for the locations can be found in this sub-folder.
 * The latter file ("Limits_of_oceans_and_seas.tab") defines the areas based on the coordinates given in your output file from before.
 * This database needs to be donwloaded and added to the folder in which you work
 * The Python module gdal needs to be installed, first. It is recommended to do this is a new, designated environment.
 * If you use Anaconda development environment, open "Anaconda prompt".
 * Enter "conda create --name gdaltest" to create the environment "gdaltest" (or whatever name you prefer)
 * Enter "conda activate gdaltest"
 * Enter "conda install gdal"
 * There will be questions during the installation process that you may answer with "Y" + Enter
 * You will need to install the module pandas in this environment as well: Enter "conda install pandas"
 * When the installations are finished, it is recommended to restart your computer
 * Now open Anaconda and choose "gdaltest" from "Applications on" drowdown menu on the upper edge of the window
 * Choose "Install" under the Spyder icon and install Spyder in the new environment
 * Now open the script "Gazzeteer_2.py", adapt the campaign and file name and run it.
 * This will create an area column with all corresponding locations (for example ["Ocean name", "name of sea strait etc"])
 * You will need to edit this column and choose the most fitting location (smallest granularity); multiple locations are possible (for example for Underway-Events) and are separated by a ";"

Semi-automatically using PanTool

 * Find area (location of the event The import file should also contain the location of the event.
 * For this, the import file needs to be opened in PanTool
 * A database for coordinates and locations from which the location will be calculated, needs to be downloaded and read by PanTool first. It can be found here
 * Within PanTool: Special tools --> Find area --> Open vent import file --> Area database --> chose database download (see above) --> executed --> produces a file within the folder you are working in, including area column

Import of the station list to PANGAEA

 * Import of the "events-campaign-pangaea-final.txt" file via Import --> Events in 4D
 * Each event of the station list needs to be linked to the corresponding cruise
 * This happens automatically when the column “Campaign” is filled in the import file
 * This is especially important to the planned automatized upload of station lists to PANGAEA homepage, so please keep that in mind

= Station lists online =
 * After importing the stations lists /event lists in 4D they will automatically appear on the PANGAEA homepage under “expeditions”. PANGAEA contains vast information about cruises, summary reports etc. You can find out more under expedition archiving
 * You can access campaign and station lists via “nice, talking” URL
 * For a ship/basis: https://www.pangaea.de/expeditions/bybasis/Polarstern
 * The name of the ship has to be UTF-8-URL-encoded
 * If the ship’s name includes special characters, they need to be encoded accordingly: https://www.pangaea.de/expeditions/bybasis/Meteor%20(1964)
 * You can also access all campaigns per project acronym (shows all campaigns connected to the project and all campaigns that are linked to a.dataset with that project): https://www.pangaea.de/expeditions/byproject/IODP
 * You can also access the station list for a cruise/campaign: https://www.pangaea.de/expeditions/events/PS114
 * Default is HTML, you can download a tab file of your list if you include “?format=textfile” in the URL

= Additional information and links =
 * You need more information for past cruises (DOD ID) and information about chief scientist etc.?
 * information on past cruises
 * information on planned cruises
 * For Meteor, Merian, Sonne you can find more information here
 * Cruise planning and more for Polarstern, Heincke and smaller vessels
 * Seadata is the official page for Cruise Summary Reports (links to DOD ID); is not complete, therefore we do not yet link them
 * Actual official page of the german research vessels