Split2Events

From PangaWiki
Jump to: navigation, search
Split2Events icon

Split2Events is a software tool to split one file with data from several events into several files, one for each event. The resulting folder with a number of files can automaticaly be imported with the Massenimport routine of 4D. But it can also be useful to configure a complex import file outside 4D. Split2Events may also extract a list of unknown parameters and create a parameter import file.

Find the current version and reference of Split2Events at doi:10.1594/PANGAEA.835398

Find Sourcecode at GitHub

Contact: Rainer Sieger. The software is provided as freeware under the (GNU General Public License (GPLv3) and is freely distributed without warranty by the Alfred Wegener Institute, Helmholtz Center for Polar and Marine Research, Bremerhaven.


Installation

Download the current version of Split2Events to your computer.

Windows

Double-click Split2Events_Win.exe and follow the instructions.

OS X

Open the downloaded dmg file with a double-click. Drag and drop the file Split2Events.app onto the appliction folder icon.

Linux

Uncompress the archive to your user bin directory. Double-click Split2Events.sh.

General

  • Empty columns and lines which contain GEOCODE only will be removed.
  • If data set metadata is included, 4D will analyze it and format during sequential import.
  • When defining data set titles and filename, the placeholder $E may be used to add individual event labels.
  • Keywords are not added by Split2Events; they may be set manually in 4D prior to import.

Just split

  1. Start Split2Events and drag/drop the file to the program window.
  2. Choose the Create import file(s)... Tool (F5).
  3. The Mandatory tab shows the entries of the last session.
  4. Click on New if a new file collection is processed.
  5. Go to the Options tab and check split file to events.
  6. Click OK. An import file is written for each event, stored in a new folder.
  7. Use 4D (Import/Analytical data/Open folder) for sequential import.

Creating import file(s) with metadata

If metadata (PI, title, references, comment, etc.) is included in import files, 4D will analyze it and format during import.

  1. Start Split2Events and drag/drop the file to the program window.
  2. Choose the Create import file(s)... Tool (F5).
  3. The Mandatory tab shows the entries of the last session.
  4. Click on New if a new file collection is processed.
  5. On the Mandatory tab fill out the fields as required.
  6. On the Optional tab add information as required.
  7. On the Options tab check split file to events and write metadata to data import file.

Creating import file(s) using a metadata template file

  1. Open an import file and choose the option use metadata file; find parameter by position on the Options tab.
  2. Create a template of a metadata file by clicking the button Create metadata template. The metadata file will be created with the extension _metadata.txt.
  3. Open this file with an editor (e.g. drag the file onto an open Excel window) and modify a appropriate. Each line contains the information for one parameter.
  • Parameter name as given in the data file. If the ID is given in the data file this entry is empty.
  • Parameter ID, in this mode the ID is mostly “unknown”. Only the ID of the GEOCODE is given automatically. You have to fill in the right ID for the parameter.
  • PI ID as provided through the PI field of Split2Events. The ID 999999 will be replaced by @PP@Event label@ (e.g. @PP@PS2742-5@)
  • Method ID; use ID=43 if not_given. If left empty, the default method will be used. The ID 999999 will be replaced by @PM@Event label@ (e.g. @PM@PS2742-5@)
  • Comment of data series. The entry 999999 will be replaced by @PC@Event label@ (e.g. @PC@PS2742-5@)
  • Format as suggested from the precision of the numeric values of the related parameter.
  • Factor if a recalculation is required, e.g. may be used to convert units.
  • Fill empty cells with add characters, which should be used to fill empty cells. The data entry @is empty@ will be always replaced by an empty string.
  • Range min and Range max defines the range of values of parameter (e.g. for temperature in water set this -5 to 50). If a value is outside of the given range, the value will be marked with a flag.

Creating import file(s) using a metadata template file and find parameter IDs automatically

  1. Prepare a local list of all parameters called ParameterDB.pdb. For this step use Tools->Refresh parameter database or Tools->Merge new parameters to parameter database. The parameter database can be given first with File->General options.... Otherwise the default location and name will be used.
  2. Open the data file.
  3. Create a metadata template file with the option use metadata file; find parameter by name. Split2Events identifies the ID for each parameter by using the parameter database. If a parameter is unknown, the ID is set to unknown Parameter ID. If write parameter import file on the Options tab was checked, a list of the unknown parameters is written to imp_Parameters.txt. After completing this file create an issue (http://issues.pangaea.de) and upload it to the issue.
  4. Continue with step 1.

Reference and detailed description

Split2Events - General options

-- General options tab --

  • Encoding of input files. Choose the encoding of your data file.
    • Windows - Latin-1 (ISO 8859-1)
    • OS X - Excel: Apple Roman
    • OS X - LibreOffice: UTF-8
    • Linux - UTF-8
  • File extension of metadata file. The spreadsheet program Calc of LibreOffice needs the file extension "csv" to open a table.
  • Create additional option lines in metadata template file. When creating a metadata template file Split2Events adds a "DataSet comment" block at the end of the template.
  • Parameter database.
    • Show "Parameter import file created message". Some users don't like the message than the parameter import files has been created. So you can switch off this little dialog.
    • Show menu "Merge new parameters to the parameter database". HG don't like this useful menu item. So he can switch off this item. Split2Events gets always the whole parameter table.
    • Refresh parameter database when the program is started. Useful for time to time users.

Split2Events - Mandatory card

-- Mandatory tab --

  • PANGAEA staff ID of principal investigator. This entry sets the ID for the PI of data in the parameter part. The ID 999999 will be replaced by @PP@Event label@ (e.g. @PP@PS2742-5@). With the PanTool function Search and replace many strings the PI of data of this event can be set easily.
  • PANGAEA staff ID of author(s). This entry sets a list of IDs for the authors of the datasets. The ID 999999 will be replaced by @A@Event label@ (e.g. @A@PS2742-5@).
  • PANGAEA institution ID of source. This entry sets the ID for the source (related to the PANGAEA institutions table) of data. The ID 999999 will be replaced by @S@Event label@ (e.g. @S@PS2742-5@).
  • PANGAEA reference ID(s). This entry sets a list of IDs for the references of the dataset in the metadata part. The ID 999999 will be replaced by @R@Event label@ (e.g. @R@PS2742-5@).
  • PANGAEA project ID(s). This entry sets a list of IDs for the projects of the dataset in the metadata part. The ID 999999 will be replaced by @Pro@Event label@ (e.g. @Pro@PS2742-5@).
  • Dataset title. This entry sets the citation of the dataset in the metadata part. The placeholder $E will be replaced by the event label. The placeholder $@ will be replaced by the string behind the “@” given in the event label. The ID 999999 will be replaced by @D@Event label@ (e.g. @D@PS2742-5@).
  • Topologic type. Menu to select the topologic type of a dataset.
  • Status. Menu to select the status of a dataset.
  • Access rights. Sets the access rights of a dataset to unrestricted (default), signup required, access rights needed, or @L@Event label@.
  • Export filename. This entry sets the export filename. The placeholder $E will be replaced by the event label. The placeholder $@ will be replaced by the string behind the “@” given in the event label.

-- Optional tab --

Split2Events - Optional card
  • Dataset comment (optional). This entry sets the dataset comment. The placeholder $E will be replaced by the event label. The placeholder $@ will be replaced by the string behind the '@' given in the event label. The ID 999999 will be replaced by @C@Event label@ (e.g. @C@PS2742-5@).
  • PANGAEA reference ID(s) to be used as further details (optional). This entry sets the reference(s) to be used as further details link(s). 999999 will be replaced by @FR@Event label@ (e.g. @FR@PS2742-5@).
  • PANGAEA dataset ID(s) to be used as further details (optional). This entry sets the datasets(s) to be used as further details link(s). 999999 will be replaced by @FD@Event label@ (e.g. @FD@PS2742-5@).
  • PANGAEA reference ID(s) to be used as other version (optional). This entry sets the reference(s) to be used as other version link(s). 999999 will be replaced by @OR@Event label@ (e.g. @OR@PS2742-5@).
  • PANGAEA dataset ID(s) to be used as other version (optional). This entry sets the dataset(s) to be used as other version link(s). 999999 will be replaced by @OD@Event label@ (e.g. @OD@PS2742-5@).
  • PANGAEA reference ID(s) to be used as source data set (optional). This entry sets the reference(s) to be used as source data set link(s). 999999 will be replaced by @SR@Event label@ (e.g. @SR@PS2742-5@).
  • PANGAEA dataset ID(s) to be used as source data set (optional). This entry sets the dataset(s) to be used as source data set link(s). 999999 will be replaced by @SD@Event label@ (e.g. @SD@PS2742-5@).
  • PANGAEA dataset ID(s) to be used as source data set (optional). This entry sets the dataset(s) to be used as source data set link(s). 999999 will be replaced by @SD@Event label@ (e.g. @SD@PS2742-5@).
  • User(s) of validated datasets. This entry sets the ID(s) for the user of data. The ID 999999 will be replaced by @U@Event label@ (e.g. @U@PS2742-5@).
  • Add to an existing parent (optional). This entry sets the ID of the parent the new dataset will add to.

-- Options tab --

Split2Events - Options card
  • Data import file options
    • split file to events: Split2Events splits a file to events only if this option is set. Uncheck it for creating surface datasets.
    • write metadata to data import file: The metadata part of the import file contains all import options. This is needed for importing a huge number of files. But it can also be useful to configure a complex import file outside 4D.
    • make filename unique: Needed if you split moore then one files with the same events.
    • marks files with 1, 2 or 3 lines: Useful to separate data from cores and surfaces are mixed.
    • use name of input file for $E or ID=999999.
    • overwrite existing datasets: @I@Event label@ will be written.
  • Handle out of range values
    • ignore range settings (default)
    • remove values
    • mark as bad (add '/' to the value)
    • mark as quetionable (add '?' to the value)
  • Metadata file options
    • use auto metadata file: Sets the default method and a suitable format for each parameter automatically.
    • use metadata file; find parameter by position: If the program splits a file to events, a metadata file is loading. The position of parameters in the metadata file has to be the same order as in the data file. A metadata template can be created with Create metadata template.
    • use metadata file; find parameter by name: If the program splits a file to events, a metadata file is loading. The program finds the right parameter automatically by its name. A metadata template can be created with Create metadata template.
  • Create metadata template options
    • write parameter import file: If the option use metadata file; find parameter by name is selected and the user creates a metadata template, Split2Events finds missing parameters and writes them into imp_Parameter_<timestamp>.txt. After completing this file create an issue (http://issues.pangaea.de) and upload it to the issue.

  • Parameter database. To create a metadata template and find missing parameters the program needs a list of all known parameters defined in PANGAEA. Use Tools->Create parameter database or Tools->Merge new parameters to parameter database to create a parameter database. The name of the parameter database can be given first with File->General options.... Browse for it.
  • Buttons
    • Create metadata template. Pressing this button creates a metadata template file. This depends on the metadata file options and the create metadata template options. If use metadata file; find parameter by name is selected a parameter database is needed.
    • New. Resets all settings.
    • Save. Saves a project file manually.
    • Load. Loads a project file. Overwrites all given settings!
    • OK. Starts the import file(s) creation procedure.
    • Cancel. Closes the settings dialog. No settings will be changed.

Contact: Dr. Rainer Sieger, Alfred Wegener Institute, Bremerhaven