Talk:Parameter

=Proposal: Internal page=



The parameter table contains all parameters with unit and ID, grouped by categories (Parameter group, see also discussion page).

When defining new parameters, please keep in mind:
PARAMETER IMPORT FORM for new definitions and submit as excel or text/zipped format via the ticket system. There are two forms of parameter submission issues: 1. Project PANGAEA Data Archiving & Publication, Issue Type Parameter Submission for general parameter requests. 2. An issue-related parameter request, choose the drop-down menu More Actions in the main data submission issue. Go to Create Sub-Task and chosse the Issue Type New Parameter.
 * 1) First check for existing parameters by using the 4D-client or the Parameter Dictionary. When using the 4D-client also use the search function -> Parameter contains subset of parameter name or Abbreviation is equal to parameter abbreviation. E.g., a search for 0.063-0.032 mm fraction [%] -> Parameter contains 0.063 mm will list all parameters containing 0.063 mm. Search for Fe % -> Abbreviation is equal to Fe will list all Iron-parameters with the abbreviation Fe (not Fe2+, Fe peak area etc.).
 * 2) Avoid duplicate definitions of parameters at any time! If the parameter already exists with a unit different from the one needed and the data can easily be converted, a new parameter should NOT be defined and the data must be converted prior to or during import. (It is one of the major challanges of Pangaea, that it delivers data in a consistent format, which also means it uses standard units as far as defined in science.)
 * 3) Do not define parameters with user specific 'qualifiers', e.g. in species names something like Thalassiosira sp. F. In this case the data should be linked to the parameter Thalassiosira sp. and the data series comment should contain sp. F (see also import; for the use of abbreviations in taxonomic names see Taxon).
 * 4) Do not define parameters containing two different individuals, e.g. Convallina logani/dawsoni. Instead use Convallina logani and add to the data series comment including Convallina dawsoni. Do not define any mixed parameters accordingly.
 * 5) Clearly separate parameters from methods or any other specifications. Methods are defined in the method table; the relation between a data series and the method or data specific comments is set during the import.
 * 6) New parameters are defined by the data librarian. Please use the

Also request of changes of a parameter definition must be submitted as ticket, use the general parameter request Parameter Submission.

A nearly unlimited number of parameters can be used in a data set. An example set containing 550 parameter (columns) is

Field description of the parameter table

 * Parameter Name contains the full name of the parameter. Parameter Name in combination with the unit is unique in Pangaea. Parameter names have the most important specification in the beginning, followed by all describing therms in hierarchic order. Example: when dissolved organic carbon is measured, carbon is the main parameter information, followed by organic and the description of it's condition dissolved = Parameter Name Carbon, organic, dissolved.
 * Abbreviation or short name of the parameter; is used in the header of data sets. The Abbreviation gives a short form of the parameter name, it never contains comma. Delta notation is given by d, standard deviation and standard errors are std dev and std e. Example: abbreviation of the parameter Carbon, organic, dissolved is DOC.
 * Unit should be given for all numeric-parameters and should follow standard use i.a. already existing parameter should only be defined with an other unit if the values can not be converted from one to an other. The unit does not contain chemical formulas, elements or the sampling milieu. This must be given in the Parameter name. Text-parameters have no unit, delete the unit column in import form or leave empty. Example: parameter Carbon, organic, dissolved has the unit µmol/kg not µmol C/kg.
 * LowerLimit/UpperLimit can be used to define the numeric range of values in which a certain parameter will occur. An internal routine will check during the import of data for outliers and will flag them as not valid. Delete columns for text-parameters in import form or leave empty.
 * Default format some predefined formats are offert by a menue but can be eddited by hand. The format should follow the general precision and will be used by the system as the default. Format can be changed during (or after) the import of data on the config card.
 * Default data type of a parameter can be numeric (1), text (2), (3) DateTime given in ISO-format: YYYY-MM-DDThh:mm:ss, (4) binary, (5) Feature, (6) URI, and (7) Event. If a text parameter is defined, no unit, format and min/max values should be given. A field of a text parameter may contain up to 255 characters. Example: Carbon, organic, dissolved [µmol/kg] is a numeric parameter.
 * Default method (DefaultMethodID) is a relational field to the Method table where a required method has to be defined first. Methods defined in this field are shown during import of data by default. The default can be changed during the import procedure. Use the ID of the method for import. Example: Carbon, organic, dissolved is calculated, Method ID 50.
 * Reference (ReferenceID) can be given in case a parameter was defined through a publication; relational to Reference. Use the ID of the referenceI for import. There reference is not shown in the dataset. it is better to use the URL field
 * URL may contain a link to a more detailed explanation/definition of the parameter, e.g. in Wikipedia or in a paper. You can give a link or a DOI. This definition should be of general use. For species parameter the URL field will automatically filled with a link to a taxonomic database (e.g. ITIS, WORMS).
 * Comment (Description) may be used for any descriptions, helpfull to other curators to understand its meaning. This is an internal info field, its content does not appear in datasets!
 * Keywords may be used to define a certain parameter group for special purpose, projects or users. Keyword-related parameter lists can be produced using the DDI tool to set up a dynamic link; relational to the Thesaurus. Field not included in the parameter import form. Keywords need to be added by hand for each parameter.

Parameter import file for the Example: Carbon, organic, dissolved [mmol/kg]



Some useful tips for parameter names:

Find missing parameter in an import file with Split2Events

 * 1) Install Split2Events Split2Events
 * 2) Prepare a local list of all parameters called ParameterDB.pdb. For this step use Tools->Refresh parameter database or Tools->Merge new parameters to parameter database. The name of the parameter database can be given first with File->General options....
 * 3) Open your import file.
 * 4) Create a metadata file with the option use metadata file; find parameter by name. Split2Events identifies the ID for each parameter by using the parameter database. If a parameter is unknown, the ID is set to unknown. If write parameter import file on the Options tab was checked, a list of the unknown parameters is written to imp_Parameters.txt. Proof carefully if the unknown parameters are real new parameters or if they exist with an other spelling. After checking the new parameters, completing the parameter import file create an issue (http://issues.pangaea.de) and upload it to the issue.
 * 5) After the parameters are imported, go to Split2Events and use Tools->Merge new parameters to parameter database.
 * 6) Continue with step 3.

How is a parameter recognized during import?
In principle, parameter definitions should be unique but the import using parameter (short) names may still produce some ambiguous parameter messages.
 * ID of parameter
 * Parameter name with unit in square bracket, e.g. Equivalent dose [mSv]
 * Parameter abbreviation with unit, e.g. H [mSv]. Be careful the combination of abbreviation and unit is not unique. Example: C. wuellerstorfi [%] can be Cibicidoides or Cibicides wuellerstorfi.

Check taxonomic parameters in the pangaea parameter table - the script (by Robert Huber) matches the pangaea parameter table with the species catalog of UBIO

=Proposal: External page= Data submissions are required to use parameters, as defined in the dictionary. Parameters are always accompanied by a unit. New parameters are defined by the data librarian on request.

The parameters related to a dataset are listed in the dataset metaheader.

Definition of parameters
Mandatory fields are underlined ; unused columns and lines in import files should be deleted.
 * Parameter Name contains the full name of the parameter. Parameter Name in combination with the unit must be unique in Pangaea. Parameter names have the most important specification in the beginning, followed by all describing therms in hierarchic order. Example: when dissolved organic carbon is measured, carbon is the main parameter information, followed by organic and the description of it's condition dissolved = Parameter Name Carbon, organic, dissolved. Do not use abbreviations in the parameter name. Compare the new parameter name with existing parameters and follow their syntax.
 * Abbreviation or short name of the parameter; is used in the header of data sets. The Abbreviation gives a short form of the parameter name, it never contains comma. Delta notation is given by d, standard deviation and standard errors are std dev and std e. Example: abbreviation of the parameter Carbon, organic, dissolved is DOC.
 * Unit should be given for all numeric-parameters and should follow standard use i.a. already existing parameter should only be defined with an other unit if the values can not be converted from one to an other. The unit does not contain chemical formulas, elements or the sampling milieu. This must be given in the Parameter name. Text-parameters have no unit, delete the unit column in import form or leave empty. Example: parameter Carbon, organic, dissolved has the unit µmol/kg not µmol C/kg.
 * LowerLimit/UpperLimit can be used to define the numeric range of values in which a certain parameter will occur. An internal routine will check during the import of data for outliers and will flag them as not valid. Delete columns for text-parameters in import form or leave empty.
 * Default format some predefined formats are offert by a menue but can be eddited by hand. The format should follow the general precision and will be used by the system as the default. Format can be changed during (or after) the import of data on the config card.
 * Default data type of a parameter can be numeric (1), text (2), (3) DateTime given in ISO-format: YYYY-MM-DDThh:mm:ss, (4) binary, (5) Feature, (6) URI, and (7) Event. If a text parameter is defined, no unit, format and min/max values should be given. A field of a text parameter may contain up to 255 characters. Example: Carbon, organic, dissolved [µmol/kg] is a numeric parameter.
 * Default method (DefaultMethodID) is a relational field to the Method table where a required method has to be defined first. Methods defined in this field are shown during import of data by default. The default can be changed during the import procedure. Use the ID of the method for import. Example: Carbon, organic, dissolved is calculated, Method ID 50.
 * Reference (ReferenceID) can be given in case a parameter was defined through a publication; relational to Reference. Use the ID of the referenceI for import. There reference is not shown in the dataset. it is better to use the URL field
 * URL may contain a link to a more detailed explanation/definition of the parameter, e.g. in Wikipedia or in a paper. You can give a link or a DOI. This definition should be of general use. For species parameter the URL field will automatically filled with a link to a taxonomic database (e.g. ITIS, WORMS).
 * Comment (Description) may be used for any descriptions, helpfull to other curators to understand its meaning. This is an internal info field, its content does not appear in datasets!
 * Keywords may be used to define a certain parameter group for special purpose, projects or users. Keyword-related parameter lists can be produced using the DDI tool to set up a dynamic link; relational to the Thesaurus. Field not included in the parameter import form. Keywords need to be added by hand for each parameter.

=Discussion= One of the advantages in using a relational database is the consistency of the data sets if archived through a fully normalized data model. This implies a well defined table of parameters used for the description of the data variables. If a user searches for data of a specific parameter, the system must ensure, that only those data are found and not others are left hidden due to a different definition with a similar meaning. Thus most important for the curators is to avoid a duplication of parameter definitions.

The parameter table of Pangaea has ten thousands of parameters defined. Parameters can easily be found with the 4D client by typing in the first letters of an expression or by using a retrieval, e.g. parameter contains carbonate which will show a list of more than 50 parameters, having the word carbonate in its name.

To help users and projects in finding its own parameters of relevance, parameters are grouped into categories, defined in the Parameter group table or can individualy be grouped by keywords. The parameter dicitionary contains several dynamic queries on project related lists of parameters which were established using the DDI funtionality.

zur Frage verlinkung von species parametern mit taxonomischen datenbanken:

Am 2015-08-25 um 13:44 schrieb PANGAEA Issue Tracker (Uwe Schindler) : [ http://issues.pangaea.de/browse/PDI-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=65162#comment-65162 ] Uwe Schindler commented on PDI-10590: -
 * An die Parameter werden keine Infos wie URIs o.ä. drangehängt. Es wird sich auch in Zukunft an den Parameterdefinitionen (die Parameter-Tabelle selbst) nix ändern.
 * An die Parametertabelle wurde bereits jetzt der Feature-Catalogue angehängt, aber der wird nicht automatisch befüllt. Derzeit wird versucht die Altlasten an mehrere Kataloge anzuhängen (via mehrerer Relationen ausgehend von Name, Messtyp, Einheit): WORMS, ITIS, QUDT,SWEET, CHEBY, ENVO,...
 * Wenn das erledigt, werden zukünftig neue Parameter einfacher zu definieren, weil man die Erkennung benutzen kann und aus den gefundenen Termen in den externen Ontologien/Katalogen wird dann der Parametername automatisch generiert (entepechend Syntax-Regeln)
 * Dadurch wird es dann einfacher bei der Suche Facettierung anzubieten, weil man die Parameter und deren Datensätze automatisch in "Gruppen" einordnen kann. Es mach wenig Sinn volle Parameter Namen wie "Globigerina bulloides, d13C" als Facette anzubieten (es gibt zuviele davon), aber solche Datensätze kann der User dann bei Suche gruppieren (Facettieren) nach Sachen wie "Globigerinen" oder "Delta 13C Measurements", o.ä. Und genau diese Informationen werden an Die Parameter gehängt, so dass übergeordnete Gruppierungen entsprechend der Treffermenge angeboten werden können.
 * Die URL des Parameters wird niemals geändert werden, weil ja u.a. an "Globigerina bulloides, d13C" alleine schon 2 Konzepte hängen. Aber die URLs hängen an den Komponenten (und das jetzt schon), weil es so aus WORMS, ITIS, QUDT,SWEET, CHEBY, ENVO,... importiert wurde. Nur die Zuordnung der Parameter zu diesen externen Importen ist in der Mache.