Data Seal of Approval

This page is a copy of the Guide for the Dataseal of Approval Assessment. Pangaea related answers are added after each query.

This paragraph is designed to help data managers who want to prepare an assessment of their repository to apply for the Data Seal of Approval (DSA). It lists each of the DSA’s guidelines (in blue) with suggestions of topics for inclusion and discussion. It is neither prescriptive nor exhaustive. Wherever possible each guideline should be addressed in this assessment by a link to a publicly available statement (preferably in English) which relates to the issues noted below each guideline.

 1.	The data producer deposits the research data in a data repository qualified according to the DSA guidelines This guideline simply refers to the DSA Status of the repository. The repository will either be: “not assessed”, “pending assessment”, “assessed” or “assessed, pending re-assessment”.

 2.	The data producer provides the research data in formats recommended by the data repository This guideline relates to the level of guidance which the repository gives to the data producer before, and at the time of submission to the repository. The response should concentrate on the contribution of the repository to make this guideline possible for the data producer:
 * Does the repository publish a list of preferred formats? Are tools used to check the compliance to official specifications of the formats? What is the repository’s approach towards data that is deposited in non-preferred formats?
 * Are Quality Control checks in place at the repository to ensure that data producers adhere to the preferred formats?
 * Does the repository ask the depositors to provide detailed information about their file formats and the tools and methods by which the files were created?

 3.	The data producer provides the research data together with the metadata requested by the data repository This guideline relates to the level of guidance which the repository gives to the data producer before, and at the time of submission to the repository. The response should concentrate on the contribution of the repository to make this possible for the data producer:
 * Are deposit forms which hold resource discovery metadata used?
 * Are there other user friendly ways for users to provide metadata?
 * What kind of Quality Control is in place at the repository to check that the data producer adheres to the request for metadata?
 * Are there tools to create metadata at the level of files?
 * Are metadata elements derived from established metadata standards, registries or conventions? If so list them, and show the level of adherence to those standards.
 * Are these metadata items relevant for the data consumers?
 * What is the repository’s approach if the metadata provided is insufficient for long term archiving?

 4.	The data repository has an explicit mission in the area of digital archiving and promulgates it This guideline relates to the level of authority which the repository has.
 * Does the repository have a Mission Statement? Does it clearly reference a commissioning authority?
 * Does the repository have a document which outlines the way in which the mission statement is implemented?
 * Does the repository carry out promotional activities?
 * What level of succession planning has taken place in the event of the repository ceasing to exist?

 5.	The data repository uses due diligence to ensure compliance with legal regulations and contracts This guideline relates to the legal regulations which impact on the repository.
 * What is the legal position of the repository?
 * Does the repository use model contract(s) with data producers?
 * Does the repository use model contract(s) with data consumers?
 * Are the repository’s conditions of use published?
 * Are there measures in place if the conditions are not complied with?
 * How does the repository ensure knowledge of and compliance with national and international laws?

 6.	The data repository applies documented processes and procedures for managing data storage This guideline relates to the ability of the repository to manage data.
 * Does the repository have a preservation policy?
 * What is the repository’s strategy towards backup / multiple copies?
 * What form of data recovery provisions are in place?
 * Are Risk Management techniques used to inform the strategy?
 * What levels of security are acceptable for the repository?
 * Are there checks on the consistency of the archive?
 * How is deterioration of storage media handled and monitored?

 7.	The data repository has a plan for long-term preservation of its digital assets This guideline relates to the ability of the repository providing continued access to data.
 * What provisions are in place to take into account the future obsolescence of file formats?
 * What provisions are in place to ensure long term data usability?

 8.	Archiving takes place according to explicit workflows across the data life cycle This guideline relates to the levels of procedural documentation for the repository.
 * Does the repository have procedural documentation for archiving data?
 * If so, provide references to:
 * Workflows
 * Decision-making process for archival data transformations
 * Skills of employees
 * Types of data within the repository
 * Selection process
 * Approach towards data that does not fall within the mission
 * Guarding privacy of subjects, etc.
 * Clarity to data producers about handling of the data

 9.	The data repository assumes responsibility from the data producers for access and availability of the digital objects This guideline relates to the levels of responsibility which the repository takes for its data.
 * What licences / contractual agreements does the repository have with data producers?
 * How does the repository enforce licences with the data producer?
 * Please describe your crisis management.

 10.	The data repository enables the users to utilize the research data and refer to them This guideline relates to the formats in which the repository provides its data.
 * In what form are data provided to end users? (E.g., are data provided in formats used by the research community?)
 * How do potential users find data? What search facilities are offered? Is OAI harvesting permissible? Is deep searching possible?
 * Does the repository offer Persistent Identifiers?

 11.	The data repository ensures the integrity of the digital objects and the metadata This guideline relates to the information contained in the digital objects and metadata and whether it is complete, whether all changes are logged and whether intermediate versions are present in the archive.
 * Does the repository utilise checksums? What type? How are they monitored?
 * How is the availability of data monitored?
 * Does the repository deal with multiple versions of the data? If so, how?

 12.	The data repository ensures the authenticity of the digital objects and the metadata This guideline refers to the relationship between the original data and that disseminated, and whether or not existing relationships between datasets and/or metadata are maintained.
 * What is the repository’s strategy for changes? Are data producers made aware of this strategy?
 * How is versioning handled?
 * Does the repository maintain provenance data and related audit trails?
 * Does the repository maintain links to metadata and other datasets, and if so how?
 * How are the essential properties of different versions of the same file compared?
 * Does the repository check the identities of depositors?

 13.	The technical infrastructure explicitly supports the tasks and functions described in internationally accepted archival standards like OAIS This guideline refers to the level of conformance with accepted standards.
 * What standards does the repository use for reference?
 * How is the standard implemented, and if there are significant deviations from the standard why is that the case?
 * Does the repository have a plan for infrastructural development?

 14.	The data consumer complies with access regulations set by the data repository This guideline refers to the contribution of the repository in creating legal access agreements which relate to relevant national (and international) legislation and the levels to which the repository informs the data consumer about the access conditions of the repository.
 * Does the repository use End User Licence(s) with data consumers?
 * Are there any particular special requirements which the repository’s holdings require?
 * Are contracts provided to grant access to restricted-use (confidential) data?
 * Does the repository make use of special licences, e.g., Creative Commons?
 * Are there measures in place if the conditions are not complied with?

 15.	The data consumer conforms to and agrees with any codes of conduct that are generally accepted in higher education and scientific research for the exchange and proper use of knowledge and information This guideline refers to the contribution of the repository to inform data users about any relevant codes of conduct.
 * Does the repository need to deal with any relevant codes of conduct?
 * What are the terms of use to which data consumers agree?
 * Are institutional bodies involved?
 * Are there measures in place if these codes are not complied with?

 16.	The data consumer respects the applicable licences of the data repository regarding the use of the research data This guideline refers to the contribution of the repository to inform data users regarding to the applicable licences.
 * Are there relevant licences in place?
 * Are there measures in place if these codes are not complied with?

Link

 * Dataseal of Approval (DSA)