Dear Fiona Legg and Yukiko Fukasaku,
the Alfred Wegener Institute for Polar and Marine Research, Germany received the draft OECD Recommendation for data access and would like to comment on it. AWI is operating a World Data Center, representing a part of the scientific community handling geo-spatial data.
A 50-year-transition from printed data to electronic holdings and the related technology has now reached a status, that nearly any amount of data can be archived in direct access through the Internet and thus the OECD Recommendation appears at the right time. The answer to the first question about the usefulness of the recommendation is clearly ‚yes‘. It is not just useful - there is now an urgent need for such a recommendation.
Answering the question which points are not clear, I would like to focus on the introduction. Quotation: ‚...growing quantities of data are collected...‘ gives the impression, that the data are ‚there‘. In fact data are mostly not archived, described and provided in a sustainable way. We have detailed insights into the world of geo-spatial data only, but due to the fact, that the problem seems to be a principle one, it might be similar with the other major research fields: at this time, we are loosing daily a significant part of our scientific cultural heritage, especially concerning data from public funding. Depending on the scientific field, up to 90 percent of the data get lost because it is not archived in a sustainable way in established systems or centers. The introduction should point out this major problem very clearly.
There is also a suggestion to add an other point to the ‚opportunities and benefits‘: - added scientific value through standardized integration
To avoid any further discussions about how to ‚acknowledge‘ a data provider, the citation in its well defined and established bibliographic sense should be recommended also for data. As part of the citation, data providers should also make use of a persistent identification for a long-term reliable access (as already established for 20 Million publications).
Further on it should be recommended to distribute the information about data holdings (metadata) as far as possible to search engines, portals and library catalogs. (The technology and standards are there). The recommendation 'not data without metadata' should include also 'no metadata without data'. (There are a lot of metadata systems on the Internet, not pointing to the real data and thus are useless.)
It is also our impression, that the OECD Recommendations should be more ‚condensed‘.
If dealing with data from 'public funding' all members of the science process are responsible. The recommendations should make clear, that in particular those giving the public money, i.e. the funding organizations, need to 'convince' their 'customers' to include a well established data policy, defining the data flow, final library and ressources, necessary to do a proper data archiving. This in most of the OECD member countries is still absent.
I understand, that recommendations from the OECD must be very general. But from our experience and view on data, some very specific recommendations have already been discussed in the scientific community and should now be implemented stepwise. We add those points for you as background information. Perhaps you find a way to add parts of it ‚in between the lines‘.
- Give credit to the data provider by establishing the data citation and the data publication similar to the one used in the conventional publication process.
- Establish a peer-review for data publications to increase acceptance and trust in its content also adding an impact factor for the data provider.
- Transfer the status of data centers to a status similiar to the one of libraries by assuring a well defined policy in organization, duties and responsibilities and a commitment on long-term access.
- Establish Open Access for data (similar to publications) as integral and mandatory part of any research project.
With best regards hg