How are you managing your research data? How will you document your data and make it discoverable by other researchers? How long must it be retained? These are questions to which many federal agencies and other funders are now requiring answers. To ensure usability, preservation, and access to data, making a plan for managing it before beginning research is essential.
Proper management of research data may increase visibility of research; meet requirements of funding agencies (NSF, NIH); follow policies of journal publishers (Nature, Science, PLoS); address preservation considerations; and reinforce open access and scientific inquiry.
Data management throughout the research cycle includes collecting and processing data, analyzing data and publishing research results, and planning for long-term access and storage. The UCSB Library can assist researchers with data management through services offered by the California Digital Library (CDL). The CDL, along with other institutions, has developed an online tool which offers guidance on creating a data management plan: use it to generate a plan and submit it with your funding proposals. The data management guidelines website—www.cdlib.org/services/uc3/datamanagement—links to detailed information on the creation, organization, and management of data.
A key component of the data management plan is the naming and organizing of files. To assist with this, the UCSB Library has subscribed to EZID, a service that makes it easy to obtain and manage long-term identifiers for digital content. Supported by the Library, research units and departments on campus can set up an account and begin creating persistent identifiers for digital content.
Matt Jones, a researcher with NCEAS, describes how he is using the EZID service:
The EZID service allows us to assign citable, permanent Digital Object Identifiers (DOIs) to scientific data sets in the Knowledge Network for Biocomplexity (KNB), which is a community data repository serving the ecological, environmental, and Earth sciences. The KNB repository will be one of the founding Member Nodes within the DataONE data sharing federation. UCSB is one of three national Coordinating Nodes for DataONE, and thus will be providing an integrated suite of data preservation services to scientists across the world. Our use of global, persistent identifiers will allow the data in these repositories to be accurately cited in scientific papers and other outlets, thereby improving support for open, transparent science.
There are obviously a lot of details surrounding this rollout—we are now working on integrating EZID with our current software platforms, and testing it. We have not yet gone live with the service. When we do, we’ll be assigning up to 150,000 DOIs in the first round to cover our existing historical data sets. Future data sets may be assigned DOIs as they are created.
James Frew, a faculty member at Bren School of Environmental Science and Management and co-founder of the Federation of Earth Science Information Partners:
The Earth science community is becoming increasingly aware of the importance of citing datasets with the same care and precision given to citing scholarly publications. At its January 2012 meeting, the Federation of Earth Science Information Partners adopted guidelines to help the community standardize on formats and techniques for data citation. Robust identifiers play a key role in making these citations credible. The recommended identifier technologies are DOIs and ARKs (Archival Resource Keys), which are both supported by EZID. It’s clear that EZID is becoming the service of choice for managing scientific data identifiers.
For further information, please contact your subject librarian or Janet Martorana (martoran [at] library [dot] ucsb [dot] edu or 893-8724).
