Data Curation and Management

Distinct from, but closely related to, the principle of open access to published research results is the principle of open data – that research data should be shared and publicly accessible.  Open data can increase the impact of your research, make attribution easy, and build your research reputation.  It also allows other scholars to verify the research conclusions, to analyze the data in ways and for purposes that the original researchers might not have considered, and to pull together data from diverse sources leading to new discoveries and outcomes.

How are you managing your research data?  How will you document your data and make it discoverable by other researchers?  How long must it be retained?  These are questions to which many federal agencies and other funders are now requiring answers.  To ensure usability, preservation, and access to data, making a plan for managing it before beginning research is essential.

The UCSB Library can assist you with data management issues and offers tools and resources to help with creation, storage, dissemination, and preservation of your data.

Funding Agencies Requirements

Increasingly, funding agencies are implementing data management and sharing requirements.  For a list of agencies, see:

Federal Funding Agencies:  Data Management and Sharing Policies
Compiled by the California Digital Library (CDL)

Create a Data Management Plan

Data management throughout the research cycle includes collecting and processing data, analyzing data and publishing research results, and planning for long-term access and storage.  The UCSB Library can assist researchers with data management through services offered by the CDL.

Data Management Plan Tool (DMPTool)
The DMPTool, developed by the CDL and its partners, offers step-by-step instruction and templates for creating, publishing and sharing data management plans that satisfy funding agency mandates.

Preservation and Archiving

University of California Curation Center (UC3)
UC3 is a suite of digital curation services established to assist researchers in sharing, managing, and preserving their digital content. The Center defines digital curation as “the set of policies and practices focused on maintaining and adding value to trusted digital content for use now and into the indefinite future.  Curation encompasses preservation and access, and can be applied to the humanities, social sciences, and sciences.”

  • MERRITT
    MERRITT is the UC3 repository for datasets and other content (audio, video, texts, etc.).  Use Merritt to provide long-term preservation of digital assets, manage and share your research with others, or meet the data sharing and preservation requirements of a grant-funded project.
  • EZID
    A key component of the data management plan is the naming and organizing of files.  To assist with this, the UCSB Library has subscribed to EZID, a UC3 service that makes it easy to create and manage long-term identifiers (DOIs or ARKs) to provide a stable point of reference to online content (papers, datasets, video clips, etc.).  Metadata can be added and updated to increase discoverability.  Supported by the Library, research units and departments on campus can set up an account and begin creating persistent identifiers for digital content.  For further information, please contact your subject librarian or Stephanie Tulley (stulley [at] library [dot] ucsb [dot] edu or 893-7225).  To apply for an EZID account, complete the request form and submit it to stulley [at] library [dot] ucsb [dot] edu.

Data Repositories

Depositing your data in an archive will facilitate its discovery and preservation.  In addition to UC3’s Merritt, there are other repositories for open data; many of these are discipline-focused.

Citing Data

Why cite data?

  • Get credit for your data and build your reputation.
  • You data is discoverable and can be attributed to you.
  • Other researchers can find data associated with a publication and explore new ways to use it.

See DataCite for examples of data citation.

Privacy and Confidentiality

The federal government1 acknowledges that data is not always shareable for reasons of privacy or confidentiality and many repositories have capabilities to keep data private.  Included in the federal definition of the terms and conditions of what type of data is NOT for sharing are:

  • preliminary analyses
  • drafts of scientific papers
  • plans for future research
  • peer reviews
  • communications with colleagues
  • physical objects (e.g., laboratory samples)
  • trade secrets
  • commercial information
  • materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law
  • personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.

1Circular No. A-110 - Uniform Administrative Requirements for Grants and Agreements With Institutions of Higher Education, Hospitals, and Other Non-Profit Organizations. (11/19/93 As Further Amended 9/30/99).

Keeping Current

Data Pub
Data Pub is the California Digital Library's blog on data publication, data sharing, data archiving, data citation, open data, open science, and related topics.  California Digital Library staff will post here, and UC faculty and researchers are invited to participate as well - by reading, commenting, tweeting or contributing a post.