Data Sharing

Why Share Data?

Data sharing is the act of providing research data to others, and it constitutes a crucial aspect of open science. It involves providing access to datasets and their associated documentation and code (when applicable). This open exchange of data, documentation, and code allows others to use, study, and build upon the data to advance knowledge and promote transparency in research and decision-making processes.

Numerous funding agencies, publishers, and institutions have established policies mandating research data sharing. Sharing data serves as a means to enhance the reuse of data, promoting greater efficiency and collaboration in research endeavors. While websites, FTP, and email exchange can be used for sharing data informally, researchers should strongly consider using data repositories for sharing research data, as they offer numerous advantages in terms of data preservation, discoverability, security, and compliance with academic and funding requirements.

When writing your Data Management Plan or preparing data for sharing, researchers should opt for thoughtful and reliable approaches and refrain from ad hoc provision of data. This consideration helps uphold the research's credibility and transparency, enabling easier replication, collaboration, and scientific progress.

Funders are aware of potential limitations to make all data publicly available but expect researchers to explicitly acknowledge any anticipated restrictions followed by a compelling justification. Consider ethical issues when dealing with human subjects' data or sensitive data of any kind (e.g., endangered species or protected lands). If access and sharing restrictions apply to your data, address plans to anonymize the data and protect privacy and confidentiality while trying to maximize data sharing whenever possible. Data can be shared through methods such as depositing it in a data repository, supplementary to an academic paper or published in a data journal, or by providing it to individual researchers upon their request. Sharing upon request should be considered the last resort. Refrain from adding a general data availability statement to your DMP, reports, and communications. Be more precise about data provision methods. You may check Taylor and Francis's guide for writing more effective and compelling data availability statements. 

According to Scholarly Publishing and Academic Resources Coalition (SPARC), open data:

  • Accelerates the pace of discovery. When datasets are openly available, they can be easily accessed and used to create a fuller picture of a given area of inquiry, or analyzed by data mining software that can uncover connections not apparent to those who produced the original data.
  • Grows the economy. Researchers estimate that $3.2 trillion in economic output could be added to global GDP through Open Data across all sectors, with scientific and scholarly data playing an important role.
  • Helps ensure we don’t miss breakthroughs. There are a huge number of ways to use or analyze any given dataset. What seems like noise to one person could be an important discovery to someone else with a different perspective or analytical technique.
  • Improves the integrity of the scientific and scholarly record. When the data that underlies findings is accessible, researchers can check each other’s work and ensure that conclusions are built upon a firm foundation.
  • Is becoming recognized by many in the research community as an important part of the research enterprise of the 21st century. From research funders like the US government to publishers, institutions involved in the research process are beginning to require that, at the very least, the data that underlies publications be made openly accessible.

Recommended Resources

  • Journals - TOP Factor
    • Search for journals and identify their data-sharing requirements.
  • Funders - SPARC Open Data
    • Browse Article and Data Sharing Requirements by Federal Agency.

Archiving

The most secure way to ensure future access to research data is to entrust data files to a stable, trustworthy online data repository. Some repositories accept data files specific to a discipline or an institution, such as a university, or may take any data files associated with the research process. Repositories also typically provide an interface for others to search for and download research data, are indexed by search engines such as Google Dataset, and offer curation services to ensure better quality. Many repositories also provide secure storage through backups and checks that ensure long-term access and preservation.  

By depositing data in reputable repositories and archives, researchers contribute to the collective knowledge and safeguard their work against loss or obsolescence. These practices promote scientific integrity, support reproducibility, and advance the overall progress of human knowledge across various fields of study.

Numerous options exist for disseminating your data to a broad audience, with various data repositories offering different access levels and support. Data archives and repositories that employ data experts to provide curation services and ensure the long-term management of your data are instrumental in preserving data for the future.

The UC system is a member organization of Dryad, a general subject data repository. Dryad is a FAIR-compliant curated repository free for UC researchers. Data are publicly available under a CC0 license and satisfy data-sharing mandates from publishers and funders. 

We strongly advise researchers to approach a trusted repository initially. Researchers should first determine if an appropriate reliable domain repository exists and offers an option to house their research data. The websites FAIRsharing.org and re3data.org can help with this determination. If using a domain repository is not possible, then a researcher should review the general repository chart and consider institutional repositories (when available) as a location to store their data. In summary:

  1. Place your data in a data center or repository specific to your field or
  2. Deposit your data in a curated and certified interdisciplinary repository like Dryad.

It's important to note that alternative data-archiving methods, which some publishers may prefer or require, may not guarantee curation or long-term preservation. These methods include:

  • Submitting your data alongside a related publication to a journal publisher.
  • Publishing your data in a data-focused journal.
  • Submitting your data to uncurated general repositories such as Figshare and Zenodo.

While personal or laboratory websites, Electronic Lab Notebooks (ELNs), wikis, and similar tools can assist in collaboration and short-term sharing, they are typically unsuitable for long-term archival and preservation. Also, GitHub is not meant for storing data files long-term. It's useful for ongoing projects and great for collaboration and tracking contributions; however, it's not designed to save things for the future or make it easy for others to find them. Microsoft owns GitHub, and it doesn't show citations and other metrics. It's also not a good choice if you have a project with public and private data and need special data-use agreements.

The Research Data Services (RDS) Department is available to aid researchers in choosing an appropriate repository, data journal, or other data-sharing approach to ensure that their data remains easily discoverable, accessible, and preserved as an integral part of the academic record. When in doubt, consult with our team: rds@library.ucsb.edu.

 

Recommended Resources

  • Dryad  - datadryad.org
    • A curated data publishing platform for researchers to share and publish their data. Free for UC affiliates.
  • Registry of Research Data Repositories - Re3data.org
    • Offers researchers, funding organizations, libraries, and publishers an overview of existing repositories for research data. 
  • FAIRSharing Databases
    • FAIRSharing.org
      • A registry of knowledge bases and repositories of data and other digital assets.
    • Generalist Repository Comparison Chart 
      • A chart designed to assist researchers in finding a generalist repository should no domain repository be available to preserve their research data. 
  • Graphic Handouts