Data Sharing and JHU Data Archive


Data Sharing: What, Why and How?

Data sharing involves making your research data available to others through a variety of mechanisms. Which data are shared, to whom, and when are issues that your data management plan should address. For example, should both processed and raw data be shared? What legal, ethical and practical considerations exist in selecting research data to share?

What are research data?

Researchers often ask what constitutes their data. Johns Hopkins University defines research data “records that would be used for the reconstruction and evaluation of reported or otherwise published results” in the policy on access and retention of research data and materials. Examples include laboratory notebooks, numerical raw experimental results and instrumental outputs.

Storing data in use, archiving completed projects

Storing and backing up research data is, of course, critical during research. However, these actions are not sufficient to ensure the data’s future usability for you and your research community. When ending a research project or project phase such as data collection, consider taking time to prepare an archived copy of your research data. Archivingwebsitegraphic1_20160906 research data is not simply taking stored data out of active use; it requires a few additional steps:

  • protecting data: requiring safeguards and periodic checks of file integrity on storage media
  • documenting data to ensure that data can be used and interpreted in the future, especially by others. This includes organizing the data as an identifiable collection with a stable reference.

Archiving research data builds upon the storage process, providing for long-term access to the data and preparing the data for deposit into a data repository if desired (see the figure at right).

 

Advantages to sharing data

While the sharing of research data is expected by some funding agencies, such as NSF, sharing research data also has many advantages for the scientists. In a 2010 UK study on open data, researchers identified the following as benefits to themselves:

 

  • Enhancing visibility of research
  • Increasing the efficiency of research due to reusability and exposure
  • Enabling researchers to ask new research questions and potentially further science
  • Promoting scientific integrity and replication
  • Enhancing collaboration and community-building
Restrictions to sharing research data

Not all research data can or should be shared due to legal, ethical or practical reasons. Your data management plan should address any restrictions to the sharing of your research data with others. The table below outlines some of these restrictions that should be considered. Information on Johns Hopkins University policies, including IRB requirements and intellectual property definitions can be found on the JHU Policies page.

 

TermDefinition
PrivacyInformation that identifies an individual (e.g., HIPAA, IRB)
ConfidentialityInformation that should not be shared (e.g., embargo period, trade secret)
SecurityThreats to something and someone through release of data
Intellectual PropertyNew, intangible creations (e.g., patents, copyright)

Ways to share research data

Scientists can disseminate their data through various solutions, each with pros websitegraphic2v4_20160908_rb_bigand cons to consider. As shown in the figure at right, access to and use of your research data will be facilitated by file sharing services or the use of a data archive. However, these solutions may require more effort than sharing the data upon request. A JHU data management consultant can help you assess your options for a sharing solution.

 

 


Data repositories for sharing data

Archived data collections can be more easily shared, whether by direct request or via websites. Or, consider archiving at a data repository to expand the access, discoverability and active management of your data collections. A data repository is a digital system and actively managed service for providing access to data. Repositories vary in their capabilities, but most include the following to varying degrees:

  • Providing a web-accessible interface for discovering and downloading research data collections.
  • Managing preservation of digital objects such as file integrity checking and redundant offsite backups.
  • Use of identifiers, such as DOIs (digital object identifiers) to give datasets persistent location links and citations similar to journal articles
  • Description of projects and files, and ways to include documentation sufficient for using the collection without contacting the researcher.

We have developed guidance for researchers on Selecting a Repository for Data Deposit. You can also search for repositories for your field on the re3data.org website and contact us for assistance in locating a suitable data repository.

Data citation
Citations for research data are important both for giving researchers proper credit for shared research data and for facilitating references to datasets in publications. One advantage of depositing your data into a data repository or archive is your data often receives a unique identifier (e.g., DOI, like those for journal articles) that is permanently associated with that data to facilitate proper citation. Also, these data repositories often create and display a proper data citation so users know exactly how to cite the downloaded data. Although formal data citation formats are emerging, a number of groups have established guidelines. In general, they contain a title, author, date, distributor, version and locator/identifier, but other citation elements are possible such as release date and resource type. Below are examples of data citations from three different data archives: 

 

  • ICPSR: Johnston, Lloyd D., Jerald G. Bachman, Patrick M. O’Malley, and John E. Schulenberg. Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2007 [Computer File]. ICPSR22480-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2008-10-29. doi:10.3886/ICPSR22480.
  • ESIP Federation: Ishikawa, M. 2002. Inventory of Rock Glaciers along the Ghunsa Valley, Kanchanjunga Himal, Eastern Nepal. Boulder, CO: National Snow and Ice Data Center. Digital Media.
  • JHU Data Archive: Zhang, Q., Harman, C. J., and Ball, W.P., 2016. Data associated with An Improved Method for Interpretation of Riverine Concentration-Discharge Relationships Indicates Long-Term Shifts in Reservoir Sediment Trapping. Version 1. Johns Hopkins University Data Archive. http://dx.doi.org/10.7281/T18G8HM0.
For more information on data citation, please see the project website for DataCite and the Digital Curation Centre’s guide on “How to Cite Datasets and Link to Publications”.

Archiving Services We Offer

Johns Hopkins University Data Management Services provides archiving services for the Johns Hopkins research community through the JHU Data Archive. While some academic disciplines have established research data repositories, many fields of research do not have easily available options for archiving and sharing data. Our archiving services give researchers the opportunity to share their data outside of original collaborations and beyond the life of a researcJHUDA logo2h project.

Characteristics of the JHU Data Archive:

  • Data from any research discipline and with any file format
  • Each dataset given a permanent citation and DOI, facilitating both attribution for authors and linkage to research publications
  • Preservation of research data through regular file integrity checks and retention of multiple copies

If you are interested in archiving your data with the JHU Data Archive, please take a look at our Archiving Process and Contact Us to discuss your research and data access needs. We can help you consider all of your options for data repositories, and whether the JHU Data Archive would be an optimal choice. Archiving services for projects under 1 TB are FREE. For archiving larger datasets, please contact us to discuss fees that may apply.

If considering archiving and sharing data for a grant project, it is important to contact JHU Data Management Services before submitting the proposal. Your archiving choices can be included in the proposal’s Data Management Plan if required by your funder. And we can help you prepare your plan too!