Which NIH grants require data sharing plans?

NIH currently requires data sharing plans to be included for three types of grant proposals:

  1. All grants seeking “$500K or more in direct costs in any one year” should explain how the final research data will be shared, or justifying why it cannot be shared,
  2. Applications that involve developing model organisms are to include plans for sharing and distributing these resources, and
  3. Studies involving large-scale human and non-human genomic data must adhere to the NIH Genomic Data Sharing Policy.

Other solicitations are increasingly asking for sharing statements, particularly if online archives and public access are significant components. We advise checking your solicitation for any required data sharing statements.

Guidelines for writing and locating the statement

Length of statements: 2003 NIH guidelines suggest a “brief paragraph” however several sentences may be required to address the components they specify. Although guidelines state that statements “would not count towards the application page limit” a few paragraphs to less than a page is likely the current norm.

For proposals of more than $500K direct costs per year, there are three sections of the proposal that require discussion of your data sharing plans.  For each section, we have compiled some suggested content and “starter questions” to help you draft your statements.

Data Sharing Plan (to follow immediately after the Research Plan Section 1. Letter of Support, PHS 398)

Data Products to be shared (suggested paragraph content)

  • Indicate the data products that will be shared. (Optionally indicate the amount of data shared, especially for particularly large datasets). See the table below, which may be useful for jotting down your planed datasets for sharing.
  • Data product examples: transcripts, tables, 3D models, digital audio, geospatial data, etc.. Also include in the list any analytic tools being provided, such as algorithms, code, or software.
  • What is the format of the final dataset? Examples such as Excel spreadsheets, text records, jpg images, an SQL database, RTF text, MATLAB, MS Excel converted to CSV, etc. Specify if there are particular tools or software required to read the data.
  • Optional: What additional documentation will be included to allow others to use the data? Specify if the data applies a standard metadata format used by your community, and indicate if explanatory text files, codebooks, or other documentation will be included.

Worksheet for listing shared data.  Copy this table for jotting down those data products from your project description that you will be sharing.

  Data Product Shared where? Shared when? Formats

Data access and policies

How will the data be disseminated and accessed?  Sharing methods include:

  • Researcher shares data upon request (shared by disk, email?) or makes data accessible through a personal website.
  • Data will be deposited at a data archive. (Name the archive or data center, mention if it is NIH-funded or has data access policies and procedures consistent with NIH data sharing policies.)
  • Data will be accessed through a data enclave (a restricted data center with controlled access, e.g., Hopkins Population Center)
  • Identify conditions for accessing data (e.g., requests for access from identified investigators working at institutions with Federal Wide Assurance) and specify policies for data re-use (e.g., signing a data sharing agreement requesting citation and secure use of data with human subject identifiers.)
  • Identify when the data will be shared. NIH policy requires “’the timely release and sharing’ to be no later than the acceptance for publication of the main findings from the final data set.”
  • Explain any reasons for delay of sharing beyond the expectations of your community, such as patent restrictions, collaborator requirements, proprietary data from private companies.
  • Provide justification for not sharing data, such as the inability at reasonable cost to remove personal identifiers.  NIH expects data sharing to follow the norms of your research community, but encourages efforts to broaden the range of data shared and of potential users beyond your field. Consider whether portions of your data may be shared, such as de-identified case examples or curricular materials.
Background and Significance Section (PHS 398 Research Plan Section B)

If the proposal is seeking support for developing a large database that will serve as an important resource for the scientific community, you may wish to make a statement about this in the Significance section of the application. It may be relevant to mention again any archive or repository being used for access.

Human Subjects Section (PHS 398 Research Plan Section E)

If the research involves human subjects and the data are intended to be shared, this section should discuss how the rights and confidentiality of participants would be protected, any potential risks to research participants posed by data sharing and steps taken to address those risks. These may include:

  • A brief statement about how data is secured during the project through access controls (e.g. password-protected network space).
  • Steps taken to de-identify shared datasets and to protect any data or identification codes retained after the project.
  • Reiterating policies and restrictions for accessing data, whether disseminated through an archive or shared directly by the PI.
  • Stating who will be responsible for protecting human subject data during the period that it is shared and archived.

NIH allows requests for funds to prepare, document, and archive the data, in which case relevant information should be included in the budget and budget justification sections.

Additional Resources

NIH Data sharing guidance: http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm#inc

NIH Data sharing FAQ’s (circa 2004): http://grants.nih.gov/grants/policy/data_sharing/data_sharing_faqs.htm

Download the MS Word version of this document.