Skip to Main Content

Data Management Plan (DMP) Guide

Learn how to write a data management plan!

FAQS and Definitions

Below are some frequently asked questions related to data management plans, terminology, data sharing, federal requirements, and more. 

Do you have a question not covered by the guide? Send us an email!

What is a data management plan (DMP)?
(i.e., data management and sharing plan)

A Data Management Plan (DMP)  is a written document that outlines how research data will be managed throughout the research process. They should include include information about what data is acquired, how it is stored and organized, who may access it, how it is documented, security and safety measures, software and device information, and post-research/long-term plans for sharing and preserving. A DMP is an effective planning device and can also be used to onboard new students and collaborators. 

Data Management Plans submitted as part of a funding proposal are typically limited to two-pages so the contents are more of an outline or 'sketch' than a procedural document.

Do I really have to keep and share all of my data?

The short answer is "no", there are usually limits on which data must be preserved and shared. You should prioritize keeping and sharing data that:

  1. captures a one-time event, cannot be easily replicated, or data with long-term value
  2. underlies and validates published research findings

If you're unsure what to keep you can contact the DataShare Team for help at

What is "research data"? Is it different than "scientific data"? 

This guide uses the terms data, research data, and scientific data interchangeably as they all refer to data that was collected or generated for the purpose of research. 

The Federal government defines ‘research data’ as:

Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues. This ``recorded'' material excludes physical objects (e.g., laboratory samples). Research data also do not include:
     (A) Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law; and
     (B) Personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.

source: OMB Circular A-110, "Uniform Administrative Requirements for Grants and Agreements With Institutions of Higher Education, Hospitals, and Other Non-Profit Organizations''

The National Institutes of Health (NIH) use the term “scientific data” and defines it as:

Data commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.
     Scientific data includes any data needed to validate and replicate research findings.
     Scientific data does not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects such as laboratory specimens.

source: Final NIH Policy for Data Management and Sharing. NOT-OD-21-013

The two definitions share the same focus (validating research findings) but do not set firm limits. NIH has provided clarifying language that makes it clear that relationship to a publication is not the sole determining factor under their policy and this understanding also aligns with the federal definition:

Even those scientific data not used to support a publication are considered scientific data and within the final DMS Policy’s scope. We understand that a lack of publication does not necessarily mean that the findings are null or negative; however, indicating that scientific data are defined independent of publication is sufficient to cover data underlying null or negative findings.

source: Final NIH Policy for Data Management and Sharing. NOT-OD-21-013

What is metadata?

A simple definition of metadata is that it is "data about data" (how meta!). A better definition is that metadata is descriptive structured information that describes data. A simple example of metadata is to think of how you would describe a book you liked to someone else: title, subject matter, length, author, date of publication, etc. -- these are pieces of (meta)data that describe the book (data). 

Data has no value unless there is also information about what it is, where it came from, and who made it. When done well, metadata fills in these gaps and makes data discoverable, accessible, and understandable. 

What is a data repository and why should I use one?

Data repositories are devoted to sharing and keeping data accessible, safe, and secure. They use special software, metadata, workflows, and networks to meet these goals. Data repositories also help guarantee authenticity by providing control mechanisms and change logs. They are the best choice for research data sharing, distribution, and preservation.

Data repositories often have limits and restrictions governing which data they accept. Most have rules covering data formats and size limits, and require that data be documented. Some accept data from any research area, while others will only accept research from specific domains (such as "biology" or "social science"). The latter are known as disciplinary data repositories. Another type of specialized repository is the institutional data repository which focuses on collecting the outputs of select group, such as a university or federal agency.

The Data Sharing portion of the guide provides more information and resources for locating data repositories.

decorative image
Need help?

Data Management Questions

IT Questions

Quick Access
Iowa State University DataShare