Below are some frequently asked questions related to data management plans, terminology, data sharing, federal requirements, and more.
Do you have a question not covered by the guide? Send us an email!
A data management plan (DMP) is a written document that describes the data you expect to acquire or generate during the course of a research project, how you will manage, describe, analyze, and store those data, and what mechanisms you will use at the end of your project to share and preserve your data.
Data management plans required by funding agencies are usually "abstracts" of planned data management activities due to length requirements (usually 2 pages).
A: In general you are required to retain, share, and make accessible data that validates your research findings. You should also consider preserving/sharing data that:
Please refer to the Requirements by Agency section of the guide for more specific guidance on what you are required to retain and share.
A: This guide uses the terms "data" and "research data" interchangeably. The definition of research data used the most often by federal agencies is adapted from OMB Circular A-110:
NOTE: NIH and NASA do not consider summary statistics, tables, charts, etc. and laboratory notebooks "research data." Although these items are important documents that may contain data they are not research data according to these agencies.
NOTE: Sample/artifact/specimen preservation and sharing is very important for some types of research. The fact that they are not "data" as defined by the government doesn't prohibit you from including this information in your grant proposal and some agencies do want this information included in a data management plan.
A: Metadata, commonly called "data about data", is information which describes data. Good metadata enables others to understand and reuse data that they themselves did not create. A minimum amount of metadata should be agreed upon and implemented before starting data collection. Data collection and documentation is easier if you know what you need to collect and how to record it. This also helps maintain data consistency and quality.
There are many different ways to record and share metadata. Some of the most common methods are:
Let's say that you wish to read the book Harry Potter and the Sorcerer's Stone. How would you locate this book at the library? You'd search for it. You'd probably do a search for the book's title or for its author, both of which will let you locate the book's library record which contains descriptive information about the book, including its location in the library.
All of the information in the library record is metadata. It describes the book.
Library book metadata is valuable is because it describes the book in variety of ways which lets you search, locate, and identify particular books.
Research data metadata describes how the data was created, recorded, generated, analyzed, etc. The added context and details let others understand and reuse the data.
A: Data repositories are devoted to keeping data accessible, safe, and secure. They use special software, metadata, workflows, and networks to meet these goals. Data repositories also help guarantee authenticity by providing control mechanisms and change logs. They are usually the best choice for research data sharing, distribution, and preservation because of this.
Data repositories often have limits and restrictions governing which data they accept. Most have rules covering data formats and size limits, and require that data be documented. Some accept data from any research area, while others will only accept research from specific domains (such as "biology" or "social science"). The latter are known as disciplinary data repositories. Another type of specialized repository is the institutional data repository which focuses on collecting the outputs of select group such as a university or federal agency.
The Data Sharing portion of the guide provides more information and resources for locating data repositories.
A: Machine-readable data is data which can be read and processed by a computer. By comparison human-readable data can only be read (and understood) by a human. It is important to understand that charts, graphs, and most tables are not machine-readable but the data they were generated from probably is.
Examples of human-readable data include books, PDFs, representations of data (charts, graphs, tables, etc.), and datasets which have not been structured to be read by computers.
Examples of machine-readable data include data which has been encoded with a mark-up language (html, xml, etc.), data sets which have been structured to be read by computers, and data which is encoded for machine processing and is not human-readable.
A: Making data digitally accessible is part of making data machine-readable. There is no clear definition of this term but it is generally understood that: