All data sets published on DataShare must be accompanied by documentation. At minimum, your data documentation should address:
The simplest and most flexible way to document your data is through a README file - a text document that acts as a 'user manual' for your dataset. README files are most often found as plain text (.txt, .md) or PDF files. It's possible to insert a codebook or data dictionary into a README but it may not always be practical.
Traditionally a codebook defines the variables of a data set while a data dictionary defines variables and provides extra details such as information on the origin of the data, its relationship to other data, and how to use it. However, the two terms are fairly interchangeable so use whichever you like.
These documents are important because they explain attributes that are not within the data itself, such as what each column of a spreadsheet represents, how null and zero values differ, and what encoded data means. For example, a column titled "date" does not tell you why the date matters, only that it is one and a value of "1" could be a quantity or might mean "yes", "question 1", or a variety of other things.