Learn how to establish file-naming conventions, best practices for formatting names, and what's in a name (elements to include in names and how to keep things sorted). Formatting redux offers a deep dive into why the recommendations on this page work.
A file-naming convention is a set of rules that govern how files and folders are named. Establishing a file-naming convention ensures compatibility and comprehension when working alone and in teams.
YYYY-MM-DD
)._01
, _v1
, _draft
, _final
, etc.These practices will ensure the highest level of compatibility across systems, software, and applications.
camelCase
) makes it hard to search for files.This section is a more in-depth look at how to use file and folder names help us organize and locate digital content. Topics covered include what elements to include in names, and more details about how to get file names to sort well.
Files names help you identify what is inside a file and how it differs from similar files. You should name files based on important elements such as:
tax form.pdf
)completed tax form.pdf
)2023 completed tax form.pdf
)2023 tax form amended 2024-08.pdf
)Folder names are categorical and should describe groups of files. Simple or broad categories work best, and you should name folders based on important elements such as:
📁 my pets
) 📁 photos
)📁 vet visit notes
)📁 photos 2023
)📁 my pets
→ 📁 photos
→ 📁 2017, 📁 2021, 📁 2022
)As the relationship example shows, folder names inherit topic and name associations from the folders above them. Making good use of these relationships is especially important in a browse file system (see Systems and theory). File and folder names can also share name elements to make relationships more explicit and to remind users which folder they are working in (e.g., 📁 my pets
→ 📁 photos
→ 📁 2017
→ 🖼 2017 halloween pets.jpg
).
The following practices will keep your files in order, literally.
YYYY-MM-DD
or YYYYMMDD
.
file_001.tif
, file_003.tif
, and file_021.tif
will sort sequentially, but file_1.tif
, file_21.tiff
, and file_3.tif
do not as numbers in file names are read as characters, not numerical values.cats_01
, cats_02
, dogs_01
, dogs_02)
; however, if the series number is more important, that should come first (01_cats
, 01_dogs
, 02_cats
, 02_dogs
).A deep dive into why the naming recommendations on this page work.
When computer memory and storage were more restricted and expensive, the symbols (characters) used to interact with them were minimal. Since the majority of computer users during this time (1950-1980s) were English speakers, and the Latin alphabet was small (26 characters), it became the dominant character set and today's computers have inherited this legacy.
While most systems and programs now use UTF-8 or UTF-16 character sets, which support multiple alphabets and thousands of characters, the limited Latin alphabet character sets are still in use and restrict compatibility. Special characters such as arrows (→), superscripts (²), emojis (😮), percent signs (%) and most punctuation marks should be avoided for this reason and because some of them may function as Reserved characters (see section below).
A space character in computer programming languages indicates the start of a new command, so filenames with spaces are cut off at the first space unless extra steps are taken. This is why spaces are not recommended even though they're the easiest way to separate name elements. However, spaces are often preferred by people with low-vision or dyslexia because they're easier to read, and they can be used safely in most modern systems. User preference is often the biggest factor in determining their use or replacement.
The best replacements for spaces are hyphens (-) or underscores (_) since some system treat all three characters (hyphen, underscore, and space) the same for search purposes. Spaces are also not supported in URLs (hyperlinks) so hyphens and dashes should be used when creating webpages. CamelCase, where letter case is used to separate words and name elements, is common but is not the best replacement for a space since letter case isn't considered in searches and it can be hard to read.
Reserved characters are characters that cannot be used in file and folder names because they have been assigned a specific function by the operating system. Some of the most common reserved characters are: backslash (\), forward slash (/), colon (:), asterisk (*), quotation mark ("), brackets ([ ]), period (.), and less than (<) and greater than (>) signs. Including these characters in filenames limits compatibility between systems. For example, Mac OS 9 and above supports a number of characters that are reserved characters for Windows (such as / \ : * ? < > |). This causes errors when downloading, sharing, or moving files across systems so it is best to avoid using these characters.
There is a hard limit on the number of characters in a file path. A file path is the route or description of where a file is located on your hard drive and folder structure. It includes the drive name, the names of all higher-level folders, and the file name.
For example, the file path to an image named "card-01" stored in your downloads folder might look like:
C:\Users\[account name]\Downloads\card-01.jpg
This is a short file path, only 44 characters long, but a complex folder structures and/or long file names can easily max out the limit of 260 characters on Windows and 255 characters for Mac OS. This is why short file and folder names are desirable - exceeding the limit will prevent you from being able to open or find your files and folders.
The limit (MAX_PATH) in Windows can be overridden but it is very risky to do so as can cause file storage (hard drive) and file write/read (software) errors. Use at your own risk.
Consultations are also available by request.
Megan O'Donnell, Research Data Services Lead
Heather Campbell, Head of Metadata Services