Which characters are invalid for an MS-DOS filename?
I'm writing a filename I/O procedure in x86-16 assembly language. It takes eight characters (I don't need to support long filenames) from the keyboard and prints them to an on-screen text input field.
At the moment I'm allowing numbers, upper/lower-case letters, underscores, and hyphens.
I'd like to allow all legal symbols, but I can't find an official list of banned characters. Common sense tells me that slashes are illegal, but if I had to guess, I would say that the plus character is legal. (edit: It's not!)
I'm already ignoring the period character since my code automatically handles appending the period and file extension.
54 Answers
A concise summary can be found on Wikipedia:
Legal characters for DOS filenames include the following:
- Upper case letters
A–Z- Numbers
0–9- Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the filename, also filenames with spaces in them must be enclosed in quotes to be used on a DOS command line, and if the DOS command is built programmatically, the filename must be enclosed in quadruple quotes when viewed as a variable within the program building the DOS command.)
! # $ % & ' ( ) - @ ^ _ ` { } ~- Values 128–255 (though if NLS services are active in DOS, some characters interpreted as lowercase are invalid and unavailable)
This excludes the following ASCII characters:
" * + , / : ; < = > ? \ [ ] |[9]- Windows/MS-DOS has no shell escape character
.(U+002E . full stop) within name and extension fields, except in . and .. entries (see below)- Lower case letters
a–z(stored as A–Z on FAT12/FAT16)- Control characters 0–31
- Value 127 (DEL)[dubious – discuss]
And here's what MS-DOS 6 user guide officially said
Naming Files and Directories
Every file and directory, except for the root directory on each drive, must have a name. The following list summarizes the rules for naming files and directories. File and directory names:
- Can be up to eight characters long. In addition, you can include an extension up to three characters long.
- Are not case-sensitive. It does not matter whether you use uppercase or lowercase letters when you type them.
- Can contain only the letters A through Z, the numbers 0 through 9, and the following special characters: underscore (
_), caret (^), dollar sign ($), tilde (~), exclamation point (!), number sign (#), percent sign (%), ampersand (&), hyphen (-), braces ({}), at sign (@), single quotation mark (`), apostrophe ('), and parentheses(). No other special characters are acceptable.- Cannot contain spaces, commas, backslashes, or periods (except the period that separates the name from the extension).
- Cannot be identical to the name of another file or subdirectory in the same directory.
- Concise User’s Guide - Microsoft® MS-DOS® 6
- Concise User’s Guide - Microsoft® MS-DOS® 6 - alternate link
This is from PC-DOS 7:
The name you assign to a file must meet the following criteria:
- It can contain no more than eight characters.
It can consist of the letters A through Z, the numbers 0 through 9, and the following special characters:
_ underscore ^ caret $ dollar sign ~ tilde ! exclamation point # number sign % percent sign & ampersand - hyphen {} braces @ at sign ` single quote ' apostrophe () parenthesesNote: No other special characters are acceptable.
- The name cannot contain spaces, commas, backslashes, or periods (except the period that separates the name from the extension).
- The name cannot be one of the following reserved file names: CLOCK$, CON, AUX, COM1, COM2, COM3, COM4, LPT1, LPT2, LPT3, LPT4, NUL, and PRN.
- It cannot be the same name as another file within the directory.
The first byte of a name must not be 0x20 (space). Short names or extensions are padded with spaces. Special ASCII characters 0x22 (
"), 0x2a (*), 0x2b (+), 0x2c (,), 0x2e (.), 0x2f (/), 0x3a (:), 0x3b (;), 0x3c (<), 0x3d (=), 0x3e (>), 0x3f (?), 0x5b ([), 0x5c (\), 0x5d (]), 0x7c (|) are not allowed.
If you're also interested in MS-DOS 5.0 then here it is.
7Strictly speaking, as an MS/PC/DR-DOS applications programmer you are supposed to ask the operating system for this information. INT 0x21 with AX=0x6505 returns a pointer to the so-called FCHAR NLS table for your country and code page. This table lists a range of characters and a further set of characters that terminate filenames.
In theory it varies by country and code page. But the fact that it was not formally carried over into the OS/2 Control Program API and the fact that FreeDOS has 1 table across all codepages and countries show that it is largely invariant in practice.
Further reading
I found this in a manual for MS-DOS 3.3. I'm running 6.22, but it probably still applies. I was wrong about '+' being allowed.
2If you just want to validate the filename, you may want to use INT 21H/AH=60H (TRUENAME - CANONICALIZE FILENAME OR PATH) after ensuring that the passed filename doesn't have a colon or backslash (those may be treated as drive letters and directories): the function takes your proposed filename and tries to canonicalize it by uppercasing the letters and checking for invalid characters (it also adds a drive letter/server name and path.)
In pseudocode:
If !(filename contains {"/", "\", ".", ":"}) Canonicalize filename (INT 21H/AH=60H) If !(CF is set) filename is valid
Filename is not valid