The rules were simple but led to complicated results. MS-DOS files were eleven characters long with an implicit dot between characters eight and nine. Theoretically, spaces were permitted anywhere, but in practice they could appear only at the end of the file name or immediately before the implicit dot.
Wildcard matching was actually very simple. The program passed an eleven-character pattern; each position in the pattern consisted either of a file name character (which had to match exactly) or consisted of a question mark (which matched anything). Consider the file "ABCD····TXT", where I've used · to represent a space. This file name would more traditionally be written as ABCD.TXT, but I've written it out in its raw 11-character format to make the matching more obvious. Let's look at some patterns and whether they would match.
|Match||all positions are a wildcard so of course they match|
|No match||space (position 9) does not match |
|match||perfect match at |
The tricky part is converting the traditional notation with dots and asterisks into the eleven-character pattern. The algorithm used by MS-DOS was the same one used by CP/M, since MS-DOS worked hard at being backwards compatible with CP/M. (You may find some people who call this the FCB matching algorithm, because file names were passed to and from the operating system in a structure called a File Control Block.)
- Start with eleven spaces and the cursor at position 1.
- Read a character from the input. If the end of the input is reached, then stop.
- If the next character in the input is a dot, then set positions 9, 10, and 11 to spaces, move the cursor to position 9, and go back to step 2.
- If the next character in the input is an asterisk, then fill the rest of the pattern with question marks, move the cursor to position 12, and go back to step 2. (Yes, this is past the end of the pattern.)
- If the cursor is not at position 12, copy the input character to the cursor position and advance the cursor.
- Go to step 2.
Let's parse a few patterns using this algorithm, since the results can be surprising. In the diagrams, I'll underline the cursor position.