How a command line is interpreted

Each line of text read from a command script, or from the input when in interactive mode, is parsed into a sequence of commands. This parsing occurs in three stages:

  1. The line is scanned, replacing variable specifications with the contents of those variables. This is variable substitution.

  2. The resultant text after substitution is parsed according to the command syntax and quoting conventions into a compound sequence of commands, which may contain multiple command pipelines separated by conjunctions.

  3. For each command in the sequence to be executed, any redirection specifications in the command are applied immediately before the command is executed.

Variable substitution

Variable substitution replaces all occurences of a name preceded by a percent, '%', character with a value.

It occurs only once, over the entire command line before any part of it is executed. (This is the same as the 16-bit CMD supplied with IBM OS/2, but different from JP Software's 4OS2.) For example, in

[c:\]set NAME=sometext & echo %NAME%

the variable substitution for the %NAME% sequence occurs before the initial SET command is executed, and the ECHO command will not echo the new value of the NAME environment variable. This can be used to good effect with the ENDLOCAL command.

Normally, the name ends with a second percent character, and is the name of an environment variable. As in the 16-bit CMD supplied with IBM OS/2, environment variable names may contain spaces and other punctuation characters. The second percent character is required in order to mark the end of the variable name. For example, in

[c:\]echo %A %B %C %D

the environment variable names are "A " and "C ". In this example "B " and "D " are in fact simply literal text that is passed on to the command unchanged.

The value used to replace the name is that of the environment variable. For example, in

[c:\]echo %PATH%

the arguments passed to the ECHO command will actually be the value of the PATH environment variable at the time that variable substitution occurred.

Note: Environment variable names are case-sensitive.

If an environment variable of the given name does not exist, then if the name is that of an implicit variable the value of that implicit variable is substituted. For example, in

[c:\]echo %_DATE%

the value of the _DATE implicit variable will be used, unless an explicit environment variable called "_DATE" exists (which is not usually the case in most system configurations).

Otherwise if no environment variable of the given name exists, and no implicit variable of the given name exists, no substitution occurs and the text is left unchanged in its original form (including both of the percent characters). For example, in

[c:\]echo %NO_SUCH_VARIABLE%

the argument passed to the ECHO command, assuming that no environment variable with the name "NO_SUCH_VARIABLE" exists, will be the literal text "%NO_SUCH_VARIABLE%". This is useful for the FOR command.

Certain special expansion sequences do not end with a percent character, and override normal environment variable substitution.

A percent character followed immediately by a decimal number indicates a command script parameter replacement. For example, within a command script the command

[c:\]echo %1 %2

will, after expansion, result in the values of the first and second arguments to the command script, from when the script was invoked, being passed to the ECHO command as its arguments. The "%0" expansion sequence is replaced by the filename of the command script itself, as determined when the script was executed.

The following sequences are also special and override normal environment variable substitution. They are replaced with the given values:

%%
A single percent character
%+
The "escape" character (caret, "^")
%=
The simple conjunction (a single ampersand, "&")

Implicit variables

If an environment variable of a given name does not exist, some environment variables have what are, in effect, default, implicit, values, as follows:

_DATE
The current local date in ISO 8601 form (see note below)
_TIME
The current local time in ISO 8601 form (see note below)
_TIMEZONE
The current local timezone abbreviation (see note below)
_WEEK
The current local week of the year (see note below)
_WDAY
The current local day (Sunday = 0) of the week (see note below)
_YDAY
The current local day (January the 1st = 1) of the year (see note below)
_DAY
The current local day of the week name (see note below)
_YEAR
The (unabbreviated) current local year (see note below)
_MON
The number of the current local month, padded to two digits if less than 10 (see note below)
_MONS
The number of the current local month, padded with a space if less than 10 (see note below)
_MONTH
The current local month name (see note below)
_MDAY
The number of the current local day of the month, padded to two digits if less than 10 (see note below)
_MDAYS
The number of the current local day of the month, padded with a space if less than 10 (see note below)
_HOUR
The current local hour, in the 24 hour system, padded to two digits if less than 10 (see note below)
_MINUTE
The current local minute, padded to two digits if less than 10 (see note below)
_SECOND
The current local second, padded to two digits if less than 10 (see note below)
_ERRORLEVEL
The error level returned by the last command to execute
_INTERACTIVE
The value 0 if the command interpreter is not running in interactive mode, otherwise another value
_OS
The name of the current operating system (i.e. "OS/2")
_OSVER
The version number of the current operating system
_CMDPROC
The name of the command interpreter (i.e. "CMD")
_CMDVER
The version number of the command interpreter
_CODEPAGE
The currently active code page in the command interpreter process
_COUNTRY
The currently active country in the command interpreter process
_PID
The process ID of the command interpreter process
_PPID
The process ID of the command interpreter's parent process
_PTYPE
The type of the command interpreter process ("PM", "AVIO", "FS", or "DT")
_BOOT
The drive letter of the boot volume
_BATCH
The value 0 if a command script is not currently being executed, otherwise another value
_BATCHLINE
The current line number within the current command script, or -1 if no script is being executed
_BATCHNAME
The name of the current command script as determined when it was first executed, or no value
_DISK
The drive letter of the current default disc of the command interpreter process
_CWD
The name of the current default disc and current working directory of the command interpreter process
_CWDS
The name of the current default disc and current working directory of the command interpreter process, with a trailing slash appended
_CWP
The name of the current working directory of the command interpreter process
_CWPS
The name of the current working directory of the command interpreter process, with a trailing slash appended

Note: The current date and time, and current timezone, are derived from the system clock that is maintained by the operating system kernel. On IBM OS/2 this clock runs in UTC. The conversion from the system value to local time is performed according to the value of the TZ environment variable. For proper operation, the 32-bit Command Interpreter requires the TZ environment variable to be in the standard format as specified by the "POSIX 1003.1" international standard (ISO/IEC 9445-1:1990). This format is a superset of the format used by many OS/2 programs. Changing the TZ environment variable with the SET command will change the way that the current local date and time are derived.

Quoting conventions

To prevent the metacharacters that are used to form the conjunctions, pipelines, and parenthesised commands described in the command syntax and that are used in redirection specifications from being recognised, one can use quotation marks. Metacharacters that occur between two quotation marks have no special meaning.

The quotation marks are not stripped before the command tail is passed to the command being executed, however.

Another way to prevent metacharacters from being recognised is to prefix them with the "escape" character, the caret ("^"). The escape character is stripped before the command tail is passed to the command being executed.

The handling of the escape character is slightly different to that of the 16-bit CMD supplied with IBM OS/2, in that the presence of quotation marks does not affect the recognition of the escape character. IBM's documentation barely mentions the escape character, and does not mention at all the fact that the 16-bit CMD supplied with IBM OS/2 will not perform escape character processing on portions of a command line that are enclosed by a pair of quotation marks. The IBM documentation says, without qualification, that the escape character prevents metacharacters from being recognised. Therefore, inasmuch as the escape character is documented at all by IBM, the 32-bit Command Interpreter sticks to what the documentation says, rather than duplicates what the 16-bit CMD supplied with IBM OS/2 actually does.

Command line syntax

Simple commands

The basic building block of a command line is a simple command. This comprises a command name and an optional command tail that specifies the arguments to be passed to the command, interspersed with redirection specifications. For example, in

[c:\]somecommand tail1 >nul tail2

the command name is "somecommand", the string ">nul" is a redirection specification directing that the standard output of the command be redirected to a file called "nul", and the arguments that are actually passed to the command form the string "tail1  tail2".

Pipelines

Simple commands may be combined into pipelines by using the "|" conjunction. All simple commands in the pipeline are executed concurrently, with the standard output of each simple command being sent through an (anonymous) pipe to the standard input of the next simple command in the pipeline. For example, in

[c:\]echo y | pause

the standard output of the ECHO command is sent to the standard input of the PAUSE command.

The commands on the left-hand side of the conjunction in a pipeline are executed in a separate child process. Any built-in commands that modify the command interpreter process' current state will modify the state of that child process, not of the parent command interpreter process. It is useless to use a SET command on the left-hand side of a pipeline, for example, because the modifications to the process environment will occur in the child rather than in the parent and be lost. In the pipeline

[c:\]set PATH=\os2 | set PATH

the modification to PATH made in the first SET command is executed in the child process, whereas the second SET command is executed in the parent process.

The error level of the pipeline as a whole is the error level returned by the command at the end of the pipeline. If the standard output of a command is piped through the MORE command, for example, the error level returned by the pipeline as a whole will be the error level returned by the MORE command, not of the command whose output is piped to it.

Redirection specifications for individual simple commands will override the redirection through the pipe. For example, in

[c:\]echo y >con | pause

the standard output of the ECHO command is sent to the file named "con", rather than down the pipe to the standard input of the PAUSE command.

Compound commands

Pipelines may be combined into compound commands by the use of the "&", "&&", and "||" conjunctions. Each pipeline is executed in turn, and the error level returned by each pipeline in combination with the conjunction controls what pipeline is executed next, if any.

When pipelines are separated by the "&" conjunction, they are always both executed. For example,

[c:\]echo first & echo second

will always execute both ECHO commands.

When pipelines are separated by the "&&" or "||" conjunctions, whether the second pipeline is executed depends from the error level returned by the first. In the case of the "&&" conjunction, the second pipeline is only executed if the error level is zero; in the case of the "||" conjunction, the second pipeline is only executed if the error level is not zero. For example, in

[c:\]dir somefile && echo Somefile found!

the error level returned by the DIR command controls whether the ECHO command is executed. If DIR returns a zero errorlevel, as it does when the wildcard matches one or more files, the ECHO command is executed. If DIR returns a non-zero errorlevel, as it does when the wildcard does not match any files, the ECHO command is not executed.

The conjunctions have an order of precedence. "&&" has the highest precedence, and "&" has the lowest. Put another way: the effect of "&&" only applies up to the next "||" or "&", and the effect of "||" only applies up to the next "&". For example, in

[c:\]dir somefile && echo Found! & echo Hello

the error level returned by DIR controls whether the first ECHO command is executed, but does not control whether the second ECHO command is executed. (It always will be.)

Unlike the "|" conjunction, the "&", "&&", and "||" conjunctions do not, of themselves, cause secondary command interpreter processes to be started as child processes.

Parenthesized commands

A compound command may be surrounded by parentheses, "(" and ")", to turn it into a simple command, and to override the precendece of the conjunctions. For example, with the command

[c:\]dir file1 && dir file2 & echo Hello

the ECHO command is always executed, but using parentheses

[c:\]dir file1 && (dir file2 & echo Hello)

means that the second DIR command and the ECHO command are treated as if they were a single simple command and are only executed if the first DIR command returns a zero error level.

An entire compound command may be treated as a simple command, including having its input and output redirected and being part of a pipeline. For example, in

[c:\]( ver & date /n ) >nul

the VER and DATE commands are executed as a compound command, and their combined output is redirected to a file named "nul".

A common misconception, based upon false analogy to UNIX "shell" programs, is that parentheses cause commands to be executed in secondary command interpreters. They do not.

Redirection

Redirection specifications affect the file handles that the commands will have available when they execute. Redirection allows the standard input, output, and error handles of a command, along with any other arbitrary file handles, to be redirected elsewhere for the duration of that command.

Redirecting the standard output of a command to a file, for example, means that when the command executes its standard output handle will refer to an open file, and whatever the command writes to its standard output will be written to that file.

A redirection specification can occur anywhere in a simple command. All redirection specifications are deleted from the command line as they are processed, and so are not passed to the command when it is executed.

A redirection specificiation comprises an operator followed by a name, optionally preceded by a number:

n < filename
Open the given file for reading, and redirect file handle n from it. If n is omitted, the default is handle 0, standard input.

Example: In the following command, the standard input of the MORE command is redirected from the file \CONFIG.SYS.

[c:\]more < \config.sys
n > filename
Open the given file for writing, truncating any existing contents if it already exists and creating a new file if it does not, and redirect file handle n to it. If n is omitted, the default is handle 1, standard output.

Example: In the following command, the standard output of the DIR command is redirected to the file LISTING.TXT, overwriting its existing contents if any, and creating it if necessary.

[c:\]dir > listing.txt
n >> filename
Open the given file for writing, creating a new file if it does not already exist, and redirect file handle n to it, placing the file pointer at the end of the file so that anything written is appended to the existing contents. If n is omitted, the default is handle 1, standard output.

Example: In the following command, the standard output of the TIME command is appended to the end of the file LOGFILE.TXT.

[c:\]time /n >> logfile.txt
n >& m
Make file handle n a duplicate of file handle m. Both handles will then refer to the same file or device. If n is omitted, the default is handle 1, standard output.

Example: In the following command, the standard error of the DIR command is duplicated onto its standard output, resulting in them together being piped to the MORE command.

[c:\]dir 2>&1|more
n <-
Close file handle n. If n is omitted, the default is handle 0, standard input.

Example: In the following command, the standard input of the MORE command is closed.

[c:\]more <-
n >-
Close file handle n. If n is omitted, the default is handle 1, standard output.

Example: In the following command, the standard output of the ECHO command is closed.

[c:\]echo >-

Note: The DosGetMessage() system API function in IBM OS/2 contains a bug, in that it incorrectly holds the handle to the message file open over successive calls to the function. This bug manifests itself quite prominently when closing file handles using the above redirection specifications, since it causes problems when next the command interpreter comes to call that system API function to retrieve the text of some message. This is a bug in the IBM OS/2 system API.

Redirection specifications are applied in left-to-right order, so later redirections of the same handle will countermand earlier ones. When combining two handles, it is important to use the correct order for the intended result.

Example: In the command

[c:\]dir > listfile.txt 2>&1

the standard error and standard output will both refer to the same file because the standard error will be made into a duplicate of the standard output after the standard output will have been redirected, whereas in the command

[c:\]dir 2>&1 > listfile.txt

the standard error will refer to where the standard output used to refer to before it was redirected to the file because it will be combined with standard output before the redirection of standard output will have been performed.

Standard handles

The convention employed by most commands is that the first three file handles, numbered 0, 1, and 2, are the standard input, standard output, and standard error handles for the command. Commands normally read input from their standard input, write output to their standard output, and error messages to their standard error.

The usage of other file handles depends from command to command. Some application programs, especially BBS softwares, use other, higher numbered, file handles as well as the standard ones.

For text-mode command interpreters such as CMD and TEXTCMD these handles are by default all handles for the "CON" device, and so what is read from standard input comes from the console keyboard and what is written to standard output and standard error goes to the console screen.

For graphical command interpreters such as PMCMD the standard input, output, and error are by default handles to the ends of pipes. Anything sent down the pipe connected to standard output and standard error is displayed by the graphical command interpreter in its command output window.


The 32-bit Command Interpreter is © Copyright Jonathan de Boyne Pollard. "Moral" rights are asserted.