If a program is called with incorrect command-line arguments, it should detect that fact and print a usage statement to the terminal. This document tells you what that usage statement should contain, and describes the conventions we expect you to use. It is not exhaustive; there are fairly standard ways to write even more elaborate usage statements than those that are described here, but the information below will at least get you started writing decent usage statements.
Usage statements are essentially the same for all programming languages. The conventions we describe below are for Unix-based operating systems (Linux and Mac OS X); similar conventions exist for Windows but some details may be different.
If you don't know what a command-line argument is, you are not ready to read this document. Go back to your language tutorial, learn what a command-line argument is, come back here and continue reading.
A usage statement is a printed summary of how to invoke a program from a shell prompt (i.e. how to write a command line). It will include a description of all the possible command-line arguments that the program might take:
A usage statement should always be printed when the user invokes a program with either:
It's important that you explicitly check the number and contents of command-line arguments so as to make sure that they are valid. This process is slightly different for each programming language, so we won't go into it here (see the language tracks for details about this).
To force a program to print its usage message, all you need to do is to
type in the program name with no arguments (for programs that take at least
one command-line argument). This is a common trick. However, some programs
don't have any required command-line arguments (all the arguments are
optional; see below). In cases like these, there is usually an optional
argument called -help
which will cause the usage
message to be printed. Some programs use --help
instead of -help
for this purpose.
The usage message is normally printed to the terminal. In Unix systems,
"printing to the terminal" comes in two flavors. Normal printing to the
terminal means printing to "standard output" (called stdout
in
the C language and similar things in other languages). This is not
where you should print usage statements. Usage statements should be printed
to "standard error" (called stderr
in the C language and similar
things in other languages). It's important to always print usage statements
to stderr
and not to stdout
.
You might wonder why this is important, given that printing to both
stdout
and stderr
prints to the same terminal
window. The reason is that it is possible to redirect either
stdout
or stderr
independently to a file rather
than to a terminal. This is very useful in practice. For instance, you
might want to log all of your error messages to a log file, but have the
normal output go to the terminal as usual. Or you may want to redirect the
non-error output to one file, and all the error messages to another file.
Having error messages printed to stderr
instead of to
stdout
makes this easy. If all error messages were printed to
stdout
, the normal output and error messages would not be easy
to separate.
Your usage statement should contain
The name of the program
Every non-optional command-line argument your program takes
Every optional command-line argument your program takes
Any extra descriptive material that the user should know about.
The usage statement should always begin with the literal word
usage
in lower case, followed by a colon and a space, followed
by the rest of the usage message. So, the usage message starts with
usage:
. If there are no command-line arguments, the usage
message will be very simple e.g.
usage: myprog
where myprog
is the name of the program.
Every command-line argument in the usage statement should be a single
word, with no spaces. If you want to write an argument as multiple words,
join the words together using underscores. So don't write number
of generations
as a command-line argument; write
number_of_generations
.
Do not surround a command-line argument name with parentheses, square
brackets, angle brackets, curly brackets, or quotation marks. For our
example, don't write [number_of_generations]
or
<number_of_generations>
or
"number_of_generations"
; just write
number_of_generations
. This is important because
square brackets and angle brackets have special meanings in usage messages,
and the other forms just look ugly.
The word representing a particular command-line argument should be descriptive. It should say what the command-line argument is supposed to represent, at least in general terms. So this would be a bad usage statement:
usage: myprog arg1 arg2
because arg1
and arg2
could mean anything. This would be a better usage message:
usage: myprog input_file output_file
because it suggests that the first command-line argument
(input_file
) represents the name of an input file, and the
second command-line argument (output_file
) presumably represents
the name of an output file. This is useful information to someone using the
program.
Do not separate successive command-line arguments with commas or semicolons; just separate them with single spaces. So don't write
usage: myprog input_file, output_file
Instead, write:
usage: myprog input_file output_file
Don't use symbolic characters for command-line argument names in usage
statements. However, there are other common abbreviations you can use. For
instance, instead of writing number_of_generations
above you might want to write #generations
. Don't
do this; instead, write ngenerations
(or even
ngens
if you want it to be shorter). Using
"n
" as the first letter of a command-line arguments is a
commonly-used abbreviation for "number of".
If you have multiple command-line arguments of the same kind, you can number them. So if your program takes three files of the same type, you might write the usage statement as
usage: myprog file1 file2 file3
Make sure you don't put a space before the number! In other words, don't do this:
usage: myprog file 1 file 2 file 3
The reason for this is that it's hard to tell if the 1, 2, and 3 are supposed to be separate command-line arguments of their own.
If you need to explain in detail what a particular command-line argument means, do it on the lines following the first line. Don't feel that you have to cram the entire usage message on one line. For instance, this is bad:
usage: myprog infile (the input file) outfile (the output file)
It should instead look like this:
usage: myprog infile outfile infile: the input file outfile: the output file
Note how the first line contains an example of the usage, while the subsequent lines explain what the command-line arguments mean in detail. This is a good pattern to follow.
Many programs have optional command-line arguments. There are various situations in which this can arise:
There are a few conventions you should follow with optional arguments. Here they are:
Optional arguments should always be surrounded by square brackets. Do not use square brackets for any other purpose in usage messages. Don't use anything but square brackets for this purpose. Square brackets mean that an argument is optional, always.
Optional "flags" (arguments that change the way the program works)
should start with a dash. Very often (but not always) they will have a name
which is a dash followed by a single letter, which identifies what it is.
Here's an example, which is a greatly simplified version of the usage message
for the Unix ls
program which lists files:
usage: ls [-a] [-F]
This program is called ls
and can be called with
no arguments, with one optional argument ( ls -a
or
ls -F
) or with two optional arguments
( ls -a -F
). In this particular case, the
-a
optional argument says to list hidden files as
well as normal ones, and -F
changes the way the
output of the program is formatted.
Note that each optional argument gets its own set of square brackets.
If an optional "flag" argument itself has arguments, put them inside the square brackets. For instance:
usage: chess [-strength r] -strength r: playing strength in approximate rating (800-3000) (default strength is 2200)
might represent a chess program that can either play with its default strength (when invoked with no command-line arguments) or with some other strength if the optional argument is used. Note the extra explanatory text that goes after the usage statement itself. This program could be invoked like this:
% chess
(where %
is the prompt) or like this:
% chess -strength 2600
but not like this:
% chess -strength
because that wouldn't make sense (the strength isn't specified).
...
) to
indicate that any number of arguments can follow. For instance:
usage: average n1 [n2 ...] n1, n2, etc.: numbers between 1 and 10 Maximum length of the list: 100
This usage message indicates that the program (which is called
average
) takes at least one number
(called n1
) and perhaps some other numbers, all
between 1 and 10. There can be at most 100 numbers. Note how the lines
after the usage statement itself were used to explain the meaning of the
command-line arguments and also describe constraints on them (that there be
no more than 100 numbers, and that each number is between 1 and 10).
You shouldn't hard-code the program name into your program's source code.
It won't make any difference in how it's displayed, but if you change the
name of your program and don't change your usage message the usage message
will be invalid. Instead, you should write your code so that the program
name gets inserted into the usage message before printing. In C, the program
name is argv[0]
(the first element of the
argv
array) whereas in python it's
sys.argv[0]
.
Some programs (e.g. C track lab 2) use the less-than
( <
) and greater-than
( >
) symbols to perform Unix file redirection on
the command line. The less-than symbol means that input from a file will
look to the program like it's input to the terminal, and the greater-than
symbol means that output to the terminal is actually redirected to a file.
If this is done, it is a good idea to indicate this in the usage message
(though many, many programs don't do this). For instance:
usage: myprog < input_file > output_file
This says that the program ( myprog
) takes its
input from the file input_file
using input file
redirection (remember to put the <
symbol
before the file name!) and puts its output into the file
output_file
using output file redirection.
Note that this also explains why you don't want to use the less-than
( <
) or greater-than
( >
) characters as angle brackets around a
command-line argument name; a user might think you meant input/output
redirection.