The C language has fairly standardized conventions about how to process command-line arguments, which I summarize here. I will also give some advice on the most effective ways to do this.
Here are the conventions:
Optional command-line arguments have a dash (-) before
them.
Optional command-line arguments are identified by placing a dash
(-) before the optional argument's name. For instance, the
"ls" command in Unix will give a long form output if the
command-line argument "-l" is provided (the "%" is
the unix prompt):
% ls -l
total 48
-rw-rw-r-- 1 mvanier cs11 16668 Apr 1 01:23 c_style_guide.html
-rwxr-xr-x 1 mvanier cs11 2296 Apr 1 01:21 c_style_check
-rw-r--r-- 1 mvanier cs11 755 Apr 1 15:46 cmdline_args.html
-rw-r--r-- 1 mvanier cs11 8077 Feb 11 22:23 gdb.html
-rw-rw-r-- 1 mvanier cs11 8290 Sep 25 2001 make.html
In general, if an argument doesn't have a dash in front of it, it's not optional unless it's an argument to another command-line argument (see below). Note that programs in DOS or Windows typically use a forward slash (/) for command-line options. For this course, we will use the dash only (which is the Unix convention).
Optional command-line arguments may be located anywhere in the argument list and in any order.
Don't assume that your user will always put optional arguments before non-optional arguments, or will put optional arguments in a particular order. Doing this invariably leads to very convoluted code which is a pain to read and doesn't work much of the time. I'll show you how to do it the right way later in this page.
Optional command-line arguments may themselves have arguments.
Sometimes an optional argument may have arguments of its own. These arguments don't usually have dashes in front of them, and are often numbers. It's as if you're saying "you may not need to do this optional task at all, but if you do, you'll need to know these other argument values as well". For instance, a sort program may have an optional argument which tells which kind of sort routine to use:
% sort -method bubble words.txt
Here, the argument "bubble" is an argument to the
"-method" option and specifies a bubble sort (a particular kind
of sorting algorithm). If it wasn't included the default might be to use
quicksort (another sorting algorithm):
% sort words.txt
Note that here you omit both the "-method" optional argument
and its argument "bubble". Fortunately for you, none of the
programs in this track have optional arguments that themselves have
arguments, but you will certainly see this and/or have to implement this
eventually.
Exceptions to the conventions
Some programs don't use the dash in front of optional arguments (the
"tar" program is an example; it's often invoked as "tar
xvf <filename>" where "xvf" are optional
arguments). This is not recommended. Some programs allow several
single-letter options to be preceded by a single dash e.g. "ps
-elf" instead of "ps -e -l -f". This is also not
recommended, at least not for the programs in this track.
Command-line arguments are always represented as an array of strings.
This array is called "argv" (for "argument values") and there is
also an integer called "argc" (for "argument count") which is
the number of command-line arguments. That's why the main
function looks like this:
int main(int argc, char *argv[])
Here, argc is declared to be an int, whereas
argv is an array of char *'s (i.e. strings).
Remember that argv[0] is the program's name, so you normally
won't want to use that except in a usage statement (see below).
Your first task in main is to process the optional argument
values, if any. The standard way to walk through the argv array
is like this:
int i;
int quiet = 0; /* Value for the "-q" optional argument. */
for (i = 1; i < argc; i++) /* Skip argv[0] (program name). */
{
/*
* Use the 'strcmp' function to compare the argv values
* to a string of your choice (here, it's the optional
* argument "-q"). When strcmp returns 0, it means that the
* two strings are identical.
*/
if (strcmp(argv[i], "-q") == 0) /* Process optional arguments. */
{
quiet = 1; /* This is used as a boolean value. */
}
else
{
/* Process non-optional arguments here. */
}
}
Note that the "-q" optional argument could be located anywhere on the command line and the program would still work. If the optional argument has arguments of its own the code is trickier:
int i;
int opt = 0;
int optarg1 = 0;
int optarg2 = 0;
for (i = 1; i < argc; i++) /* Skip argv[0] (program name). */
{
if (strcmp(argv[i], "-opt") == 0) /* Process optional arguments. */
{
opt = 1; /* This is used as a boolean value. */
/*
* The last argument is argv[argc-1]. Make sure there are
* enough arguments.
*/
if (i + 2 <= argc - 1) /* There are enough arguments in argv. */
{
/*
* Increment 'i' twice so that you don't check these
* arguments the next time through the loop.
*/
i++;
optarg1 = atoi(argv[i]); /* Convert string to int. */
i++;
optarg2 = atoi(argv[i]); /* Ditto. */
}
else
{
/* Print usage statement and exit (see below). */
}
}
else
{
/* Process non-optional arguments here. */
}
}
In some cases, command-line processing can get quite hairy. Fortunately for you, the above examples are more than sufficient for the programs in this course.
Your program has to be able to handle the case when invalid command-line arguments are provided to it without crashing (core dumping etc.). The correct way to handle this is:
For instance, let's say that your program expects exactly three arguments in addition to the program name, and can take another optional argument. You could write this:
if (argc < 4)
{
fprintf(stderr, "usage: %s filename word count [-w]\n", argv[0]);
exit(1);
}
There are several parts to this:
The usage message: it always starts with the word
"usage", followed by the program name and the names of the
arguments. Argument names should be descriptive if possible, telling what
the arguments refer to, like "filename" above. Argument
names should not contain spaces! Optional arguments are put between
square brackets, like "-w" above. Do not use square brackets
for non-optional arguments! Always print to stderr, not to
stdout, to indicate that the program has been invoked
incorrectly.
The program name: always use argv[0] to refer to the
program name rather than writing it out explicitly. This means that if you
rename the program (which is common) you won't have to re-write the
code.
Exiting the program: use the exit function, which is
defined in the header file <stdlib.h>. Any non-zero
argument to exit (e.g. exit(1)) signals an
unsuccessful completion of the program (a zero argument to exit
(exit(0)) indicates successful completion of the program, but
you rarely need to use exit for this). If you're truly anal you
can use EXIT_FAILURE and EXIT_SUCCESS (which are defined in
<stdlib.h>) instead of 1 and 0 as arguments
to exit.
If you have to write out a usage statement more than once, make it a
separate function called (obviously) usage and pass it the
program name (argv[0]) as an argument. Then call it from
main whenever the program has invalid arguments.
Always print a usage message to stderr if the
program receives incorrect arguments. Failure to do so means an automatic
redo.
Don't assume that optional arguments will be located in any particular place in the argument list.
This was discussed above.
Don't try to process all the command-line arguments in a single pass if it isn't convenient to do so.
I've seen a lot of C code that tied itself in knots trying to process the
entire argument list in one pass. Typically, the code has a dense nest of
if statements to handle every possible combination of arguments
in every possible order. This is completely unnecessary and is simply bad
programming. Most program invocations have very few command-line arguments,
so even if you just process one of them per pass through the argument list
you still won't be wasting much time.
Having said that, the command-line argument processing for lab 3 can be
done in one pass through the argv array with no
difficulty.
Don't alter the argv array!
Some programmers do strange manipulations to the argv array
involving pointer arithmetic, moving arguments around, trying to delete
arguments, etc. The usual reason for this is to get rid of the arguments
that have already been processed, particularly optional arguments. It's
really easy to screw up when doing this, and it's almost never
necessary, so don't do it unless there's no other way (and there will always
be another way for all the labs in this track). It's OK to copy some of the
arguments to a separate array and/or separate variables if you need
to.