C track: Using 'gdb' for debugging.

Debugging C programs is often extremely challenging. The direct pointer manipulations permitted by the language give rise to bugs that can't happen in most other computer languages. Often, these bugs manifest themselves in strange ways, such as the program printing interesting messages like "core dump" or "bus error" with no additional information. This is the price you pay for the efficiency and low-level control that the C language provides.

Debugging is a big subject, and we can only scratch the surface here. In general, here are three approaches you can use for debugging:

  1. When you get a bug, put lots of print statements in code likely to have caused the bug so that you can monitor the values of variables which may not be what they should be.

  2. Add lots of assert statements so that when something goes wrong the program halts right away instead of continuing. If you don't know about assert, do "man assert". We will talk more about this later in the course.

  3. Use a debugger to find out where your code went wrong.

These approaches are not mutually exclusive and almost every programmer uses a combination of all three (plus others). The first two methods are pretty self-explanatory. The third needs a bit more explanation, which we provide below. You can also do "man gdb" and/or "info gdb" to get much more information.

GDB basics

GDB stands for Gnu DeBugger. It is an environment under which you can run a C program in such a way as to make it very easy to identify bugs.

To use gdb, do the following:

  1. Compile your program with the -g flag e.g.

        gcc -Wall -Wstrict-prototypes -ansi -pedantic -g myprog.c -o myprog

    (Note that we're using a lot of warning options as well, which are the "-Wall -Wstrict-prototypes -ansi -pedantic" options; these force the compiler to complain if your code isn't ANSI-compliant or if it has other suspicious features. It's a good habit to always use these options.) The "-g" option puts debugging information into the executable. Most importantly, it puts the text of the source code file into the executable so you can examine it as the program executes (we'll see how below).

  2. Type gdb myprog (for the example above). This will start the interactive debugger. It's basically an interpreter-like environment in which you can run your program line-by-line and do useful debugging tasks as well.

When in the debugger, you have a choice of lots of commands. Do "info" to get a list of commands. Here are some of the most important ones:

  1. run: runs the program
  2. where: tells you where you are in the program when you have stopped at some point. Also tells you the calling history of the program up to that point (i.e. which functions have been called to get you where you are).
  3. p <variable>: prints the value of <variable>
  4. break <file>:<line>: causes the program to stop at a particular line in a particular source code file
  5. break <function>: causes the program to stop when entering a particular function.
  6. n: executes the next statement and then stops. This command will not enter a new function while you're inside a function. Instead, it goes to the next statement in the current function.
  7. s: executes the next statement, possibly entering a new function, and then stops.
  8. l: lists lines in a source code file.
  9. c: continues executing the program.
  10. q: exits (quits) gdb.

Several of these commands have longer names that you can use as well: print for p, next for n, step for s, list for l, cont for c, and quit for q.

For more information about any of these, type help cmdname at the gdb prompt, where cmdname is the name of the command listed above.

Things to try when things go wrong

Let's say that you're running a C program and it core dumps. The error message you get is unlikely to be helpful; it will probably be something like segmentation violation (core dumped). First, let's identify what that cryptic phrase means. A "segmentation violation" means that your program tried to access memory that it wasn't allowed to. Since Unix is a multitasking operating system, each process lives in its own little world, with its own little hunk of memory that it's allowed to play with. The operating system knows what hunk belongs to your process and what doesn't; if your process tries to access memory that it doesn't have the right to access, then it violates the (memory) segment boundaries and you get a segmentation violation, which (normally) causes your program to abort. A "core dump" refers to the fact that by default, a "core" file will be "dumped" into the directory from which you ran your program. The file is actually called "core" and can be very large (several megabytes or more). That's because it's a dump of what the memory contained when your program crashed. It is possible to use the core file to debug your program, but there are much easier ways to debug, so we won't cover that here. Most Unix shells (i.e. the command interpreter like bash) allow you to put a statement in the initialization file (.bashrc for bash) that restricts the size of core dumps (ideally to zero bytes, in which case no core file is dumped); ask your local Unix guru for more information on this.

OK. Now what you need to know is where the segmentation violation occurred. To do this, compile your program with the "-g" option described above, start up gdb, and type "run myprog" (where "myprog" is the name of your program). Alternatively you can invoke gdb as "gdb myprog" and then just type "run" at the gdb prompt. [NOTE: if your program needs command-line arguments, you should supply them after the "run" or "run myprog" statement e.g. "run myprog arg1 arg2 arg3".] This will run your program until the segmentation violation occurs. Gdb will tell you that the segmentation violation occurred and then wait for your command. It will look something like this:

    Program received signal SIGSEGV, Segmentation fault.
    0x4006cb26 in free () from /lib/libc.so.6

This means that the segmentation violation (also known as a segmentation fault or segfault for short) occurred in the library function "free". This is weird; does this mean that there is a bug in "free"? Almost certainly not. Instead, your program did something bad that caused "free" to fail (possibly by asking it to free a NULL pointer).

Type "where" and you will get a stack backtrace. This is probably the single most useful thing you can have when something goes wrong. A stack backtrace is a list of function names in your program and associated data. It looks something like this:

    (gdb) where
    #0  0x4006cb26 in free () from /lib/libc.so.6
    #1  0x4006ca0d in free () from /lib/libc.so.6
    #2  0x8048951 in board_updater (array=0x8049bd0, ncells=2) at 1dCA2.c:148
    #3  0x80486be in main (argc=3, argv=0xbffff7b4) at 1dCA2.c:44
    #4  0x40035a52 in __libc_start_main () from /lib/libc.so.6

The stack is a data structure which holds information about functions which have partially finished executing. When a function calls another function, information about the new function being called is "pushed" onto a stack. This includes information such as the arguments to the function, the contents of local variables, etc. This information is referred to as a stack frame. When the function is finished its work it "pops" the frame off the stack and returns to the previous stack frame, which belongs to the function that called it. In the above backtrace, we see that the function __libc_start_main called main which called board_updater which called free which called itself recursively. In this case, the functions __libc_start_main and free are C library functions which you didn't write. main is the good old main function that you write in every C program. What seems to have happened here is that something went wrong in the board_updater function, and gdb even tells you what line it happened on (which is what the "-g" option did). You should look at that line, and perhaps set a breakpoint there:

    break 1dCA2.c:148

Now, when you run the program again, gdb will stop it on that line, and you will be able to print out the values of any relevant variables before free is called.

There is much, much more to debugging than I have time to go into here, but this should get you started. Reading the gdb info documentation (type "info gdb" at the shell prompt) will be a good place to go for more information, as will asking your Unix guru friends.

Graphical front-ends to gdb

There exist programs which serve as graphical front-ends to gdb. The one we recommend is called ddd (for Data Display Debugger). Learning a graphical debugger can take time, but it's well worth it because it makes it much easier to interact with your program (setting breakpoints, looking at code as it executes, etc.). Describing ddd in detail is beyond the scope of this document, but if you're interested, type "man ddd" at the Unix shell prompt or visit this link for much more information.