[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

backtrace of all stack frames, without debugger



I have to offer an interesting debugging trick for Linux, and then a question
on how to improve this trick.

Many times (though not all), when a program gets a memory fault (SIGSEGV,
signal 11) all it takes to find the bug is to find where in the program
the memory fault occured.

Usually you can just let the program dump core, and then run gdb on it and
do "where" to get a backtrace stating the exact line the program was running,
and which functions where calling which other functions at the time (these
are the frames on the stack).

Unfortunately, sometimes it's not possible to have a core file - for example,
in a threaded program sometimes you get a bad core or a core of the wrong
thread; sometimes you don't have enough space for a corefile; In a production
environment you can't very well tell your customer: could you please send me
that 100MB file that was generated after the crash, so it will be easier for
me to debug?

Instead, here is a nice debugging trick. If you call the debug_signal(SIGSEGV)
defined below, this function attaches a signal handler to the SIGSEGV signal,
which takes the program counter from the signal context (which is an
undocumented second parameter to the signal handler in Linux!) and prints
out a command calling addr2line(1) to find the source line corresponding to
this code address (of course, you must compile the program with the -g
parameter).

#include <signal.h>
#include <unistd.h>

/* returns >=0 on success and buf filled and null-terminated. */
static int
find_current_executable(char *buf,int bufsize){
    int i;
    i=readlink("/proc/self/exe",buf,bufsize);
    if(i>0 && i<bufsize){
        buf[i]='\0'; /* readlink doesn't finish string with null */
        return i+1;
    }
    return -1;
}

static void
debugging_sighandler(int sig, struct sigcontext sc)
{
    /*** the following trick finds the executable's filename in progname */
    char progname[128];
    if(find_current_executable(progname,sizeof(progname)) < 0)
        sprintf(progname,"?unknown-executable?"); /*where is the executable?*/

    fprintf(stderr, "Signal %d received. To find where, run\n   addr2line %p -e 
%s\n",
        sig, (void*) sc.eip, progname);
    exit(1);
}
void
debug_signal(int sig)
{
    signal(sig, (__sighandler_t) debugging_sighandler);
}


Ok, so that's a nice and useful trick. But I want to improve it to show the
code addresses of all function calls on the stack frames, not just the last
call. This is important when your program dies in some general library
call, but you want to know where this library function was called from.
I can try reverse-engineering the Linux stack, or reading the appropriate
kernel code, but I was wondering if someone already knows how to do it, or
better yet: have ready code which prints out the backtrace from a live
stack.

-- 
Nadav Har'El                        |      Thursday, Sep 13 2001, 26 Elul 5761
nyh@math.technion.ac.il             |-----------------------------------------
Phone: +972-53-245868, ICQ 13349191 |Hardware, n.: The parts of a computer
http://nadav.harel.org.il           |system that can be kicked.

=================================================================
To unsubscribe, send mail to linux-il-request@linux.org.il with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail linux-il-request@linux.org.il