本文是学习100gdb-tips的总结内容,部分代码与原文不同,补充相关调试程序中遇到的问题与解决方法。
调试已运行的程序
thread.c
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
while (1)
{
printf("%s\n", (char*)p_arg);
sleep(10);
}
}
int main(void)
{
pthread_t t1, t2;
pthread_create(&t1, NULL, thread_func, "Thread 1");
pthread_create(&t2, NULL, thread_func, "Thread 2");
sleep(1000);
return 0;
}
gcc -g -o thread thread.c -lpthread
然后在当前会话shell中执行./thread &
或直接执行./thread再单独开一个shell会话窗口。
调试已运行的进程有两种方法,一种是通过ps找到要调试的程序的进程号,例如上面例子可以通过ps -aux | grep thread,然后gdb thread 进程号。
另一种方法是先gdb 然后通过attach命令来调试进程。
如果不想继续调试了,可以用“detach”命令“脱离”进程:
(gdb) detach
Detaching from program: /data/nan/a, process 10210
(gdb) bt
No stack.
如果觉得每次查找进程号麻烦可以使用如下脚本:
# 保存为xgdb.sh(添加可执行权限)
# 用法 xgdb.sh program
prog_bin=$1
running_name=$(basename $prog_bin)
pid=$(pidof $running_name)
gdb attach $pid
例如我们使用./xgdb.sh thread即可调试正在运行的进程。
调试子进程
代码:
#include <stdio.h>
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
pid_t pid;
pid = fork();
if (pid == 0)
{
printf("child\n");
exit(1);
}
else if (pid > 0)
{
printf("parent\n");
exit(0);
}
else
{
printf("error\n");
}
printf("hello world\n");
return 0;
}
在调试多进程程序时,gdb默认会追踪父进程。在执行set follow-fork-mode child后可以去跟踪子进程。
这个命令目前Linux支持,其它很多操作系统都不支持,使用时请注意。
同时调试父进程和子进程
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(void)
{
pid_t pid;
pid = fork();
if (pid < 0)
{
exit(1);
}
else if (pid > 0)
{
printf("Parent\n");
exit(0);
}
printf("Child\n");
return 0;
}
在调试多进程程序时,gdb默认只会追踪父进程的运行,而子进程会独立运行,gdb不会控制。以上面程序为例:
Starting program: /root/100gdb/a.out
Breakpoint 1, main () at child.c:8
8 pid = fork();
(gdb) n
child
9 if (pid == 0)
(gdb) n
14 else if (pid > 0)
(gdb)
可以看到当单步执行到第8行时,程序打印出“Child” ,证明子进程已经开始独立运行。
如果要同时调试父进程和子进程,可以使用“set detach-on-fork off”(默认detach-on-fork是on)命令,这样gdb就能同时调试父子进程,并且在调试一个进程时,另外一个进程处于挂起状态。仍以上面程序为例:
在使用“set detach-on-fork off”命令后,用“i inferiors”(i是info命令缩写)查看进程状态,可以看到父子进程都在被gdb调试的状态,前面显示“*”是正在调试的进程。当父进程退出后,用“inferior infno”切换到子进程去调试。
这个命令目前Linux支持,其它很多操作系统都不支持,使用时请注意。
此外,如果想让父子进程都同时运行,可以使用“set schedule-multiple on”命令(默认off),仍以上述代码为例:
Reading symbols from parent...
(gdb) set schedule-multiple on
(gdb) b main
Breakpoint 1 at 0x115d: file parent.c, line 8.
(gdb) r
Starting program: /root/workspace/gdb/parent
Breakpoint 1, main () at parent.c:8
8 pid = fork();
(gdb) n
[Detaching after fork from child process 28513]
Child
9 if (pid < 0)
(gdb) n
13 else if (pid > 0)
(gdb) p pid
$1 = 28513
(gdb) n
15 printf("Parent\n");
(gdb) n
Parent
16 exit(0);
(gdb) q
可以看到打印出了“Child”,证明子进程也在运行了。
查看线程信息
#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
while (1)
{
printf("%s\n", (char*)p_arg);
sleep(10);
}
}
int main(void)
{
pthread_t t1, t2;
pthread_create(&t1, NULL, thread_func, "Thread 1");
pthread_create(&t2, NULL, thread_func, "Thread 2");
sleep(1000);
return;
}
用gdb调试多线程程序,可以用“i threads”命令(i是info命令缩写)查看所有线程的信息,以上面程序为例(运行平台为Linux,CPU为X86_64):
(gdb) b main
Breakpoint 1 at 0x400722: file google.c, line 15.
(gdb) r
Starting program: /root/100gdb/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main () at google.c:15
15 {
(gdb) n
18 pthread_create(&t1, NULL, thread_func, "Thread 1");
(gdb)
[New Thread 0x7ffff77ef700 (LWP 21849)]
19 pthread_create(&t2, NULL, thread_func, "Thread 2");
(gdb)
Thread 1
[New Thread 0x7ffff6fee700 (LWP 21856)]
21 sleep(1000);
(gdb) i thread
Id Target Id Frame
* 1 Thread 0x7ffff7fdd700 (LWP 21836) "a.out" main () at google.c:21
2 Thread 0x7ffff77ef700 (LWP 21849) "a.out" 0x00007ffff78bc30d in nanosleep ()
at ../sysdeps/unix/syscall-template.S:84
3 Thread 0x7ffff6fee700 (LWP 21856) "a.out" clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:81
(gdb)
第一项(Id):是gdb标示每个线程的唯一ID:1,2等等。
第二项(Target Id):是具体系统平台用来标示每个线程的ID,不同平台信息可能会不同。 像当前Linux平台显示的就是: Thread 0x7ffff77ef700 (LWP 21849)。
第三项(Frame):显示的是线程执行到哪个函数。
前面带“*”表示的是“current thread”,可以理解为gdb调试多线程程序时,选择的一个“默认线程”。
可以用“i threads [Id...]”指定打印某些线程的信息,例如:
(gdb) info threads 1
Id Target Id Frame
* 1 Thread 0x7ffff7da7740 (LWP 28527) "process" main () at process.c:20
(gdb) info threads 1 2
Id Target Id Frame
* 1 Thread 0x7ffff7da7740 (LWP 28527) "process" main () at process.c:20
2 Thread 0x7ffff7da6700 (LWP 28533) "process" 0x00007ffff7e8b670 in __GI___nanosleep (
requested_time=requested_time@entry=0x7ffff7da5ea0, remaining=remaining@entry=0x7ffff7da5ea0)
at ../sysdeps/unix/sysv/linux/nanosleep.c:28
(gdb) info threads 1
Id Target Id Frame
* 1 Thread 0x7ffff7da7740 (LWP 28527) "process" main () at process.c:20
使用$_thread变量
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
int a = 0;
int b = 0;
void *thread1_func(void *p_arg)
{
while (1)
{
a++;
sleep(1);
}
}
void *thread2_func(void *p_arg)
{
while (1)
{
b++;
sleep(1);
}
}
int main(void)
{
pthread_t t1, t2;
pthread_create(&t1, NULL, thread1_func, "Thread 1");
pthread_create(&t2, NULL, thread2_func, "Thread 2");
sleep(1000);
return 0;
}
gdb从7.2版本引入了$_thread这个“convenience variable”,用来保存当前正在调试的线程号。这个变量在写断点命令或是命令脚本时会很有用。
首先设置了观察点:“wa a”(wa是watch命令缩写),也就是当a的值发生变化时,程序会暂停,接下来在commands语句中打印线程号。
可以看到在程序中断后打印出了进程号 thread id = 2
打印所有线程的堆栈信息
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
void *thread_func(void *p_arg)
{
while (1)
{
printf("%s\n", (char*)p_arg);
sleep(10);
}
}
int main(void)
{
pthread_t t1, t2;
pthread_create(&t1, NULL, thread_func, "Thread 1");
pthread_create(&t2, NULL, thread_func, "Thread 2");
sleep(1000);
return 0;
}
Breakpoint 1, main () at process.c:14
14 {
(gdb) n
17 pthread_create(&t1, NULL, thread_func, "Thread 1");
(gdb) n
[New Thread 0x7ffff7da6700 (LWP 28549)]
Thread 1
18 pthread_create(&t2, NULL, thread_func, "Thread 2");
(gdb) n
[New Thread 0x7ffff75a5700 (LWP 28550)]
Thread 2
20 sleep(1000);
(gdb) thread apply all bt
Thread 3 (Thread 0x7ffff75a5700 (LWP 28550)):
#0 0x00007ffff7e8b670 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffff75a4ea0, remaining=remaining@entry=0x7ffff75a4ea0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1 0x00007ffff7e8b57a in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#2 0x0000555555555187 in thread_func (p_arg=0x55555555600d) at process.c:10
#3 0x00007ffff7f9e182 in start_thread (arg=<optimized out>) at pthread_create.c:486
#4 0x00007ffff7ec7b1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 2 (Thread 0x7ffff7da6700 (LWP 28549)):
#0 0x00007ffff7e8b670 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffff7da5ea0, remaining=remaining@entry=0x7ffff7da5ea0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1 0x00007ffff7e8b57a in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#2 0x0000555555555187 in thread_func (p_arg=0x555555556004) at process.c:10
#3 0x00007ffff7f9e182 in start_thread (arg=<optimized out>) at pthread_create.c:486
#4 0x00007ffff7ec7b1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 1 (Thread 0x7ffff7da7740 (LWP 28545)):
#0 main () at process.c:20
(gdb) thread apply 1 bt
Thread 1 (Thread 0x7ffff7da7740 (LWP 28545)):
#0 main () at process.c:20
(gdb) thread apply 1-2 bt
Thread 1 (Thread 0x7ffff7da7740 (LWP 28545)):
#0 main () at process.c:20
Thread 2 (Thread 0x7ffff7da6700 (LWP 28549)):
#0 0x00007ffff7e8b670 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffff7da5ea0, remaining=remaining@entry=0x7ffff7da5ea0) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
#1 0x00007ffff7e8b57a in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#2 0x0000555555555187 in thread_func (p_arg=0x555555556004) at process.c:10
#3 0x00007ffff7f9e182 in start_thread (arg=<optimized out>) at pthread_create.c:486
#4 0x00007ffff7ec7b1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb)
可以受用命令thread apply all bt查看所有线程堆栈信息,同时使用thread-id可以显示部分线程堆栈信息。
只允许一个线程运行
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
int a = 0;
int b = 0;
void *thread1_func(void *p_arg)
{
while (1)
{
a++;
sleep(1);
}
}
void *thread2_func(void *p_arg)
{
while (1)
{
b++;
sleep(1);
}
}
int main(void)
{
pthread_t t1, t2;
pthread_create(&t1, NULL, thread1_func, "Thread 1");
pthread_create(&t2, NULL, thread2_func, "Thread 2");
sleep(1000);
return 0;
}
Reading symbols from thread...
(gdb) b thread.c:9
Breakpoint 1 at 0x1161: file thread.c, line 11.
(gdb) r
Starting program: /root/workspace/gdb/thread
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7da6700 (LWP 28631)]
[New Thread 0x7ffff75a5700 (LWP 28632)]
[Switching to Thread 0x7ffff7da6700 (LWP 28631)]
Thread 2 "thread" hit Breakpoint 1, thread1_func (
p_arg=0x555555556004) at thread.c:11
11 a++;
(gdb) p b
$1 = 1
(gdb) n
12 sleep(1);
(gdb)
11 a++;
(gdb)
Thread 2 "thread" hit Breakpoint 1, thread1_func (
p_arg=0x555555556004) at thread.c:11
11 a++;
(gdb) p b
$2 = 5
(gdb)
thread1_func更新全局变量a的值,thread2_func更新全局变量b的值。我在thread1_func里a++语句打上断点,当断点第一次命中时,打印b的值是1,在单步调试thread1_func几次后,b的值变成5,证明在单步调试thread1_func时,thread2_func也在执行。
如果想在调试一个线程时,让其它线程暂停执行,可以使用“set scheduler-locking on”命令:
Reading symbols from thread...
(gdb) b thread
thread.c thread1_func thread2_func
(gdb) b thread.c:9
Breakpoint 1 at 0x1161: file thread.c, line 11.
(gdb) r
Starting program: /root/workspace/gdb/thread
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7da6700 (LWP 28646)]
[New Thread 0x7ffff75a5700 (LWP 28647)]
[Switching to Thread 0x7ffff7da6700 (LWP 28646)]
Thread 2 "thread" hit Breakpoint 1, thread1_func (p_arg=0x555555556004) at thread.c:11
11 a++;
(gdb) set scheduler-locking on
(gdb) p b
$1 = 1
(gdb) n
12 sleep(1);
(gdb) n
11 a++;
(gdb) n
Thread 2 "thread" hit Breakpoint 1, thread1_func (p_arg=0x555555556004) at thread.c:11
11 a++;
(gdb) n
12 sleep(1);
(gdb) n
n11 a++;
(gdb) n
Thread 2 "thread" hit Breakpoint 1, thread1_func (p_arg=0x555555556004) at thread.c:11
11 a++;
(gdb) n
12 sleep(1);
(gdb) p a
$2 = 3
(gdb) p b
$3 = 1
可以看到在单步调试thread1_func几次后,b的值仍然为1,证明在在单步调试thread1_func时,thread2_func没有执行。