调试工具之gdb
gdb是gcc的调试工具,在Linux环境下开发C/C++程序必不可少的工具之一,它还可以让你在汇编层面深入的了解C++底层机制的实现原理,可以说是进阶的必备工具。
1.入门
1.1准备工作
编译后的程序一般分为debug版本和release版本,release版本是最终发布的版本,编译器会进行各种代码优化以达到理论上的代码最小速度最优,如果利用gdb调试这个版本,将看不到程序的函数名和变量名,取而代之的是一堆不明所以的内存地址。而debug版本恰恰相反,不仅仅不做任何优化,还会在编译结果中加入调试信息。
首先,我们需要使用编译器gcc/g++的-g
参数可以将代码编译成debug版本。(-o 参数后接编译后程序名,缺省为a.out)
g++ -g filename.cpp -o debug
1.2启动调试
启动gdb调试主要有以下四种形式:
- gdb : (启动后可以通过file或者attach命令关联被调试程序)
- gdb [program] :(debug版本的可执行文件)
- gdb [program] [core]:(同时运行可执行文件和core文件)
- gdb [program] [pid]:(调试一个正在运行的程序,指定进程ID pid,gdb会自动进行attach)
(具体用法将在下一节详细描述)
1.3常用命令
1.3.1 ctrl+x+a
说明:
TUI(terminal user interface)开关终端用户界面,比较直观的显示参数
1.3.2 file [program]
说明:
装载想要调试的可执行文件
1.3.3 attach [pid]
说明:
attach到正在运行的程序进行调试,后边接程序的PID
1.3.4 list [args]
简化版命令:
l [args]
说明:
列出执行文件源代码中的一部分,args (参数)主要有五种组合方式
- LINENUM (常用):列出的是LINENUM这一行前后的源代码
- FILE:LINENUM:列出的是FILE文件中LINENUME这一行前后代码。
- FUNCTION(常用):列出的是FUNCTION这个函数开始那一行前后的代码。
- FILE:FUNCITON:列出的是FILE文件中函数FUNCTION开始的那一行前后的代码。
- *ADDRESS:列出ADDRESS那一行前后的代码,
注意:这个地址是代码段中的地址而非实际运行时变量的地址
此地址可以通过p &main,取得main函数在代码段中的地址,假设地址为0x1234,就可以用list *0x1234或者l *ox1234获得代码。
1.3.5 break [args...]
简化版命令:
b[args...]
说明:
在代码中设置断点,args主要包含[PROBE_MODIFIER] [LOCATION] [thread THREADNUM] [if CONDITION]
- PROBE_MODIFIER可为空(不常用,了解即可),如果设置断点在探针位置,这项参数就需要设置,-probe接普通探针,-probe-stap接SystemTap探针
以SystemTap为例,Systemtap用于kernel分析,允许使用者向内核代码或者用户空间的程序设置一个观测点,当内核代码或者用户程序运行到这个观测点时,使用者有机会执行一个自己编写的内核函数,读取该观测点上下文,进行分析与统计。
- LOCATION表示的是断点位置(常用),可以是行号,函数名,地址(*ADDRESS)。
- THREADNUM线程号(常用),来自于info threads命令中的值
- CONDITION布尔表达式,表示此断点为条件断点,只有当condition为true时才会断点
1.3.6 run [args...]
简化版命令:
r [args...]
说明:
开始运行程序,参数可为空,一般都是在利用break设置完断点后执行,后边可接多个参数,用空格分割,就是需要在程序运行前传递的参数,run $arg1 $arg2 $arg3 实际上相当于 sh program $arg1 $arg2 $arg3
1.3.7 next [num]
简化版命令:
n [num]
说明:
num 缺省值为1,表示执行下num行源代码但不进入函数内部
1.3.8 step [num]
简化版命令:
s [num]
说明:
num缺省值为1,表示执行下s行源代码而且进入函数内部
,是真正意义上的下一步
1.3.9 backtrace
简化版命令:
bt
说明:
打印当前栈内全部,调试中经常用的选项。
1.3.10 continue [num]
简化版命令:
c [num]
说明:
num缺省值为1,表示继续执行知道碰到第num个断点再停止
1.3.11 delete [num]
简化版命令:
d [num]
说明:
num缺省值为all,表示删除断点,num表示断点号,来自于info break中编号
1.3.12 info break
简化版命令:
i b
说明:
查看断点信息,列出所有断点的标号,程序地址,代码文件中的位置等
1.3.13 watch [arg]
简化版命令:
wa [arg]
说明:
设置内存断点,监视一个变量arg的值,一旦值有变化,程序立刻停止
1.3.14 help [command]
简化版命令:
h [command
说明:
GDB命令帮助
1.3.15 display arg 或{arg1,arg2,...}
说明:
跟踪打印变量值(非常实用的一种调试手段),比watch常用。
1.3. undisplay arg
说明:
不再跟踪显示此变量
1.4简要示例
#include<stdio.h>
#include<iostream>
#include<string>
using namespace std;
int a(int x,int y){int abc=100;cout<<x<<";"<<y<<endl;};
int b(int x){a(10,30);};
int main(int argc,char ** argv)
{
int a = 10;
b(90);
cout<<"success"<<endl;
getchar();
return 0;
}
#编译成debug版本并输出为debug.out
$ g++ -g test.cpp -o debug.out
#启动调试
$gdb ./debug.out
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/hugifish/cpp/gdbtest/debug.out...done.
#查看main函数前后的代码
(gdb) l main
5
6 int a(int x,int y){int abc=100;cout<<x<<";"<<y<<endl;};
7 int b(int x){a(10,30);};
8
9 int main(int argc,char ** argv)
10 {
11 int a = 10;
12 b(90);
13 cout<<"success"<<endl;
14 getchar();
#在main函数处打断点
(gdb) b main
Breakpoint 1 at 0x4008c5: file test.cpp, line 11.
#在第13行打上断点
(gdb) b 13
Breakpoint 2 at 0x4008d6: file test.cpp, line 13.
#查看有多少断点
(gdb) i b
Num Type Disp Enb Address What
1 breakpoint keep y 0x00000000004008c5 in main(int, char**) at test.cpp:11
2 breakpoint keep y 0x00000000004008d6 in main(int, char**) at test.cpp:13
#获取main函数的地址
(gdb) p &main
$1 = (int (*)(int, char **)) 0x4008b6 <main(int, char**)>
#再在main函数位置打上断点
(gdb) b *0x4008b6
Breakpoint 3 at 0x4008b6: file test.cpp, line 10.
#查看断点详情,注意1 与3的函数有微妙的区别
(gdb) i b
Num Type Disp Enb Address What
1 breakpoint keep y 0x00000000004008c5 in main(int, char**) at test.cpp:11
2 breakpoint keep y 0x00000000004008d6 in main(int, char**) at test.cpp:13
3 breakpoint keep y 0x00000000004008b6 in main(int, char**) at test.cpp:10
#删除第三个断点
(gdb) d 3
(gdb) info break
Num Type Disp Enb Address What
1 breakpoint keep y 0x00000000004008c5 in main(int, char**) at test.cpp:11
2 breakpoint keep y 0x00000000004008d6 in main(int, char**) at test.cpp:13
#开始调试
(gdb) r
Starting program: /home/hugifish/cpp/gdbtest/./debug.out
Breakpoint 1, main (argc=1, argv=0x7fffffffe4e8) at test.cpp:11
11 int a = 10;
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.3.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64
#下一行
(gdb) n
12 b(90);
#进入到函数内
(gdb) s
b (x=90) at test.cpp:7
7 int b(int x){a(10,30);};
(gdb) s
a (x=10, y=30) at test.cpp:6
6 int a(int x,int y){int abc=100;cout<<x<<";"<<y<<endl;};
#观察栈内状态
(gdb) bt
#0 a (x=10, y=30) at test.cpp:6
#1 0x00000000004008b4 in b (x=90) at test.cpp:7
#2 0x00000000004008d6 in main (argc=1, argv=0x7fffffffe4e8) at test.cpp:12
#运行到下一个断点
(gdb) c
Continuing.
10;30
Breakpoint 2, main (argc=1, argv=0x7fffffffe4e8) at test.cpp:13
13 cout<<"success"<<endl;
#退出
(gdb) q
A debugging session is active.
Inferior 1 [process 31601] will be killed.
Quit anyway? (y or n) y
2. gdb与coredump
coredump(核心转储),也叫吐核,操作系统在某进程收到异常信号,将此时进程地址空间的内容以及有关进程状态的运行信息写入到一个磁盘中。gdb可以利用此文件进行场景还原,是非常常见的一中查找问题的原因。
2.1 coredump相关设置
- 开启核心转储:ulimit -c,输出结果为0,表示关闭了core dump功能,可以利用ulimit -c [num] (num的单位是KB),设置过小依然不会转储,在正常生产环境下,由于磁盘空间都是TB以上级别的,所以可以直接设置成不限制。
ulimit -c unlimited
此方法只对当前终端环境有效,重新登陆此值将会被重置
永久有效需要修改ulimit配置文件,路径为/etc/security/limits.conf(linux 资源使用配置文件)
#<domain> <type> <item> <value>
* soft core unlimited
- core相关配置文件及设置详情
/proc/sys/kernel/core_pipe_limit:定义了可以有多少个并发的崩溃程序可以通过管道模式传递给指定的core信息收集程序,如果超过,异常信息将被丢弃。0表示不限制并行捕捉的进程个数。
/proc/sys/kernel/core_uses_pid:如果这个文件的内容被配置成1,那么即使core_pattern中没有设置%p,最后生成的core dump文件名仍会加上进程ID
/proc/sys/kernel/core_pattern:定义了core文件名的生成格式
core_pattern通配符 | 说明 |
---|---|
%h | home主机名 |
%e | executable file程序名 |
%g | group id进程运行的实际用户组ID |
%u | user id进程运行的实际用户ID |
%p | process id进程ID |
%t | time of core进程core dump的时间戳(单位为秒) |
%s | signal导致core dump的信号 |
一般设置为core_%p_%e_%t,如果文件命中包含"/"目录分隔符,那么说生成的core文件将会被放在指定的目录中,反之,将会在在可执行文件目录中。
1.如果需要指定绝对路径,需要确认存储的路径是否存在
2.最好指定文件生成位置,如果程序通过chdir修改了工作目录,默认存储位置将不再是可执行文件所在的目录
- 生效自定义core格式的方法
临时修改:
方法一:
echo 1 >/proc/sys/kernel/core_uses_pid
echo 'core_%p_%e_%t' > /proc/sys/kernel/core_pattern
方法二:
sysctl -w kernel.core_uses_pid=0
sysctl -w kernel.core_pattern=core.%p_%e_%t
永久有效修改方案:
echo 'kernel.core_pattern=core_%P_%e_%t' >>/etc/sysctl.conf
echo 'kernel.core_uses_pid=1'>>/etc/sysctl.conf
sysctl -p #sysctl命令的作用是运行时配置内核参数,载入指定的sysctl配置文件,缺省为/etc/sysctl.conf
2.2 gdb模拟调试core
首先,我们要知道可能产生core的条件:
- 内存访问越界,比如在操作数组或者字符串越界访问。
- 多个线程修改同一块内存
- 非法指针操作,利用与内存结构不一致的指针进行操作。
- 堆栈溢出
最好构造的莫过于错误的使用指针。
#include<iostream>
using namespace std;
void createCore()
{
char * ptr = "i'm cons value";
ptr[10]= 'a';
}
int main(int argc,char ** argv)
{
createCore();
return 0;
}
上述代码中,通过指向常量字符串的指针修改常量区的字符串,输入利用指针错误操作内存,导致coredump。
下面我们将利用debug版本的程序和release版本的程序分别对core进行调试。
#编译两个版本的程序
$ g++ test.cpp -o debug.out
$ g++ -g test.cpp -o release.out
#然后运行
$ ./release.out
#利用debug版本调试core
$ gdb ./debug.out core_hugi_centos_release.out_1558503618.20005
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/hugifish/cpp/debug.out...done.
warning: core file may not match specified executable file.
[New LWP 20005]
Core was generated by `./release.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400641 in createCore () at test.cpp:7
7 abc[10]= 'a';
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.5.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64
(gdb) bt
#0 0x0000000000400641 in createCore () at test.cpp:7
#1 0x000000000040065a in main (argc=1, argv=0x7ffd11839a38) at test.cpp:11
#可以很清晰的定位到具体coredump行
#但是在实际生产过程中可能会出现,core对应的debug版本可执行文件缺失的情况。
#这就需要利用release版本进行调试
$ gdb ./release.out core_hugi_centos_release.out_1558503618.20005
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/hugifish/cpp/release.out...(no debugging symbols found)...done.
[New LWP 20005]
Core was generated by `./release.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400641 in createCore() ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.5.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64
(gdb) bt
#0 0x0000000000400641 in createCore() ()
#1 0x000000000040065a in main ()
#这里可以看到由于debug信息的缺失,并没有标记到哪个代码段出的问题
所以我们首先跳到出问题的那一帧上f 0,
(gdb) f 0
#0 0x0000000000400641 in createCore() ()
#coreDump出现在一个叫createCore函数中,但是并不知道在函数中的哪一行出现的问题。
#利用disassemble反汇编找出汇编代码
(gdb) disassemble
Dump of assembler code for function _Z10createCorev:
0x000000000040062d <+0>: push %rbp
0x000000000040062e <+1>: mov %rsp,%rbp
0x0000000000400631 <+4>: movq $0x400750,-0x8(%rbp)
0x0000000000400639 <+12>: mov -0x8(%rbp),%rax
0x000000000040063d <+16>: add $0xa,%rax
=> 0x0000000000400641 <+20>: movb $0x61,(%rax)
0x0000000000400644 <+23>: pop %rbp
0x0000000000400645 <+24>: retq
End of assembler dump.
#箭头指向的位置是core出现的位置
#很显然movb是赋值操作,并且是函数体内最后一行
#结合代码可以判断出问题的是函数体内最后一行的赋值操作
综上所述,调试CORE的最主要的一步就是找到出现core的位置,然后根据代码分析core发生的原因。
3. gdb与多线程
3.1 gdb多线程调试命令
多线程调试中,除了需要用到单线程调试的一些常用命令以外,还有以下几个重要的命令。
gdb命令 | 简化命令 | 说明 |
---|---|---|
info threads | i threads | 查看当前进程中所有线程,gdb会给每个运行中的线程分配一个id号,此id号从1开始,前面有* 的是当前正在调试的线程 |
thread [,id] | t [,id] | 切换到线程号为[id]的线程,id为info threads表格中第一列的值,如果id为空,则打印当前所在的线程号 |
break [,location] [,thread ] | b [,location][,thread] | 为某个位置设置断点,多线程环境下,此Location对所有线程都适用 |
thread apply [,ids...] [command] | - | 让一个线程id序列全部应用command包含的GDB命令 |
thread apply all command | - | 让所有线程执行GDB命令 |
set scheduler-locking [off/on/step] | - | 为运行中的程序设置线程锁定模式,gdb在适用step/continue命令时,所有的线程都会执行。通过设置此命令可以实现线程隔离控制,默认值off,不锁定任何线程,即所有线程同时执行命令;on只有当前被调试的线程才会执行命令;step,当执行step操作时,只有当前线程会被执行,执行continue时,所有的线程才会被执行。 |
调试多线程程序需要的linux命令:
linux命令 | 说明 |
---|---|
ps aux | grep [,name] | 查看名为name的进程详细信息,多线程调试中可以通过attach进程号,调试多线程程序。 |
pstree -p [,id] | 列出主线程与子线程的关系 |
ps stack [,threadId] | 查看线程栈 |
g++编译多线程需要注意的事项。
- 下面我们将利用C++11实现多线程,因为在语言层面实现多线程,可移植性非常强。编译C++11,需要加上参数-std="c++11",也可以利用在.bashrc中添加alias g++='g++ -std=c++11',实现默认利用c++11标准编译
- 编译多线程程序还需要加上 -pthread 参数,这是因为g++默认没有加载pthread库
- gcc/g++ 添加-g参数,可生成debug版本可执行文件,但是此debug版本中没有宏的调试信息,如果将-g替换成-ggdb3,利用gdb即可调试宏。-ggdb3主要是由3个部分组成,-g代表添加调试信息,gdb代表尽可能多的添加gdb调试信息,3表示的是gdb调试信息的级别,此级别代表可调试宏
gdb下可以通过info macro [name]查看这个宏在哪些文件中被引用,以及宏定义是什么样的,macro expand [name]查看宏展开的样子。
3.2多线程调试简要事例
#include<iostream>
#include<thread>
#include<unistd.h>
using namespace std;
void threadFunc1()
{
cout << "The function child thread begin..."<<endl;
while(1)
{
cout << "I come from function fun(): 1"<< endl;
sleep(1);
}
cout << "The function child thread end..."<<endl;
}
void threadFunc2()
{
cout << "The function child thread begin..."<<endl;
while(1)
{
cout<< "I come from function fun(): 2" << endl;
sleep(1);
}
cout << "The function child thread end..."<<endl;
}
int main(int argc,char ** argv)
{
cout << "Main thread begin..." <<endl;
cout.sync_with_stdio(true); // 设置输入流cout是线程安全的
thread t1(threadFunc1);
thread t2(threadFunc2);
t1.join();
t2.join();
cout << "Main thread end..." <<endl;
return 0;
}
#gdb编辑代码
$ g++ -g -pthread -std=c++11 test.cpp -o debug.out
#后台运行
$nohup ./debug.out &
利用shell命令可以查看当前线程号,和子线程号,还有当前线程栈的状态,如下图所示:
我们利用gdb attach 方式可以对此多线程进行调试
gdb attach 26302
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
attach: 没有那个文件或目录.
Attaching to process 26302
Reading symbols from /
...此处为信息忽略
[New LWP 26304] <==这里我们可以看到主线程创建的两个轻量子线程
[New LWP 26303]
....此处信息忽略
(gdb) info threads <==我们利用此命令查看当前的有哪些线程正在运行
Id Target Id Frame
3 Thread 0x7f12a88be700 (LWP 26303) "debug.out" 0x00007f12a8983e2d in nanosleep () from /lib64/libc.so.6
2 Thread 0x7f12a80bd700 (LWP 26304) "debug.out" 0x00007f12a8983e2d in nanosleep () from /lib64/libc.so.6
* 1 Thread 0x7f12a98da740 (LWP 26302) "debug.out" 0x00007f12a8c94f47 in pthread_join () from /lib64/libpthread.so.0
(gdb) thread 2 <==我们切换到2线程,
[Switching to thread 2 (Thread 0x7f12a80bd700 (LWP 26304))]
#0 0x00007f12a8983e2d in nanosleep () from /lib64/libc.so.6
(gdb) n
Single stepping until exit from function nanosleep,
which has no line number information.
0x00007f12a8983cc4 in sleep () from /lib64/libc.so.6
(gdb) n <==继续执行发现其为threadFunc2的线程
Single stepping until exit from function sleep,
which has no line number information.
threadFunc2 () at test.cpp:15
15 void threadFunc2()
(gdb) n
19 cout << "I come from function fun(): 2" << endl;
#下面我们为两个线程起始函数添加断点
(gdb) b threadFunc1
Breakpoint 1 at 0x400f81: file test.cpp, line 8.
(gdb) b threadFunc2
Breakpoint 2 at 0x400fc9: file test.cpp, line 17.
(gdb) i b
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000400f81 in threadFunc1() at test.cpp:8
2 breakpoint keep y 0x0000000000400fc9 in threadFunc2() at test.cpp:17
(gdb) info threads
[New Thread 0x7ffff67d1700 (LWP 2380)]
Id Target Id Frame
3 Thread 0x7ffff67d1700 (LWP 2380) "debug.out" 0x00007ffff70d0e71 in clone () from /lib64/libc.so.6
* 2 Thread 0x7ffff6fd2700 (LWP 2379) "debug.out" threadFunc1 () at test.cpp:8
1 Thread 0x7ffff7fec740 (LWP 2378) "debug.out" 0x00007ffff70d0e71 in clone () from /lib64/libc.so.6
(gdb) thread 3 <==切换到threadFunc2线程,继续执行
[Switching to thread 3 (Thread 0x7ffff67d1700 (LWP 2380))]
#0 0x00007ffff70d0e71 in clone () from /lib64/libc.so.6
(gdb) n
Single stepping until exit from function clone,
which has no line number information.
0x00007ffff73a7d10 in start_thread () from /lib64/libpthread.so.0
(gdb) n
Single stepping until exit from function start_thread,
which has no line number information.
[Switching to Thread 0x7ffff6fd2700 (LWP 2379)]
Breakpoint 1, threadFunc1 () at test.cpp:8
8 cout << "The function child thread begin..." << endl;
(gdb) n
[Switching to Thread 0x7ffff67d1700 (LWP 2380)]
Breakpoint 2, threadFunc2 () at test.cpp:17
17 cout << "The function child thread begin..." << endl;
(gdb)
The function child thread begin...The function child thread begin...
I come from function fun(): 1
19 cout << "I come from function fun(): 2" << endl;
(gdb) n
I come from function fun(): 2I come from function fun(): 1
20 sleep(1);
(gdb) n <==此处我们发现两个子线程同时运行
I come from function fun(): 1
nI come from function fun(): 1
15 void threadFunc2()
#如果我们需要一个线程运行,另一个线程在断点处等待,则需要锁定当前线程
#锁定当前线程在多线程调试中经常会被用到
(gdb) info threads <==我们先利用此命令打印线程运行状态
Id Target Id Frame
* 3 Thread 0x7ffff67d1700 (LWP 2380) "debug.out" threadFunc2 () at test.cpp:19
2 Thread 0x7ffff6fd2700 (LWP 2379) "debug.out" 0x00007ffff7097e2d in nanosleep () from /lib64/libc.so.6
1 Thread 0x7ffff7fec740 (LWP 2378) "debug.out" 0x00007ffff73a8f47 in pthread_join () from /lib64/libpthread.so.0
(gdb) set sheduler-locking on 我们先锁定单线程调试模式
(gdb) thread 2 <==切换到另一个线程
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f713c032700 (LWP 4341))]
#0 0x00007f713c0f7e2d in nanosleep () from /lib64/libc.so.6
(gdb) n
Single stepping until exit from function nanosleep,
which has no line number information.
0x00007f713c0f7cc4 in sleep () from /lib64/libc.so.6
(gdb) n
Single stepping until exit from function sleep,
which has no line number information.
threadFunc1 () at test.cpp:6
6 void threadFunc1()
(gdb) info threads
Id Target Id Frame
3 Thread 0x7ffff67d1700 (LWP 2380) "debug.out" threadFunc2 () at test.cpp:19
*2 Thread 0x7f713c032700 (LWP 2380) "debug.out" threadFunc1 () at test.cpp:6
1 Thread 0x7f713d04e740 (LWP 2378) "debug.out" 0x00007f713c408f47 in pthread_join () from /lib64/libpthread.so.0
#这里我们就可以看出我们进入到了threadFun1这个函数的线程中
#我们也可以指定某个线程运行命令,并不影响当前我们调试的这个线程
(gdb) thread apply 3 n
Thread 3 (Thread 0x7ffff67d1700 (LWP 2380)):
19 cout << "I come from function fun(): 2" << endl;
3.3 gdb模拟调试多线程core
如果你有幸可以通过debug版本调试core,那么恭喜你,启动gdb查看core时,你可以清晰的看到core出现在哪行,和gdb调试单线程core的步骤完全一致。
下面我们以没有调试信息的release 版本生成的core来进行说明
#include <iostream>
#include <thread>
#include <unistd.h>
using namespace std;
void threadFunc1()
{
cout << "The function child thread begin..." << endl;
while (1) {
cout << "I come from function fun(): 1" << endl;
sleep(3);
char * a ="i'm test";
a[0] = 'x'; <<<<此处必然出现core
}
cout << "The function child thread end..." << endl;
}
void threadFunc2()
{
cout << "The function child thread begin..." << endl;
while (1) {
cout << "I come from function fun(): 2" << endl;
sleep(1);
}
cout << "The function child thread end..." << endl;
}
int main(int argc, char** argv)
{
cout << "Main thread begin..." << endl;
cout.sync_with_stdio(true); // 设置输入流cout是线程安全的
thread t1(threadFunc1);
thread t2(threadFunc2);
t1.join();
t2.join();
cout << "Main thread end..." << endl;
return 0;
}
#下面我们将源代码编译成release.out,运行后必定吐核
$g++ -std='c++11' -pthread test.cpp -o release.out
$ gdb release.out core_hugi_centos_release.out_1558974085.6559
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/hugifish/cpp/thread/release.out...(no debugging symbols found)...done.
[New LWP 6560]
[New LWP 6559]
[New LWP 6561]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./release.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400fd3 in threadFunc1() () <==此处为core的位置
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.5.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64
#从上面的信息 core在了threadFunc1的位置
#那么我们直接在此处打上断点,然后运行
(gdb) b threadFunc1
Breakpoint 1 at 0x400f81
(gdb) r
Starting program: /home/hugifish/cpp/thread/release.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Main thread begin...
[New Thread 0x7ffff6fd2700 (LWP 7572)]
[Switching to Thread 0x7ffff6fd2700 (LWP 7572)]
Breakpoint 1, 0x0000000000400f81 in threadFunc1() ()
(gdb) info threads
[New Thread 0x7ffff67d1700 (LWP 7573)]
Id Target Id Frame
3 Thread 0x7ffff67d1700 (LWP 7573) "release.out" 0x00007ffff70d0e71 in clone () from /lib64/libc.so.6
* 2 Thread 0x7ffff6fd2700 (LWP 7572) "release.out" 0x0000000000400f81 in threadFunc1() ()
1 Thread 0x7ffff7fec740 (LWP 7568) "release.out" 0x00007ffff70d0e71 in clone () from /lib64/libc.so.6
#从上面我们可以看到当前运行到了出问题的地方,下面就和单线程调试类似了,执行直到core,然后运行disassemble 得到此函数的汇编代码,然后就可以对照源代码找到出问题的地方
(gdb) n
Single stepping until exit from function _Z11threadFunc1v,
which has no line number information.
The function child thread begin...
I come from function fun(): 1
The function child thread begin...
I come from function fun(): 2
I come from function fun(): 2
I come from function fun(): 2
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400fd3 in threadFunc1() ()
(gdb) disassemble
Dump of assembler code for function _Z11threadFunc1v:
0x0000000000400f7d <+0>: push %rbp
0x0000000000400f7e <+1>: mov %rsp,%rbp
0x0000000000400f81 <+4>: sub $0x10,%rsp
0x0000000000400f85 <+8>: mov $0x4024d8,%esi
0x0000000000400f8a <+13>: mov $0x604140,%edi
0x0000000000400f8f <+18>: callq 0x400cc0 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@plt>
0x0000000000400f94 <+23>: mov $0x400d60,%esi
0x0000000000400f99 <+28>: mov %rax,%rdi
0x0000000000400f9c <+31>: callq 0x400d30 <_ZNSolsEPFRSoS_E@plt>
0x0000000000400fa1 <+36>: mov $0x4024fb,%esi
0x0000000000400fa6 <+41>: mov $0x604140,%edi
0x0000000000400fab <+46>: callq 0x400cc0 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@plt>
0x0000000000400fb0 <+51>: mov $0x400d60,%esi
0x0000000000400fb5 <+56>: mov %rax,%rdi
0x0000000000400fb8 <+59>: callq 0x400d30 <_ZNSolsEPFRSoS_E@plt>
0x0000000000400fbd <+64>: mov $0x3,%edi
0x0000000000400fc2 <+69>: callq 0x400cd0 <sleep@plt>
0x0000000000400fc7 <+74>: movq $0x402519,-0x8(%rbp)
0x0000000000400fcf <+82>: mov -0x8(%rbp),%rax
=> 0x0000000000400fd3 <+86>: movb $0x78,(%rax)
0x0000000000400fd6 <+89>: jmp 0x400fa1 <_Z11threadFunc1v+36>
End of assembler dump.
#看到上面汇编代码中指向的问题点了吗~之后就和但线程调试core一样了。
4. gdb与多进程
4.1多进程趣谈
提到多进程,就不得不说一下下面这三个函数
函数 | 详解 | 趣谈 |
---|---|---|
fork | 最简单,最常用的一种手段,仅仅是创建和父进程一摸一样的进程,运行在一个独立于父进程的地址空间,由于fork使用的是COW机制(写时拷贝),所以在子进程和父进程都没有写操作时,享用相同的地址空间,只是在页表中设置cow标识,当其中任何一个进程对某一页中的某个数据执行写操作时,系统才会利用COW算法重新为子进程复制一份物理页,然后映射到子进程用户地址空间中 | 父亲(父进程)开餐馆,儿子(子进程)在隔壁开了一家一摸一样的餐馆,只是门牌号变了而已,两父子各干各的,互不打扰。 |
vfork |
过时了 ,这个函数的作用就是创建一个子进程后,与父进程公用同一块地址空间,并且阻塞父进程,直到exec或exit调用,父进程才可以正常运行。为什么说它过时了呢,因为最初的fork并没有设计COW机制,有很多时候fork之后直接接exec,如果没有COW机制,那么相当于fork引发的复制操作就变成了无用功,这样会造成系统资源和时间的双重浪费,所以诞生了vfork函数来适配这种情况。 |
父亲开餐馆,儿子也想开餐馆,就和父亲商量回家歇两天,然后把餐馆原来的大门锁上,在墙上又开了一个新门,挂上自己的门牌号开张,直到儿子玩腻了,跟父亲说你回来吧,父亲这才将餐馆恢复原样继续营业。 |
exec | 子进程替换父进程的所有内容,除了继承了父进程的进程号以外,一切都是全新的运行环境 | 父亲开餐馆,儿子也要开餐馆,让父亲回家养老,然后重新替换了菜谱,重新装潢了门脸,在父亲留下的那个房子里做着自己的事业。 |
4.1 gdb多进程调试命令
命令 | 详解 |
---|---|
follow-fork-mode [parent/child] | fork行为发生以后gdb跟随哪个进程,默认只为parent,parent表示fork之后gdb继续跟随父进程进行调试,子进程不受影响,child表示fork之后gdb跟随子进程进行调试 |
detach-on-fork [on/off] | 指示gdb在fork之后是否断开未被指定进程的控制,默认只为on ,on表示断开未被指定的进程,调试follow-fork-mode指定的进程,off 表示gdb依然控制未被指定的进程,将其阻塞在fork位置,然后再去调试指定的进程 |
info inferiors | 查看当前运行的进程的状态信息,每个进程状态信息都有唯一的标识 |
inferior [,id] | 切换到对应标识的状态信息中 |
题外话
inferior在GDB中到底代表着什么?为什么用info inferiors 而不用info processes呢?
在《Debugging with gdb》中有这么一段描述:"gdb represents the state of each program execution with an object called an inferior. An inferior typically corresponds to a process, but is more general and applies also to targets that do not have processes. Inferiors may be created before a process runs, and may be retained after a process exits. Inferiors have unique identifiers that are different from process ids. Usually each inferior will also have its own distinct address space, although some embedded targets may have several inferiors running in different parts of a single address space. Each inferior may in turn have multiple threads running in it.
To find out what inferiors exist at any moment, use info inferiors:“
(译)gdb用次对象来表示每个程序的执行状态,一个次对象通常对应一个进程,也被广泛的用于代表一个没有包含任何进程的目标,次对象可能被创建于进程运行之前,可能在一个进程结束以后被留存,次对象拥有特殊的唯一标识(此标识不同于进程ID)。通常每个次对象拥有自己的地址空间,尽管一些包含嵌套关系的目标所对应的多个次对象可能运行在一个独立地址空间的不同部分。每个次对象可能以多线程的方式运行。找出某一时刻有哪些次对象在运行,需要用到info inferiors
以下是自身理解,如有偏颇请指正
这一大段啰哩啰嗦的话实际上就是在阐述GDB会创建一个叫次对象的东西来记录进程在每一时刻的状态,包括运行前,运行中和运行后。它是GDB自己创造出来用于存放调试所需信息的一个载体,而非被调试进程本身。所以我们用info inferiors查看被调试进程的信息,而非info processes(没有此命令)
4.2多进程调试示例
#include<iostream>
#include<unistd.h>
#include<sys/types.h>
#include<thread>
using namespace std;
void thread_func1()
{
while(1)
{
cout << " thread 1"<<endl;
sleep(1);
}
}
void thread_func2()
{
while(1)
{
cout << " thread 2"<<endl;
sleep(1);
}
}
int main(int argc,char** argv)
{
cout.sync_with_stdio(true); // 设置输入流cout是线程安全的
cout << "main begin"<<endl;
pid_t pid = fork();
if(-1 == pid)
{
perror("fork error");
}
else if(0 == pid)
{
thread t1(thread_func1);
thread t2(thread_func2);
cout << "i'm child pid = "<<getpid()<<endl;
t1.join();
t2.join();
}
else
{
cout <<"i'm father pid = "<<getpid()<<endl;
}
cout <<"main end" <<endl;
return 0;
}
#由于在fork的子进程中开启了两个线程,所以编译的时候需要-pthread
$ g++ -std="c++11" -g test.cpp -o debug.out
$ gdb debug.out
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/hugifish/cpp/process/debug.out...done.
#首先我们通过关闭detach-on-fork ,通知gdb不放弃对其他进程的控制
(gdb) set detach-on-fork off
#然后开始运行 n继续到fork函数执行
(gdb) start
Temporary breakpoint 1 at 0x401105: file test.cpp, line 24.
Starting program: /home/hugifish/cpp/process/debug.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Temporary breakpoint 1, main (argc=1, argv=0x7fffffffe438) at test.cpp:24
24 cout.sync_with_stdio(true); // 设置输入流cout是线程安全的
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.5.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64
(gdb) n
25 cout << "main begin"<<endl;
(gdb) n
main begin
26 pid_t pid = fork();
(gdb) n
[New process 21151] <==此处显示为新进程加入
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
27 if(-1 == pid)
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.5.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libstdc++-4.8.5-36.el7_6.2.x86_64
(gdb) n
31 else if(0 == pid)
#查看当前系统中存在的次对象,你会发现有两个次对象分别指向了两个进程,1为我们当前调试的主进程
(gdb) info inferiors
Num Description Executable
2 process 21151 /home/hugifish/cpp/process/debug.out
* 1 process 21137 /home/hugifish/cpp/process/debug.out
#我们切换到fork出来的那个性次对象中,调试新进程
(gdb) inferior 2
[Switching to inferior 2 [process 21151] (/home/hugifish/cpp/process/debug.out)]
[Switching to thread 2 (Thread 0x7ffff7fec740 (LWP 21151))]
#0 0x00007ffff7097f42 in fork () from /lib64/libc.so.6
(gdb) n
Single stepping until exit from function fork,
which has no line number information.
main (argc=1, argv=0x7fffffffe438) at test.cpp:27
27 if(-1 == pid)
(gdb) list
22 int main(int argc,char** argv)
23 {
24 cout.sync_with_stdio(true); // 设置输入流cout是线程安全的
25 cout << "main begin"<<endl;
26 pid_t pid = fork();
27 if(-1 == pid)
28 {
29 perror("fork error");
30 }
31 else if(0 == pid)
(gdb) n
31 else if(0 == pid)
(gdb) n
33 thread t1(thread_func1);
(gdb) n
[New Thread 0x7ffff6fd2700 (LWP 21238)]
34 thread t2(thread_func2);
(gdb) n
thread 1
[New Thread 0x7ffff67d1700 (LWP 21240)]
35 cout << "i'm child pid = "<<getpid()<<endl;
(gdb) n
thread 2
thread 1
i'm child pid = 21151
36 t1.join();
(gdb) n
thread 2
thread 1
thread 2
thread 1
thread 2
thread 1
^C
Program received signal SIGINT, Interrupt.
0x00007ffff73a8f47 in pthread_join () from /lib64/libpthread.so.0
#我们利用 info threads 对线程进行探查,发现有四个线程
#一个进程开始时都只有一个线程,这个线程号和进程号是一致的,1和2是原来两个进程开始时创建出来的线程,3和4则是fork的新进程通过thread类创建出来的新线程。
(gdb) info threads
Id Target Id Frame
4 Thread 0x7ffff67d1700 (LWP 21240) "debug.out" 0x00007ffff7097e2d in nanosleep () from /lib64/libc.so.6
3 Thread 0x7ffff6fd2700 (LWP 21238) "debug.out" 0x00007ffff7097e2d in nanosleep () from /lib64/libc.so.6
* 2 Thread 0x7ffff7fec740 (LWP 21151) "debug.out" 0x00007ffff73a8f47 in pthread_join () from /lib64/libpthread.so.0
1 Thread 0x7ffff7fec740 (LWP 21137) "debug.out" main (argc=1, argv=0x7fffffffe438) at test.cpp:31
#之后进入到线程调试阶段就和之前多线程调试的步骤一致了。
写在最后,
如果你是在Linux或者UNIX下进行C/C++开发,GDB就是必备手段之一,另外它在Linux桌面系统下有一个叫Graphic GDB的图形化调试工具,初学时可以利用这个软件了解GDB调试流程。但是生产环境中调试core时,你就不能寄希望于图形化工具,命令行调试是你最终的且唯一的手段,它可以让你看到程序最本真的样子。此外,如果你需要深入理解C/C++底层实现,GDB是最好的手段。深入了解它,你会发现它别样的没。
GDB说:“虽然我很丑,但是我很温柔~”
推荐一本了解它的好书《Debugging with GDB》,要看英文版,对应的中文版有删减。