centos7(内核版本5.4)容器中使用bpftrace
说明
bpftrace是一款基于BPF和BCC的开源跟踪器。BCC实现的工具,bpftrace也基本实现了一套,方便的排查系统性能问题。bpftrace的语法非常简单,用bpftrace来做一些小工具非常方便和高效。例如,动态跟踪tcp_retransmit_skb()函数可以知道tcp重传的情况,下边是bpftrace实现的tcpretrans.bt,可以看出10.126.161.85:36924 ->10.126.168.197:8092,这条连接上发生了重传。
root@c4:/# tcpretrans.bt
Attaching 3 probes...
Tracing tcp retransmits. Hit Ctrl-C to end.
TIME PID LADDR:LPORT RADDR:RPORT STATE
06:41:55 0 10.126.161.85:33838 10.126.148.72:8092 ESTABLISHED
06:41:55 33 10.126.161.85:36924 10.126.168.197:8092 ESTABLISHED
06:41:55 33 10.126.161.85:36924 10.126.168.197:8092 ESTABLISHED
06:41:55 33 10.126.161.85:36924 10.126.168.197:8092 ESTABLISHED
06:41:55 33 10.126.161.85:36924 10.126.168.197:8092 ESTABLISHED
06:41:55 33 10.126.161.85:36924 10.126.168.197:8092 ESTABLISHED
06:41:55 33 10.126.161.85:36924 10.126.168.197:8092 ESTABLISHED
06:41:55 33 10.126.161.85:36924 10.126.168.197:8092 ESTABLISHED
学习基本语法后,看代码非常简单:
#!/usr/bin/env bpftrace
/*
* tcpretrans.bt Trace or count TCP retransmits
* For Linux, uses bpftrace and eBPF.
*
* USAGE: tcpretrans.bt
*
* This is a bpftrace version of the bcc tool of the same name.
* It is limited to ipv4 addresses, and doesn't support tracking TLPs.
*
* This uses dynamic tracing of kernel functions, and will need to be updated
* to match kernel changes.
*
* Copyright (c) 2018 Dale Hamel.
* Licensed under the Apache License, Version 2.0 (the "License")
*
* 23-Nov-2018 Dale Hamel created this.
*/
#include <linux/socket.h>
#include <net/sock.h>
BEGIN
{
printf("Tracing tcp retransmits. Hit Ctrl-C to end.\n");
printf("%-8s %-8s %20s %21s %6s\n", "TIME", "PID", "LADDR:LPORT",
"RADDR:RPORT", "STATE");
// See include/net/tcp_states.h:
@tcp_states[1] = "ESTABLISHED";
@tcp_states[2] = "SYN_SENT";
@tcp_states[3] = "SYN_RECV";
@tcp_states[4] = "FIN_WAIT1";
@tcp_states[5] = "FIN_WAIT2";
@tcp_states[6] = "TIME_WAIT";
@tcp_states[7] = "CLOSE";
@tcp_states[8] = "CLOSE_WAIT";
@tcp_states[9] = "LAST_ACK";
@tcp_states[10] = "LISTEN";
@tcp_states[11] = "CLOSING";
@tcp_states[12] = "NEW_SYN_RECV";
}
kprobe:tcp_retransmit_skb
{
$sk = (struct sock *)arg0;
$inet_family = $sk->__sk_common.skc_family;
if ($inet_family == AF_INET || $inet_family == AF_INET6) {
// initialize variable type:
$daddr = ntop(0);
$saddr = ntop(0);
if ($inet_family == AF_INET) {
$daddr = ntop($sk->__sk_common.skc_daddr);
$saddr = ntop($sk->__sk_common.skc_rcv_saddr);
} else {
$daddr = ntop(
$sk->__sk_common.skc_v6_daddr.in6_u.u6_addr8);
$saddr = ntop(
$sk->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr8);
}
$lport = $sk->__sk_common.skc_num;
$dport = $sk->__sk_common.skc_dport;
// Destination port is big endian, it must be flipped
$dport = ($dport >> 8) | (($dport << 8) & 0x00FF00);
$state = $sk->__sk_common.skc_state;
$statestr = @tcp_states[$state];
time("%H:%M:%S ");
printf("%-8d %14s:%-6d %14s:%-6d %6s\n", pid, $saddr, $lport,
$daddr, $dport, $statestr);
}
}
END
{
clear(@tcp_states);
}
本文主要目的是介绍如何在centos7,5.4内核中使用bpftrace。
centos支持不友好
官网安装(https://github.com/iovisor/bpftrace/blob/master/INSTALL.md#ubuntu-packages),没有centos的安装包,源码安装依赖比较高,比较麻烦(尤其在生产环境)。直接使用官方的镜像(quay.io/iovisor/bpftrace:latest),bpftrace自带的工具可以正常运行,但是自己写的程序可能会报一些语法错误,例如下面我自己写的网络包跟踪工具,通过struct sk_buff 获取设备名称时包的错,开始怀疑bpf引用的内核头文件版本不对,最后发现与bpftrace版本有关。
root@mi:/bpftools# ./pkgtool.bt
./pkgtool.bt:41:10-19: ERROR: Struct/union of type 'struct sk_buff' does not contain a field named 'dev'
$net = $skb->dev->nd_net.net;
~~~~~~~~~
./pkgtool.bt:42:12-21: ERROR: Struct/union of type 'struct sk_buff' does not contain a field named 'dev'
$netif = $skb->dev->name;
安装
想到一个目前最方便的方式在centos安装bpftrace的方式就是使用ubuntu 镜像安装然后,打包到centos上使用。docker安装不是本文关注点,下边主要讲下使用ubuntu 镜像安装bpftrace的步骤:
- 下载ubuntu 镜像
# docker pull ubuntu:20.10
- 启动ubuntu容器
# docker run -ti -v /usr/src:/usr/src:ro -v /lib/modules/:/lib/modules:ro -v /sys/kernel/debug/:/sys/kernel/debug:rw --net=host --pid=host --privileged ubuntu:20.10 bash
- 更新软件
# apt-get update
- 安装bpftrace
# apt-get install -y bpftrace
- 安装个ssh客户端方便把我们写的程序考进来(可选)
# apt-get install openssh-client
- 从容器创建一个新的镜像
# docker commit -a "zhangzhifei" -m "bpftrace on ubuntu" xxx/zhangzhifei/bpftrace:v0.11.0
fb64c3f2c0a6是上边ubuntu容器的id,可通过docker ps | grep update查看。可以将打好的镜像传到镜像仓库,哪里使用哪里运行就ok了。再执行之前报错的程序现在就通过了, 我们已经正确获取到了设备名称
07:20:35 4026531992 eth0 783692,device-server,__dev_queue_xmit,33 flags:ack, seq:3811853281, ack:1073387667, win:0 10.xx.xx.xx:3442 10.xx.xx.xx:20171 741 ms
07:20:35 4026531992 docker0 783692,device-server,ip_finish_output2,33 flags:ack, seq:3617609915, ack:4105713565, win:0 10.xx.xx.xx:61003 10.xx.xx.xx:3442 741 ms
- 运行
docker run -d --rm -ti -v /root/work/bpf:/bpftools:rw -v /usr/src:/usr/src:ro -v /lib/modules/:/lib/modules/:ro -v /sys/kernel/debug/:/sys/kernel/debug:rw --net=host --pid=host --privileged xxx/zhangzhifei/bpftrace:v0.11.0 sleep 3600
总结
整体的安装过程是十分简单的,但是如果一时没有想到这个方法,选择源码安装还是不够方便,而且源码安装如果centos版本不够高,也会有一些依赖问题。如果使用bpftrace跑自己的程序遇到报错,希望这篇幅文章能帮助到你。