Data source
- tracepoints: kernel static tracing
- kprobes: kernel dynamic tracing
- uprobes: user level dynamic tracing
- USDT / dtrace probes
These tracepoints are hard coded in interesting and logical locations of
the kernel, so that higher-level behavior can be easily traced. You can
find all the tracepoints in include/trace/events/
Dynamic tracing can see everything, it’s also an unstable interface since
it is instrumenting raw code.
cBPF used for filter. eBPF has (key/value) maps and more helper functions.
Ref: https://jvns.ca/blog/2017/07/05/linux-tracing-systems/
Mechanisms for collecting data
- ftrace/kprobes/tracepoints: /sys/kernel/debug/tracing
- perf_events
- eBPF
- Systemtap kernel module
- sysdig
user frontends: display data
- perf
- ftrace/trace-cmd
- bpftrace/bcc
- systemtap
For user to choose, here are some reference
- PMC statistics: perf
- CPU sampling with timestamps: perf, as its perf.data output has been
optimized
- Function flow tracing: ftrace function graphing.
One-Liners
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # bpftrace -l "tracepoint:net:*"
# bpftrace -lv tracepoint:net:net_dev_xmit
# bpftrace -lv "struct path"
struct path {
struct vfsmount *mnt;
struct dentry *dentry;
};
# bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
# bpftrace -e 'BEGIN { printf("hello world\n"); }'
# bpftrace -lv "t:syscalls:sys_enter_openat"
# bpftrace -e 'tracepoint:syscalls:sys_enter_openat / comm == "vim" /{ printf("%s %s\n", comm, str(args->filename)); }'
# bpftrace -e 'kprobe:ip_output { @syscalls = count(); }'
# bpftrace -e 'kprobe:ip_output { @syscalls = count(); } interval:s:1 { print(@syscalls); clear(@syscalls); } '
# bpftrace -e 'kprobe:ip_output { @[kstack()] = count(); }'
|
bpftrace file
1
2
3
4
5
6
7
8
9
10
11
12
13
| # cat path.bt
#!/usr/bin/bpftrace
kprobe:vfs_open
{
printf("open path: %s\n", str(((struct path *)arg0)->dentry->d_name.name));
}
# bpftrace path.bt
Attaching 1 probe...
open path: dev
open path: if_inet6
open path: retrans_time_ms
|
buildin examples
1
2
3
4
5
6
7
| # rpm -ql bpftrace | grep tcp.*bt
/usr/share/bpftrace/tools/tcpaccept.bt
/usr/share/bpftrace/tools/tcpconnect.bt
/usr/share/bpftrace/tools/tcpdrop.bt
/usr/share/bpftrace/tools/tcplife.bt
/usr/share/bpftrace/tools/tcpretrans.bt
/usr/share/bpftrace/tools/tcpsynbl.bt
|
Drop monitor
Drop monitor:
- software drops: rigister probe on
kfree_skb
tracepoint - hardware drops
- devlink call back (origin)
- devlink tracepoint
devlink_trap_report
(5.10)
comparation: perf, bpftrace, dropwatch
1
2
3
| # tc qdisc add dev lo clsact
# tc filter add dev lo egress proto ip flower dst_ip 127.0.0.1 action drop
# ping 127.0.0.1
|
##drop watch
1
2
3
4
5
6
7
8
9
10
11
12
| # dropwatch -l kas
> start
Enabling monitoring...
Kernel monitoring activated.
Issue Ctrl-C to stop monitoring
6 drops at ip_rcv+98 (0xffffffff83e276d8)
1 drops at __netif_receive_skb_core+12c (0xffffffff83d9b61c)
2 drops at __netif_receive_skb_core+12c (0xffffffff83d9b61c)
3 drops at ip_rcv+98 (0xffffffff83e276d8)
5 drops at ip_rcv+98 (0xffffffff83e276d8)
...
|
perf
1
2
3
4
5
6
7
8
9
| # perf record -e skb:kfree_skb -ag
# perf script
swapper 0 [001] 3557578.978726: skb:kfree_skb: skbaddr=0xffff9850400bba00 protocol=2048 location=0xffffffff83e276d8
ffffffff83d81563 kfree_skb+0x73 (/usr/lib/debug/lib/modules/4.18.0-300.el8.x86_64/vmlinux)
ffffffff83d81563 kfree_skb+0x73 (/usr/lib/debug/lib/modules/4.18.0-300.el8.x86_64/vmlinux)
ffffffff83e276d8 ip_rcv+0x98 (/usr/lib/debug/lib/modules/4.18.0-300.el8.x86_64/vmlinux)
ffffffff83d9c03b __netif_receive_skb_core+0xb4b (/usr/lib/debug/lib/modules/4.18.0-300.el8.x86_64/vmlinux)
ffffffff83d9c1cd netif_receive_skb_internal+0x3d (/usr/lib/debug/lib/modules/4.18.0-300.el8.x86_64/vmlinux)
...
|
bpftrace
1
2
3
4
5
6
7
8
9
10
11
| # bpftrace -e 'tracepoint:skb:kfree_skb { @[kstack] = count(); }'
@[
kfree_skb+115
kfree_skb+115
ip_rcv+152
__netif_receive_skb_core+2891
netif_receive_skb_internal+61
napi_gro_receive+186
...
secondary_startup_64_no_verify+194
]: 4
|
drop watch usage
For software
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
| # dropwatch -lkas
Initalizing kallsyms db
dropwatch> start
Enabling monitoring...
Kernel monitoring activated.
Issue Ctrl-C to stop monitoring
2 drops at ip_rcv+98 (0xffffffffafc276d8)
1 drops at br_net_exit+7160 (0xffffffffc075f510)
1 drops at ip_rcv_finish+20d (0xffffffffafc2702d)
3 drops at ip_rcv+98 (0xffffffffafc276d8)
1 drops at __init_scratch_end+e0ad2db (0xffffffffc06ad2db)
1 drops at br_net_exit+7160 (0xffffffffc075f510)
2 drops at __udp4_lib_rcv+ab0 (0xffffffffafc60da0)
2 drops at ip_rcv+98 (0xffffffffafc276d8)
1 drops at __init_scratch_end+e0ad2db (0xffffffffc06ad2db)
3 drops at ip_rcv+98 (0xffffffffafc276d8)
2 drops at __init_scratch_end+e0ad2db (0xffffffffc06ad2db)
1 drops at br_net_exit+7160 (0xffffffffc075f510)
4 drops at ip_rcv+98 (0xffffffffafc276d8)
1 drops at br_net_exit+7160 (0xffffffffc075f510)
4 drops at ip_rcv+98 (0xffffffffafc276d8)
2 drops at __init_scratch_end+e0ad2db (0xffffffffc06ad2db)
3 drops at ip6_mc_input+17a (0xffffffffafcb7b9a)
4 drops at ip_rcv+98 (0xffffffffafc276d8)
1 drops at br_net_exit+7160 (0xffffffffc075f510)
1 drops at __init_scratch_end+e0ad2db (0xffffffffc06ad2db)
1 drops at icmpv6_rcv+17c (0xffffffffafcdf55c)
96 drops at ip6_mc_input+17a (0xffffffffafcb7b9a)
71 drops at ip6_mc_input+17a (0xffffffffafcb7b9a)
1 drops at ip_rcv+98 (0xffffffffafc276d8)
1 drops at br_net_exit+7160 (0xffffffffc075f510)
2 drops at ip_rcv+98 (0xffffffffafc276d8)
^CGot a stop message
dropwatch> exit
Shutting down ...
|
For hardware
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| # modprobe netdevsim
# echo "10 2" > /sys/bus/netdevsim/new_device
# ls /sys/bus/netdevsim/devices/netdevsim10/net
eth4 eth5
# ip link set eth4 up
# devlink dev show
netdevsim/netdevsim10
# devlink trap show
netdevsim/netdevsim10:
name source_mac_is_multicast type drop generic true action drop group l2_drops
name vlan_tag_mismatch type drop generic true action drop group l2_drops
name ingress_vlan_filter type drop generic true action drop group l2_drops
...
# devlink trap set netdevsim/netdevsim10 trap blackhole_route action trap
# devlink trap show netdevsim/netdevsim10 trap blackhole_route
netdevsim/netdevsim10:
name blackhole_route type drop generic true action trap group l3_drops
|
Reference: https://lwn.net/Articles/796088/
Author
Hangbin Liu
LastMod
2021-07-15
(132542b)