AWK 案例
2022-05-31 本文已影响0人
QXPLUS
1. 在'a b c d'的'b'后面添加'e f g'
{$2=$2" e f g";}
[Mb18@login Work_qx]$ echo 'a b c d' | awk '{$2=$2" e f g";print}'
a b e f g c d
- 修改
$2
会重建$0
, 默认OFS=" "
[Mb18@login Work_qx]$ echo 'a b c d' | awk '{$2=$2;print}'
a b c d
2. 格式化空白
[Mb18@login TESTA01]$ cat test1.txt
aaaa bbbb ccc
dd ff ggg
ddd fff eee gg hh jj jj
- 修改任何的2、0
[Mb18@login TESTA01]$ awk '{$2=$2;print}' test1.txt
aaaa bbbb ccc
dd ff ggg
ddd fff eee gg hh jj jj
- 修改
OFS = "\t"
[Mb18@login TESTA01]$ awk 'BEGIN{OFS = "\t"} {$1=$1;print}' test1.txt
aaaa bbbb ccc
dd ff ggg
ddd fff eee gg hh jj jj
3. 筛选IPv4地址
从ifconfig
命令的结果中筛选出lo网卡的所有IPv4地址
方法一:
ifconfig
[Mb18@login TESTA01]$ ifconfig
ens2f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.10.10.41 netmask 255.255.0.0 broadcast 10.10.255.255
inet6 fe80::ea61:1fff:fe21:c9d8 prefixlen 64 scopeid 0x20<link>
ether e8:61:1f:21:c9:d8 txqueuelen 1000 (Ethernet)
RX packets 962102443 bytes 188205391048 (175.2 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 903676777 bytes 86806163792 (80.8 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xc5000000-c57fffff
ens2f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.201 netmask 255.255.252.0 broadcast 192.168.3.255
inet6 fe80::ea61:1fff:fe21:c9d9 prefixlen 64 scopeid 0x20<link>
ether e8:61:1f:21:c9:d9 txqueuelen 1000 (Ethernet)
RX packets 14030133704 bytes 15282171947551 (13.8 TiB)
RX errors 0 dropped 28663 overruns 764 frame 0
TX packets 28089382895 bytes 39695929075318 (36.1 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xc4800000-c4ffffff
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 12.12.12.41 netmask 255.255.255.0 broadcast 12.12.12.255
inet6 fe80::526b:4b03:3a:8c5c prefixlen 64 scopeid 0x20<link>
infiniband 20:00:00:67:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 28984646487 bytes 57171279777901 (51.9 TiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 14558790557 bytes 26489320986891 (24.0 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1 (Local Loopback)
RX packets 212338469 bytes 111843472616 (104.1 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 212338469 bytes 111843472616 (104.1 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
- 筛选 以 'inet '开头的行 (4行)
[Mb18@login TESTA01]$ ifconfig | awk '/inet /'
inet 10.10.10.41 netmask 255.255.0.0 broadcast 10.10.255.255
inet 192.168.1.201 netmask 255.255.252.0 broadcast 192.168.3.255
inet 12.12.12.41 netmask 255.255.255.0 broadcast 12.12.12.255
inet 127.0.0.1 netmask 255.0.0.0
- 在上面的基础上,去掉'inet 127'开头的行 (4行-> 3行)
[Mb18@login TESTA01]$ ifconfig | awk '/inet / && !($2~ /^127/)'
inet 10.10.10.41 netmask 255.255.0.0 broadcast 10.10.255.255
inet 192.168.1.201 netmask 255.255.252.0 broadcast 192.168.3.255
inet 12.12.12.41 netmask 255.255.255.0 broadcast 12.12.12.255
- 取第二个字段,几位IPv4的IP
[Mb18@login TESTA01]$ ifconfig | awk '/inet / && !($2~ /^127/) {print $2}'
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
10.10.10.41
192.168.1.201
12.12.12.41
方法二:
- awk 按段落读取 (默认是按行读取)
修改 输入记录分隔符RS
-
RS=""
: 按段落读取 -
RS="\0"
: 一次性读取所有数据,但是有些特殊文件可能会包含空字符\0
-
RS="^$"
: 真正地一次性读取所有数据,可以过滤掉空文件 -
RS="\n+"
: 按行读取,但是忽略所有空行
按段落输出, 输出ifconfig
输出内容的第一个段落
每个段落保存在$0
中。
[Mb18@login TESTA01]$ ifconfig | awk 'BEGIN{RS = ""} NR==1 {print}'
ens2f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.10.10.41 netmask 255.255.0.0 broadcast 10.10.255.255
inet6 fe80::ea61:1fff:fe21:c9d8 prefixlen 64 scopeid 0x20<link>
ether e8:61:1f:21:c9:d8 txqueuelen 1000 (Ethernet)
RX packets 962144533 bytes 188213637436 (175.2 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 903722507 bytes 86810472621 (80.8 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xc5000000-c57fffff
- 按照段落分割后,$0内部按照 空格隔开(包括\n \t)
[Mb18@login TESTA01]$ ifconfig | awk 'BEGIN{RS = ""} NR==1 {print $6}'
10.10.10.41
- 取除了
lo
网卡的其他所有网卡的IPv4地址
[Mb18@login TESTA01]$ ifconfig | awk 'BEGIN{RS = ""} !/^lo:/ {print $6}'
10.10.10.41
192.168.1.201
12.12.12.41
方法三:
- 先修改
FS="\n"
, 获取第2行
[Mb18@login TESTA01]$ ifconfig | awk 'BEGIN{RS = ""; FS = "\n"} !/^lo:/ {print $2}'
inet 10.10.10.41 netmask 255.255.0.0 broadcast 10.10.255.255
inet 192.168.1.201 netmask 255.255.252.0 broadcast 192.168.3.255
inet 12.12.12.41 netmask 255.255.255.0 broadcast 12.12.12.255
- 再将
FS=" "
,换回来,并更新0,最后打印第二行的第二个元素
[Mb18@login TESTA01]$ ifconfig | awk 'BEGIN{RS = ""; FS = "\n"} !/^lo:/ {$0 = $2; FS = " "; $0 = $0; print $2}'
10.10.10.41
4. 读取.ini格式的配置文件
test.ini
[Mb18@login TESTA01]$ cat test.ini
[mysqld]
#设置3306端口号
port=3306
#设置MySQL的安装目录
basedir=D:\\mysql\\mysql-8.0.16-winx64(这是我的MySQL路径,注意用\\而非\)
#设置MySQL数据库的数据存放目录
datadir=D:\\mysql\\mysql-8.0.16-winx64\\data(与上面同理,注意最后的data文件名保存不变)
#运行最大连接数
max_connections=200
#运行连接失败的次数。这也是为了防止有人从该主机试图攻击数据库系统
max_connect_errors=10
#服务端使用的字符集默认为utf-8
character-set-server=utf8
[mysql]
#客户端使用的字符集默认为utf8
default-character-set=utf8
[client]
#客户端默认端口号为3306
port=3306
-
搜索
[mysql]
整段内容 -
getline 返回值
>0
: 表示已经读取到数据
=0
: 表示遇到结尾EOF,没有读取到内容
<0
: 表示读取错误 -
1.awk
index ($0, "[mysql]"){
print
while ( (getline var) >0 ){
if (var ~ /\[*\]/){
exit
}
print var
}
}
- 执行
1.awk
脚本
[Mb18@login TESTA01]$ awk -f 1.awk test.ini
[mysql]
#客户端使用的字符集默认为utf8
default-character-set=utf8
5. 根据某字段去重
- 去掉
uid = XXX
重复的行
test.txt
2019-01-13_12:00_index?uid=123
2019-01-13_13:00_index?uid=123
2019-01-13_14:00_index?uid=333
2019-01-13_15:00_index?uid=9710
2019-01-14_12:00_index?uid=123
2019-01-14_13:00_index?uid=123
2019-01-15_15:00_index?uid=333
2019-01-16_15:00_index?uid=9710
- 基于
uid=xxx
去重
-
-F "?"
指定分割符为"?",进行字段划分,uid=xxx为$2 - 对字段出现的次数进行统计,利用hash 数组统计2]=arr[$2]+1`
- 只打印2]==1) {print}`
[Mb18@login TESTA01]$ awk -F "?" '{arr[$2]=arr[$2]+1; if (arr[$2]==1) {print}}' test.txt
2019-01-13_12:00_index?uid=123
2019-01-13_14:00_index?uid=333
2019-01-13_15:00_index?uid=9710
- 利用自增进行计数
[Mb18@login TESTA01]$ awk -F "?" '{arr[$2]++; if (arr[$2]==1) {print}}' test.txt
2019-01-13_12:00_index?uid=123
2019-01-13_14:00_index?uid=333
2019-01-13_15:00_index?uid=9710
-
arr++
先返回值,再自增,所以第一次出现的时候会先返回0,再+1
[Mb18@login TESTA01]$ awk -F "?" '!arr[$2]++ {print}' test.txt
2019-01-13_12:00_index?uid=123
2019-01-13_14:00_index?uid=333
2019-01-13_15:00_index?uid=9710
- 还可以直接省略
print
[Mb18@login TESTA01]$ awk -F "?" '!arr[$2]++' test.txt
2019-01-13_12:00_index?uid=123
2019-01-13_14:00_index?uid=333
2019-01-13_15:00_index?uid=9710
6. AWK 进行次数统计
test.txt
portmapper
portmapper
portmapper
portmapper
portmapper
portmapper
status
status
mountd
mountd
mountd
mountd
mountd
mountd
nfs
nfs
nfs_acl
nfs
nfs
nfs_acl
nlockmgr
nlockmgr
nlockmgr
nlockmgr
nlockmgr
- 词频统计
[Mb18@login TESTA01]$ awk '{arr[$0]++} END {for (i in arr) {print i, arr[i]}}' test.txt
nfs 4
status 2
nlockmgr 5
portmapper 6
nfs_acl 2
mountd 6
7. 统计TCP连接状态数量
netstart -tnap
[Mb18@login TESTA01]$ netstat -tnap
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:6059 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6027 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6060 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6028 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6061 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6029 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6062 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6030 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:48750 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:46830 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6063 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6031 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:6032 0.0.0.0:* LISTEN -
- 统计state出现的次数,并打印
[Mb18@login TESTA01]$ netstat -tnap | awk '{arr[$6]++} END {for (i in arr) {print arr[i], i}}'
153 LISTEN
134 ESTABLISHED
1 established)
1 Foreign
14 TIME_WAIT
- 只查看TCP的输出
[Mb18@login TESTA01]$ netstat -tnap | awk '/^tcp/ {arr[$6]++} END {for (i in arr) {print arr[i], i}}'
153 LISTEN
134 ESTABLISHED
42 TIME_WAIT