shell分析nginx日志

2020-02-27  本文已影响0人  水平号

192.168.40.1 - - [26/Feb/2020:20:10:22 +0800] "GET /Public/Front/js/plugins/layui/lay/modules/element.js HTTP/1.1" 200 7011 "http://192.168.40.136/index.html" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36" "-"

1,统计26/Feb/2020的PV量

grep '26/Feb/2020' /var/log/nginx/access.log |wc -l
另一种写法
awk '/26\/Feb\/2020/' /var/log/nginx/access.log

2,统计某一段时间内的PV量

 awk '/26\/Feb\/2020:20:10:22/','/26\/Feb\/2020:20:10:31/' /var/log/nginx/access.log
另一种写法:
sed -n '/26\/Feb\/2020:20:10:22/','/26\/Feb\/2020:20:10:31/p' /var/log/nginx/access.log
另另一种写法:
awk '$4>="[26/Feb/2020:20:10:22" && $4<="[26/Feb/2020:20:10:31"' /var/log/nginx/access.log

3,统计26/Feb/2020 一天内访问最多的10个IP(top 10)

grep '26/Feb/2020' /var/log/nginx/access.log |awk '{ips[$1]++}END{for(i in ips){print i,ips[i]}}' |sort -k2 -rn |head -n 10

4,统计26/Feb/2020 访问大于100次的IP

awk '/26\/Feb\/2020/{ips[$1]++}END{for(i in ips){print i,ips[i]}}' /var/log/nginx/access.log |awk '{if($2>100){print $1}}'
另一种写法
awk '/26\/Feb\/2020/{ips[$1]++}END{for(i in ips){if(ips[i]>100){print i}}}' /var/log/nginx/access.log

5,统计26/Feb/2020 访问最多的10个页面($request)

awk '/26\/Feb\/2020/{url[$7]++}END{for(i in url){print i,url[i]}}' /var/log/nginx/access.log |sort -k2rn |head -n 10

awk '/26\/Feb\/2020/{url[$7]++}END{for(i in url){print i,url[i]}}' /var/log/nginx/access.log |awk '$2>2'|sort -k2rn |head -n 10

6,统计每个URL访问内容总大小($body_bytes_sent)

awk '/26\/Feb\/2020/{size[$7]+=$10}END{for(i in size){print i,size[i]}}' /var/log/nginx/access.log |sort -k2rn |head

size[$7]+=$10 数组的特性,需要统计某个字段就设置为 **数组的索引 **

image.png

7,统计每个IP访问状态码数量($status)

 awk '/26\/Feb\/2020/{ip_code[$1" "$9]++}END{for(i in ip_code){print i,ip_code[i]}}' /var/log/nginx/access.log

awk '/26\/Feb\/2020/{ip_code[$1" "$9]++}END{for(i in ip_code){print i,ip_code[i]}}' /var/log/nginx/access.log |sort -n -t' ' -k 1 -k 3
-n 升序,“-t :”表示“请使用空格()作为列间隔符,-k1 -k3 先第1列排序后再第3列排序

把IP和状态码当做一个整体,相同的IP和状态码进行叠加统计。 {ip_code[$1" "$9]++}

8,统计每个IP访问状态码为404及出现的次数($status)

 awk '/26\/Feb\/2020/{if($9==404){ip_code[$1" "$9]++}}END{for(i in ip_code){print i,ip_code[i]}}' /var/log/nginx/access.log

添加一个判断条件 if($9==404)

9,统计前一分钟的PV量

dates=$(date -d '1 minute ago' +%d/%b/%Y:%H:%M);awk -v date=$dates '$0 ~ date {i++}END{print i}' /var/log/nginx/access.log

使用-v参数,传入外部变量

date -d '1 minute ago' +%d/%b/%Y:%H:%M
了解date的用法

10,统计各种状态码数量

 sed -n '/27\/Feb\/2020:08:50:03/','/27\/Feb\/2020:08:50:04/p' /var/log/nginx/access.log |awk '{codes[$9]++}END{for(i in codes){print i,codes[i]}}'

11,统计某个时间段,访问状态码是404

 awk '/27\/Feb\/2020:08:50:03/','/27\/Feb\/2020:08:50:04/' /var/log/nginx/access.log |awk '{if($9==404)ip_code[$1" "$9]++}END{for(i in ip_code){print i,ip_code[i]}}'

12,状态码百分比

awk '/27\/Feb\/2020:08:50:03/{codes[$9]++;total++}END{for(i in codes){printf i" ";printf codes[i]" ";printf "%.2f%\n", codes[i]/total*100}}' /var/log/nginx/access.log

codes[$9]++ 不同的状态码叠加
total++ 所有行的叠加
用codes[ i]/total*100 就是百分比

上一篇下一篇

猜你喜欢

热点阅读