shell分析nginx日志
2020-02-27 本文已影响0人
水平号
192.168.40.1 - - [26/Feb/2020:20:10:22 +0800] "GET /Public/Front/js/plugins/layui/lay/modules/element.js HTTP/1.1" 200 7011 "http://192.168.40.136/index.html" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36" "-"
1,统计26/Feb/2020的PV量
grep '26/Feb/2020' /var/log/nginx/access.log |wc -l
另一种写法
awk '/26\/Feb\/2020/' /var/log/nginx/access.log
2,统计某一段时间内的PV量
awk '/26\/Feb\/2020:20:10:22/','/26\/Feb\/2020:20:10:31/' /var/log/nginx/access.log
另一种写法:
sed -n '/26\/Feb\/2020:20:10:22/','/26\/Feb\/2020:20:10:31/p' /var/log/nginx/access.log
另另一种写法:
awk '$4>="[26/Feb/2020:20:10:22" && $4<="[26/Feb/2020:20:10:31"' /var/log/nginx/access.log
3,统计26/Feb/2020 一天内访问最多的10个IP(top 10)
grep '26/Feb/2020' /var/log/nginx/access.log |awk '{ips[$1]++}END{for(i in ips){print i,ips[i]}}' |sort -k2 -rn |head -n 10
4,统计26/Feb/2020 访问大于100次的IP
awk '/26\/Feb\/2020/{ips[$1]++}END{for(i in ips){print i,ips[i]}}' /var/log/nginx/access.log |awk '{if($2>100){print $1}}'
另一种写法
awk '/26\/Feb\/2020/{ips[$1]++}END{for(i in ips){if(ips[i]>100){print i}}}' /var/log/nginx/access.log
5,统计26/Feb/2020 访问最多的10个页面($request)
awk '/26\/Feb\/2020/{url[$7]++}END{for(i in url){print i,url[i]}}' /var/log/nginx/access.log |sort -k2rn |head -n 10
awk '/26\/Feb\/2020/{url[$7]++}END{for(i in url){print i,url[i]}}' /var/log/nginx/access.log |awk '$2>2'|sort -k2rn |head -n 10
6,统计每个URL访问内容总大小($body_bytes_sent)
awk '/26\/Feb\/2020/{size[$7]+=$10}END{for(i in size){print i,size[i]}}' /var/log/nginx/access.log |sort -k2rn |head
size[$7]+=$10
数组的特性,需要统计某个字段就设置为 **数组的索引 **
7,统计每个IP访问状态码数量($status)
awk '/26\/Feb\/2020/{ip_code[$1" "$9]++}END{for(i in ip_code){print i,ip_code[i]}}' /var/log/nginx/access.log
awk '/26\/Feb\/2020/{ip_code[$1" "$9]++}END{for(i in ip_code){print i,ip_code[i]}}' /var/log/nginx/access.log |sort -n -t' ' -k 1 -k 3
-n 升序,“-t :”表示“请使用空格()作为列间隔符,-k1 -k3 先第1列排序后再第3列排序
把IP和状态码当做一个整体,相同的IP和状态码进行叠加统计。 {ip_code[$1" "$9]++}
8,统计每个IP访问状态码为404及出现的次数($status)
awk '/26\/Feb\/2020/{if($9==404){ip_code[$1" "$9]++}}END{for(i in ip_code){print i,ip_code[i]}}' /var/log/nginx/access.log
添加一个判断条件 if($9==404)
9,统计前一分钟的PV量
dates=$(date -d '1 minute ago' +%d/%b/%Y:%H:%M);awk -v date=$dates '$0 ~ date {i++}END{print i}' /var/log/nginx/access.log
使用-v参数,传入外部变量
date -d '1 minute ago' +%d/%b/%Y:%H:%M
了解date的用法
10,统计各种状态码数量
sed -n '/27\/Feb\/2020:08:50:03/','/27\/Feb\/2020:08:50:04/p' /var/log/nginx/access.log |awk '{codes[$9]++}END{for(i in codes){print i,codes[i]}}'
11,统计某个时间段,访问状态码是404
awk '/27\/Feb\/2020:08:50:03/','/27\/Feb\/2020:08:50:04/' /var/log/nginx/access.log |awk '{if($9==404)ip_code[$1" "$9]++}END{for(i in ip_code){print i,ip_code[i]}}'
12,状态码百分比
awk '/27\/Feb\/2020:08:50:03/{codes[$9]++;total++}END{for(i in codes){printf i" ";printf codes[i]" ";printf "%.2f%\n", codes[i]/total*100}}' /var/log/nginx/access.log
codes[$9]++ 不同的状态码叠加
total++ 所有行的叠加
用codes[ i]/total*100 就是百分比