三剑客
2021-07-01 本文已影响0人
天生顽皮
二.
2.1正则表达式概述
1.目标:
*方便人们处理文本,字符的内容
*方便人们处理有规律的内容
*方便人们使用三剑客,高级语言处理字符
2.应用场景:
通过特殊符号"^ $ .*.* () [] [^] | + ...",表达或匹配有规律的内容
3.举例
匹配手机号
匹配身份证号
2.2正则分类
re (regular expression)
基础正则 bre
扩展正则 ere
1.基础正则符号: ^ $ . * .* ^$ [] [^]
2.扩展正则: + | () {} ?
2.3区别
image.png2.4正则误区
(1)正则vs通配符
image.png(2)通配符快速复习指南
#匹配文件名字
## * 所有
ls *.txt
find / -type f -name "*.avi"
## {} 生成序列
###
echo {a..z} {A..Z}
echo {0..10}
echo {01..100}
### 无规律
[root@m01 ~]# echo {a,b,z}
a b z
[root@m01 ~]# cp oldboy.txt{,.bak}
[root@m01 ~]# ll oldboy.txt*
-rwxrwxrwx. 1 root root 23 Apr 25 16:41 oldboy.txt
-rwxr-xr-x 1 root root 23 Jul 1 09:11 oldboy.txt.bak
[root@m01 ~]# echo cp oldboy.txt{,.bak}
cp oldboy.txt oldboy.txt.bak
[root@m01 ~]#
[root@m01 ~]# echo cp A{,.bak}
cp A A.bak
[root@m01 ~]# echo cp A{,C}
cp A AC
[root@m01 ~]#
### 进阶
[root@m01 ~]# echo {1..10..2}
1 3 5 7 9
[root@m01 ~]# echo {2..10..2}
2 4 6 8 10
[root@m01 ~]# echo {a..z..2}
a c e g i k m o q s u w y
[root@m01 ~]#
[root@m01 ~]#
[root@m01 ~]# seq 1 2 10
1
3
5
7
9
### 了解更多 █████████
man bash
2.5基础正则
image.png(1)^ $
oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!
image.png
2) ^$ 空行
image.png[root@m01 /server/files]# grep -v '^$' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!
案例 排除 /etc/ssh/sshd_config中的空行和注释行
grep -v '^$' /etc/ssh/sshd_config |grep -v '#'
egrep -v '^$|#' /etc/ssh/sshd_config
sed -r '/^$|#/p' /etc/ssh/sshd_config
awk '/^$|#/' /etc/ssh/sshd_config
其他方法:
grep '^[a-zA-Z]' /etc/ssh/sshd_config
grep '^[a-Z]' /etc/ssh/sshd_config
3) .
image.png4) * 前一个字符连续出现0次或0次以上
[root@m01 /server/files]# grep '0*' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!
[root@m01 /server/files]#
[root@m01 /server/files]#
[root@m01 /server/files]#
[root@m01 /server/files]# #grep '0*' oldboy.txt
[root@m01 /server/files]# # 连续出现的时候 0次以上 0 0000 0000000
[root@m01 /server/files]# # 连续出现的时候 0次 ''
[root@m01 /server/files]# grep '' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
挑战: 排除文件中的空行,只包含空格的行
[root@m01 /server/files]# echo -e 'oldboy\n\nlidao \n \n\nlidao\n lidao '>star.txt
[root@m01 /server/files]# cat star.txt
oldboy
lidao
lidao
lidao
[root@m01 /server/files]#
[root@m01 /server/files]# cat -A star.txt
oldboy$
$
lidao $
$
$
lidao$
lidao $
#方法1
[root@m01 /server/files]# grep -v '^$' star.txt
oldboy
lidao
lidao
lidao
[root@m01 /server/files]# grep -v '^$' star.txt |grep -v '^ *$'
oldboy
lidao
lidao
lidao
#方法2
[root@m01 /server/files]# grep -v '^ *$' star.txt
oldboy
lidao
lidao
lidao
[root@m01 /server/files]# #^ *$ *0次以上 ^ $ ^ $ ^ $
[root@m01 /server/files]# #^ *$ *0次 ^$
[root@m01 /server/files]# grep -v '^ *$' star.txt
oldboy
lidao
lidao
lidao
#方法3
[root@m01 /server/files]# egrep -v '^$|^ +$' star.txt
oldboy
lidao
lidao
lidao
5) .* 所有
# 正则表达式的贪婪性 .* 匹配所有 连续出现 * + {} ?
grep '^.*:' passwd
# 限制方法-增加条件/内容
image.png
image.png
正则表达式表示 并且
#匹配 oldboy.txt中 以字母m开头并且以m空格即为的行
[root@m01 /server/files]# grep '^m.*m $' oldboy.txt
my blog is http://oldboy.blog.51cto.com
[root@m01 /server/files]#
[root@m01 /server/files]#
[root@m01 /server/files]# grep '^m' oldboy.txt
my blog is http://oldboy.blog.51cto.com
my qq is 49000448
my god ,i am not oldbey,but OLDBOY!
[root@m01 /server/files]# grep '^m' oldboy.txt |grep 'm $'
my blog is http://oldboy.blog.51cto.com
6) [] [abc] 一个整体 匹配任意一个字符a或b或c
#匹配数字
[0-9]
#匹配字母(大小写字母)
[a-z]
[A-Z]
[a-zA-Z]
[a-Z]
# 基础使用
[root@m01 /server/files]# grep '[abc]' oldboy.txt
[root@m01 /server/files]# grep -o '[abc]' oldboy.txt
a
b
a
c
a
c
b
a
[root@m01 /server/files]# grep -o '[abc][abc]' oldboy.txt
ac
ac
ba
ba
ba
[root@m01 /server/files]# grep '[abc][abc]' oldboy.txt
[root@m01 /server/files]# grep '[abc][abc]' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess
补充: \ 转义字符 脱掉马甲打回原形
[root@m01 /server/files]# grep '\.$' oldboy.txt
I teach linux.
not 4900000448.
[root@m01 /server/files]# # \ 转义字符 脱掉马甲打回原形 去掉特殊含义
[root@m01 /server/files]#
[root@m01 /server/files]# grep '[.^$!!??]$' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
not 4900000448.
[root@m01 /server/files]#
[root@m01 /server/files]# vim oldboy.txt
[root@m01 /server/files]#
[root@m01 /server/files]# grep '[.^$!!??]$' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
not 4900000448.
^^^^$$$$$....****?????!!!!
$$..**?????!!!!
^^^^$$..**???!!!!
[root@m01 /server/files]# grep '[.^$!!??]' oldboy.txt
[root@m01 /server/files]# grep '[.^$!!??]' oldboy.txt
I am oldboy teacher!
I teach linux.
7) [^] [^abc] 一个整体 匹配排除a或排除b或排除c的内容
image.png进阶: 匹配出/etc/passwd的第1列
grep '^[a-zA-Z0-9]*' /etc/passwd
useradd quede_user_name666
grep '^[a-zA-Z0-9]*' /etc/passwd
grep '^[a-zA-Z0-9_]*' /etc/passwd
useradd quede_user_name666-01
grep '^[a-zA-Z0-9_]*' /etc/passwd
grep '^[a-zA-Z0-9_-]*' /etc/passwd
useradd quede_user_name666-01...
grep '^[a-zA-Z0-9_-]*' /etc/passwd
#需要慢慢理解
[root@m01 /server/files]# grep '^[^:]*' /etc/passwd
egrep '^[^:]+' /etc/passwd
8) BRE 小结
image.png2.6 扩展正则
image.png1) |
[root@m01 /server/files]# egrep 'oldboy|my' oldboy.txt
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
my god ,i am not oldbey,but OLDBOY!
[root@m01 /server/files]# grep -E 'oldboy|my' oldboy.txt
[root@m01 /server/files]# grep -E 'oldboy|my' oldboy.txt
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
my god ,i am not oldbey,but OLDBOY!
[root@m01 /server/files]#
[root@m01 /server/files]# grep 'oldboy\|my' oldboy.txt
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
my god ,i am not oldbey,but OLDBOY!
[root@m01 /server/files]# egrep 'oldbo|ey' oldboy.txt
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my god ,i am not oldbey,but OLDBOY!
[root@m01 /server/files]# egrep 'oldb(o|e)y' oldboy.txt
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my god ,i am not oldbey,but OLDBOY!
[root@m01 /server/files]# egrep 'oldb[oe]y' oldboy.txt
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my god ,i am not oldbey,but OLDBOY
[] |
image.png2) + 前一个字符连续出1次或1次以上
#取出文件中所有的单词 #统计重复次数
egrep -o '[a-Z]+' oldboy.txt
egrep -o '[a-Z]+' oldboy.txt |sort
egrep -o '[a-Z]+' oldboy.txt |sort |uniq -c
egrep -o '[a-Z]+' oldboy.txt |sort |uniq -c |sort -rn
egrep -o '[a-Z]+' oldboy.txt |awk '{word[$1]++}END{for(n in word)print n,word[n]}'
#取出文件中所有的字母 #统计重复次数
egrep -o '[a-Z]' oldboy.txt
egrep -o '[a-Z]' oldboy.txt |sort
egrep -o '[a-Z]' oldboy.txt |sort |uniq -c
egrep -o '[a-Z]' oldboy.txt |sort |uniq -c |sort -rn
egrep -o '[a-Z]' oldboy.txt |awk '{word[$1]++}END{for(n in word)print n,word[n]}'
awk -vRS='[^a-zA-Z]+' -F '' '{for(i=1;i<=NF;i++)word[$i]++}END{for(n in word)print n,word[n]}' oldboy.txt
3) {} a{n,m} 前一个字符a连续出现至少n次,最多m次
image.png[root@m01 /server/files]# egrep '0{1,5}' oldboy.txt
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep -o '0{1,5}' oldboy.txt
000
00000
[root@m01 /server/files]# egrep '0{3,4}' oldboy.txt
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep -o '0{3,4}' oldboy.txt
000
0000
[root@m01 /server/files]# egrep '0{3}' oldboy.txt
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep '0{2}' oldboy.txt
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep '0{2}' oldboy.txt -o
00
00
00
id.txt
金 211324198705244720
万 500224197105168312
任 1231231231oldboy
任 3oldboy
任 lidao97303136098
任 alex2197303136098
任 350182197303oldgir
吕 211282199209113038
孔 150000198309176071
邹 371001197412221284
贺 130185200011215926
杜 362522198711278101
向 14052219961008852X
取出文件中正确的身份证号码的行
[root@m01 /server/files]# egrep '[0-9]{17}[0-9X]' id.txt
金 211324198705244720
万 500224197105168312
#正则使用流程
##先找出规律
##通过正则匹配
4) ( ) 表示整体 后向引用()
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep 'isize|bsize'
meta-data=/dev/sda1 isize=512 agcount=4, agsize=65536 blks
data = bsize=4096 blocks=262144, imaxpct=25
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=2560, version=2
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep 'i|bsize'
meta-data=/dev/sda1 isize=512 agcount=4, agsize=65536 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0 spinodes=0
data = bsize=4096 blocks=262144, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep '(i|b)size'
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep '(i|b)size'
meta-data=/dev/sda1 isize=512 agcount=4, agsize=65536 blks
data = bsize=4096 blocks=262144, imaxpct=25
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=2560, version=2
## 后项引用中的 ()
[root@m01 /server/files]# echo 123456
123456
[root@m01 /server/files]# echo '<123456>'
<123456>
[root@m01 /server/files]# echo 123456 |sed -r 's#(.*)#<\1>#g'
<123456>
[root@m01 /server/files]# echo 123456 |sed -r 's#(.)#<\1>#g'
<1><2><3><4><5><6>
[root@m01 /server/files]# echo {1..10}
1 2 3 4 5 6 7 8 9 10
[root@m01 /server/files]# echo {1..10}
1 2 3 4 5 6 7 8 9 10
[root@m01 /server/files]# echo {1..10} |sed -r 's#([0-9])#<\1>#g'
<1> <2> <3> <4> <5> <6> <7> <8> <9> <1><0>
[root@m01 /server/files]# echo {1..10} |sed -r 's#([0-9]+)#<\1>#g'
<1> <2> <3> <4> <5> <6> <7> <8> <9> <10>
[root@m01 /server/files]#
[root@m01 /server/files]#
[root@m01 /server/files]# echo {1..10} |sed -r 's#(4|6|10)#<\1>#g'
1 2 3 <4> 5 <6> 7 8 9 <10>
5) ? 前一个字符出现了0次或1次
[root@m01 /server/files]# cat job.txt
joooooooob
jooooob
jooob
job
jb
[root@m01 /server/files]# egrep 'job|jb' job.txt
job
jb
[root@m01 /server/files]# egrep 'jo?b' job.txt
job
jb
仅仅使用正则取出ip地址
root@m01 /server/files]# ip a s eth0 |egrep -o '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'|head -1
10.0.0.61
[root@m01 /server/files]# ip a s eth0 |egrep '([0-9]{1,3}\.?){4}'
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 10.0.0.61/24 brd 10.0.0.255 scope global eth0
[root@m01 /server/files]# ip a s eth0 |egrep '([0-9]{1,3}\.?){4}' -o
1500
1000
10.0.0.61
10.0.0.255
[root@m01 /server/files]# ip a s eth0 |egrep '([0-9]{1,3}\.?){4}' -o |sed -n 3p
10.0.0.61
image.png
2.7perl正则
1) 零宽断言
#取出 下面的 uptime时间
16:30:13 up 41 days, 17:41, 1 user, load average: 0.03, 0.09, 0.08
16:29:51 up 7:20, 2 users, load average: 0.01, 0.03, 0.05
16:41:42 up 0 min, 1 user, load average: 0.43, 0.13, 0.05
[root@m01 ~]# grep -Po '(?<=up).*(?=\d+ user)' uptime.txt
41 days, 17:41,
7:20,
0 min,
23 min,
image.png
2)perl 符号
image.png2.8其他符号
1) 括号表达式 了解
[:alnum:]
[root@m01 /server/files]# grep '[[:alnum:]]' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!
____aaaaolidao996
[root@m01 /server/files]# grep '[a-zA-Z0-9]' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!
____aaaaolidao996
[root@m01 /server/files]#
[root@m01 /server/files]# grep -P '\w' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com
our size is http://blog.oldboyedu.com
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!
____aaaaolidao996
2) 其他字符
[root@m01 /server/files]# cat name.txt
oldboy
oldboylidao
oldboy666
oldboyedu
[root@m01 /server/files]# grep -w oldboy name.txt
oldboy
[root@m01 /server/files]# grep '\boldboy\b' name.txt
oldboy
[root@m01 /server/files]# grep '\<oldboy\>' name.txt
oldboy