三剑客

2021-07-01  本文已影响0人  天生顽皮

二.

2.1正则表达式概述

1.目标:
    *方便人们处理文本,字符的内容
    *方便人们处理有规律的内容
    *方便人们使用三剑客,高级语言处理字符
2.应用场景:
  通过特殊符号"^ $   .*.*    ()   []    [^]    |  +  ...",表达或匹配有规律的内容
3.举例
   匹配手机号
   匹配身份证号

2.2正则分类

re (regular expression)
基础正则 bre
扩展正则 ere
1.基础正则符号: ^  $  .  *   .*     ^$  []    [^]
2.扩展正则: +    |   ()    {}   ?

2.3区别

image.png

2.4正则误区

(1)正则vs通配符
image.png
(2)通配符快速复习指南
#匹配文件名字
## * 所有
ls *.txt   
find /    -type f  -name "*.avi"
## {} 生成序列
###
echo {a..z}   {A..Z}  
echo {0..10}  
echo {01..100}
### 无规律
[root@m01 ~]# echo {a,b,z}
a b z
[root@m01 ~]# cp   oldboy.txt{,.bak}
[root@m01 ~]# ll oldboy.txt* 
-rwxrwxrwx. 1 root root 23 Apr 25 16:41 oldboy.txt
-rwxr-xr-x  1 root root 23 Jul  1 09:11 oldboy.txt.bak
[root@m01 ~]# echo cp   oldboy.txt{,.bak}
cp oldboy.txt oldboy.txt.bak
[root@m01 ~]# 
[root@m01 ~]# echo cp   A{,.bak}
cp A A.bak
[root@m01 ~]# echo cp   A{,C}
cp A AC
[root@m01 ~]# 
### 进阶
[root@m01 ~]# echo {1..10..2}
1 3 5 7 9
[root@m01 ~]# echo {2..10..2}
2 4 6 8 10
[root@m01 ~]# echo {a..z..2}
a c e g i k m o q s u w y
[root@m01 ~]# 
[root@m01 ~]# 
[root@m01 ~]# seq 1   2 10 
1
3
5
7
9
### 了解更多 █████████
man bash

2.5基础正则

image.png
(1)^ $
oldboy.txt 
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY! 
image.png
2) ^$ 空行
image.png
[root@m01 /server/files]# grep -v '^$' oldboy.txt 
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!  
案例 排除 /etc/ssh/sshd_config中的空行和注释行
grep -v '^$' /etc/ssh/sshd_config |grep -v '#'
egrep -v '^$|#' /etc/ssh/sshd_config
sed -r '/^$|#/p'   /etc/ssh/sshd_config
awk   '/^$|#/'   /etc/ssh/sshd_config
其他方法: 
grep '^[a-zA-Z]' /etc/ssh/sshd_config
grep '^[a-Z]' /etc/ssh/sshd_config
3) .
image.png
4) * 前一个字符连续出现0次或0次以上
[root@m01 /server/files]# grep '0*' oldboy.txt 
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!  
[root@m01 /server/files]# 
[root@m01 /server/files]# 
[root@m01 /server/files]# 
[root@m01 /server/files]# #grep '0*' oldboy.txt 
[root@m01 /server/files]# # 连续出现的时候 0次以上 0 0000 0000000
[root@m01 /server/files]# # 连续出现的时候 0次     ''
[root@m01 /server/files]# grep '' oldboy.txt 
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
挑战: 排除文件中的空行,只包含空格的行
[root@m01 /server/files]# echo -e 'oldboy\n\nlidao   \n     \n\nlidao\n   lidao     '>star.txt
[root@m01 /server/files]# cat star.txt
oldboy
lidao    
     
lidao
   lidao     
[root@m01 /server/files]# 
[root@m01 /server/files]# cat -A star.txt
oldboy$
$
lidao    $
     $
$
lidao$
   lidao     $
   
#方法1    
[root@m01 /server/files]# grep -v '^$' star.txt 
oldboy
lidao    
     
lidao
   lidao     
[root@m01 /server/files]# grep -v '^$' star.txt |grep -v '^ *$'
oldboy
lidao    
lidao
   lidao     
#方法2 
[root@m01 /server/files]# grep -v '^ *$' star.txt 
oldboy
lidao    
lidao
   lidao     
[root@m01 /server/files]# #^ *$   *0次以上   ^ $ ^ $ ^         $
[root@m01 /server/files]# #^ *$   *0次       ^$
[root@m01 /server/files]# grep -v '^ *$' star.txt 
oldboy
lidao    
lidao
   lidao 
   
   
#方法3    
[root@m01 /server/files]# egrep -v '^$|^ +$' star.txt
oldboy
lidao    
lidao
   lidao 
5) .* 所有
# 正则表达式的贪婪性   .* 匹配所有   连续出现 * + {} ?   
grep  '^.*:' passwd 
# 限制方法-增加条件/内容
image.png
image.png
正则表达式表示 并且
#匹配 oldboy.txt中 以字母m开头并且以m空格即为的行
[root@m01 /server/files]# grep '^m.*m $' oldboy.txt 
my blog is http://oldboy.blog.51cto.com 
[root@m01 /server/files]# 
[root@m01 /server/files]# 
[root@m01 /server/files]# grep '^m' oldboy.txt 
my blog is http://oldboy.blog.51cto.com 
my qq is 49000448
my god ,i am not oldbey,but OLDBOY!  
[root@m01 /server/files]# grep '^m' oldboy.txt |grep 'm $'
my blog is http://oldboy.blog.51cto.com 
6) [] [abc] 一个整体 匹配任意一个字符a或b或c
#匹配数字
[0-9]
#匹配字母(大小写字母) 
[a-z]
[A-Z]
[a-zA-Z]
[a-Z]
# 基础使用
[root@m01 /server/files]# grep '[abc]' oldboy.txt 
[root@m01 /server/files]# grep -o '[abc]' oldboy.txt 
a
b
a
c
a
c
b
a

[root@m01 /server/files]# grep -o '[abc][abc]' oldboy.txt 
ac
ac
ba
ba
ba
[root@m01 /server/files]# grep '[abc][abc]' oldboy.txt 
[root@m01 /server/files]# grep '[abc][abc]' oldboy.txt 
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess
补充: \ 转义字符 脱掉马甲打回原形
[root@m01 /server/files]# grep '\.$' oldboy.txt 
I teach linux.
not 4900000448.
[root@m01 /server/files]# # \ 转义字符 脱掉马甲打回原形 去掉特殊含义
[root@m01 /server/files]# 
[root@m01 /server/files]# grep '[.^$!!??]$' oldboy.txt 
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
not 4900000448.
[root@m01 /server/files]# 
[root@m01 /server/files]# vim oldboy.txt 
[root@m01 /server/files]# 
[root@m01 /server/files]# grep '[.^$!!??]$' oldboy.txt 
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
not 4900000448.
^^^^$$$$$....****?????!!!!
$$..**?????!!!!
^^^^$$..**???!!!!
[root@m01 /server/files]# grep '[.^$!!??]' oldboy.txt 
[root@m01 /server/files]# grep '[.^$!!??]' oldboy.txt 
I am oldboy teacher!
I teach linux.
7) [^] [^abc] 一个整体 匹配排除a或排除b或排除c的内容
image.png
进阶: 匹配出/etc/passwd的第1列
grep '^[a-zA-Z0-9]*' /etc/passwd
useradd quede_user_name666
grep '^[a-zA-Z0-9]*' /etc/passwd
grep '^[a-zA-Z0-9_]*' /etc/passwd
useradd quede_user_name666-01
grep '^[a-zA-Z0-9_]*' /etc/passwd
grep '^[a-zA-Z0-9_-]*' /etc/passwd
useradd quede_user_name666-01...
grep '^[a-zA-Z0-9_-]*' /etc/passwd
#需要慢慢理解
[root@m01 /server/files]# grep '^[^:]*' /etc/passwd 
egrep '^[^:]+' /etc/passwd 
8) BRE 小结
image.png

2.6 扩展正则

image.png
1) |
[root@m01 /server/files]# egrep 'oldboy|my' oldboy.txt 
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
my god ,i am not oldbey,but OLDBOY!  
[root@m01 /server/files]# grep -E 'oldboy|my' oldboy.txt 
[root@m01 /server/files]# grep -E 'oldboy|my' oldboy.txt 
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
my god ,i am not oldbey,but OLDBOY!  
[root@m01 /server/files]# 
[root@m01 /server/files]# grep   'oldboy\|my' oldboy.txt 
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
my god ,i am not oldbey,but OLDBOY! 
[root@m01 /server/files]# egrep 'oldbo|ey' oldboy.txt 
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my god ,i am not oldbey,but OLDBOY!  
[root@m01 /server/files]# egrep 'oldb(o|e)y' oldboy.txt 
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my god ,i am not oldbey,but OLDBOY!  
[root@m01 /server/files]# egrep 'oldb[oe]y' oldboy.txt 
I am oldboy teacher!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my god ,i am not oldbey,but OLDBOY
[] |
image.png
2) + 前一个字符连续出1次或1次以上
#取出文件中所有的单词 #统计重复次数
egrep -o '[a-Z]+' oldboy.txt 
egrep -o '[a-Z]+' oldboy.txt |sort 
egrep -o '[a-Z]+' oldboy.txt |sort |uniq -c
egrep -o '[a-Z]+' oldboy.txt |sort |uniq -c |sort -rn
egrep -o '[a-Z]+' oldboy.txt |awk '{word[$1]++}END{for(n in word)print n,word[n]}'

#取出文件中所有的字母 #统计重复次数
egrep -o '[a-Z]' oldboy.txt 
egrep -o '[a-Z]' oldboy.txt |sort 
egrep -o '[a-Z]' oldboy.txt |sort |uniq -c
egrep -o '[a-Z]' oldboy.txt |sort |uniq -c |sort -rn
egrep -o '[a-Z]' oldboy.txt |awk '{word[$1]++}END{for(n in word)print n,word[n]}'
awk -vRS='[^a-zA-Z]+' -F ''  '{for(i=1;i<=NF;i++)word[$i]++}END{for(n in word)print n,word[n]}' oldboy.txt
3) {} a{n,m} 前一个字符a连续出现至少n次,最多m次
image.png
[root@m01 /server/files]# egrep '0{1,5}' oldboy.txt 
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep -o '0{1,5}' oldboy.txt 
000
00000
[root@m01 /server/files]# egrep '0{3,4}' oldboy.txt 
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep -o '0{3,4}' oldboy.txt 
000
0000
[root@m01 /server/files]# egrep '0{3}' oldboy.txt 
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep '0{2}' oldboy.txt 
my qq is 49000448
not 4900000448.
[root@m01 /server/files]# egrep '0{2}' oldboy.txt -o 
00
00
00
id.txt
金 211324198705244720
万 500224197105168312
任 1231231231oldboy
任 3oldboy
任 lidao97303136098
任 alex2197303136098
任 350182197303oldgir
吕 211282199209113038
孔 150000198309176071
邹 371001197412221284
贺 130185200011215926
杜 362522198711278101
向 14052219961008852X
取出文件中正确的身份证号码的行

[root@m01 /server/files]# egrep '[0-9]{17}[0-9X]' id.txt 
金 211324198705244720
万 500224197105168312

#正则使用流程
##先找出规律
##通过正则匹配
4) ( ) 表示整体 后向引用()
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep 'isize|bsize'
meta-data=/dev/sda1              isize=512    agcount=4, agsize=65536 blks
data     =                       bsize=4096   blocks=262144, imaxpct=25
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep 'i|bsize'
meta-data=/dev/sda1              isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep '(i|b)size'
[root@m01 /server/files]# xfs_info /dev/sda1 |egrep '(i|b)size'
meta-data=/dev/sda1              isize=512    agcount=4, agsize=65536 blks
data     =                       bsize=4096   blocks=262144, imaxpct=25
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=2560, version=2
## 后项引用中的 () 
[root@m01 /server/files]# echo 123456 
123456
[root@m01 /server/files]# echo '<123456>'
<123456>

[root@m01 /server/files]# echo 123456 |sed -r 's#(.*)#<\1>#g'
<123456>
[root@m01 /server/files]# echo 123456 |sed -r 's#(.)#<\1>#g'
<1><2><3><4><5><6>
[root@m01 /server/files]# echo {1..10} 
1 2 3 4 5 6 7 8 9 10
[root@m01 /server/files]# echo {1..10} 
1 2 3 4 5 6 7 8 9 10
[root@m01 /server/files]# echo {1..10} |sed -r 's#([0-9])#<\1>#g'
<1> <2> <3> <4> <5> <6> <7> <8> <9> <1><0>
[root@m01 /server/files]# echo {1..10} |sed -r 's#([0-9]+)#<\1>#g'
<1> <2> <3> <4> <5> <6> <7> <8> <9> <10>
[root@m01 /server/files]# 
[root@m01 /server/files]# 
[root@m01 /server/files]# echo {1..10} |sed -r 's#(4|6|10)#<\1>#g'
1 2 3 <4> 5 <6> 7 8 9 <10>
5) ? 前一个字符出现了0次或1次
[root@m01 /server/files]# cat job.txt
joooooooob
jooooob
jooob
job
jb
[root@m01 /server/files]# egrep 'job|jb' job.txt
job
jb
[root@m01 /server/files]# egrep 'jo?b' job.txt
job
jb
仅仅使用正则取出ip地址
root@m01 /server/files]# ip a s eth0 |egrep -o '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'|head -1 
10.0.0.61
[root@m01 /server/files]# ip a s eth0 |egrep '([0-9]{1,3}\.?){4}'
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
   inet 10.0.0.61/24 brd 10.0.0.255 scope global eth0
[root@m01 /server/files]# ip a s eth0 |egrep '([0-9]{1,3}\.?){4}' -o 
1500
1000
10.0.0.61
10.0.0.255

[root@m01 /server/files]# ip a s eth0 |egrep '([0-9]{1,3}\.?){4}' -o |sed -n 3p 
10.0.0.61
image.png

2.7perl正则

1) 零宽断言
#取出 下面的 uptime时间
16:30:13 up 41 days, 17:41,  1 user, load average: 0.03, 0.09, 0.08
16:29:51 up  7:20,  2 users, load average: 0.01, 0.03, 0.05
16:41:42 up 0 min,  1 user, load average: 0.43, 0.13, 0.05

[root@m01 ~]# grep -Po '(?<=up).*(?=\d+ user)' uptime.txt 
41 days, 17:41,  
  7:20,  
0 min,  
23 min, 
image.png
2)perl 符号
image.png

2.8其他符号

1) 括号表达式 了解
[:alnum:]
[root@m01 /server/files]# grep '[[:alnum:]]' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!  
____aaaaolidao996
[root@m01 /server/files]# grep '[a-zA-Z0-9]' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!  
____aaaaolidao996
[root@m01 /server/files]# 
[root@m01 /server/files]# grep -P '\w' oldboy.txt
I am oldboy teacher!
I teach linux.
I like badminton ball ,billiard ball and chinese chess!
my blog is http://oldboy.blog.51cto.com 
our size is http://blog.oldboyedu.com 
my qq is 49000448
not 4900000448.
my god ,i am not oldbey,but OLDBOY!  
____aaaaolidao996
2) 其他字符
[root@m01 /server/files]# cat name.txt 
oldboy
oldboylidao
oldboy666
oldboyedu
[root@m01 /server/files]# grep -w oldboy name.txt
oldboy
[root@m01 /server/files]# grep '\boldboy\b' name.txt
oldboy
[root@m01 /server/files]# grep '\<oldboy\>' name.txt
oldboy
上一篇 下一篇

猜你喜欢

热点阅读