Linux小推车

Linux Day21:grep/sed/awk

2018-11-07  本文已影响54人  泥人吴

grep

基础正则表达式

$ vim regular_express.txt 
$ grep -n 'the' regular_express.txt 
8:I can't finish the test.
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
# 将关键字显示颜色
$ grep -n  --color=auto 'the' regular_express.txt 
8:I can't finish the test.
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
# 每次都加上--color=auto显得麻烦,使用alias来处理
$ alias grep='grep --color=auto'
$ source ~/.bashrc
# 反向选择
$ grep -vn 'the' regular_express.txt
# 取得不论大小写的the
$ grep -in 'the' regular_express.txt
  1. [^[:lower:]] 与 [^a-z] 意义相同
#小数点具有其它的意义,所以需要( \ )进行转意,消除特殊意义
$ grep '\.$' regular_express.txt 
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
motorcycle is cheap than car.
This window is clear.
the symbol '*' is represented as start.
You are the best is mean you are the no. 1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
go! go! Let's go.

sed(流编辑器)

NAME
       sed - stream editor for filtering and transforming text

SYNOPSIS
       sed [OPTION]... {script-only-if-no-other-script} [input-file]...
# 马哥: sed 'AddressCommand' file ...
# Address:
1. startLine ,Endline,比如:1,100;$:最后一行;$-1,倒数第二行。
2. /RegExp/,比如:/^root/
$ sed '/oot/d' /etc/fstab
3. /pattern1/,/pattern2/;第一行被pattern1匹配到的行开始,至第一次被pattern2匹配到的行结束,这中间的所有行
4. LineNumber:指定的行
5. StartLine,+N:从startLine开始,向后的N行;
#将 /etc/passwd 的内容列出且打印行号,同时,请将第 2~5 行删
除!
ubuntu@VM-0-3-ubuntu:~$ nl /etc/passwd | sed '2,5d'
     1  root:x:0:0:root:/root:/bin/bash
     6  games:x:5:60:games:/usr/games:/usr/sbin/nologin
     7  man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
     8  lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin

# 只删除第二行
ubuntu@VM-0-3-ubuntu:~$ nl /etc/passwd | sed '2d'
     1  root:x:0:0:root:/root:/bin/bash
     3  bin:x:2:2:bin:/bin:/usr/sbin/nologin

# 若是要删除第 3 到最后一行,则是『nl /etc/passwd | sed '3,$d' 』啦,那个钱字号『$ 』代表最后一行!

# 在第二行后面加入两行字,例如『Drink tea or .....』『drink beer?』
ubuntu@VM-0-3-ubuntu:~$ nl /etc/passwd | sed '2a Drink tea or ......\
> > drink beer ?'
     1  root:x:0:0:root:/root:/bin/bash
     2  daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
Drink tea or ......
> drink beer ?
     3  bin:x:2:2:bin:/bin:/usr/sbin/nologin
# 重点是『我们可以新增不只一行喔!可以新增好几行』但是每一行之间都必须要以反斜杠『\ 』来进行新增的行!所以,上面的例子中,我们可以仅现在第一行癿最后面就有 \ 存在啦!
# \n:可用于换行
ubuntu@VM-0-3-ubuntu:~$ nano sed.sh
ubuntu@VM-0-3-ubuntu:~$ cat sed.sh 
hello,like
hi,my love
# 将like-->liker;love-->lover
ubuntu@VM-0-3-ubuntu:~$ sed 's#l..e#&r#' sed.sh 
hello,liker
hi,my lover
# 或者后向引用:
ubuntu@VM-0-3-ubuntu:~$ sed 's#\(l..e\)#\1r#' sed.sh 
hello,liker
hi,my lover
# 将like-->Like;love-->Love
ubuntu@VM-0-3-ubuntu:~$ sed 's#l\(..e\)#L\1#g' sed.sh 
hello,Like
hi,my Love
ubuntu@VM-0-3-ubuntu:~$ sed 's#l\(..e\)#L\1#' sed.sh 
hello,Like
hi,my Love

awk

ubuntu@VM-0-3-ubuntu:~$ last -n 5 
ubuntu   pts/0        218.70.16.106    Wed Nov  7 14:40   still logged in
ubuntu   pts/1        119.86.113.106   Fri Nov  2 23:18 - 02:05  (02:47)
ubuntu   pts/0        113.205.193.254  Fri Nov  2 21:17 - 01:26  (04:08)
ubuntu   pts/0        218.70.16.106    Fri Nov  2 18:50 - 18:54  (00:04)
ubuntu   pts/0        218.70.16.106    Fri Nov  2 15:40 - 18:47  (03:07)

wtmp begins Fri Nov  2 13:13:25 2018

# 选去第一行、第三列
ubuntu@VM-0-3-ubuntu:~$ last -n 5 | awk '{$1 "\t" $3}'
# 忘了加上print
ubuntu@VM-0-3-ubuntu:~$ last -n 5 | awk '{print $1 "\t" $3}'
ubuntu  218.70.16.106
ubuntu  119.86.113.106
ubuntu  113.205.193.254
ubuntu  218.70.16.106
ubuntu  218.70.16.106
# 默认的字符的分隔符为空格键或[TAB]键
# 在 /etc/passwd 当中是以冒号 ":"来作为字段癿分隔,
# 该档案中第一字段为账号,第三字段则是 UID。
# 那假设我要查阅,第三栏小于 10 以下的数据,并且仅列出账号的第三栏, 那么可以这样做:
ubuntu@VM-0-3-ubuntu:~$ cat /etc/passwd | awk '{FS=":"} $3 < 10 {print $1 "\t" $3}'
root:x:0:0:root:/root:/bin/bash 
daemon  1
bin 2
sys 3
sync    4
games   5
man 6
lp  7
mail    8
news    9
# 第一行并未正确显示:
# 这是因为我们读入第一行的时候,那些发数 $1,$2... 默认还是以空格键为分隔的,所以虽然我们定义了 FS=":" 了, 但是即仅能在第二行后才开始生效。那么怎么办呢?我们可以预先设定 awk 的变量, 利用 BEGIN 这个关键:
ubuntu@VM-0-3-ubuntu:~$ cat /etc/passwd | awk 'BEGIN {FS=":"} $3 < 10 {print $1 "\t" $3}'
root    0
daemon  1
bin 2
sys 3
sync    4
games   5
man 6
lp  7
mail    8
news    9

最后强烈推荐:

生信技能树公益视频合辑:学习顺序是linux,r,软件安装,geo,小技巧,ngs组学!
请猛戳下面链接
B站链接:https://m.bilibili.com/space/338686099

YouTube链接:https://m.youtube.com/channel/UC67sImqK7V8tSWHMG8azIVA/playlists

生信工程师入门最佳指南:https://mp.weixin.qq.com/s/vaX4ttaLIa19MefD86WfUA

学徒培养:https://mp.weixin.qq.com/s/3jw3_PgZXYd7FomxEMxFmw

上一篇 下一篇

猜你喜欢

热点阅读