#hello，JS：07 正则表达式

2018-08-11 本文已影响0人饥人谷_远方

一、意义：

用来判断用户输入是否符合某些特定规则，如检测判断用户所填信息是否为正确的手机号？正确的邮箱地址？以及在很多文本编辑器中，则更常用来检索、替换那些符合某个模式的文本。

二、定义：

JS中的特定对象，用来指定某些规则，对字符串进行匹配，是否符合规则，或者将其结果拿出来检测是否匹配该结果。

三、如何创建正则表达式

通过内置对象RegExp去支持正则表达式。如果我们想匹配字符串中<%xxx%>两个百分号分割的字符串可以这么写。创建正则表达式对象有两种方法：

1、构造函数方法

 var reg=new RegExp('<%[^%>]+%>','g');

注：正则表达式使用new RegExp方法，传递的是字符串，而正则表达式里还有一些参数，表示匹配之后需要遵循什么规则

2.字面量方法

 var reg=/<%[^%>]%>/g;

注：

g：global，全文搜索，不添加的话搜索到第一个结果停止搜索
i：ingore case，忽略大小写，默认大小写敏感
m：multiple lines，多行搜索

四、正则表达式如何表达

var reg = /hello/ig   //可看成reg创建了一个正则表达式的对象
--> undefined
reg
--> /hello/gi   //对象，是控制台对一个对象特有的表达
//事实上，reg不是一个字符串，而是一个对象，如下
typeof reg
-->"object"

//如何表示一个字符串
var str = '/hello/ig'
--> undefined
str
--> "/hello/ig"
typeof str
--> "string"

五、正则表达式的其他概念

正则表达式让人望而却步的重要原因是转义字符太多，导致组合出来的也非常多
自我总结：
正则表达式的匹配字符，在预设的字符串中通过正则的匹配字符进行检索，找出匹配该正则字符符合的参数/元素

1、元字符

即正则表达式中具有特殊意义的专用字符，可以用来规定其前导字符

( [ { \ ^ $ | ) ? * + .

比如我们返回一个带有元字符的正则表达式，需要：

var reg = '/\[h\]ello/ig'
--> undefined
reg
--> "/[h]ello/ig"

不是每个元字符都有特定的意义，在不同组合中元字符有不同意义：

image

2、字符类匹配

正则表达式表示一个字符串，匹配的是一个字符串中相应的字符
我们可以使用元字符[ ]构建一个简单的类，用来匹配某一类，
如匹配一个手机号：

var reg =/[a-z]/ig
或者
var reg =/[abc01234]/ig
//[a-z] 或[0-9]表示一个字符，-  表示一个范围和区间

匹配一个数字（或大小写字母）

var reg =/[0-A9-c5-Bn-Z]/ig

3、取反匹配

var reg =/[^abc01234]/ig
//^表示只要不是中括号里的参数中的任何一个，则都被选中

4、预定义类匹配

即有没有符号代替数字或者非数字？
如我们希望匹配一个可以是ab+数字+任意字符的字符串，就可以这样写了 /ab\d./

var reg = /ab\d./

图：

image

5、边界匹配

（1）区分一下^ 作为取反和边界匹配的差别

如：取反匹配的表示

//取反：匹配的是除了hello这个字符串之外的所有元素字符串，
str.match(/[^hello]/g)  
--> (47) [" ", "w", "r", "d", " ", "1", "2", "3", "4", "5", "6", "7", "8", " ", "   ", " ", "
", " ", "j", "i", "r", "n", "g", "u", " ", "↵", " ", "w", "a", "n", "g", "x", "i", "a", "q", "i", "n", " ", "↵", " ", "n", "i", "a", "s", "i", "j", "i"]

如：边界匹配的表示

//边界：匹配的是以hello开头的字符串
str.match(/^hello/g)
--> ["hello"]

//即我们会匹配这个hello，不过要通过此来验证该hello是否位于字符串的开头

或

var str = 'hello world hello 12345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie'
--> undefined
str.match(/^hello/g)
--> ["hello"]

（2）匹配以XXX结尾

str.match(/hello$/g)
--> null  //未匹配上

或者

var str = 'hello1 world hello2 12345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/hello3$/g)
--> ["hello3"]

或者

var str = 'hello1 world hello2 12345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/^hello\d/g)
--> ["hello1"]
str.match(/hello\d/g)
--> (3) ["hello1", "hello2", "hello3"]

（3）单词边界匹配

var str = 'hello1 world hello2 12345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/\bhello\d\b/g)
--> (3) ["hello1", "hello2", "hello3"]

或者

var str = 'hello1 whello9-Ahello5orld hello2 12-hello8-A345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/\bhello\d\b/g)
--> (4) ["hello1", "hello2", "hello8", "hello3"]  //以此可看出单词边界的一个标准，前后均有空格或者以下情况

或者: - 也为单词边界

var str = 'hello1 hello9-Ahello5orld hello2 12-hello8-A345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/\bhello\d/g)
//注：这里右侧少了一个\b的边界匹配，返回如下：
--> (5) ["hello1", "hello9", "hello2", "hello8", "hello3"]

或者：\t 也为单词边界

var str = '\thello1 hello9-Ahello5orld hello2 12-hello8-A345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/\bhello\d\b/g)
-->(5) ["hello1", "hello9", "hello2", "hello8", "hello3"]

\r 也为单词边界

var str = '\rhello1 hello9-Ahello5orld hello2 12-hello8-A345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/\bhello\d\b/g)
--> (5) ["hello1", "hello9", "hello2", "hello8", "hello3"]

var str = '\nhello1 hello9-Ahello5orld hello2 12-hello8-A345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie hello3'
--> undefined
str.match(/\bhello\d\b/g)
--> (5) ["hello1", "hello9", "hello2", "hello8", "hello3"]

如何判断一个字符串中是否有某个单词（元素/参数）？（相当于一个class中是否包含某元素？），假设：

var str = 'header3 clearfix active header-fixed'

如如何求header这个单词？
❌错误解法

var str = 'header3 clearfix active header-fixed'
-->undefined
str.match(/\bheader\b/g)
--> ["header"]  //系统默认第四个单词中的header

✅正确解法:
| 可以匹配空白字符本身也可以作为开头

var str = 'header3 clearfix active header-fixed'
-->undefined
str.match(/(^|\s)header($|\s)/g)
--> null  //没有匹配到任何东西
var str = 'header3 clearfix active header-fixed header'
undefined
str.match(/(^|\s)header($|\s)/g)
--> [" header"]  //从上一步，知道匹配到最后一个单词header

6、量词匹配

image

看一个字符串是否为url，该如何匹配

var str = 'http://wangxiaoqin.com'
--> undefined
var str2 = 'https://wangxiaoqin.com'
-->undefined
str.match(/https?:\/\/.+/g) //s 这个字母后加 ? 表示出现零次或一次（最多出现一次）不管有没有出现都行
                          //.  任意字符   +  多次
--> ["http://wangxiaoqin.com"]
str2.match(/https?:\/\/.+/g)
--> ["https://wangxiaoqin.com"]

//s出现多次，？无效   *有效  其他匹配字符
var str3 = 'httpssssss://wangxiaoqin.com'
--> undefined
str3.match(/https?:\/\/.+/g)
--> null
str3.match(/https*:\/\/.+/g)
--> ["httpssssss://wangxiaoqin.com"]

7、实践

（1）实践1：如何更精准地判断url？
固定元素 http:// https:// //

str.match(/^(https?:)?\/\/.+/)   //(https?:)? ()里的表示可有可无
--> (2) ["http://wangxiaoqin.com", "http:", index: 0, input: "http://wangxiaoqin.com", groups: undefined]
str2.match(/^(https?:)?\/\/.+/)
--> (2) ["https://wangxiaoqin.com", "https:", index: 0, input: "https://wangxiaoqin.com", groups: undefined]
var str4 = '//wangxiaoqin.com'
--> undefined
str4.match(/^(https?:)?\/\/.+/)
--> (2) ["//wangxiaoqin.com", undefined, index: 0, input: "//wangxiaoqin.com", groups: undefined]

（2）实践2：如何判断用户的输入是手机号？
假设：

var str1 ='15011112222'
-->undefined
var str2 ='aaadd15011112222'
--> undefined
var str3 ='150111111222222555555'
--> undefined
var str4 ='150aaadd11112222'
--> undefined

分析：手机号只有11位、数字开头/结尾、中国手机号以1开头

//接上面代码继续
str1.match(/1[3578]\d{9}/g)
--> ["15011112222"]  // ✅准确无误地完全匹配上
str2.match(/1[3578]\d{9}/g)
--> ["15011112222"]   //虽然也返回正则所规定的，但只是部分匹配

如何接解决？分析：手机号前后都没东西，也就是要以数字1作为开头，结尾也要匹配数字字符

//接上面代码
str2.match(/^1[3578]\d{9}/g)
--> null
str1.match(/^1[3578]\d{9}/g)
-->["15011112222"]   ✅
str3.match(/^1[3578]\d{9}/g)
--> ["15011111122"]   //虽然也返回正则所规定的，也是以1为开头的，但没有结尾

//$ 结尾匹配上该字符
str1.match(/^1[3578]\d{9}$/g)
--> ["15011112222"]
str2.match(/^1[3578]\d{9}$/g)
--> null
str3.match(/^1[3578]\d{9}$/g)
--> null
str4.match(/^1[3578]\d{9}$/g)
--> null

图：

image

8、如何具象使用正则？

（1）先利用字符串的搜索去把正则给应用起来

var str = 'hello world 12345678 \t \r jirengu \n wangxiaoqin \n nihaoshijie'
--> undefined

结果如：

image

（2）接着回忆，利用字符串的搜索功能搜出某个字的起始下标（所传递的是一个字符串）

str.search('world') 
--> 6  //这里可看出str输出的字符串，空格跳过

这里，可以使用正则表达式进行检索

str.match('world')
str.match(/\d/)
//只匹配到一次，1，拿到就结束

如图：

image

//匹配所有的数字字符
str.match(/\d/g)
-->(8) ["1", "2", "3", "4", "5", "6", "7", "8"]

如图：

image

六、贪婪模式与非贪婪模式

1、贪婪模式：

在贪婪（默认）模式下，正则引擎【尽可能多】地重复匹配字符
先看这个代码：

var str = 'a "witch" and her "boom" is one'
--> undefined
str.match(/".*"/g)
--> [""witch" and her "boom""]

按照我们的逻辑来看，本来我们想要匹配的是"witch"和"boom"两个字符串，但是结果却出乎我们的意料，是"witch" and her "boom"这样一个整体的字符串。这样的字符串则是因为正则表达式的贪婪模式起的效果。
正则引擎演示如下：
正则的贪婪模式采用了查找算法的形式进行检索匹配字符。正则引擎从字符串的第0位开始检索：（在此为若愚老师打call，他讲正则引擎工作讲得真的很不错😂）

image

2、非贪婪模式：

与贪婪模式相反，可通过在代表数量的标示符后放置？（表示尽可能地少匹配）来开启非贪婪模式。如：？、 +？甚至是??
非贪婪模式下，正则引擎尽可能少地重复匹配字符
演示如下：

var str = 'a "witch" and her "boom" is one'
--> undefined
str.match(/".*?"/g)   
--> (2) [""witch"", ""boom""]

图：

image

七、分组

使用量词的时候匹配多个字符，而不是像上面例子只是匹配一个
首先，假设写成 hunger{10} 的话匹配的是hunge＋r出现10次

/hunger{10}/   //hunge＋r出现10次

把hunger作为一个整体呢？使用()就可以达到此目的，我们称为分组

/(hugner){10}/

字符或 | 在正则引擎中是怎么处理的呢

var str = 'helloworld'
--> undefined
str.match(/hello|sworld/) \ //有个疑问，是(hello)|(sworld),还是hell(o|s)world
              //匹配返回的有可能是hello或者hells
-->["hello", index: 0, input: "helloworld", groups: undefined]
验证之后，等同于
str.match(/(hello)|(sworld)/)
--> (3) ["hello", "hello", undefined, index: 0, input: "helloworld", groups: undefined]
str.match(/hello|sworld/g)
--> ["hello"]

//而事实上，匹配helloworld这个字符串，字符 |  则需要将左右两边相邻的字母用()包裹起来
 str.match(/hell(o|s)world/g)
--> ["helloworld"]

所以

(hello|world){20}  //表示hello或者world都匹配出现20次

分组的作用

str
--> "helloworld"
str.match(/(he)l(lo)/)
--> (3) ["hello", "he", "lo", index: 0, input: "helloworld", groups: undefined]
VS
str.match(/hello/)
-->["hello", index: 0, input: "helloworld", groups: undefined]

分组的价值实践：（同时运用了非贪婪模式的？让返回完全匹配）

var str ='<a href="http://wangxiaoqin.com">'
--> undefined
str.match(/href="(https?:\/\/.+?)"/)
--> (2) ["href="http://wangxiaoqin.com"", "http://wangxiaoqin.com", index: 3, input: "<a href="http://wangxiaoqin.com">", groups: undefined]
str.match(/href="(https?:\/\/.+?)"/)[1]
-->"http://wangxiaoqin.com"

非贪婪模式 VS贪婪模式

var str ='<a href="http://wangxiaoqin.com">"hello"'
--> undefined
str.match(/href="(https?:\/\/.+?)"/)
--> (2) ["href="http://wangxiaoqin.com"", "http://wangxiaoqin.com", index: 3, input: "<a href="http://wangxiaoqin.com">"hello"", groups: undefined]
str.match(/href="(https?:\/\/.+?)"/g)
--> ["href="http://wangxiaoqin.com""]

str.match(/href="(https?:\/\/.+)"/)
--> (2) ["href="http://wangxiaoqin.com">"hello"", "http://wangxiaoqin.com">"hello", index: 3, input: "<a href="http://wangxiaoqin.com">"hello"", groups: undefined]
str.match(/href="(https?:\/\/.+)"/g)
--> ["href="http://wangxiaoqin.com">"hello""]

八、前瞻

var str=hunger(?=Byron)

str.exec('goodByron123'); //['good']
str.exec('goodCasper123'); //null
str.exec('goodCasper123');//null

九、正则表达式的相关用法

1、reg.test(str)

测试字符串参数中是否存正则表达式模式，如果存在则返回true，否则返回false

var reg = /\d+\.\d{1,2}$/g;

reg.test('123.45'); //true
reg.test('0.2'); //true
reg.test('a.34'); //false
reg.test('34.5678'); //false

实践价值：
测试一个手机号是否存在？

var reg = /^1[3578]\d{9}$/g   //正则的手机号检索标准
--> undefined
reg.test('18320158956')
--> true

2、reg.exec(str)

用于正则表达式模式在字符串中运行查找，持续去运行。如果exec()找到了匹配的文本，则返回一个结果数组,否则返回 null
如：

var str ='123 456 789'
var reg =/\d{3}/   //未添加g，则匹配一次就结束
reg.exec(str)
["123", index: 0, input: "123 456 789", groups: undefined]

/* 进行g的全局匹配 */
var str ='123 456 789'
var reg =/\d{3}/g
reg.exec(str)
--> ["123", index: 0, input: "123 456 789", groups: undefined]
reg.exec(str)
--> ["456", index: 4, input: "123 456 789", groups: undefined]
reg.exec(str)
--> ["789", index: 8, input: "123 456 789", groups: undefined]

实践价值：
如找出相对应的邮箱，手机号码等

//代码如下：  
  var str ='123 456 789'
  var reg =/\d{3}/g
while(result=reg.exec(str)){
         console.log(result[0])
}

--> 123
    456
    789

3、str.split(reg)

使用split方法把字符串分割为字符数组

var str = 'h   e  llo wan g xi ao qi   n'
undefined
//以前
str.split('')
--> (29) ["h", " ", " ", " ", "e", " ", " ", "l", "l", "o", " ", "w", "a", "n", " ", "g", " ", "x", "i", " ", "a", "o", " ", "q", "i", " ", " ", " ", "n"]
str.split(' ')
--> (14) ["h", "", "", "e", "", "llo", "wan", "g", "xi", "ao", "qi", "", "", "n"]
//现在：用正则表达式做一个检索分类
str.split(/\s/)
--> (14) ["h", "", "", "e", "", "llo", "wan", "g", "xi", "ao", "qi", "", "", "n"]
str.split(/\s*/)
--> (16) ["h", "e", "l", "l", "o", "w", "a", "n", "g", "x", "i", "a", "o", "q", "i", "n"]

#hello，JS：07 正则表达式

一、意义：

二、定义：

三、如何创建正则表达式

1、构造函数方法

2.字面量方法

四、正则表达式如何表达

五、正则表达式的其他概念

1、元字符

2、字符类匹配

3、取反匹配

4、预定义类匹配

5、边界匹配

6、量词匹配

7、实践

8、如何具象使用正则？

六、贪婪模式与非贪婪模式

1、贪婪模式：

2、非贪婪模式：

七、分组

八、前瞻

九、正则表达式的相关用法

1、reg.test(str)

2、reg.exec(str)

3、str.split(reg)

猜你喜欢

热点阅读

#hello，JS：07 正则表达式

一、意义：

二、定义：

三、如何创建正则表达式

1、构造函数方法

2.字面量方法

四、正则表达式如何表达

五、正则表达式的其他概念

1、元字符

2、字符类 匹配

3、取反 匹配

4、预定义类 匹配

5、边界 匹配

6、量词 匹配

7、实践

8、如何具象使用正则？

六、贪婪模式与非贪婪模式

1、贪婪模式：

2、非贪婪模式：

七、分组

八、前瞻

九、正则表达式的相关用法

1、reg.test(str)

2、reg.exec(str)

3、str.split(reg)

猜你喜欢

热点阅读

2、字符类匹配

3、取反匹配

4、预定义类匹配

5、边界匹配

6、量词匹配