Python | 关于正则表达式中()分组的一种理解
2019-02-24 本文已影响0人
Quora文选
先看下案例:
phoneRegex = re.compile(r'''(
(\d{3}|\(\d{3}\))? # area code group1
(\s|-|\.)? # separator
(\d{3}) # first 3 digits group3
(\s|-|\.) # separator
(\d{4}) # last 4 digits group5
(\s*(ext|x|ext.)\s*(\d{2,5}))? # extension group8
)''', re.VERBOSE)
组1到组5很好理解,但是很多人可能不理解为什么会出现group(8)
下面来用一个案例解释:
>>> patt = '(\d{2})-(\d{2})-(\d{2})-(\d{2})-(\d{2})-(\d{2})-((\d{2})-(\d{2}))?'
>>> re.match(patt, '11-22-33-44-55-66-77-88').group(1)
'11'
>>> re.match(patt, '11-22-33-44-55-66-77-88').group(5)
'55'
>>> re.match(patt, '11-22-33-44-55-66-77-88').group(6)
'66'
>>> re.match(patt, '11-22-33-44-55-66-77-88').group(7)
'77-88'
>>> re.match(patt, '11-22-33-44-55-66-77-88').group(8)
'77'
>>> re.match(patt, '11-22-33-44-55-66-77-88').group(9)
'88'
>>> re.match(patt, '11-22-33-44-55-66-77-88').group(10)
Traceback (most recent call last):
File "<pyshell>", line 1, in <module>
IndexError: no such group
注意:
- group(7)是
77-88
- group(8)是
77
- group(9)是
88
- group(10)报错
计算分组的方式先按照最外面的小括号,再从小括号里面依次计算
再来一种更直观的理解:
>>> m = re.match('(a(b)(c))', 'abc')
>>> m.groups()
('abc', 'b', 'c')
>>> m.group(1)
'abc'
>>> m.group(2)
'b'
>>> m.group(3)
'c'