正则表达式的贪婪模式和非贪婪模式

2021-01-25 本文已影响0人星光下的胖子

一、什么是贪婪模式和非贪婪模式？

定义

贪婪模式：正则表达式趋向于匹配最大长度。
非贪婪模式：正则表达式趋向于匹配最小长度，即一旦匹配到结果就结束。

如何区分？

默认是贪婪模式。
若 量词 后面添加 问号(?)，则是非贪婪模式。量词 包括以下4种：
- {m, n}：匹配 m 到 n 个，包含 m、n。
  - {N} --> 匹配N次
  - {M, N} --> 匹配M到N次
  - {M,} --> 匹配至少M次
  - {,N} --> 匹配至多N次
- *：匹配任意多个，包括0个。
- +：匹配1到多个。
- ?：0或1个。

简单示例

String str = "abcaxc";
Pattern p1 = "ab.*c";  # 贪婪模式，匹配结果是：abcaxc
Pattern p2 = "ab.*?c";  # 量词*后面添加?，是非贪婪模式，匹配结果：abc

二、程序实例

1）Java 代码

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegularExpression {
    public static void main(String[] args) {
        String text = "(content:\"rcpt to root\";pcre:\"word\";)";
        String rule1 = "content:\".+\""; //贪婪模式
        String rule2 = "content:\".+?\""; //非贪婪模式

        System.out.println("文本：" + text);

        System.out.println("贪婪模式：" + rule1);
        Pattern p1 = Pattern.compile(rule1);
        Matcher m1 = p1.matcher(text);
        while (m1.find()) {
            System.out.println("匹配结果：" + m1.group(0));
        }

        System.out.println("非贪婪模式：" + rule2);
        Pattern p2 = Pattern.compile(rule2);
        Matcher m2 = p2.matcher(text);
        while (m2.find()) {
            System.out.println("匹配结果：" + m2.group(0));
        }
    }
}

运行结果：

文本：(content:"rcpt to root";pcre:"word";)
贪婪模式：content:".+"
匹配结果：content:"rcpt to root";pcre:"word"
非贪婪模式：content:".+?"
匹配结果：content:"rcpt to root"

2）Python 代码

# 导入re模块
import re

text = '<table><td><th>贪婪</th><th>贪婪</th></td></table>贪婪'
print("文本：", text)
print("贪婪模式：")
print(re.findall(r"<.*>", text))
print("非贪婪模式：")
print(re.findall(r"<.*?>", text))

运行结果：

文本： <table><td><th>贪婪</th><th>贪婪</th></td></table>贪婪
贪婪模式：
['<table><td><th>贪婪</th><th>贪婪</th></td></table>']
非贪婪模式：
['<table>', '<td>', '<th>', '</th>', '<th>', '</th>', '</td>', '</table>']

正则表达式的贪婪模式和非贪婪模式

一、什么是贪婪模式和非贪婪模式？

定义

如何区分？

简单示例

二、程序实例

1）Java 代码

2）Python 代码

猜你喜欢

热点阅读