Java 正则表达式之字符串匹配

2023-10-25  本文已影响0人  Tinyspot

1. 正则字符

1.1 转义字符

1.2 字符匹配

字符 描述
. 匹配任意单个字符 (除\n)
\s 匹配任意空白字符,包括空格、制表符、换页符等等。等价[ \f\n\r\t\v]
? {0, 1}, 可选
+ {1, }
* {0, }
[...] 匹配字符组中的任意单个字符
[^…] 排除型字符组

2. 量词匹配

2.1 匹配任意字符

注:单独的 .* 会匹配到空白字符

@Test
public void anyChar() {
    String input = "3.141592653";

    String regex = ".*";
    Matcher matcher = Pattern.compile(regex).matcher(input);

    while (matcher.find()) {
        System.out.println(matcher.group());
    }
}

改为 .+

2.2 匹配点

@Test
public void dotMatch() {
    String data = "3.141592653";
    String regex = "\\.";
    String result = data.replaceAll(regex, "-");
    System.out.println(result);
}

2.3 ? 匹配:可有可无

@Test
public void stringMatch() {
    String regex = "(\\+86|0086)?-?\\d{11}";
    boolean phone = "+86-18823238789".matches(regex);
    boolean phone2 = "0086-18823238789".matches(regex);
    boolean phone3 = "18823238789".matches(regex);
}

3. 字符串匹配

String.matches(String regex)

public final class String {
    public boolean matches(String regex) {
      return Pattern.matches(regex, this);
    }
}

3.1 String.matches(String regex)

@Test
public void stringMatch2() {

    boolean words = "abc".matches("...");
    boolean words2 = "abc".matches(".{3}");
    boolean ip = "192.168.10".matches("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}");
    boolean mail = "Tinyspot@163.com".matches("\\w+@\\w+\\.(com|cn)");
}

4. 字符串替换

4.1 replaceAll(regex, str)

示例一:删除文本中的所有数字

@Test
public void test() {
    String str = "aa12bb44ee";
    String result = str.replaceAll("\\d+", "");
}

4.2 通配符匹配替换

@Test
public void test() {
    String str = "123a45bcd";
    String result = str.replaceAll("\\d?", "-");
    String result2 = str.replaceAll("\\d*", "-");

    // ----a---b-c-d-
    // --a--b-c-d-
}

输出结果

# result
----a---b-c-d-
# result2
--a--b-c-d-

注意:空字符串也会被匹配到,然后被替换为 -

4.3 分组匹配替换 $

示例:超过 10 个数字,改为字符串

@Test
public void groupReplace() {
    /**
     * {
     *     "1001": [
     *         "id": 1234567890123,
     *         "itemId": 123456789
     *     ],
     *     "1002": [
     *         "id": 12345,
     *         "itemId": 1234567890123
     *     ]
     * }
     */
    String jsonStr = "{\"1001\":[\"id\":1234567890123,\"itemId\":123456789],\"1002\":[\"id\":12345,\"itemId\":1234567890123]}";

    String result = jsonStr.replaceAll("(\\d{10,})", "\"$1\"");
    System.out.println(result);
}

替换结果:

{
    "1001": [
        "id": "1234567890123",
        "itemId": 123456789
    ],
    "1002": [
        "id": 12345,
        "itemId": "1234567890123"
    ]
}

5. 其他

5.1 开始符(^)和结束符($)

示例:去除首尾的0

@Test
public void test() {
    String data = "00230045000";
    String start = data.replaceAll("^(0+)", "");
    String end = data.replaceAll("(0+)$", "");
    System.out.println(start + "; " + end);

    String result = data.replaceAll("^0+(.*?)0+$", "$1");
    System.out.println(result);
}

打印结果

230045000; 00230045
230045

5.2 字符边界

5.3 空白行(whilte lines)

" \n" 空格加换行,空白行

@Test
public void whileLines() {
    String lines = " \n".replaceAll("^\\s*$", "----");

    String lines2 = " \n".replaceAll("^[\\s&&[^\\n]]*$", "----");

    boolean matches = " \n".matches("^[\\s&&[^\\n]]*\\n$");
}
上一篇下一篇

猜你喜欢

热点阅读