Java中的Replace和ReplaceAll的区别

replace和replaceAll是String类中提供的两种用于字符/字符串替换的方法。如果只从字面意思理解,很容易误解为replace表示替换单个匹配项,而replaceAll表示替换所有匹配项;而事实上则完全不是这样:P

1、概述

2、相关类String、Pattern、Matcher

3、相关方法

3.1、Matcher

3.2、Pattern

3.3、String

4、结论


1、概述

String类中一共提供了四种替换字符/字符串相关的方法,分别是replace的两个重载方法、replaceAll方法和replaceFirst方法。

  • replace(字符):全部匹配的都替换;参数为字符(char)类型;不调用Pattern和Matcher方法。
  • replace(字串接口实现类):全部匹配的都替换;参数为字串接口实现类(如String);不支持正则匹配,调用Pattern(不匹配正则模式)和Matcher的replaceAll方法。
  • replaceAll:全部匹配的都替换,参数为String类型,支持正则匹配;调用Pattern(匹配正则模式)和Matcher的replaceAll方法。
  • replaceFirst:第一个匹配到的替换,参数为String类型,支持正则匹配;调用Pattern(匹配正则模式)和Matcher的replaceFirst方法。

2、相关类String、Pattern、Matcher

  • String类:
public final class String implements java.io.Serializable, Comparable, CharSequence

           字符串和相关方法的类:The String class represents character strings. All string literals in Java programs, such as "abc", are implemented as instances of this class.

详细介绍见以下两篇博客:

  • Pattern && Matcher

正则表达式捕获组的概念:https://blog.csdn.net/kofandlizi/article/details/7323863

Pattern和Matcher大概介绍:https://blog.csdn.net/yin380697242/article/details/52049999

总的来说,Pattern类的作用在于编译正则表达式后创建一个匹配模式,Matcher类使用Pattern实例提供的模式信息对正则表达式进行匹配。

  • String、Pattern、Matcher类的相关方法调用图

Java中的Replace和ReplaceAll的区别_第1张图片

3、相关方法

3.1、Matcher

详细见这篇博文:https://www.cnblogs.com/SQP51312/p/6134324.html

  • Matcher(Pattern parent, CharSequence text);

Matcher的构造函数,包访问权限,不允许外部生成Matcher的实例

/**

* All matchers have the state used by Pattern during a match.

*/

Matcher(Pattern parent, CharSequence text) {

    this.parentPattern = parent;

    this.text = text;



    // Allocate state storage

    int parentGroupCount = Math.max(parent.capturingGroupCount, 10);

    groups = new int[parentGroupCount * 2];    //数组groups是组使用的存储。存储的是当前匹配的各捕获组的first和last信息。

    locals = new int[parent.localCount];



    // Put fields into initial states

    reset();

}
  • public Matcher appendReplacement(StringBuffer sb, String replacement);

将当前匹配子串替换为指定字符串,并将从上次匹配结束后到本次匹配结束后之间的字符串添加到一个StringBuffer对象中,最后返回其字符串表示形式。

/**

* Implements a non-terminal append-and-replace step.

*

* 

This method performs the following actions:

* *
    * *
  1. It reads characters from the input sequence, starting at the * append position, and appends them to the given string buffer. It * stops after reading the last character preceding the previous match, * that is, the character at index {@link * #start()} - 1.

  2. * *
  3. It appends the given replacement string to the string buffer. *

  4. * *
  5. It sets the append position of this matcher to the index of * the last character matched, plus one, that is, to {@link #end()}. *

  6. * *
* *

The replacement string may contain references to subsequences * captured during the previous match: Each occurrence of * ${name} or $g * will be replaced by the result of evaluating the corresponding * {@link #group(String) group(name)} or {@link #group(int) group(g)} * respectively. For $g, * the first number after the $ is always treated as part of * the group reference. Subsequent numbers are incorporated into g if * they would form a legal group reference. Only the numerals '0' * through '9' are considered as potential components of the group * reference. If the second group matched the string "foo", for * example, then passing the replacement string "$2bar" would * cause "foobar" to be appended to the string buffer. A dollar * sign ($) may be included as a literal in the replacement * string by preceding it with a backslash (\$). * *

Note that backslashes (\) and dollar signs ($) in * the replacement string may cause the results to be different than if it * were being treated as a literal replacement string. Dollar signs may be * treated as references to captured subsequences as described above, and * backslashes are used to escape literal characters in the replacement * string. * *

This method is intended to be used in a loop together with the * {@link #appendTail appendTail} and {@link #find find} methods. The * following code, for example, writes one dog two dogs in the * yard to the standard-output stream:

* *

* Pattern p = Pattern.compile("cat");

* Matcher m = p.matcher("one cat two cats in the yard");

* StringBuffer sb = new StringBuffer();

* while (m.find()) {

* m.appendReplacement(sb, "dog");

* }

* m.appendTail(sb);

* System.out.println(sb.toString());
* * @param sb * The target string buffer * * @param replacement * The replacement string * * @return This matcher * * @throws IllegalStateException * If no match has yet been attempted, * or if the previous match operation failed * * @throws IllegalArgumentException * If the replacement string refers to a named-capturing * group that does not exist in the pattern * * @throws IndexOutOfBoundsException * If the replacement string refers to a capturing group * that does not exist in the pattern */ public Matcher appendReplacement(StringBuffer sb, String replacement) {     // If no match, return error     if (first < 0)         throw new IllegalStateException("No match available");     // Process substitution string to replace group references with groups     int cursor = 0;     StringBuilder result = new StringBuilder();     while (cursor < replacement.length()) {  // 1start         char nextChar = replacement.charAt(cursor);         if (nextChar == '\\') {  // 2start             cursor++;             nextChar = replacement.charAt(cursor);             result.append(nextChar);             cursor++;         } else if (nextChar == '$') {  // 2end,3start             // Skip past $             cursor++;             // A StringIndexOutOfBoundsException is thrown if             // this "$" is the last character in replacement             // string in current implementation, a IAE might be             // more appropriate.             nextChar = replacement.charAt(cursor);             int refNum = -1;             if (nextChar == '{') {  // 4start                 cursor++;                 StringBuilder gsb = new StringBuilder();                 while (cursor < replacement.length()) {  // 5start                     nextChar = replacement.charAt(cursor);                     if (ASCII.isLower(nextChar) || ASCII.isUpper(nextChar) || ASCII.isDigit(nextChar)) {  // 6start                         gsb.append(nextChar);                         cursor++;                     } else {  // 6end,7start                         break;                     }  // 7end                 }  // 5end                 if (gsb.length() == 0)                     throw new IllegalArgumentException("named capturing group has 0 length name");                 if (nextChar != '}')                     throw new IllegalArgumentException("named capturing group is missing trailing '}'");                 String gname = gsb.toString();                 if (ASCII.isDigit(gname.charAt(0)))                     throw new IllegalArgumentException("capturing group name {" + gname + "} starts with digit character");                 if (!parentPattern.namedGroups().containsKey(gname))                     throw new IllegalArgumentException("No group with name {" + gname + "}");                 refNum = parentPattern.namedGroups().get(gname);                 cursor++;             } else {  // 4end,8start                 // The first number is always a group                 refNum = (int)nextChar - '0';                 if ((refNum < 0)||(refNum > 9))                     throw new IllegalArgumentException("Illegal group reference");                 cursor++;                 // Capture the largest legal group string                 boolean done = false;                 while (!done) {  // 9start                     if (cursor >= replacement.length()) {  // 10start                         break;                     }  // 10end                     int nextDigit = replacement.charAt(cursor) - '0';                     if ((nextDigit < 0)||(nextDigit > 9)) {  // 11start                         // not a number                         break;                     }  // 11end                     int newRefNum = (refNum * 10) + nextDigit;                     if (groupCount() < newRefNum) {  // 12start                         done = true;                     } else {  // 12end,13start                         refNum = newRefNum;                         cursor++;                     }  // 13end                 }  // 9end                 }  // 8end             // Append group             if (start(refNum) != -1 && end(refNum) != -1)                 result.append(text, start(refNum), end(refNum));         } else {  // 3end,14start             result.append(nextChar);             cursor++;         }  // 14end     }  // 1end     // Append the intervening text     sb.append(text, lastAppendPosition, first);     // Append the match substitution     sb.append(result);     lastAppendPosition = last;     return this; }
  • public StringBuffer appendTail(StringBuffer sb);

将最后一次匹配工作后剩余的字符串添加到一个StringBuffer对象里。

/**

* Implements a terminal append-and-replace step.

*

* 

This method reads characters from the input sequence, starting at * the append position, and appends them to the given string buffer. It is * intended to be invoked after one or more invocations of the {@link * #appendReplacement appendReplacement} method in order to copy the * remainder of the input sequence.

* * @param sb * The target string buffer * * @return The target string buffer */ public StringBuffer appendTail(StringBuffer sb) {     sb.append(text, lastAppendPosition, getTextLength());     return sb; }
  • public String replaceAll(String replacement);

将匹配的子串用指定的字符串替换。此方法首先重置匹配器,然后判断是否有匹配,若有,则创建StringBuffer 对象,然后循环调用appendReplacement方法进行替换,最后调用 appendTail方法并返回StringBuffer 对象的字符串形式。

/**

* Replaces every subsequence of the input sequence that matches the

* pattern with the given replacement string.

*

* 

This method first resets this matcher. It then scans the input * sequence looking for matches of the pattern. Characters that are not * part of any match are appended directly to the result string; each match * is replaced in the result by the replacement string. The replacement * string may contain references to captured subsequences as in the {@link * #appendReplacement appendReplacement} method. * *

Note that backslashes (\) and dollar signs ($) in * the replacement string may cause the results to be different than if it * were being treated as a literal replacement string. Dollar signs may be * treated as references to captured subsequences as described above, and * backslashes are used to escape literal characters in the replacement * string. * *

Given the regular expression a*b, the input * "aabfooaabfooabfoob", and the replacement string * "-", an invocation of this method on a matcher for that * expression would yield the string "-foo-foo-foo-". * *

Invoking this method changes this matcher's state. If the matcher * is to be used in further matching operations then it should first be * reset.

* * @param replacement * The replacement string * * @return The string constructed by replacing each matching subsequence * by the replacement string, substituting captured subsequences * as needed */ public String replaceAll(String replacement) {     reset();     boolean result = find();     if (result) {         StringBuffer sb = new StringBuffer();         do {         appendReplacement(sb, replacement);         result = find();         } while (result);         appendTail(sb);         return sb.toString();     }     return text.toString(); }
  • public String replaceFirst(String replacement);

将匹配的第一个子串用指定的字符串替换。

/**

* Replaces the first subsequence of the input sequence that matches the

* pattern with the given replacement string.

*

* 

This method first resets this matcher. It then scans the input * sequence looking for a match of the pattern. Characters that are not * part of the match are appended directly to the result string; the match * is replaced in the result by the replacement string. The replacement * string may contain references to captured subsequences as in the {@link * #appendReplacement appendReplacement} method. * *

Note that backslashes (\) and dollar signs ($) in * the replacement string may cause the results to be different than if it * were being treated as a literal replacement string. Dollar signs may be * treated as references to captured subsequences as described above, and * backslashes are used to escape literal characters in the replacement * string. * *

Given the regular expression dog, the input * "zzzdogzzzdogzzz", and the replacement string * "cat", an invocation of this method on a matcher for that * expression would yield the string "zzzcatzzzdogzzz".

* *

Invoking this method changes this matcher's state. If the matcher * is to be used in further matching operations then it should first be * reset.

* * @param replacement * The replacement string * @return The string constructed by replacing the first matching * subsequence by the replacement string, substituting captured * subsequences as needed */ public String replaceFirst(String replacement) {     if (replacement == null)         throw new NullPointerException("replacement");     reset();     if (!find())         return text.toString();     StringBuffer sb = new StringBuffer();     appendReplacement(sb, replacement);     appendTail(sb);     return sb.toString(); }

3.2、Pattern

详细见这篇博文:http://www.cnblogs.com/SQP51312/p/6136304.html

  • private Pattern(String p, int f);

Pattern类的构造函数,由于私有,所以外部不能创造其实例,而是通过Pattern.compile(regex)创建pattern实例。

/**

* This private constructor is used to create all Patterns. The pattern

* string and match flags are all that is needed to completely describe

* a Pattern. An empty pattern string results in an object tree with

* only a Start node and a LastNode node.

*/

private Pattern(String p, int f) {

    pattern = p;

    flags = f;



    // to use UNICODE_CASE if UNICODE_CHARACTER_CLASS present

    if ((flags & UNICODE_CHARACTER_CLASS) != 0)

        flags |= UNICODE_CASE;



    // Reset group index count

    capturingGroupCount = 1;

    localCount = 0;



    if (pattern.length() > 0) {

        compile();

    } else {

        root = new Start(lastAccept);

        matchRoot = lastAccept;

    }

}
  • public Matcher matcher(CharSequence input);

供外部获取生成的Matcher实例。

/**

* Creates a matcher that will match the given input against this pattern.

* 

* * @param input * The character sequence to be matched * * @return A new matcher for this pattern */ public Matcher matcher(CharSequence input) {     if (!compiled) {         synchronized(this) {         if (!compiled)             compile();         }     }     Matcher m = new Matcher(this, input);     return m; }
  • public static Pattern compile(String regex, int flags);

调用Pattern构造函数,生成pattern实例。

public static Pattern compile(String regex, int flags) {

    return new Pattern(regex, flags);

}
  • public static Pattern compile(String regex);
public static Pattern compile(String regex) {

    return new Pattern(regex, 0);

}

3.3、String

  • public String replace(char oldChar, char newChar);

 String类中对replace方法进行了重载,参数可以为单个字符,也可以为实现了CharSequence接口的类(String类是其中之一);而replace在字符替换中,采用的是新建buf数组,然后遍历源数组将需要替换的字符用新字符写入buf数组。

注意:不要望文生义,从源代码来看,replace方法仍然是替换了所有的目标字符!!!

/**

* Returns a new string resulting from replacing all occurrences of

* oldChar in this string with newChar.

* 

* If the character oldChar does not occur in the * character sequence represented by this String object, * then a reference to this String object is returned. * Otherwise, a new String object is created that * represents a character sequence identical to the character sequence * represented by this String object, except that every * occurrence of oldChar is replaced by an occurrence * of newChar. *

* Examples: *


* "mesquite in your cellar".replace('e', 'o')

* returns "mosquito in your collar"

* "the war of baronets".replace('r', 'y')

* returns "the way of bayonets"

* "sparring with a purple porpoise".replace('p', 't')

* returns "starring with a turtle tortoise"

* "JonL".replace('q', 'x') returns "JonL" (no change)

* 
* * @param oldChar the old character. * @param newChar the new character. * @return a string derived from this string by replacing every * occurrence of oldChar with newChar. */ public String replace(char oldChar, char newChar) {     if (oldChar != newChar) {         int len = value.length;         int i = -1;         char[] val = value; /* avoid getfield opcode */         while (++i < len) {             if (val[i] == oldChar) {                 break;             }         }         if (i < len) {             char buf[] = new char[len];             for (int j = 0; j < i; j++) {                 buf[j] = val[j];             }             while (i < len) {                 char c = val[i];                 buf[i] = (c == oldChar) ? newChar : c;                 i++;             }             return new String(buf, true);         }     } return this; }
  • public String replace(CharSequence target, CharSequence replacement);

这是replace方法的重载,用于字符串的全部替换。实际上是调用了Matcher的replaceAll方法。

注意:通过源码可以知道,虽然调用了Pattern.compile()方法,但是flag值为Pattern.LITERAL,即不使用正则表达式进行匹配!!!

/**

* Replaces each substring of this string that matches the literal target

* sequence with the specified literal replacement sequence. The

* replacement proceeds from the beginning of the string to the end, for

* example, replacing "aa" with "b" in the string "aaa" will result in

* "ba" rather than "ab".

*

* @param target The sequence of char values to be replaced

* @param replacement The replacement sequence of char values

* @return The resulting string

* @throws NullPointerException if target or

* replacement is null.

* @since 1.5

*/

public String replace(CharSequence target, CharSequence replacement) {

    return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(this).replaceAll(Matcher.quoteReplacement(replacement.toString()));

}
  • public String replaceAll(String regex, String replacement);

replaceAll方法,用于String类型字符串之间的全部替换。

注意:通过源码可以知道,该方法使用正则表达式进行匹配!!!

/**

* Replaces each substring of this string that matches the given regular expression with the

* given replacement.

*

* 

An invocation of this method of the form * str.replaceAll(regex, repl) * yields exactly the same result as the expression * *

* {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#compile * compile}(regex).{@link * java.util.regex.Pattern#matcher(java.lang.CharSequence) * matcher}(str).{@link java.util.regex.Matcher#replaceAll * replaceAll}(repl)
* *

* Note that backslashes (\) and dollar signs ($) in the * replacement string may cause the results to be different than if it were * being treated as a literal replacement string; see * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}. * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special * meaning of these characters, if desired. * * @param regex * the regular expression to which this string is to be matched * @param replacement * the string to be substituted for each match * * @return The resulting String * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51 */ public String replaceAll(String regex, String replacement) {     return Pattern.compile(regex).matcher(this).replaceAll(replacement); }

  • public String replaceFirst(String regex, String replacement);

replaceFirst方法才是String类提供的局部替换的方法,替换第一个匹配到的字符串,调用的是Matcher的replaceFirst方法。

注意:通过源码可以知道,该方法使用正则表达式进行匹配!!!

public String replaceFirst(String regex, String replacement) {

    return Pattern.compile(regex).matcher(this).replaceFirst(replacement);

}

4、结论

String中的方法

参数

替换个数

是否正则

调用Pattern类方法

调用Matcher类方法

replace(char)

char

全部替换

replace(charSequence)

charSequence

全部替换

Pattern.compile(非正则模式)

replaceAll

replaceAll

String

全部替换

Pattern.compile(正则模式)

replaceAll

replaceFirst

String

替换第一个匹配的

Pattern.compile(正则模式)

replaceFirst

 

 

你可能感兴趣的:(Java)