正则表达式 最小匹配(第一次出现) 2020-10-01

  
  

  ⮮因为需要使用Notepad++ 来过滤一些字符如下:

  ⮮首先按照字面的特征,写正则表达式如下:
  \[.*\]\(https:.+\),测试匹配

  
  
  

  ⮮查阅Notepad++的在线帮助,找到正则表达式的相关内容:

Multiplying operators

  • + ⇒ This matches 1 or more instances of the previous character, as many as it can. For example, Sa+m matches Sam, Saam, Saaam, and so on. [aeiou]+ matches consecutive strings of vowels.

  • * ⇒ This matches 0 or more instances of the previous character, as many as it can. For example, Sa*m matches Sm, Sam, Saam, and so on.

  • ? ⇒ Zero or one of the last character. Thus Sa?m matches Sm and Sam, but not Saam.

  • *? ⇒ Zero or more of the previous group, but minimally: the shortest matching string, rather than the longest string as with the “greedy” operator. Thus, m.*?o applied to the text margin-bottom: 0; will match margin-bo, whereas m.*o will match margin-botto.

  • +? ⇒ One or more of the previous group, but minimally.

  • {ℕ} ⇒ Matches ℕ copies of the element it applies to (where ℕ is any decimal number).

  • {ℕ,} ⇒ Matches ℕ or more copies of the element it applies to.

  • {ℕ,ℙ} ⇒ Matches ℕ to ℙ copies of the element it applies to, as much it can (where ℙ ≥ ℕ).

  • {ℕ,}? or {ℕ,ℙ}? ⇒ Like the above, but mimimally.

  • *+ or ?+ or ++ or {ℕ,}+ or {ℕ,ℙ}+ ⇒ These so called “possessive” variants of greedy repeat marks do not backtrack. This allows failures to be reported much earlier, which can boost performance significantly. But they will eliminate matches that would require backtracking to be found. As an example:

  When regex “.*”is run against the text“abc”x:

“  matches “
.* matches abc”x
”  cannot match $ ( End of line ) => Backtracking

“  matches “
.* matches abc”
”  cannot match letter x => Backtracking

“  matches “
.* matches abc
”  matches ” => 1 overall match “abc”

  When regex “.*+”, with a possessive quantifier, is run against the text “abc”x :

“   matches “
.*+ matches abc”x ( catches all remaining characters )
” cannot match $ ( End of line )

  Notice there is no match at all for the possive version, because the possessive repeat factor prevents from backtracking to a possible solution

  
  
  

  ⮮注意关键说明如下:

  • *? ⇒ Zero or more of the previous group, but minimally: the shortest matching string, rather than the longest string as with the “greedy” operator. Thus, m.*?o applied to the text margin-bottom: 0; will match margin-bo, whereas m.*o will match margin-botto.

  加入*?表示前一个字符出现0次或者无限多次,但是是最小匹配。结合刚才的需求分析:

  ⮮运行结果如下:

  为什么呢?仔细分析正则表达式,原来问题出现在了前面:

  前面的这个\[.*\]没有进行限定,依然是最大匹配,现在在这里也加上?,看测试结果:

你可能感兴趣的:(正则表达式 最小匹配(第一次出现) 2020-10-01)