Regex 類詳解

XXG學(xué)習(xí)園 2012-05-07

展開全文

.NET Framework 中的正則表達(dá)式引擎由 Regex 類表示。正則表達(dá)式引擎負(fù)責(zé)分析和編譯正則表達(dá)式，并執(zhí)行用于將正則表達(dá)式模式與輸入字符串相匹配的操作。此引擎是 .NET Framework 正則表達(dá)式對象模型中的主要組件。

可以通過以下兩種方式之一使用正則表達(dá)式引擎：

通過調(diào)用 Regex 類的靜態(tài)方法。方法參數(shù)包含輸入字符串和正則表達(dá)式模式。正則表達(dá)式引擎會(huì)緩存靜態(tài)方法調(diào)用中使用的正則表達(dá)式，這樣一來，重復(fù)調(diào)用使用同一正則表達(dá)式的靜態(tài)正則表達(dá)式方法將提供相對良好的性能。
通過實(shí)例化 Regex 對象，采用的方式是將一個(gè)正則表達(dá)式傳遞給類構(gòu)造函數(shù)。在此情況下，Regex 對象是不可變的（只讀），它表示一個(gè)與單個(gè)正則表達(dá)式緊密耦合的正則表達(dá)式引擎。由于未對 Regex 實(shí)例使用的正則表達(dá)式進(jìn)行緩存，因此不應(yīng)使用同一正則表達(dá)式實(shí)例化 Regex 對象多次。

可以調(diào)用 Regex 類的方法來執(zhí)行下列操作：

確定字符串是否與正則表達(dá)式模式匹配。
提取單個(gè)匹配項(xiàng)或第一個(gè)匹配項(xiàng)。
提取所有匹配項(xiàng)。
替換匹配的子字符串。
將單個(gè)字符串拆分成一個(gè)字符串?dāng)?shù)組。

以下各部分對這些操作進(jìn)行了描述。

匹配正則表達(dá)式模式

如果字符串與模式匹配，則 Regex.IsMatch 方法返回 true；如果字符串與模式不匹配，則此方法返回 false。 IsMatch 方法通常用于驗(yàn)證字符串輸入。例如，下面的代碼將確保字符串與有效的美國社會(huì)保障號(hào)匹配。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string[] values = { "111-22-3333", "111-2-3333"};
      string pattern = @"^\d{3}-\d{2}-\d{4}$";
      foreach (string value in values) {
         if (Regex.IsMatch(value, pattern))
            Console.WriteLine("{0} is a valid SSN.", value);
         else   
            Console.WriteLine("{0}: Invalid", value);
      }
   }
}
// The example displays the following output:
//       111-22-3333 is a valid SSN.
//       111-2-3333: Invalid

正則表達(dá)式模式 ^\d{3}-\d{2}-\d{4}$ 的含義如下表所示。

模式	說明
^	匹配輸入字符串的開頭部分。
\d{3}	匹配三個(gè)十進(jìn)制數(shù)字。
-	匹配連字符。
\d{2}	匹配兩個(gè)十進(jìn)制數(shù)字。
-	匹配連字符。
\d{4}	匹配四個(gè)十進(jìn)制數(shù)字。
$	匹配輸入字符串的末尾部分。

提取單個(gè)匹配項(xiàng)或第一個(gè)匹配項(xiàng)

Regex.Match 方法返回一個(gè) Match 對象，該對象包含有關(guān)與正則表達(dá)式模式匹配的第一個(gè)子字符串的信息。如果 Match.Success 屬性返回 true，則表示已找到一個(gè)匹配項(xiàng)，可以通過調(diào)用 Match.NextMatch 方法來檢索有關(guān)后續(xù)匹配項(xiàng)的信息。這些方法調(diào)用可以繼續(xù)進(jìn)行，直到 Match.Success 屬性返回 false。例如，下面的代碼使用 Regex.Match(String, String) 方法查找重復(fù)的單詞在字符串中的第一個(gè)匹配項(xiàng)。然后，此代碼調(diào)用Match.NextMatch 方法查找任何其他匹配項(xiàng)。該示例將在每次調(diào)用方法后檢查 Match.Success 屬性以確定當(dāng)前匹配是否成功，并確定是否應(yīng)接著調(diào)用 Match.NextMatch 方法。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "This is a a farm that that raises dairy cattle."; 
      string pattern = @"\b(\w+)\W+(\1)\b";
      Match match = Regex.Match(input, pattern);
      while (match.Success)
      {
         Console.WriteLine("Duplicate '{0}' found at position {1}.",  
                           match.Groups[1].Value, match.Groups[2].Index);
         match = match.NextMatch();
      }                       
   }
}
// The example displays the following output:
//       Duplicate 'a' found at position 10.
//       Duplicate 'that' found at position 22.

正則表達(dá)式模式 \b(\w+)\W+(\1)\b 的含義如下表所示。

模式	說明
\b	從單詞邊界開始進(jìn)行匹配。
(\w+)	匹配一個(gè)或多個(gè)單詞字符。這是第一個(gè)捕獲組。
\W+	匹配一個(gè)或多個(gè)非單詞字符。
(\1)	匹配第一個(gè)捕獲的字符串。這是第二個(gè)捕獲組。
\b	在單詞邊界處結(jié)束匹配。

提取所有匹配項(xiàng)

Regex.Matches 方法返回一個(gè) MatchCollection 對象，該對象包含有關(guān)正則表達(dá)式引擎在輸入字符串中找到的所有匹配項(xiàng)的信息。例如，可重寫上一示例以調(diào)用 Matches 方法，而不是調(diào)用 Match 和 NextMatch 方法。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "This is a a farm that that raises dairy cattle."; 
      string pattern = @"\b(\w+)\W+(\1)\b";
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("Duplicate '{0}' found at position {1}.",  
                           match.Groups[1].Value, match.Groups[2].Index);
   }
}
// The example displays the following output:
//       Duplicate 'a' found at position 10.
//       Duplicate 'that' found at position 22.

替換匹配的子字符串

Regex.Replace 方法會(huì)將與正則表達(dá)式模式匹配的每個(gè)子字符串替換為指定的字符串或正則表達(dá)式模式，并返回進(jìn)行了替換的整個(gè)輸入字符串。例如，下面的代碼會(huì)在字符串中的十進(jìn)制數(shù)字前添加美國貨幣符號(hào)。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b\d+\.\d{2}\b";
      string replacement = "$$$&"; 
      string input = "Total Cost: 103.64";
      Console.WriteLine(Regex.Replace(input, pattern, replacement));     
   }
}
// The example displays the following output:
//       Total Cost: $103.64

正則表達(dá)式模式 \b\d+\. \d{2}\b is interpreted as shown in the following table.

模式	說明
\b	在單詞邊界處開始匹配。
\d+	匹配一個(gè)或多個(gè)十進(jìn)制數(shù)字。
\.	匹配句點(diǎn)。
\d{2}	匹配兩個(gè)十進(jìn)制數(shù)字。
\b	在單詞邊界處結(jié)束匹配。

替換模式 $$$& 的含義如下表所示。

模式	替換字符串
$$	美元符號(hào) ($) 字符。
$&	整個(gè)匹配的子字符串。

將單個(gè)字符串拆分成一個(gè)字符串?dāng)?shù)組

Regex.Split 方法在由正則表達(dá)式匹配項(xiàng)定義的位置拆分輸入字符串。例如，下面的代碼將編號(hào)列表中的項(xiàng)置于字符串?dāng)?shù)組中。

C++

JScript

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "1. Eggs 2. Bread 3. Milk 4. Coffee 5. Tea";
      string pattern = @"\b\d{1,2}\.\s";
      foreach (string item in Regex.Split(input, pattern))
      {
         if (! String.IsNullOrEmpty(item))
            Console.WriteLine(item);
      }      
   }
}
// The example displays the following output:
//       Eggs
//       Bread
//       Milk
//       Coffee
//       Tea

正則表達(dá)式模式 \b\d{1,2}\. \s is interpreted as shown in the following table.

模式	說明
\b	在單詞邊界處開始匹配。
\d{1,2}	匹配一個(gè)或兩個(gè)十進(jìn)制數(shù)字。
\.	匹配句點(diǎn)。
\s	與空白字符匹配。

MatchCollection 和 Match 對象

Regex 方法返回作為正則表達(dá)式對象模型的一部分的兩個(gè)對象：MatchCollection 對象和 Match 對象。

Match 集合

Regex.Matches 方法返回一個(gè) MatchCollection 對象，該對象包含多個(gè) Match 對象，這些對象表示正則表達(dá)式引擎在輸入字符串中找到的所有匹配項(xiàng)（其順序?yàn)檫@些匹配項(xiàng)在輸入字符串中的顯示順序）。如果沒有匹配項(xiàng)，則此方法將返回一個(gè)不包含任何成員的 MatchCollection 對象。利用 MatchCollection.Item 屬性，您可以按照索引（從零到將 MatchCollection.Count 屬性的值減 1 所得的值）訪問集合中的各個(gè)成員。 Item 是集合的索引器（在 C# 中）和默認(rèn)屬性（在 Visual Basic 中）。

默認(rèn)情況下，調(diào)用 Regex.Matches 方法會(huì)使用延遲計(jì)算來填充 MatchCollection 對象。訪問需要完全填充的集合的屬性（如 MatchCollection.Count 和 MatchCollection.Item 屬性）可能會(huì)降低性能。因此，建議您使用由 MatchCollection.GetEnumerator 方法返回的 IEnumerator 對象訪問該集合。各種語言都提供了用于包裝該集合的 IEnumerator 接口的構(gòu)造（如 Visual Basic 中的 ForEach 和 C# 中的 foreach）。

下面的示例使用 Regex.Matches(String) 方法將在輸入字符串中找到的所有匹配項(xiàng)填充到 MatchCollection 對象中。此示例枚舉了該集合，將匹配項(xiàng)復(fù)制到字符串?dāng)?shù)組并將字符位置記錄在整數(shù)數(shù)組中。

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
       MatchCollection matches;
       List<string> results = new List<string>();
       List<int> matchposition = new List<int>();

       // Create a new Regex object and define the regular expression.
       Regex r = new Regex("abc");
       // Use the Matches method to find all matches in the input string.
       matches = r.Matches("123abc4abcd");
       // Enumerate the collection to retrieve all matches and positions.
       foreach (Match match in matches)
       {
          // Add the match string to the string array.
           results.Add(match.Value);
           // Record the character position where the match was found.
           matchposition.Add(match.Index);
       }
       // List the results.
       for (int ctr = 0; ctr < results.Count; ctr++)
         Console.WriteLine("'{0}' found at position {1}.", 
                           results[ctr], matchposition[ctr]);  
   }
}
// The example displays the following output:
//       'abc' found at position 3.
//       'abc' found at position 7.

Match 類

Match 類表示單個(gè)正則表達(dá)式匹配項(xiàng)的結(jié)果。可以通過兩種方式訪問 Match 對象：

通過從 Regex.Matches 方法返回的 MatchCollection 對象檢索這些對象。若要檢索各個(gè) Match 對象，請使用 foreach（在 C# 中）或 For Each...Next （在 Visual Basic 中）構(gòu)造循環(huán)訪問集合，或者使用MatchCollection.Item 屬性按索引或名稱檢索特定的 Match 對象。也可以通過按索引（從零到將集合中的對象數(shù)減去 1 所得的值）循環(huán)訪問集合來檢索集合中的各個(gè) Match 對象。但是，此方法不使用延遲計(jì)算，因?yàn)樗鼘⒃L問 MatchCollection.Count 屬性。
下面的示例通過使用 foreach 或 For Each...Next 構(gòu)造循環(huán)訪問集合，來從 MatchCollection 對象中檢索各個(gè) Match 對象。正則表達(dá)式只是與輸入字符串中的字符串“abc”匹配。
VB
C#
C++
F#
JScript
using System;
using System.Text.RegularExpressions;

public class Example
{
public static void Main()
{
string pattern = "abc";
string input = "abc123abc456abc789";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("{0} found at position {1}.",
match.Value, match.Index);
}
}
// The example displays the following output:
// abc found at position 0.
// abc found at position 6.
// abc found at position 12.

通過調(diào)用 Regex.Match 方法，此方法返回一個(gè) Match 對象，該對象表示字符串中的第一個(gè)匹配項(xiàng)或字符串的一部分。可以通過檢索 Match.Success 屬性的值確定是否已找到匹配項(xiàng)。若要檢索表示后續(xù)匹配項(xiàng)的Match 對象，請重復(fù)調(diào)用 Match.NextMatch 方法，直到返回的 Match 對象的 Success 屬性為 false。

下面的示例使用 Regex.Match(String, String) 和 Match.NextMatch 方法來匹配輸入字符串中的字符串“abc”。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = "abc";
      string input = "abc123abc456abc789";
      Match match = Regex.Match(input, pattern);
      while (match.Success)
      {
         Console.WriteLine("{0} found at position {1}.", 
                           match.Value, match.Index);
         match = match.NextMatch();                  
      }                     
   }
}
// The example displays the following output:
//       abc found at position 0.
//       abc found at position 6.
//       abc found at position 12.

Match 類的以下兩個(gè)屬性都將返回集合對象：

Match.Groups 屬性返回一個(gè) GroupCollection 對象，該對象包含有關(guān)與正則表達(dá)式模式中的捕獲組匹配的子字符串的信息。
Group.Captures 屬性返回一個(gè) CaptureCollection 對象，該對象的使用是有限制的。不會(huì)為其 Success 屬性為 false 的 Match 對象填充集合。否則，它將包含一個(gè) Capture 對象，該對象具有的信息與 Match對象具有的信息相同。

有關(guān)這些對象的更多信息，請參見本主題后面的組集合和捕獲集合部分。

Match 類的另外兩個(gè)屬性提供了有關(guān)匹配項(xiàng)的信息。 Match.Value 屬性返回輸入字符串中與正則表達(dá)式模式匹配的子字符串。 Match.Index 屬性返回輸入字符串中匹配的字符串的起始位置（從零開始）。

Match 類還具有兩個(gè)模式匹配方法：

Match.NextMatch 方法查找位于由當(dāng)前的 Match 對象表示的匹配項(xiàng)之后的匹配項(xiàng)，并返回表示該匹配項(xiàng)的 Match 對象。
Match.Result 方法對匹配的字符串執(zhí)行指定的替換操作并返回相應(yīng)結(jié)果。

下面的示例使用 Match.Result 方法在每個(gè)包含兩個(gè)小數(shù)位的數(shù)字前預(yù)置一個(gè) $ 符號(hào)和一個(gè)空格。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b\d+(,\d{3})*\.\d{2}\b";
      string input = "16.32\n194.03\n1,903,672.08"; 

      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine(match.Result("$$ $&"));
   }
}
// The example displays the following output:
//       $ 16.32
//       $ 194.03
//       $ 1,903,672.08

正則表達(dá)式模式 \b\d+(,\d{3})*\. \d{2}\b is defined as shown in the following table.

模式	說明
\b	在單詞邊界處開始匹配。
\d+	匹配一個(gè)或多個(gè)十進(jìn)制數(shù)字。
(,\d{3})*	匹配零個(gè)或多個(gè)以下模式：一個(gè)逗號(hào)后跟三個(gè)十進(jìn)制數(shù)字。
\.	匹配小數(shù)點(diǎn)字符。
\d{2}	匹配兩個(gè)十進(jìn)制數(shù)字。
\b	在單詞邊界處結(jié)束匹配。

替換模式 $$ $& 指示匹配的子字符串應(yīng)由美元符號(hào) ($)（$$ 模式）、空格和匹配項(xiàng)的值（$& 模式）替換。

返回頁首

組集合

Match.Groups 屬性返回一個(gè) GroupCollection 對象，該對象包含多個(gè) Group 對象，這些對象表示單個(gè)匹配項(xiàng)中的捕獲的組。集合中的第一個(gè) Group 對象（位于索引 0 處）表示整個(gè)匹配項(xiàng)。此對象后面的每個(gè)對象均表示一個(gè)捕獲組的結(jié)果。

可以使用 GroupCollection.Item 屬性檢索集合中的各個(gè) Group 對象。可以在集合中按未命名組的序號(hào)位置來檢索未命名組，也可以按命名組的名稱或序號(hào)位置來檢索命名組。未命名捕獲將首先在集合中顯示，并將按照未命名捕獲在正則表達(dá)式模式中出現(xiàn)的順序從左至右對它們進(jìn)行索引。在對未命名捕獲進(jìn)行索引后，將按照命名捕獲在正則表達(dá)式模式中出現(xiàn)的順序從左至右對它們進(jìn)行索引。

GroupCollection.Item 屬性是集合的索引器（在 C# 中）和集合對象的默認(rèn)屬性（在 Visual Basic 中）。這表示可以按索引（對于命名組，可以按名稱）訪問各個(gè) Group 對象，如下所示：

Group group = match.Groups[ctr];

下面的示例定義一個(gè)正則表達(dá)式，該表達(dá)式使用分組構(gòu)造捕獲日期的年、月和日部分。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = @"\b(\w+)\s(\d{1,2}),\s(\d{4})\b";
      string input = "Born: July 28, 1989";
      Match match = Regex.Match(input, pattern);
      if (match.Success)
         for (int ctr = 0; ctr <  match.Groups.Count; ctr++)
            Console.WriteLine("Group {0}: {1}", ctr, match.Groups[ctr].Value);
    }
}
// The example displays the following output:
//       Group 0: July 28, 1989
//       Group 1: July
//       Group 2: 28
//       Group 3: 1989

正則表達(dá)式模式 \b(\w+)\s(\d{1,2}),\s(\d{4})\b 的定義如下表所示。

模式	說明
\b	在單詞邊界處開始匹配。
(\w+)	匹配一個(gè)或多個(gè)單詞字符。這是第一個(gè)捕獲組。
\s	與空白字符匹配。
(\d{1,2})	匹配一個(gè)或兩個(gè)十進(jìn)制數(shù)字。這是第二個(gè)捕獲組。
,	匹配逗號(hào)。
\s	與空白字符匹配。
(\d{4})	匹配四個(gè)十進(jìn)制數(shù)字。這是第三個(gè)捕獲組。
\b	在單詞邊界處結(jié)束匹配。

返回頁首

捕獲的組

Group 類表示來自單個(gè)捕獲組的結(jié)果。表示正則表達(dá)式中定義的捕獲組的組對象由 Match.Groups 屬性所返回的 GroupCollection 對象的 Item 屬性返回。 Item 屬性是索引器（在 C# 中）和 Group 類的默認(rèn)屬性（在 Visual Basic 中）。也可以使用 foreach 或 ForEach 構(gòu)造循環(huán)訪問集合來檢索各個(gè)成員。有關(guān)示例，請參見上一部分。

下面的示例使用嵌套的分組構(gòu)造來將子字符串捕獲到組中。正則表達(dá)式模式 (a(b))c 將匹配字符串“abc”。它會(huì)將子字符串“ab”分配給第一個(gè)捕獲組，并將子字符串“b”分配給第二個(gè)捕獲組。

List<int> matchposition = new List<int>();
List<string> results = new List<string>();
// Define substrings abc, ab, b.
Regex r = new Regex("(a(b))c"); 
Match m = r.Match("abdabc");
for (int i = 0; m.Groups[i].Value != ""; i++) 
{
   // Add groups to string array.
   results.Add(m.Groups[i].Value); 
   // Record character position.
   matchposition.Add(m.Groups[i].Index); 
}

// Display the capture groups.
for (int ctr = 0; ctr < results.Count; ctr++)
   Console.WriteLine("{0} at position {1}", 
                     results[ctr], matchposition[ctr]);
// The example displays the following output:
//       abc at position 3
//       ab at position 3
//       b at position 4

下面的示例使用命名的分組構(gòu)造，從包含“DATANAME:VALUE”格式的數(shù)據(jù)的字符串中捕獲子字符串，正則表達(dá)式通過冒號(hào) (:) 拆分?jǐn)?shù)據(jù)。

Regex r = new Regex("^(?<name>\\w+):(?<value>\\w+)");
Match m = r.Match("Section1:119900");
Console.WriteLine(m.Groups["name"].Value);
Console.WriteLine(m.Groups["value"].Value);
// The example displays the following output:
//       Section1
//       119900

正則表達(dá)式模式 ^(?<name>\w+):(?<value>\w+) 的定義如下表所示。

模式	說明
^	從輸入字符串的開頭部分開始匹配。
(?<name>\w+)	匹配一個(gè)或多個(gè)單詞字符。此捕獲組的名稱為 name。
:	匹配冒號(hào)。
(?<value>\w+)	匹配一個(gè)或多個(gè)單詞字符。此捕獲組的名稱為 value。

Group 類的屬性提供有關(guān)捕獲的組的信息：Group.Value 屬性包含捕獲的子字符串，Group.Index 屬性指示輸入文本中捕獲的組的起始位置，Group.Length 屬性包含捕獲的文本的長度，Group.Success 屬性指示子字符串是否與捕獲組所定義的模式匹配。

通過對組應(yīng)用限定符（有關(guān)更多信息，請參見限定符），可以按兩種方式修改一個(gè)捕獲組對應(yīng)一個(gè)捕獲這樣的關(guān)系：

如果對組應(yīng)用 * 或 *? 限定符（將指定零個(gè)或多個(gè)匹配項(xiàng)），則捕獲組在輸入字符串中可能沒有匹配項(xiàng)。在沒有捕獲的文本時(shí)，將如下表所示設(shè)置 Group 對象的屬性。

組屬性	值
Success	false
Value	String.Empty
Length	0

下面的示例進(jìn)行了這方面的演示。在正則表達(dá)式模式 aaa(bbb)*ccc 中，可以匹配第一個(gè)捕獲組（子字符串“bbb”）零次或多次。由于輸入字符串“aaaccc”與此模式匹配，因此該捕獲組沒有匹配項(xiàng)。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = "aaa(bbb)*ccc";
      string input = "aaaccc";
      Match match = Regex.Match(input, pattern);
      Console.WriteLine("Match value: {0}", match.Value);
      if (match.Groups[1].Success)
         Console.WriteLine("Group 1 value: {0}", match.Groups[1].Value);
      else
         Console.WriteLine("The first capturing group has no match.");
   }
}
// The example displays the following output:
//       Match value: aaaccc
//       The first capturing group has no match.

限定符可以匹配由捕獲組定義的模式的多個(gè)匹配項(xiàng)。在此情況下，Group 對象的 Value 和 Length 屬性僅包含有關(guān)最后捕獲的子字符串的信息。例如，下面的正則表達(dá)式匹配以句點(diǎn)結(jié)束的單個(gè)句子。此表達(dá)式使用兩個(gè)分組構(gòu)造：第一個(gè)分組構(gòu)造捕獲各個(gè)單詞以及空白字符；第二個(gè)分組構(gòu)造捕獲各個(gè)單詞。如示例中的輸出所示，雖然正則表達(dá)式成功捕獲整個(gè)句子，但第二個(gè)捕獲組僅捕獲了最后一個(gè)單詞。
using System;
using System.Text.RegularExpressions;

public class Example
{
public static void Main()
{
string pattern = @"\b((\w+)\s?)+\.";
string input = "This is a sentence. This is another sentence.";
Match match = Regex.Match(input, pattern);
if (match.Success)
{
Console.WriteLine("Match: " + match.Value);
Console.WriteLine("Group 2: " + match.Groups[2].Value);
}
}
}
// The example displays the following output:
// Match: This is a sentence.
// Group 2: sentence

返回頁首

捕獲集合

Group 對象僅包含有關(guān)最后一個(gè)捕獲的信息。但仍可從 Group.Captures 屬性返回的 CaptureCollection 對象中獲取由捕獲組生成的整個(gè)捕獲集。集合中的每個(gè)成員均為一個(gè)表示由該捕獲組生成的捕獲的 Capture 對象，這些對象按被捕獲的順序排列（因而也就是遵循在輸入字符串中按從左至右匹配捕獲的字符串的順序）。可以通過以下兩種方式之一來檢索集合中的各個(gè) Capture 對象：

通過使用構(gòu)造循環(huán)訪問集合，如 foreach 構(gòu)造（在 C# 中）或 ForEach 構(gòu)造（在 Visual Basic 中）。
通過使用 CaptureCollection.Item 屬性按索引檢索特定對象。 Item 屬性是 CaptureCollection 對象的默認(rèn)屬性（在 Visual Basic 中）或索引器（在 C# 中）。

如果未對捕獲組應(yīng)用限定符，則 CaptureCollection 對象將包含一個(gè) Capture 對象，但該對象的作用不大，因?yàn)樗峁┑氖怯嘘P(guān)與其 Group 對象相同的匹配項(xiàng)的信息。如果對一個(gè)捕獲組應(yīng)用限定符，則CaptureCollection 對象將包含該捕獲組所生成的所有捕獲，并且集合的最后一個(gè)成員將表示與 Group 對象相同的捕獲。

例如，如果使用正則表達(dá)式模式 ((a(b))c)+（其中，+ 限定符指定一個(gè)或多個(gè)匹配項(xiàng)）捕獲字符串“abcabcabc”中的匹配項(xiàng)，則每個(gè) Group 對象的 CaptureCollection 對象都將包含三個(gè)成員。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = "((a(b))c)+";
      string input = "abcabcabc";

      Match match = Regex.Match(input, pattern);
      if (match.Success)
      {
         Console.WriteLine("Match: '{0}' at position {1}",  
                           match.Value, match.Index);
         GroupCollection groups = match.Groups;
         for (int ctr = 0; ctr < groups.Count; ctr++) {
            Console.WriteLine("   Group {0}: '{1}' at position {2}", 
                              ctr, groups[ctr].Value, groups[ctr].Index);
            CaptureCollection captures = groups[ctr].Captures;
            for (int ctr2 = 0; ctr2 < captures.Count; ctr2++) {
               Console.WriteLine("      Capture {0}: '{1}' at position {2}", 
                                 ctr2, captures[ctr2].Value, captures[ctr2].Index);
            }                     
         }
      }
   }
}
// The example displays the following output:
//       Match: 'abcabcabc' at position 0
//          Group 0: 'abcabcabc' at position 0
//             Capture 0: 'abcabcabc' at position 0
//          Group 1: 'abc' at position 6
//             Capture 0: 'abc' at position 0
//             Capture 1: 'abc' at position 3
//             Capture 2: 'abc' at position 6
//          Group 2: 'ab' at position 6
//             Capture 0: 'ab' at position 0
//             Capture 1: 'ab' at position 3
//             Capture 2: 'ab' at position 6
//          Group 3: 'b' at position 7
//             Capture 0: 'b' at position 1
//             Capture 1: 'b' at position 4
//             Capture 2: 'b' at position 7

下面的示例使用正則表達(dá)式 (Abc)+ 來在字符串“XYZAbcAbcAbcXYZAbcAb”中查找字符串“Abc”的一個(gè)或多個(gè)連續(xù)匹配項(xiàng)。該示例闡釋了使用 Group.Captures 屬性來返回多組捕獲的子字符串。

   int counter;
   Match m;
   CaptureCollection cc;
   GroupCollection gc;

   // Look for groupings of "Abc".
   Regex r = new Regex("(Abc)+"); 
   // Define the string to search.
   m = r.Match("XYZAbcAbcAbcXYZAbcAb"); 
   gc = m.Groups;

   // Display the number of groups.
   Console.WriteLine("Captured groups = " + gc.Count.ToString());

   // Loop through each group.
   for (int i=0; i < gc.Count; i++) 
   {
      cc = gc[i].Captures;
      counter = cc.Count;

      // Display the number of captures in this group.
      Console.WriteLine("Captures count = " + counter.ToString());

      // Loop through each capture in the group.
      for (int ii = 0; ii < counter; ii++) 
      {
         // Display the capture and its position.
         Console.WriteLine(cc[ii] + "   Starts at character " + 
              cc[ii].Index);
      }
   }
}
// The example displays the following output:
//       Captured groups = 2
//       Captures count = 1
//       AbcAbcAbc   Starts at character 3
//       Captures count = 3
//       Abc   Starts at character 3
//       Abc   Starts at character 6
//       Abc   Starts at character 9

返回頁首

單個(gè)捕獲

Capture 類包含來自單個(gè)子表達(dá)式捕獲的結(jié)果。 Capture.Value 屬性包含匹配的文本，而 Capture.Index 屬性指示匹配的子字符串在輸入字符串中的起始位置（從零開始）。

下面的示例分析針對選定城市的溫度的輸入字符串。逗號(hào)（“,”）用于將城市與其溫度分隔開，而分號(hào)（“;”）用于將每個(gè)城市的數(shù)據(jù)分隔開。整個(gè)輸入字符串表示一個(gè)匹配項(xiàng)。在用于分析字符串的正則表達(dá)式模式 ((\w+(\s\w+)*),(\d+);)+ 中，城市名稱將分配給第二個(gè)捕獲組，而溫度將分配到第四個(gè)捕獲組。

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "Miami,78;Chicago,62;New York,67;San Francisco,59;Seattle,58;"; 
      string pattern = @"((\w+(\s\w+)*),(\d+);)+";
      Match match = Regex.Match(input, pattern);
      if (match.Success)
      {
         Console.WriteLine("Current temperatures:");
         for (int ctr = 0; ctr < match.Groups[2].Captures.Count; ctr++)
            Console.WriteLine("{0,-20} {1,3}", match.Groups[2].Captures[ctr].Value, 
                              match.Groups[4].Captures[ctr].Value);
      }
   }
}
// The example displays the following output:
//       Current temperatures:
//       Miami                 78
//       Chicago               62
//       New York              67
//       San Francisco         59

該正則表達(dá)式的定義如下表所示。

模式	說明
\w+	匹配一個(gè)或多個(gè)單詞字符。
(\s\w+)*	匹配零個(gè)或多個(gè)以下模式：一個(gè)空格字符后跟一個(gè)或多個(gè)單詞字符。此模式匹配包含多個(gè)單詞的城市名稱。這是第三個(gè)捕獲組。
(\w+(\s\w+)*)	匹配以下模式：一個(gè)或多個(gè)單詞字符，后跟零個(gè)或多個(gè)一個(gè)空白字符與一個(gè)或多個(gè)單詞字符的組合。這是第二個(gè)捕獲組。
,	匹配逗號(hào)。
(\d+)	匹配一個(gè)或多個(gè)數(shù)字。這是第四個(gè)捕獲組。
;	匹配分號(hào)。
((\w+(\s\w+)*),(\d+);)+	匹配一個(gè)或多個(gè)以下模式：一個(gè)單詞后跟任何其他單詞，后跟一個(gè)逗號(hào)、一個(gè)或多個(gè)數(shù)字和一個(gè)分號(hào)。這是第一個(gè)捕獲組。

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自： XXG學(xué)習(xí)園 > 《Asp.net》

舉報(bào)/認(rèn)領(lǐng)