提取字符串 正则表达式 求助
求助大侠
一段源代码中,很多类似
<a href="read-htm-tid-4546.html" id="a_ajax_4546">今后转移工作重点</a>
<a href="read-htm-tid-7969541.html" id="a_ajax_7969541">检讨书一份</a>
这样的。
我需要提取 一个帖子号码 和 一个标题,如下:
4546 和 今后转移工作重点
7969541 和 检讨书一份
要求每个数组里存帖子号码,和标题,中间用逗号分开,求正则表达式!
我用的如下代码
MatchCollection mc = Regex.Matches(strHtmlBody, 这里怎么写。。。);
string[] result = new string[mc.Count];
for (int i = 0; i < mc.Count; i++)
{
result[i] = mc[i].Value;
}
[解决办法]
using System;using System.Collections.Generic;using System.Linq;using System.Text;using System.Text.RegularExpressions;using System.Threading.Tasks;namespace ConsoleApplication1{ class Program { static void Main(string[] args) { string HTMLBody = @"<a href=""read-htm-tid-4546.html"" id=""a_ajax_4546"">今后转移工作重点</a><a href=""read-htm-tid-7969541.html"" id=""a_ajax_7969541"">检讨书一份</a>"; foreach (Match m in Regex.Matches(HTMLBody, @"id\=""a_ajax_(\d+)"">(.+?)</a>")) { Console.WriteLine("id {0} text {1}", m.Groups[1].Value, m.Groups[2].Value); } } }}
[解决办法]
try...
MatchCollection mc = Regex.Matches(yourStr, @"(?is)<a href=""read-htm-tid-(?<number>[^.]+)\.html""[^>]*>(?<title>.*?)</a>");string[] result = new string[mc.Count];for (int i = 0; i < mc.Count; i++){ result[i] = mc[i].Groups["number"].Value + "," + mc[i].Groups["title"].Value;}//测试代码foreach (string s in result){ richTextBox2.Text += s + "\n";}/*-----输出-----4546,今后转移工作重点7969541,检讨书一份*/