求一正则表达式,提取html指定标签内容
<div id="content">
<p>ddd</p>
<div id="cc">ddd</div>
<img src="yili120x60.gif" />
<img src="120x60/omj120X60.gif" />
</div>
提取
<div id="content"> 这个div间的内容,即
<p>ddd</p>
<div id="cc">ddd</div>
<img src="yili120x60.gif" />
<img src="120x60/omj120X60.gif" />
文章采集用的.研究了一天正则平衡组,搞不懂,求高手
[解决办法]
string str = @"<div class=""info"">aaaaaa<div id=""content""> <p>ddd</p> <div id=""cc"">ddd</div> <img src=""yili120x60.gif"" /> <img src=""120x60/omj120X60.gif"" /> </div>bbbbb</div>"; Regex reg = new Regex(@"(?is)<div[^>]*?id=""content"">((?:(?<Open><div[^>]*?>)|(?<-Open></div>)|.*?)*)(?(Open)(?!))</div>"); Console.WriteLine(reg.Match(str).Groups[1].Value);