首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > .NET > C# >

正則表達式,该怎么处理

2012-05-10 
正則表達式,急!!現需用正則表達式從以下html中取出id為documentContainer的table中td里面的值.htmldiv

正則表達式,急!!
現需用正則表達式從以下html中取出id為documentContainer的table中td里面的值.
<html>
<div class="content" id="content" style="overflow:auto;width:100%;">
<table width="640" border="0" cellpadding="5" cellspacing="1" bgcolor="#EAEAEA">
  <tbody><tr>
  <td width="73" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">货币名称</td>
  <td width="73" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">现汇买入价</td>
  <td width="73" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">现钞买入价</td>
  <td width="73" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">现汇卖出价</td>
<td width="73" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">现钞卖出价</td>
  <td width="73" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">人民币汇率中间价</td>
<td width="73" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">中行折算价</td>
<td width="110" align="center" valign="middle" bgcolor="#EFEFEF" class="lan12_hover">发布时间</td>
  </tr>
</tbody>
</table>
<table id="documentContainer">
<tbody><tr>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">新西兰元</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">513.87</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20"></td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">518</td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20"></td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20"></td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">514.07</td>
<td width="110" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">2012-03-07 11:31:28</td>
  </tr>
<tr>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">英镑</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">989.53</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">958.98</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">997.48</td>


<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">997.48</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">993.61</td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">993.61</td>
<td width="110" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">2012-03-07 11:31:28</td>
  </tr>
<tr>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">澳门元</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">78.94</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">78.28</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">79.24</td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">79.24</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20"></td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">79.07</td>
<td width="110" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">2012-03-07 11:31:28</td>
  </tr>
<tr>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">加拿大元</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">629.21</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">609.78</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">634.26</td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">634.26</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">631.21</td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">631.21</td>
<td width="110" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">2012-03-07 11:31:28</td>
  </tr>

<tr>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">卢布</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">21.25</td>


  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">&nbsp;</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">21.42</td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">&nbsp;</td>
  <td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20"></td>
<td width="73" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">21.34</td>
<td width="110" align="center" valign="middle" bgcolor="#FFFFFF" class="hui12_20">2012-03-07 11:24:14</td>
  </tr>



</tbody></table>

<table width="300" align="center" border="0" cellpadding="5" cellspacing="1" bgcolor="#FFFFFF">
<tbody><tr><td height="25" align="center">&nbsp;</td></tr>
<tr>
<td align="center" width="300"><a href="../../whpj/" class="lan12_hover" onclick="javascript:parent.scroll=(0,0) " target="_parent&quot;"><img align="center" width="120" height="32" border="0" src="../../images/drwhpj.jpg"></a></td>
</tr>
</tbody></table>
</div>
</html>

[解决办法]
1。先用<table id="documentContainer">[\s\S]+?</table>获取到一块内容
2.再去掉标签。

例如:
string text = Regex.Match(yourHtml,@"<table id="documentContainer">[\s\S]+?</table>");
string text2 = Regex.Replace(text,@"<[^<>]*>","")
剩下就是你要的,
如果你说格式不对之类,你提问的人,请说明你要的是什么。自己整理一下。
[解决办法]

探讨
1。先用<table id="documentContainer">[\s\S]+?</table>获取到一块内容
2.再去掉标签。

例如:
string text = Regex.Match(yourHtml,@"<table id="documentContainer">[\s\S]+?</table>");
string text2 = Regex.Replace(text,@"……

[解决办法]
C# code
foreach(Match m in Regex.Matches(html,@"(?is)<table[^>]*?id=(['""]?)documentContainer\1[^>]*?>.*?<tbody>(\r?\n?\s*<tr>(\r\n\s*<td\b[^>]*?>(?<v>.*?)</td>)+\r\n\s*</tr>\r\n)+")) {    foreach(Capture c in m.Groups["v"].Captures)   {      Console.WriteLine(c.Value);    }  }    /*  新西兰元513.87518514.072012-03-07 11:31:28英镑989.53958.98997.48997.48993.61993.612012-03-07 11:31:28澳门元78.9478.2879.2479.2479.072012-03-07 11:31:28加拿大元629.21609.78634.26634.26631.21631.212012-03-07 11:31:28卢布21.25&nbsp;21.42&nbsp;21.342012-03-07 11:24:14  */ 

热点排行