首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 开发语言 > C++ >

请问提取字符串的有关问题

2012-02-09 
请教提取字符串的问题我想在一个HTML中提取一段字符HTML结构如下:!DOCTYPEHTMLPUBLIC-//IETF//DTDHTML//

请教提取字符串的问题
我想在一个HTML中提取一段字符   HTML结构如下:
<!DOCTYPE   HTML   PUBLIC   "-//IETF//DTD   HTML//EN ">
<HTML>
<HEAD>
<meta   name= "GENERATOR "   content= "Microsoft&reg;   HTML   Help   Workshop   4.1 ">
<!--   Sitemap   1.0   -->
</HEAD> <BODY>
<UL>
<LI>   <OBJECT   type= "text/sitemap ">
<param   name= "Name "   value= "Robst   ">
<param   name= "Name "   value= "CopyRight ">
<param   name= "Local "   value= "Introduction\CopyRight.htm ">
<param   name= "Name "   value= "FAQ ">
<param   name= "Local "   value= "FAQ\FAQ.htm ">
<param   name= "Name "   value= "Welcome   to   Robst   ">
<param   name= "Local "   value= "Welcome\welcome.htm ">
<param   name= "URL "   value= "Introduction\KnowRobst.htm ">
<param   name= "Name "   value= "Known   BlueSoleil ">
</OBJECT>
<LI>   <OBJECT   type= "text/sitemap ">
<param   name= "Name "   value= "Dial-Up   Networking ">
<param   name= "Name "   value= "Mobile ">
<param   name= "Local "   value= "Connection\Mobile\Mobile.htm ">
</OBJECT>
<LI>   <OBJECT   type= "text/sitemap ">
<param   name= "Name "   value= "Environment ">
<param   name= "Local "   value= "Welcome\welcome.htm ">
</OBJECT>
.................
我想提取每一个 <LI>   <OBJECT   type= "text/sitemap ">
之后第一行的value的值,引号中的字符串。就是 <param   name= "Name "   value= "Robst   "> 中我提取出   Robst,并写入新文件中,   其他行不管,直到下一个 <LI>   <OBJECT   type= "text/sitemap "> 。

请问该如何做啊   我对字符操作不是很熟,请各位帮帮忙吧   谢谢!!

[解决办法]
去学“正则表达式”
[解决办法]
那就用string类的find吧,自己看看string类的帮助。
[解决办法]
"text/sitemap "> \s.*value= "(.*)[^ "] " //貌似可以
boss 真傻
[解决办法]
我也够傻 上边错了是这个。
"text/sitemap "> \s.*value= "[^ "](.*) "
[解决办法]
不能用正则那就 手动解析吧。

test.txt文件内容:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN ">
<HTML>
<HEAD>
<meta name= "GENERATOR " content= "Microsoft&reg; HTML Help Workshop 4.1 ">
<!-- Sitemap 1.0 -->
</HEAD> <BODY>
<UL>
<LI> <OBJECT type= "text/sitemap ">
<param name= "Name " value= "Robst ">
<param name= "Name " value= "CopyRight ">
<param name= "Local " value= "Introduction\CopyRight.htm ">


</OBJECT>
<LI> <OBJECT type= "text/sitemap ">
<param name= "Name " value= "Dial-Up Networking ">
<param name= "Name " value= "Mobile ">
<param name= "Local " value= "Connection\Mobile\Mobile.htm ">
</OBJECT>


#include <fstream>
//#include <string>
#include <iostream>
#include <cstdlib>

using namespace std;

int main()
{
string line, value;
int flag=0;
ifstream infile( "test.txt ");
while(!infile.eof())
{
getline(infile, line);
if(line == " </OBJECT> ")
flag=0;
if(flag)
{
value=line.substr(line.find( "value ")+7); //截取 value= " 后面的string
value=value.substr(0, value.length()-2); //去除后面 ">
cout < <value < <endl; //输出结果,这个结果也可以另外处理,比如写到其他文件
}
if(line == " <LI> <OBJECT type=\ "text/sitemap\ "> ")
flag=1;
}

system( "pause ");
return 0;
}

热点排行