首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > .NET > C# >

施用winhttp 组件获取某页面的html文本,中文出现乱码

2012-09-18 
使用winhttp 组件获取某页面的html文本,中文出现乱码.代码很简单,如下:C# codeusing Systemusing System.

使用winhttp 组件获取某页面的html文本,中文出现乱码.
代码很简单,如下:

C# code
using System;using System.Collections.Generic;using System.Text;using System.Net;namespace winhttptest{    class Program    {        static void Main(string[] args)        {            WinHttp.WinHttpRequest whr = new WinHttp.WinHttpRequest();            string url = Console.ReadLine();            while (url != string.Empty)            {                whr.Open("GET", url, false);                                whr.Send("");                string html = whr.ResponseText;                Console.WriteLine(html);                url = Console.ReadLine();            }                    }    }}


输入一个完整的uri(例如:http://news.sina.com.cn/c/2008-10-12/095416439207.shtml),回车,显示出来的文本中中文都是乱码,如何解决? 




[解决办法]
C# code
string html = whr.ResponseText;html = Encoding.GetEncoding("GB2312").GetString(Encoding.UTF8.GetBytes(html));
[解决办法]
C# code
using System;using System.Collections.Generic;using System.Text;using System.Net;using System.IO;namespace winhttptest{    class Program    {        private static string GetResponse(string url)        {            url.Trim();            HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);            req.AllowAutoRedirect = true;            req.MaximumAutomaticRedirections = 3; //text/xml;charset=uft-8            req.UserAgent = "Mozilla/4.0 (compatible;MSIE 6.0;Windows NT 5.2;.NET CLR 1.1.4322)";            req.Referer = req.RequestUri.ToString();            req.KeepAlive = true;            //req.Method = "Get";            req.Timeout = -1;            HttpWebResponse webresponse = null;            try            {                webresponse = (HttpWebResponse)req.GetResponse();                if (webresponse != null)                {                    StreamReader reader = new StreamReader(webresponse.GetResponseStream(), System.Text.Encoding.GetEncoding("gb2312"));                    return reader.ReadToEnd();                }            }            catch (System.Net.WebException ex)            {                return ex.Message;            }            if (webresponse != null)            {                return "";            }            return "";        }        static void Main(string[] args)        {            string html = GetResponse("http://news.sina.com.cn/c/2008-10-12/095416439207.shtml");            Console.WriteLine(html);            Console.ReadLine();        }    }}
[解决办法]
主要还是编码问题 有些是utf8 有些是gb2312 根据实际情况修改

热点排行