看连载小说同志们的福利来了
????? 最近迷上了看《那时汉朝》,在网上找手机电子书,无奈,塔读的收费,网页连载的倒是免费,但是不适合手机看,太麻烦,倒是可以吧连载的复制下来,整合成txt文档,放在手机里,看着方便,但是就拿《那时汉朝》来说,连载的就600多节,这要是复制粘贴肯定累死人了,为了方便,下面的这个程序变产生了.
????? 程序功能:把连载的小说自动从网上下载到本地,生成txt文档
????? 使用之前需要安装Python的环境,而且要分析写连载网站是否具有规则,没有规则是无法使用此程序的,呵呵,没那么智能
# -*- coding: UTF-8 -*-import httplibimport reimport stringconn = httplib.HTTPConnection("网址")conn.request("GET","/连载章节目录网页地址")r1 = conn.getresponse()p = re.compile(r"<a.*?</A>")aList = p.findall(r1.read())f = open("nashihanchao/那时汉朝.txt", "w")articalList = []for e in aList:href = re.compile(r"read_.*?html")hrefList = href.findall(e)if hrefList:title = re.compile(r"第\d*?节")titleList = title.findall(e)titleNumRegx = re.compile("\d*")titleNum = titleNumRegx.findall(titleList[0])articalList.append({'name': string.atoi(titleNum[2]),'href':hrefList[0]})articalList.sort(key=lambda obj:obj.get('name'), reverse=False)for e in articalList:print "loading",e.get('name'),e.get('href')conn.request("GET",e.get('href'))r = conn.getresponse()contentRegx = re.compile(r"<TD CLASS=ART>[\w\W]*?</TD>")contentList = contentRegx.findall(r.read())content = contentList[0]content = content.replace("<TD CLASS=ART>","")content = content.replace("</TD>","")content = content.replace("<br><br><br>","\n")content = content.replace("<br><br>","\n")content = content.replace("<br>","\n")content = content.replace(" ","\n")print >>f,contentf.close()conn.close()
?
?
1 楼 wsh303496225 2013-09-08 有个叫小说阅读下载器的软件 可以满足你的需求