如何用asp.net采集网站数据

如何用asp.net采集网站数据,第1张

输出显示函数即可,也可以将变量存入数据库,这只是一个例子,具体其它功能你举一反三,循环以下即可。response.writeShowipinfo("202.29.90.9")FunctionShowipinfo(ip)'显示IP地址具体地址参考IP138数据库悔升Dimurls,str,showipinfosurls="/ips138.asp?ip="&ip&帆察"态前茄&action=2"str=getHTTPPage(urls)Showipinfo=strcut(str,"","",2)'截取IP地址来源showipinfos=Replace(Showipinfo,"本站主数据:","1、")Showipinfo=Replace(showipinfos,"参考数据一:","2、")EndFunction'****************************************''函数名:GetHttpPage(url)2011-5-17xuyang'功能:ASP采集网页内容GB2312和UTF-8通用'参数:url地址'****************************************FunctionGetHttpPage(url)DimResStr,ResBody,PageCodeIfIsNull(url)=TrueOrurl="False"ThenGetHttpPage=""ExitFunctionEndIfDimHttp,sStartTimeSetHttp=Server.CreateObject("MSXML2.XMLHTTP")WithHttp.Open"GET",url,False.SendEndWith'Http.open"GET",url,False'Http.Send(Null)sStartTime=NowOnErrorResumeNextIfHttp.Status200ThenSetHttp=NothingGetHttpPage=""ExitFunctionEndIfDoWhileHttp.ReadyState4IfDateDiff("s",sStartTime,Now)>10ThenGetHttpPage=""ExitFunctionEndIfLoopIfHttp.ReadyState=4ThenIfHttp.Status=200ThenPageCode=test(url)GetHttpPage=bytesToBSTR(Http.responseBody,PageCode)EndIfEndIfSetHttp=NothingIfErr.Number0ThenErr.ClearEndIfEndFunctionFunctionbytesToBSTR(body,Cset)DimObjstreamSetObjstream=CreateObject("adodb.stream")Objstream.Type=1Objstream.Mode=3Objstream.OpenObjstream.writebodyObjstream.position=0Objstream.Type=2Objstream.Charset=CsetbytesToBSTR=Objstream.ReadtextObjstream.CloseSetObjstream=NothingEndFunctionFunctiontest(sUrl)DimoxSetox=server.CreateObject("msxml2.xmlhttp")ox.Open"get",sUrl,Falseox.Sendtest=charsetOf(ox.responseBody)EndFunctionFunctioncharsetOf(bstr)Dimp,c,rIfInStrB(bstr,ChrB(0))>0ThencharsetOf="unicode"ExitFunctionEndIfc=s2b("charset=")p=InStrB(1,bstr,c,1)Ifp>0Thenc=b2s(MidB(bstr,p+LenB(c),20))Setr=NewRegExpr.Pattern="^[’""]?([-\w]+)"Setc=r.Execute(c)Ifc.Count>0ThencharsetOf=LCase(c(0).SubMatches(0))ExitFunctionEndIfEndIfDimn,ucsOnly,retucsOnly=Falsen=LenB(bstr)Forp=1Tonc=AscB(MidB(bstr,p,1))IfcAnd&H80ThenExitForIfc&HDAndc&HAAndc&H9ThenucsOnly=TrueExitForEndIfEndIfNextIfp>nThenret="ascii"ElseIfNotucsOnlyThenIfisUtf8(bstr,p,n)Thenret="utf-8"ElseIfisGbk(bstr,p,n)Thenret="GB2312"EndIfEndIfIfIsEmpty(ret)ThenIfisUnicode(bstr,p,n)ThencharsetOf="unicode"ElsecharsetOf="unknown"EndIfElsecharsetOf=retEndIfEndFunctionFunctions2b(str)Dimr,iFori=1ToLen(str)r=r+ChrB(Asc(Mid(str,i,1))And&HFF)Nexts2b=rEndFunctionFunctionb2s(bs)Dimr,iFori=1ToLenB(bs)r=r+Chr(AscB(MidB(bs,i,1)))Nextb2s=rEndFunctionFunctionisUtf8(bs,start,Length)isUtf8=TrueDimp,e,ce=FalseForp=startToLengthc=AscB(MidB(bs,p,1))IfcAnd&H80ThenIfcAnd&HE0=&HC0ThenIfp=LengthThene=TrueElsep=p+1IfAscB(MidB(bs,p,1))And&H30&HC0Thene=TrueEndIfElseIfcAnd&HF0=&HE0ThenIfp=LengthOrp=Length-1Thene=TrueElsep=p+2IfAscB(MidB(bs,p-1,1))And&H30&HC0Thene=TrueElseIfAscB(MidB(bs,p,1))And&H30&HC0Thene=TrueEndIfEndIfElsee=TrueEndIfEndIfIfeThenisUtf8=FalseExitFunctionEndIfNextEndFunctionFunctionisGbk(bs,start,Length)isGbk=TrueDimp,e,ce=FalseForp=startToLengthc=AscB(MidB(bs,p,1))IfcAnd&H80ThenIfp=LengthThene=TrueElsep=p+1If(AscB(MidB(bs,p,1))And&H80)=0Thene=TrueEndIfEndIfIfeThenisGbk=FalseExitFunctionEndIfNextEndFunctionFunctionisUnicode(bs,start,Length)isUnicode=TrueDimp,cIfstartMod2=0ThenisUnicode=FalseExitFunctionEndIfForp=startToLengthc=AscB(MidB(bs,p,1))IfcAnd&H80ThenIfp=LengthThenisUnicode=FalseExitFunctionElsep=p+1EndIfEndIfNextEndFunction'截取字符串,1.包括起始和终止字符,2.不包括FunctionstrCut(strContent,StartStr,EndStr,CutType)DimstrHtml,S1,S2strHtml=strContentOnErrorResumeNextSelectCaseCutTypeCase1S1=InStr(strHtml,StartStr)S2=InStr(S1,strHtml,EndStr)+Len(EndStr)Case2S1=InStr(strHtml,StartStr)+Len(StartStr)S2=InStr(S1,strHtml,EndStr)EndSelectIfErrThenstrCute="没有找到需要的内容。"Err.ClearExitFunctionElsestrCut=Mid(strHtml,S1,S2-S1)EndIfEndFunction

asp或者asp.net是需要服务器支持xmlhttp组件的

php有个fopen选项 要把它改棚型者成true

这两个是通过你的网站自己采集 是需要服务器支持

不过你也可以通过一些本地程租弯序采集。。就是让你的计算机自动给你的网站采集添加文章 软件比较NB的就链薯是火车头。。。本人经常用这个吧某网站的东西采集到另一个网站 弄的这个网站的内容乱七八糟。。。已达到干坏事的目的

对你你个人熟悉html代码就可以 其实不用太熟悉。。。知道一点就成

比如网站的html代码给你 你能找到里面那些内容是你需要采集的,那些不用就行。。。其实很简单。。。个人也就学了几分钟就会了


欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/yw/12519174.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2023-05-26
下一篇 2023-05-26

发表评论

登录后才能评论

评论列表(0条)

保存