java程序读取一个url页面的源代码

java程序读取一个url页面的源代码,第1张

传入一个url,返回源代码; public static String getHTML(String url){// 获取指定URL的网页,返回网页内容的字符串,然后将此字符串存到文件即可 try { URL newUrl = new URL(url)URLConnection connect = newUrl.openConnection()connect.setRequestProperty("User-Agent","Mozilla/4.0 (compatibleMSIE 5.0Windows NTDigExt)")DataInputStream dis = new DataInputStream(connect.getInputStream())BufferedReader in = new BufferedReader(new InputStreamReader(dis,"UTF-8"))String html = ""String readLine = nullwhile((readLine = in.readLine()) != null) { html = html + readLine} in.close()return html}catch (MalformedURLException me){ System.out.println("MalformedURLException" + me)}catch (IOException ioe){ System.out.println("ioeException" + ioe)} return null}

var xml = new ActiveXObject("Msxml2.XMLHTTP")

xml.open("get", "http://www.baidu.com",false)

xml.send(null)

if(xml.readyState != 4 || xml.status != 200){

alert("出错了")

}else{

// xml.responseText就是你要的网址页面内容

alert(xml.responseText)

}


欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/yw/11755408.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2023-05-18
下一篇 2023-05-18

发表评论

登录后才能评论

评论列表(0条)

保存