java关于html匹配的正则问题

java关于html匹配的正则问题,第1张

你不要一行一行的匹配,你把这个网页全部读取下来再去通过正则匹配就对了。

URL url=new URL("http://www.weather.com.cn:80/weather/101210101.shtml")

BufferedReader bufIn=new BufferedReader(new InputStreamReader(url.openStream()))

StringBuilder builder = new StringBuilder()

String regex="<table.*"

String line=null

Pattern p=Pattern.compile(regex,Pattern.DOTALL)

while((line=bufIn.readLine())!=null){

builder.append(line)

}

line = builder.toString()

Matcher m=p.matcher(line)

//char[] buf=

while(m.find()){

//将符合规则的数据存储到集合中

list.add(m.group())

}

for(String list3 : list){

System.out.println(list3)

}

<head>

<script language="javascript">

var G=document.getElementById

function window_load(){

var strHTML = ""// document.body.innerHTML

strHTML += "<html>"

strHTML += " <head>"

strHTML += " </head>"

strHTML += " <body>"

strHTML += " <font color='red'>test1</font><br />"

strHTML += " <font size='18'>test2</font><br />"

strHTML += " <font >test3</font><br />"

strHTML += " <font></font>"

strHTML += " </body>"

strHTML += "</html>"

var reg = /<(font)\s*[^<>]*>[^<>]*<\/\1\s*>/ig

var aryResult = strHTML.match(reg)

alert("用match方法匹配 ,结果:\n\n" + aryResult.join("\n"))

}

</script>

</head>

<body onload="window_load()"> \

<!--

<font color='red'>test1</font><br />

<font size='18'>test2</font><br />

<font >test3</font><br />

<font></font>

-->

</body>

</html>


欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/7436210.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2023-04-05
下一篇 2023-04-05

发表评论

登录后才能评论

评论列表(0条)

保存