Java 从网页抽取数据 存入数据库

Java 从网页抽取数据 存入数据库,第1张

台风的编号和名称直接在源码中有,但时间和地理位置我只能跟踪到

function totf(tfbh){

location.href( "Typhoon.aspx?id="+tfbh)

}

数据需要从aspx中拿到的,应该是存放到数据库的,页面上是拿不到的

我认为可以通过循环模拟发送请求Typhoon.aspx?id="+XXX,然后通过解析response包的方式可以获得详细的信息

下面一个页面是讲模拟发送请求的

http://tidus2005.javaeye.com/blog/195544

希望对你有帮助

写了一段获得一组数据的代码

//get Typhoon content by param

public static String getTyphoon(String param) {

URL url = null

try {

url = new URL(param)

} catch (MalformedURLException e) {

e.printStackTrace()

}

HttpURLConnection connection = null

InputStream is = null

try {

connection = (HttpURLConnection) url.openConnection()

is = connection.getInputStream()

} catch (IOException e) {

e.printStackTrace()

}

BufferedInputStream bis = new BufferedInputStream(is)

int len = 0

byte[] buf_all = new byte[0]

try {

while (true) {

byte[] buf1 = new byte[4096]

byte[] buf2 = buf_all

len = bis.read(buf1)

if(len <= 0){

break

}

buf_all = new byte[len+buf2.length]

System.arraycopy(buf2, 0, buf_all, 0, buf2.length)

System.arraycopy(buf1, 0, buf_all, buf2.length, len)

}

} catch (IOException e) {

e.printStackTrace()

}

String content = null

try {

content = new String(buf_all, "utf-8")

} catch (UnsupportedEncodingException e) {

e.printStackTrace()

}

int startIndex = content.indexOf("var ary0=")+9

content = content.substring(startIndex)

int endIndex = content.indexOf("var aryyb0=")

content = content.substring(0, endIndex)

return content

}

得到的结果是这样的:

[['200906','2009-07-19 20:00:00','23.8','109.6','','15','','','260','','','54440','','莫拉菲','Molave','7'],

['200906','2009-07-19 15:00:00','23.5','111','993','15','25','西北西','260','','','54439','','莫拉菲','Molave','7'],

['200906','2009-07-19 14:00:00','23.3','111.2','','18','','','260','','','54438','','莫拉菲','Molave','8'],

['200906','2009-07-19 13:00:00','23.3','111.5','990','18','25','西北西','260','','','54437','','莫拉菲','Molave','8'],

['200906','2009-07-19 12:00:00','23.2','111.8','990','18','25','西北西','260','','','54436','','莫拉菲','Molave','8'],

['200906','2009-07-19 11:00:00','23.2','112.1','987','18','25','西北西','260','','','54435','','莫拉菲','Molave','8'],

['200906','2009-07-19 10:00:00','23.2','112.4','987','18','25','西北西','260','','','54434','','莫拉菲','Molave','8'],

['200906','2009-07-19 09:00:00','23','112.6','987','20','25','西北西','260','','','54433','','莫拉菲','Molave','8'],

['200906','2009-07-19 08:00:00','22.9','112.9','987','20','','','260','','','54432','','莫拉菲','Molave','8'],

['200906','2009-07-19 07:00:00','22.9','113.2','985','23','25','西北西','260','','','54431','','莫拉菲','Molave','9'],

['200906','2009-07-19 06:00:00','22.8','113.4','982','25','25','西北西','260','','','54430','','莫拉菲','Molave','10'],

['200906','2009-07-19 05:00:00','22.7','113.7','980','28','25','西北西','260','','','54429','','莫拉菲','Molave','10'],

['200906','2009-07-19 04:00:00','22.7','114','975','30','25','西北西','260','','','54428','','莫拉菲','Molave','11'],

['200906','2009-07-19 03:00:00','22.7','114.2','975','33','25','西北偏西','260','80','','54426','','莫拉菲','Molave','12'],

['200906','2009-07-19 02:00:00','22.6','114.5','','35','','','260','80','','54425','','莫拉菲','Molave','12'],

['200906','2009-07-19 01:00:00','22.5','114.5','970','35','28','西北西','260','80','','54424','','莫拉菲','Molave','12'],

['200906','2009-07-19 00:00:00','22.5','114.8','965','38','28','西北西','260','80','','54423','','莫拉菲','Molave','13'],

['200906','2009-07-18 23:00:00','22.4','115.1','','38','','','260','80','','54422','','莫拉菲','Molave','13'],

['200906','2009-07-18 22:00:00','22.3','115.5','965','38','25','西北西','260','80','','54421','','莫拉菲','Molave','13'],

['200906','2009-07-18 21:00:00','22.2','115.7','965','38','25','西北西','260','80','','54420','','莫拉菲','Molave','13'],

['200906','2009-07-18 20:00:00','22.2','116','','35','','','260','80','','54419','','莫拉菲','Molave','12'],

['200906','2009-07-18 19:00:00','22.2','116.2','970','35','25','西北偏西','260','80','','54418','','莫拉菲','Molave','12'],

['200906','2009-07-18 18:00:00','22.1','116.5','970','35','25','西北偏西','260','80','','54417','','莫拉菲','Molave','12'],

['200906','2009-07-18 17:00:00','22','116.7','970','35','25','西北西','260','80','','54416','','莫拉菲','Molave','12'],

['200906','2009-07-18 16:00:00','21.9','116.9','970','35','25','西北偏西','260','80','','54415','','莫拉菲','Molave','12'],

['200906','2009-07-18 15:00:00','21.8','117.1','970','35','25','西北偏西','260','80','','54414','','莫拉菲','Molave','12'],

['200906','2009-07-18 14:00:00','21.7','117.2','970','35','25','西北西','260','80','','54413','','莫拉菲','Molave','12'],

['200906','2009-07-18 13:00:00','21.7','117.4','970','35','25','西北西','260','80','','54412','','莫拉菲','Molave','12'],

['200906','2009-07-18 12:00:00','21.6','117.5','975','33','25','西北西','260','80','','54411','','莫拉菲','Molave','12'],

['200906','2009-07-18 11:00:00','21.6','117.7','975','33','25','西北西','260','80','','54410','','莫拉菲','Molave','12'],

['200906','2009-07-18 10:00:00','21.6','117.9','975','33','25','西北西','260','80','','54409','','莫拉菲','Molave','12'],

['200906','2009-07-18 09:00:00','21.5','118.2','975','33','25','西北西','260','80','','54408','','莫拉菲','Molave','12'],

['200906','2009-07-18 08:00:00','21.4','118.3','975','33','25','西北偏西','260','80','','54407','','莫拉菲','Molave','12'],

['200906','2009-07-18 07:00:00','21.4','118.5','975','33','25','西北西','260','80','','54406','','莫拉菲','Molave','12'],

['200906','2009-07-18 06:00:00','21.3','118.7','975','33','25','西北西','260','80','','54405','','莫拉菲','Molave','12'],

['200906','2009-07-18 05:00:00','21.2','119','975','33','','','260','60','','54404','','莫拉菲','Molave','12'],

['200906','2009-07-18 04:00:00','21.2','119.2','978','30','25','西北西','260','60','','54403','','莫拉菲','Molave','11'],

['200906','2009-07-18 03:00:00','21.1','119.4','978','30','25','西北偏西','260','60','','54402','','莫拉菲','Molave','11'],

['200906','2009-07-18 02:00:00','21','119.6','978','30','','','260','60','','54401','','莫拉菲','Molave','11'],

['200906','2009-07-18 01:00:00','21','120.1','978','30','25','西北偏西','260','60','','54400','','莫拉菲','Molave','11'],

['200906','2009-07-18 00:00:00','20.9','120.3','978','30','25','西北偏西','260','60','','54399','','莫拉菲','Molave','11'],

['200906','2009-07-17 23:00:00','20.8','120.5','978','30','20','西北偏西','260','60','','54398','','莫拉菲','Molave','11'],

['200906','2009-07-17 22:00:00','20.7','121','978','30','20','西北偏西','260','60','','54397','','莫拉菲','Molave','11'],

['200906','2009-07-17 21:00:00','20.7','121.2','978','30','20','西北偏西','260','60','','54396','','莫拉菲','Molave','11'],

['200906','2009-07-17 20:00:00','20.6','121.5','978','30','20','西北偏西','260','60','','54395','','莫拉菲','Molave','11'],

['200906','2009-07-17 19:00:00','20.4','121.8','980','28','20','西北西','260','60','','54394','','莫拉菲','Molave','10'],

['200906','2009-07-17 18:00:00','20.3','121.9','980','28','20','西北偏西','260','60','','54393','','莫拉菲','Molave','10'],

['200906','2009-07-17 17:00:00','20.2','122.1','980','28','20','西北偏西','200','50','','54392','','莫拉菲','Molave','10'],

['200906','2009-07-17 14:00:00','19.5','122.7','','25','','','200','50','','54391','','莫拉菲','Molave','10'],

['200906','2009-07-17 11:00:00','18.9','123.3','985','25','15','西北','200','50','','54390','','莫拉菲','Molave','10'],

['200906','2009-07-17 08:00:00','18.6','123.6','994','20','','','100','','','54389','','莫拉菲','Molave','8'],

['200906','2009-07-17 05:00:00','18.4','123.9','996','18','15','西北','100','','','54388','','莫拉菲','Molave','8'],

['200906','2009-07-17 02:00:00','17.9','124.1','996','18','15','西北','50','','','54387','','莫拉菲','Molave','8'],

['200906','2009-07-16 23:00:00','17.6','124.6','996','18','15','西北','','','','54386','','莫拉菲','Molave','8'],

['200906','2009-07-16 20:00:00','17.4','124.7','996','18','','','','','','54385','','莫拉菲','Molave','8']]

再下去字符串的拆分实在是太复杂了,不想写了

使用时只要参数为http://www.wztf121.com/Typhoon.aspx?id=

id后是台风的代码号,写一个循环就可以了


欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/sjk/9886964.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2023-05-03
下一篇 2023-05-03

发表评论

登录后才能评论

评论列表(0条)

保存