http://www.flipkart.com/mens-footwear/shoes/casual-shoes/pr?p%5B%5D=sort%3Dpopularity&sID=osp%2Ccil%2Cnit%2Ce1f&start=31&AJAX=truehttp://www.flipkart.com/mens-footwear/shoes/casual-shoes/pr?p%5B%5D=sort%3Dpopularity&sID=osp%2Ccil%2Cnit%2Ce1f&start=46&AJAX=truehttp://www.flipkart.com/mens-footwear/shoes/casual-shoes/pr?p%5B%5D=sort%3Dpopularity&sID=osp%2Ccil%2Cnit%2Ce1f&start=61&AJAX=true...
require 'rubygems'require 'nokogiri'require 'mechanize'require 'open-uri'number = 1while true url = "http://www.flipkart.com/mens-footwear/shoes" + "/casual-shoes/pr?p%5B%5D=sort%3Dpopularity&" + "sID=osp%2Ccil%2Cnit%2Ce1f&start=#{number}&AJAX=true" doc = Nokogiri::HTML(open(url)) doc = Nokogiri::HTML(doc.at_CSS('#AJAX').text) products = doc.CSS(".browse-product") break if products.size == 0 products.each do |item| Title = item.at_CSS(".fk-display-block,.Title").text.strip price = (item.at_CSS(".pu-final").text || '').strip link = item.at_xpath(".//a[@class='fk-display-block']/@href") image = item.at_xpath(".//div/a/img/@src") puts number puts "#{Title} - #{price}" puts "http://www.flipkart.com#{link}" puts image puts "========================" number += 1 endend总结
以上是内存溢出为你收集整理的ruby – 如何使用JavaScript添加动态内容来抓取网页?全部内容,希望文章能够帮你解决ruby – 如何使用JavaScript添加动态内容来抓取网页?所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)