所有其他答案均参考Scrapyv0.x。根据更新的文档,Scrapy 1.0要求:
import scrapyfrom scrapy.crawler import CrawlerProcessclass MySpider(scrapy.Spider): # Your spider definition ...process = CrawlerProcess({ 'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'})process.crawl(MySpider)process.start() # the script will block here until the crawling is finished
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)