我现在正在做的是:
public static final HashMap<String, String> acceptTypes = new HashMap<String, String>(){{ put("html", "text/html,application/xhtml+xml,application/xml;q=0.9,**;q=0.5"); put("script", "**;q=0.1"); }};protected void downloadCssAndImages(HtmlPage page) { String xPathexpression = "//*[name() = 'img' or name() = 'link' and @type = 'text/css']"; List<?> resultList = page.getByXPath(xPathexpression); Iterator<?> i = resultList.iterator(); while (i.hasNext()) { try { HtmlElement el = (HtmlElement) i.next(); String path = el.getAttribute("src").equals("")?el.getAttribute("href"):el.getAttribute("src"); if (path == null || path.equals("")) continue; URL url = page.getFullyQualifiedUrl(path); WebRequestSettings wrs = new WebRequestSettings(url); wrs.setAdditionalHeader("Referer", page.getWebResponse().getRequestSettings().getUrl().toString()); client.addRequestHeader("Accept", acceptTypes.get(el.getTagName().toLowerCase())); client.getPage(wrs); } catch (Exception e) {} }client.removeRequestHeader("Accept");}
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)