http://www.bluenile.com/api/public/loose-diamond/diamond-details/panel?country=USA¤cy=USD&language=en-us&productSet=BN&sku=LD04077082 @H_419_7@这是我在[R]中的(失败的)请求. @H_419_7@
test2 <- fromJsON(getURL("http://www.bluenile.com/API/public/loose-diamond/diamond-details/panel?country=USA¤cy=USD&language=en-us&productSet=BN&sku=LD04077082",ssl.verifypeer = FALSE,useragent = "Mozilla/5.0 (windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/37.0.2062.124 Safari/537.36"))@H_419_7@我的研究到目前为止
首先,我在堆栈上查看了这个先前的问题,并在我的useragent中添加了请求(没有解决问题,但可能仍然是必要的):
ViralHeat API issues with getURL() command in RCurl package @H_419_7@接下来,我查看了这个有用的帖子,它指导了我的理由:
R Disparity between browser and GET / getURL @H_419_7@我对解决方案的看法
这不是我的专业领域,但我的猜测是请求缺少完成请求所需的cookie(因此它在隐身模式下无法在我的浏览器中运行).我将成功请求的请求和响应与不成功的请求进行了比较: @H_419_7@成功要求: @H_419_7@不成功的请求: @H_419_7@有人有主意吗?我应该尝试使用MrFlick在我发表的第二篇文章中建议的软件包RSelenium软件包.解决方法 这是一个有礼貌的网站.它想知道您使用的货币来自哪里等,以便为您提供更好的用户体验.它通过在目标网页上设置大量cookie来实现此目的.所以我们跟着并导航到登陆页面首先获取cookie然后我们转到我们想要的页面: @H_419_7@ @H_419_7@
library(RCurl)myURL <- "http://www.bluenile.com/API/public/loose-diamond/diamond-details/panel?country=USA¤cy=USD&language=en-us&productSet=BN&sku=LD04077082"agent="Mozilla/5.0 (windows NT 6.3; WOW64; rv:32.0) Gecko/20100101 firefox/32.0"#Set RCurl parscurl = getCurlHandle()curlSetopt(cookiejar="cookies.txt",useragent = agent,followlocation = TRUE,curl=curl)firstPage <- getURL("http://www.bluenile.com",curl=curl)myPage <- getURL(myURL,curl = curl)library(rjsonio)> names(fromJsON(myPage))[1] "diamondDetailsheader" "diamondDetailsBodIEs" "pageMetadata" "expandedUrl" [5] "newVersion" "multIDiamond"@H_419_7@和饼干: @H_419_7@
> getCurlinfo(curl)$cookieList [1] ".bluenile.com\tTRUE\t/\tFALSE\t2412270275\tGUID\tDA5C11F5_E468_46B5_B4E8_D551D4D6EA4D" [2] ".bluenile.com\tTRUE\t/\tFALSE\t1475342275\tsplit\tver~3&presetFilters~TEST" [3] ".bluenile.com\tTRUE\t/\tFALSE\t1727630275\tsitetrack\tver~2&Jse~0" [4] ".bluenile.com\tTRUE\t/\tFALSE\t1425230275\tpop\tver~2&china~false&french~false&IE~false&internationalSelect~false&iphoneApp~false&survey~false&uae~false" [5] ".bluenile.com\tTRUE\t/\tFALSE\t1475342275\tdsearch\tver~6&newUser~true" [6] ".bluenile.com\tTRUE\t/\tFALSE\t1443806275\tlocale\tver~1&country~IRL¤cy~EUR&language~en-gb&productSet~BNUK" [7] ".bluenile.com\tTRUE\t/\tFALSE\t0\tbnses\tver~1&ace~false&isbml~false&fbcs~false&ss~0&mbpop~false&sswpu~false&deo~false" [8] ".bluenile.com\tTRUE\t/\tFALSE\t1727630275\tbnper\tver~5&NIB~0&DM~-&GUID~DA5C11F5_E468_46B5_B4E8_D551D4D6EA4D&SESS-CT~1&STC~32RPVK&FB_MINI~false&SUB~false" [9] "#httpOnly_www.bluenile.com\tFALSE\t/\tFALSE\t0\tJsESSIONID\tB8475C3AEC08205E5AC6252C94E4B858" [10] ".bluenile.com\tTRUE\t/\tFALSE\t1727630278\tmigrationstatus\tver~1&redirected~false"总结
以上是内存溢出为你收集整理的抓取API时’RCurl'[R]打包getURL网页错误全部内容,希望文章能够帮你解决抓取API时’RCurl'[R]打包getURL网页错误所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)