python – 使用R SOAP(SSOAP)检索数据 scrape

python – 使用R SOAP(SSOAP)检索数据 scrape,第1张

概述在B-cycle页面(www.bcycle.com/whowantsitmore.aspx)上,我试图抓住选票的位置和价值. URL http://mapservices.bcycle.com/bcycleservice.asmx是SOAP服务. 基于documentation我相信我正确地做到了但是由于解析输入参数我得到了一个错误.即使调用没有参数的函数也会产生错误. # working wit 在B-cycle页面(www.bcycle.com/whowantsitmore.aspx)上,我试图抓住选票的位置和价值.

URL http://mapservices.bcycle.com/bcycleservice.asmx是SOAP服务.

基于documentation我相信我正确地做到了但是由于解析输入参数我得到了一个错误.即使调用没有参数的函数也会产生错误.

# working with SOAP#install.packages("SSOAP",repos="http://www.omegahat.org/R",dependencIEs = T,type =  "source")library(SSOAP)# Process the Web Service DeFinition Language (WSDL) filebcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL")# Generate functions based on deFinitions to access the different data setsbcycle.interface <- genSOAPClIEntInterface(bcycle.asmx@operations[[1]],def = bcycle.asmx,bcycle.asmx@name,verbose=T)# Get the data by requesting the number of citIEs,username and password (yes it is public)bcycle.interface@functions$getCitIEs("10","bcycle","c@rbont0ns")# receive error: Error in as(parameters,"limit.username.pw") :# no method or default for coercing "character" to "limit.username.pw"

这是由于函数中的以下代码:

function(parameters = List(...),... etc) {    ...    as(parameters,"limit.username.pw")    ...}

因此我尝试直接使用.soAP函数:

# Using RCurl librarylibrary(RCurl)# set up curl optionscurl.opts <- curlOptions(    verbose=T,header=T,cookie="ASP.NET_SessionID=dv25ws551nquoezqwq3iu545;__utma=27517231.373920809.1357910914.1357910914.1357912862.2;__utmc=27517231;__utmz=27517231.1357910914.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);__utmb=27517231.13.10.1357912862",httpheader = c('Content-Type' = 'text/xml; charset=utf-8',Accept = "text/HTML,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),followLocation = TRUE,useragent = "Mozilla/5.0 (windows NT 6.1; rv:18.0) Gecko/20100101 firefox/18.0")# define header and submit requestbcycle.server <- SOAPServer("http://mapservices.bcycle.com/bcycleservice.asmx").soAP(bcycle.server,"getCitIEs",limit=250,username="bCycle",pw="c@rbont0ns",action="http://bcycle.com/getCitIEs",xmlns="http://bcycle.com/",.opts=curl.opts,.literal=T,nameSpaces = "1.2",elementFormQualifIEd = T,.returnNodename = 'getCitIEsResponse',.soapheader = NulL)

我设法连接到他们的服务器但收到错误:

System.Web.Services.Protocols.soapException:  Server dID not recognize the value of http header SOAPAction:  http://bcycle.com/getCitIEs#getCitIEs

这些是我迄今为止尝试过的选项没有成功.

使用Python我能够发出getCitIEs的请求但没有收到任何回复.

import sudsclIEnt = suds.clIEnt.ClIEnt('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL')print clIEnt # prints WSDL infoprint clIEnt.service.getCitIEs(10,'bcycle','c@rbont0ns') #prints nothing

我真的很有兴趣保持这个R专注,但使用python可以更容易地洞察问题可能是什么.

有任何想法吗?

解决方法 尝试更正用户名并明确命名参数:

library(SSOAP)bcycle.asmx <- processWSDL("http://mapservices.bcycle.com/bcycleservice.asmx?WSDL")bcycle.interface <- genSOAPClIEntInterface(bcycle.asmx@operations[[1]],verbose=T)out <- bcycle.interface@functions$getCitIEs(                     limit="10",pw="c@rbont0ns")#> out[[1]]@#out[[1]]@zip               out[[1]]@state_name#out[[1]]@pop               out[[1]]@latitude#out[[1]]@ambassador_count  out[[1]]@longitude#out[[1]]@city_name         out[[1]]@city_name#[1] "toledo"

Python调用也可以使用更正后的用户名

import sudsclIEnt = suds.clIEnt.ClIEnt('http://mapservices.bcycle.com/bcycleservice.asmx?WSDL')clIEnt.service.getCitIEs(10,'bCycle','c@rbont0ns')(ArrayOfCitIEs){   CitIEs[] =       (CitIEs){         zip = "43606"         pop = 337362         ambassador_count = 455261         city_name = "toledo"         state_name = "oh"         latitude = 41.6743         longitude = -83.6029      },............
总结

以上是内存溢出为你收集整理的python – 使用R SOAP(SSOAP)检索数据/ scrape全部内容,希望文章能够帮你解决python – 使用R SOAP(SSOAP)检索数据/ scrape所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1196765.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-03
下一篇 2022-06-03

发表评论

登录后才能评论

评论列表(0条)

保存