python - Scrapy: ERROR: Spider error processing -


i new in python & scrapy. tried run existing code, got error on every address:

2015-07-02 01:52:19 [scrapy] debug: crawled (200) <get http://www.tripadvisor.com/showuserreviews-g187147-d197524-r281927613-hotel_mirific_opera-paris_ile_de_france.html> (referer: http://www.tripadvisor.com/hotel_review-g187147-d197524-reviews-hotel_mirific_opera-paris_ile_de_france.html)2015-07-02 01:52:19 [scrapy] error: spider error processing <get http://www.tripadvisor.com/showuserreviews-g187147-d197524-r281927613-hotel_mirific_opera-paris_ile_de_france.html> (referer: http://www.tripadvisor.com/hotel_review-g187147-d197524-reviews-hotel_mirific_opera-paris_ile_de_france.html) 

traceback (most recent call last): file "/usr/local/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 102, in iter_errback yield next(it) file "/usr/local/lib/python2.7/dist-packages/scrapy/spidermiddlewares/offsite.py", line 28, in process_spider_output x in result: file "/usr/local/lib/python2.7/dist-packages/scrapy/spidermiddlewares/referer.py", line 22, in return (_set_referer(r) r in result or ()) file "/usr/local/lib/python2.7/dist-packages/scrapy/spidermiddlewares/urllength.py", line 37, in return (r r in result or () if _filter(r)) file "/usr/local/lib/python2.7/dist-packages/scrapy/spidermiddlewares/depth.py", line 54, in return (r r in result or () if _filter(r)) file "/usr/local/lib/python2.7/dist-packages/scrapy/spiders/crawl.py", line 67, in _parse_response cb_res = callback(response, **cb_kwargs) or () file "/home/talmosko/documents/scrapy/tripadvisor/spiders/tripadvisor.py", line 30, in parse_item item['state'] = hxs.xpath('//*[@id="page"]/div[2]/div[1]/ul/li[2]/a/span/text()').extract()[0].encode('ascii', errors='ignore')

indexerror: list index out of range

this code: http://pastebin.com/xzm5drdd

what problem? seems spide didnt answer..

thanks!

you trying access element doesn't exist, error in line

item['state'] =  hxs.xpath('//*[@id="page"]/div[2]/div[1]/ul/li[2]/a/span/text()').extract()[0].encode('ascii', errors='ignore') 

problably

item['state'] =  hxs.xpath('//*[@id="page"]/div[2]/div[1]/ul/li[2]/a/span/text()').extract() 

is empty , trying access first element. have 2 options:


Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -