Cannot deploy spiders when importing `urlparse`
See original GitHub issueEnvironment:
- macOS Sierra 10.12.3 (16D32)
- Python 3.6 [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin] installed via
brew
- Scrapy 1.3.2
- shub 2.5.1
Steps:
mkdir shubissue
cd shubissue
python3 -m venv .pyenv
source .pyenv/bin/activate
pip install scrapy shub
scrapy startproject myscrapy
cd myscrapy
scrapy genspider example example.com
shub deploy
# provide project ID
# set as default
Message
{"status": "ok", "spiders": 1, "project": XXXXXX, "version": "1.0"}
Change the contents of myscrapy/myscrapy/spiders/example.py
to:
import scrapy
from urllib.parse import urlparse
class ExampleSpider(scrapy.Spider):
name = "example"
allowed_domains = ["example.com"]
start_urls = ['http://example.com/']
def parse(self, response):
pass
Rerun:
shub deploy
Message:
{"status": "ok", "spiders": 0, "project": XXXXXX, "version": "1.0"}
If you create new spiders they will be ignored also.
I will try to reproduce on a linux environment.
Issue Analytics
- State:
- Created 7 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Import error urllib.parse in scrapy - python - Stack Overflow
I know scrapy needs python 2.7 but urllib.parse is introduced in python 3, before that it was urlparse. Looking at the error it...
Read more >A little help for a new scrapy user? - Google Groups
I can't get it to crawl anything beyond the index page. What am I doing wrong? ... from scrapy.contrib.spiders import CrawlSpider, Rule ......
Read more >Requests and Responses — Scrapy 2.7.1 documentation
Typically, Request objects are generated in the spiders and pass across the system until they reach the Downloader, which executes the ...
Read more >Add parser to requirements.txt - Support - Zyte
Now I get the following error while trying to deploy the project to Scrapinghub: ... but... what should I put on my requirements.txt...
Read more >How to Fix Error: No Module Named 'urlparse' (Easily) - Finxter
This error usually occurs because urlparse has been renamed to urllib.parse . So you need to pip install urllib and then import urllib...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hey @rtodea, thanks for the issue!
For compatibility reasons Scrapy Cloud uses Python 2 by default, where the
urllib
module still had a different structure and your import fails with anImportError
. You can switch to Python 3 by specifying a corresponding stack in yourscrapinghub.yml
, e.g. like this:What’s curious is that your deploy didn’t fail with a build error but apparently the build went through just fine but dropped the spiders. This is what following your steps produced on my machine:
Thanks for the heads up @rubhanazeem