OSError when downloading a very long url
See original GitHub issueWhen you run into some horrible image url, like this:
https://o.aolcdn.com/images/dims?resize=2000%2C2000%2Cshrink&image_uri=https%3A%2F%2Fo.aolcdn.com%2Fimages%2Fdimse%2F5845cadfecd996e0372f%2Fccc34660c41122e3170c0d586c151a29397c0fcf%2FY3JvcD0xOTIwJTJDMTA5NyUyQzAlMkMwJnF1YWxpdHk9ODUmZm9ybWF0PWpwZyZyZXNpemU9MTYwMCUyQzkxNCZpbWFnZV91cmk9aHR0cHMlM0ElMkYlMkZzLnlpbWcuY29tJTJGb3MlMkZjcmVhdHItdXBsb2FkZWQtaW1hZ2VzJTJGMjAxOS0wOCUyRjg2YjNlYjkwLWI5YjgtMTFlOS05ZWFlLTQ5YWU2NTcxMjM0MyZjbGllbnQ9YTFhY2FjM2UxYjMyOTA5MTdkOTImc2lnbmF0dXJlPTZmZWJkYjQwN2E0NzU0YzM0YTJjY2ViMDczNDc1YTE1ZjBiODA3OGQ%3D&client=a1acac3e1b3290917d92&signature=bf3461468aef0cb3ecaea00d2ed611e04a88bc70
Then…
Traceback (most recent call last):
File "c:\program files\python37\lib\site-packages\scrapy\pipelines\files.py", line 419, in media_downloaded
checksum = self.file_downloaded(response, request, info)
File "c:\program files\python37\lib\site-packages\scrapy\pipelines\files.py", line 452, in file_downloaded
self.store.persist_file(path, buf, info)
File "c:\program files\python37\lib\site-packages\scrapy\pipelines\files.py", line 53, in persist_file
with open(absolute_path, 'wb') as f:
OSError: [Errno 22] Invalid argument: 'E:\\2019-08-12\\resources\\885443110bae0e1149e017dbea5ca3935efa38c0.com%2Fimages%2Fdimse%2F5845cadfecd996e0372f%2F108a4af73772ae197fa2c4ec4e9fe7a47390433c%2FY3JvcD0xMTc0JTJDNTgwJTJDMCUyQzAmcXVhbGl0eT04NSZmb3JtYXQ9anBnJnJlc2l6ZT0xNjAwJTJDNzkxJmltYWdlX3VyaT1odHRwcyUzQSUyRiUyRnMueWltZy5jb20lMkZvcyUyRmNyZWF0ci11cGxvYWRlZC1pbWFnZXMlMkYyMDE5LTA4JTJGMWJmZGQxNDAtYjliYy0xMWU5LWJmZjMtMjMyNzcwMTg1MzE5JmNsaWVudD1hMWFjYWMzZTFiMzI5MDkxN2Q5MiZzaWduYXR1cmU9OTFiNzQ3Y2MyZTY5ODY3OGIxNWI0OTkyMjdjM2NmZWRlYTE1NGIxOA%3D%3D&client=a1acac3e1b3290917d92&signature=6517aece82e79d536edeaccc275ad88090df0252'
So,I think that when downloading a file, you should use a random name instead of intercepting it from the url.For some particularly weird urls, this will cause an OSErro when writing to the file
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (5 by maintainers)
Top Results From Across the Web
Could not download image from url link using Python
When I click on the second link the browser will try to open the image but give me the error (Fig. 1). However,...
Read more >Could not install packages due to an OSError [FIX] - YouTube
This video we'll talk about how to fix the following error: “Could not install packages due to an OSError ”. I had this...
Read more >How do I fix this error in the notebook? 12 - Hugging Face
I've already done this, so there is no button to do so again. And yet I'm still getting the error. I seem to...
Read more >How to download files with very long URLs? - Super User
The problem is: file urls are very long, about 4k characters, no tool seems to be able to work with such url length....
Read more >Package installation issues | PyCharm Documentation
The most viable troubleshooting action is to try installing the problematic ... Expand the list of the available interpreters and click the Show...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@Gallaecio Working on that
@ritik-malik There’s no need to ask for permission to fix an issue, you can usually just start a pull request including a reference to the ticket and that’s it.
However, in this case there is already a pull request open, #3954, which seems promising. Maybe you could find a different issue?