problem with 'mailto' links
See original GitHub issueMandatory
- I read the documentation (readme and wiki).
- I searched other issues (including closed issues) and could not find any to be related. If you find related issues post them below or directly add your issue to the most related one.
Related issues:
- I did not find any related issues in news-please and scrapy
Describe your question I trying to crawl from its url “http://www.ladigetto.it” and I get the following error from scrapy
File "/usr/local/lib/python3.6/dist-packages/scrapy/http/request/__init__.py", line 25, in __init__
self._set_url(url)
File "/usr/local/lib/python3.6/dist-packages/scrapy/http/request/__init__.py", line 69, in _set_url
raise ValueError('Missing scheme in request url: %s' % self._url)
ValueError: Missing scheme in request url: mailto:redazione@ladigetto.it
By looking at the content of the initial page of the url, I actually see the href generating the error.
<a href="mailto:redazione@ladigetto.it">redazione@ladigetto.it</a>
Is it possible to force news-please not to follow this kind of link. Is it possible to do it through the configuration file?
Versions (please complete the following information):
- OS: Ubuntu 16.04.6 LTS
- Python Version: 3.6.8
- news-please Version: 1.5.3
Intent (optional; we’ll use this info to prioritize upcoming tasks to work on)
- personal
- academic
- business
- other
- Some information on your project:
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (4 by maintainers)
Top Results From Across the Web
Mailto links do nothing in Chrome but work in Firefox?
This is because chrome handles the mailto in different way. You can go to chrome://settings/handlers ...
Read more >Mailto Links Explained with Examples | Mailtrap Blog
You can customize mailto links in many ways. We describe them all, give examples, and discuss whether mailto is the right approach.
Read more >Easy Fixes to Make Email Links Open Gmail
1. Open Google Chrome 2. Navigate to https://mail.google.com/ 3. Look at the right side of the address bar for the Protocol Handler icon....
Read more >How to Fix the Mailto Link, not Working (About: blank) Message
There are a couple of different reasons why this might be happening. Especially in Google Chrome, you may have your handler turned off,...
Read more >Why Mailto Links Should Be Avoided On Websites - Blog
For example, one problem would be that in a mail application (opened by a mailto link), receiver and subject are usually already set....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For me the problem was solved by defining an ignore_regex in the config.cfg
ignore_regex = "(mail[tT]o)|([jJ]avascript)|(tel)|(fax)"
added it 😃