question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add note that "allowed_domains" should be a list of domains, not URLs

See original GitHub issue

(just logging the issue before I forget)

It may seem obvious by the name of the attribute that allowed_domains is about domain names, but it’s not uncommon for scrapy users to make the mistake of doing allowed_domains = ['http://www.example.com']

I believe it is worth adding a note in http://doc.scrapy.org/en/latest/topics/spiders.html?#scrapy.spiders.Spider.allowed_domains

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:2
  • Comments:11 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
redapplecommented, Sep 15, 2016

+1 to issue a warning.

I’m less sure about inferring domain, for example for http://www.example.com, should it infer example.com or www.example.com?

1reaction
eliasdornelescommented, Sep 15, 2016

What if instead of simply documenting, Scrapy detect this case and issues a warning?

Even better, it could extract the domain from the URL and use that, while issuing a warning like:

logging.warn("allowed_domains accepts only domains, not URLs. Using allowed_domains = %r" % effective_allowed_domains)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Dynamically add to allowed_domains in a Scrapy spider
I need to add more domains dynamically to this whitelist as the spidering continues from within a parser, but the following piece of...
Read more >
Allowed Domains - Amazon AppStream 2.0
For AppStream 2.0 users to access streaming instances, you must allow the following domain on the network from which users initiate access to...
Read more >
Restricting the Chat widget by country or domain - Zendesk help
Using the Widget Security settings in the dashboard, you can restrict what countries and domains can load the chat widget. Note: If you......
Read more >
Difference between trusted domains, Allowed ... - SonicWall
Allowed domains allow access to URLs that are normally blocked by the SonicWall's Content Filter List (Categories). To allow access to a Web...
Read more >
add a domain allow list [#3304340] | Drupal.org
Ideally we would have an allowed list (text area) with a list of allowed domains, or regex patterns to match against domains. If...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found