question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

find_api_page HTTPError 403

See original GitHub issue

For me, find_api_page doesn’t work:

(base) hfm-1804a:scipy deil$ python
Python 3.7.3 (default, Mar 27 2019, 16:54:48) 
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from astropy import find_api_page
>>> from astropy.units import Quantity
>>> find_api_page(Quantity)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/deil/software/anaconda3/lib/python3.7/site-packages/astropy/utils/misc.py", line 228, in find_api_page
    uf = urllib.request.urlopen(baseurl + 'objects.inv')
  File "/Users/deil/software/anaconda3/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Users/deil/software/anaconda3/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/Users/deil/software/anaconda3/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/Users/deil/software/anaconda3/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/Users/deil/software/anaconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/Users/deil/software/anaconda3/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Following https://stackoverflow.com/questions/3336549 I tried this, which suggests that the Cloudflare CDN blocks these requests:

>>> try:find_api_page(Quantity)
... except Exception as e: print(e.fp.read())
... 
b'<!DOCTYPE html>\n<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->\n<!--[if IE 7]>    <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->\n<!--[if IE 8]>    <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->\n<!--[if gt IE 8]><!--> <html class="no-js" lang="en-US"> <!--<![endif]-->\n<head>\n<title>Access denied | docs.astropy.org used Cloudflare to restrict access</title>\n<meta charset="UTF-8" />\n<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />\n<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />\n<meta name="robots" content="noindex, nofollow" />\n<meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1" />\n<link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/cf.errors.css" type="text/css" media="screen,projection" />\n<!--[if lt IE 9]><link rel="stylesheet" id=\'cf_styles-ie-css\' href="/cdn-cgi/styles/cf.errors.ie.css" type="text/css" media="screen,projection" /><![endif]-->\n<style type="text/css">body{margin:0;padding:0}</style>\n\n\n<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/zepto.min.js"></script><!--<![endif]-->\n<!--[if gte IE 10]><!--><script type="text/javascript" src="/cdn-cgi/scripts/cf.common.js"></script><!--<![endif]-->\n\n\n\n</head>\n<body>\n  <div id="cf-wrapper">\n    <div class="cf-alert cf-alert-error cf-cookie-error" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.</div>\n    <div id="cf-error-details" class="cf-error-details-wrapper">\n      <div class="cf-wrapper cf-header cf-error-overview">\n        <h1>\n          <span class="cf-error-type" data-translate="error">Error</span>\n          <span class="cf-error-code">1010</span>\n          <small class="heading-ray-id">Ray ID: 4f5c36966fc6cc4a &bull; 2019-07-13 15:15:36 UTC</small>\n        </h1>\n        <h2 class="cf-subheadline">Access denied</h2>\n      </div><!-- /.header -->\n\n      <section></section><!-- spacer -->\n\n      <div class="cf-section cf-wrapper">\n        <div class="cf-columns two">\n          <div class="cf-column">\n            <h2 data-translate="what_happened">What happened?</h2>\n            <p>The owner of this website (docs.astropy.org) has banned your access based on your browser\'s signature (4f5c36966fc6cc4a-ua48).</p>\n          </div>\n\n          \n        </div>\n      </div><!-- /.section -->\n\n      <div class="cf-error-footer cf-wrapper">\n  <p>\n    <span class="cf-footer-item">Cloudflare Ray ID: <strong>4f5c36966fc6cc4a</strong></span>\n    <span class="cf-footer-separator">&bull;</span>\n    <span class="cf-footer-item"><span>Your IP</span>: 147.86.175.50</span>\n    <span class="cf-footer-separator">&bull;</span>\n    <span class="cf-footer-item"><span>Performance &amp; security by</span> <a href="https://www.cloudflare.com/5xx-error-landing?utm_source=error_footer" id="brand_link" target="_blank">Cloudflare</a></span>\n    \n  </p>\n</div><!-- /.error-footer -->\n\n\n    </div><!-- /#cf-error-details -->\n  </div><!-- /#cf-wrapper -->\n\n  <script type="text/javascript">\n  window._cf_translation = {};\n  \n  \n</script>\n\n</body>\n</html>\n'

The problem is not with the URL. http://docs.astropy.org/en/v3.1.2/objects.inv exists, I can download it with my browser, or like this:

>>> import requests
>>> requests.get("http://docs.astropy.org/en/v3.1.2/objects.inv")
<Response [200]>

@eteq or anyone - Can you reproduce? What should we do?

I guess if we want to keep it, we should either use requests as dependency for this, or find the right incantation to do this with urllib?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:20 (20 by maintainers)

github_iconTop GitHub Comments

1reaction
mhvkcommented, Jul 24, 2019

We’ll presumably know for 3.2.2, no?

1reaction
mhvkcommented, Jul 22, 2019

Congratulations on getting 403?!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Problem HTTP error 403 in Python 3 Web Scraping
HTTP/1.1 403 Forbidden. Try posting header 'User-Agent' which fakes web client. NOTE: The page contains Ajax call that creates the table you probably...
Read more >
Troubleshoot HTTP 403 errors from API Gateway - Amazon AWS
An HTTP 403 response code means that a client is forbidden from accessing a valid URL. The server understands the request, but it...
Read more >
urllib.error.HTTPError: HTTP Error 403: Forbidden #6 - GitHub
Added my key to the file and got a Forbidden error when the example program tried to acess the Google Maps site.
Read more >
urllib.error.HTTPError: HTTP Error 403: Forbidden |Data Magic
Hello Friends, If you are using urllib to access data from url and facing below error,urllib.error. HTTPError : HTTP Error 403 : Forbiddenthen ......
Read more >
PYTHON : urllib2.HTTPError: HTTP Error 403: Forbidden
PYTHON : urllib2. HTTPError : HTTP Error 403 : Forbidden [ Gift : Animated Search Engine : https://www.hows.tech/p/recommended.html ] PYTHON ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found