question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

URS not working for NSIDC OPeNDAP

See original GitHub issue

I am attempting to access a granule in the NSIDC ECS OPeNDAP instance using pydap, but unable to successfully authenticate with URS. Looking at PR #57, I would expect this to work when using pydap v3.2.2:

from pydap.client import open_url
from pydap.cas.urs import setup_session

url = 'https://n5eil02u.ecs.nsidc.org/opendap/OTHR/NISE.004/2012.10.02/NISE_SSMISF17_20121002.HDFEOS'
session = setup_session(os.environ['EARTHDATA_USER'], os.environ['EARTHDATA_PASS'], check_url=url)
open_url(url, session=session)

Returns an HTTPError that looks like a redirect is not being followed.

HTTPError                                 Traceback (most recent call last)
<ipython-input-7-4a425d17a509> in <module>
----> 1 open_url('https://n5eil02u.ecs.nsidc.org/opendap/OTHR/NISE.004/2012.10.02/NISE_SSMISF17_20121002.HDFEOS.html')

~/.pyenv/versions/miniconda3-latest/envs/test/lib/python3.7/site-packages/pydap/client.py in open_url(url, application, session, output_grid, timeout)
     65     """
     66     dataset = DAPHandler(url, application, session, output_grid,
---> 67                          timeout).dataset
     68
     69     # attach server-side functions

~/.pyenv/versions/miniconda3-latest/envs/test/lib/python3.7/site-packages/pydap/handlers/dap.py in __init__(self, url, application, session, output_grid, timeout)
     52         ddsurl = urlunsplit((scheme, netloc, path + '.dds', query, fragment))
     53         r = GET(ddsurl, application, session, timeout=timeout)
---> 54         raise_for_status(r)
     55         if not r.charset:
     56             r.charset = 'ascii'

~/.pyenv/versions/miniconda3-latest/envs/test/lib/python3.7/site-packages/pydap/net.py in raise_for_status(response)
     37             detail=response.status+'\n'+response.text,
     38             headers=response.headers,
---> 39             comment=response.body
     40         )
     41

HTTPError: 302 Found
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://urs.earthdata.nasa.gov/oauth/authorize?client_id=PGVMJ5nUzSnQkI5o23gMxA&amp;response_type=code&amp;redirect_uri=https%3A%2F%2Fn5eil02u.ecs.nsidc.org%2FOPS%2Fredirect&amp;state=aHR0cHM6Ly9uNWVpbDAydS5lY3MubnNpZGMub3JnL29wZW5kYXAvT1RIUi9OSVNFLjAwNC8yMDEyLjEwLjAyL05JU0VfU1NNSVNGMTdfMjAxMjEwMDIuSERGRU9TLmh0bWwuZGRz">here</a>.</p>
</body></html>

I tried tracing the execution down to the webob.Request but I am not sure at that point what I should be seeing for this to work. Any help much appreciated!

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:6

github_iconTop GitHub Comments

3reactions
wallinbcommented, May 9, 2019

Thanks to Peter L. Smith at Raytheon for this workaround and the following explanation:

import os

import requests
from requests.packages.urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter

sessions = {}

class URSSession(requests.Session):
    def __init__(self, username=None, password=None):
        super(URSSession, self).__init__()
        self.username = username
        self.password = password
        self.original_url = None

    def authenticate(self, url):
        self.original_url = url
        super(URSSession, self).get(url)
        self.original_url = None

    def get_redirect_target(self, resp):
        if resp.is_redirect:
            if resp.headers['location'] == self.original_url:
                # Redirected back to original URL, so OAuth2 complete. Exit here
                return None
        return super(URSSession, self).get_redirect_target(resp)

    def rebuild_auth(self, prepared_request, response):
        # If being redirected to URS and we have credentials, add them in
        # otherwise default session code will look to pull from .netrc
        if "https://urs.earthdata.nasa.gov" in prepared_request.url \
                and self.username and self.password:
            prepared_request.prepare_auth((self.username, self.password))
        else:
            super(URSSession, self).rebuild_auth(prepared_request, response)
        return


def get_session(url):
    """ Get existing session for host or create it
    """
    global sessions
    host = urlsplit(url).netloc

    if host not in sessions:
        session = requests.Session()
        if 'urs' in session.get(url).url:
            session = URSSession(os.environ['EARTHDATA_USER'], os.environ['EARTHDATA_PASS'])
            session.authenticate(url)

        retries = Retry(total=5, connect=3, backoff_factor=1, method_whitelist=False,
                        status_forcelist=[400, 401, 403, 404, 408, 500, 502, 503,  504])
        session.mount('http', HTTPAdapter(max_retries=retries))

        sessions[host] = session

    return sessions[host]

The Pydap library starts with the client (the pydap.cas.urs ‘setup_session’ method) making a call to the URS server at urs.earthdata.nasa.gov and providing credentials in a Basic Authorization header. This establishes a logged-in session with the URS service (but not the opendap service). Next, pydap uses this session to issue a HEAD request to the opendap server for the given resource. Under normal circumstances, the opendap server triggers the URS OAuth2 process and redirect pydap to URS. Because pydap is using a session that has logged in to URS already, this redirect would return immediately (no additional credentials needed) with another redirect back to the opendap server. The opendap server then establishes its own logged-in session before finally redirecting pydap back to the original resource, completing the OAuth2 process. At this point, pydap would then issue a GET request for the resource and would successfully download the data.

The issue with the NSIDC opendap server is that HEAD requests are permitted with no authentication required, and thus the OAuth2 process is never triggered and a log-in session with the opendap server is never establish. This results in the subsequent GET request for the resource to fail.

The sample code above avoids this by creating a session and issuing a GET request for the resource up front. The OAUth2 process is invoked and a URS session is established, followed by an opendap server session. At the tail end of this process, the final redirect back to the original resource is intercepted and halted (we don’t wish to actually download it at this point – we want to leave that to pydap). However, the session has established logins with both URS and the opendap server. This session is then passed to pydap whereupon it will be used for the HEAD and subsequent GET request as per normal.

If the NSIDC opendap server was configured to require authentication for HEAD requests, I believe pydap would work out-of-the-box with it. However, it would not be very efficient because at the end of the OAuth2 process, the final redirect back to the original resource is converted from a HEAD request to a GET request, resulting in the entire file being transmitted (or at least partially transmitted/buffered before the client disconnects).

0reactions
simonrp84commented, Sep 28, 2020

Likewise, the issue persists.

I think that the fix described by @wallinb works, but it also seems to download the entire file: session = get_session(url) gives this debug message: 2020-09-28 13:57:22,176 - urllib3.connectionpool - DEBUG - https://ladsweb.modaps.eosdis.nasa.gov:443 "GET /archive/allData/5200/VJ102IMG/2020/225/VJ102IMG.A2020225.0418.002.2020225101640.nc HTTP/1.1" 200 268321100

And there’s a long pause before python becomes active again, which makes me think it’s downloading the file.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How do I access data using OPeNDAP?
With a URL to an OPeNDAP server, users can browse data, perform subset operations, and open data directly in NetCDF-compliant software and tools ......
Read more >
Use cases for swath and time series aggregation
Use cases for satellite Swath and Time Series aggregation. Our general approach is to use the Sequence data type to aggregate granules from ......
Read more >
Access NSIDC AMSR HDF-EOS5 via OPeNDAP - Earthdata
This API allows access to visualize AMSR HDF-EOS5 from NSIDC DAAC ... For large subset of data, this code will not work for...
Read more >
EarthData Cloud Cookbook - PO DAAC Authentication Example
machine urs.earthdata.nasa.gov login jmcnelis password machine opendap.earthdata.nasa.gov ... Dump the netrc again sans passwords to confirm that it worked:.
Read more >
access to MERRA-2 using opendap - Google Groups
http://legras:<my_passwd>@goldsmr5.sci.gsfc.nasa.gov/opendap/hyrax/MERRA2/M2T3NVRAD. ... It was working perfectly one year ago and does not work anymore.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found