decode error for unicode charaters
See original GitHub issuemaybe related to https://github.com/pydap/pydap/pull/152 and https://github.com/pydap/pydap/issues/164
Trying to get a netcdf data served via PyDap. The file in question works fine in a standard python console with direct access using python-netcdf4 while in pydap, on the web interface the das is not available and the apache log returns this error:
[Wed May 22 13:55:25.685392 2019] [wsgi:error] [pid 20625:tid 140168119965440] [client 157.249.114.74:44934] File "/usr/local/lib/python3.6/dist-packages/pydap/responses/das.py", line 44, in __iter__, referer: http://dap.metsis.met.no/
[Wed May 22 13:55:25.685402 2019] [wsgi:error] [pid 20625:tid 140168119965440] [client 157.249.114.74:44934] #yield line.encode('ascii'), referer: http://dap.metsis.met.no/
[Wed May 22 13:55:25.685429 2019] [wsgi:error] [pid 20625:tid 140168119965440] [client 157.249.114.74:44934] UnicodeEncodeError: 'ascii' codec can't encode character '\\xd8' in position 33: ordinal not in range(128), referer: http://dap.metsis.met.no/
a bad hack to fix the das … is to add an exception and try to decode using utf-8 … which now gave me a working page for the das but this doesn’t fix the pydap.client … as the error trying to laod such dataset is:
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-1-2bb713f8a88f> in <module>
1 from pydap.client import open_url
----> 2 dataset = open_url('http://dap.metsis.met.no/SN99938.nc')
/usr/local/lib/python3.7/dist-packages/pydap/client.py in open_url(url, application, session, output_grid, timeout, verify)
65 """
66 dataset = DAPHandler(url, application, session, output_grid,
---> 67 timeout=timeout, verify=verify).dataset
68
69 # attach server-side functions
/usr/local/lib/python3.7/dist-packages/pydap/handlers/dap.py in __init__(self, url, application, session, output_grid, timeout, verify)
61 verify=verify)
62 raise_for_status(r)
---> 63 das = safe_charset_text(r)
64
65 # build the dataset from the DDS and add attributes from the DAS
/usr/local/lib/python3.7/dist-packages/pydap/handlers/dap.py in safe_charset_text(r)
115 else:
116 r.charset = get_charset(r)
--> 117 return r.text
118
119
/usr/local/lib/python3.7/dist-packages/webob/response.py in _text__get(self)
620 decoding = self.charset or self.default_body_encoding
621 body = self.body
--> 622 return body.decode(decoding, self.unicode_errors)
623
624 def _text__set(self, value):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 94: ordinal not in range(128)
if i put a print statement in /usr/local/lib/python3.7/dist-packages/webob/response.py line 622 it tells me the decoding is set to ascii while to work in my case it should be utf-8 decoding is define few lines above by : decoding = self.charset or self.default_body_encoding so adding an other try/except to switch to utf-8 … will work but this is a hack and most important … this is happening on the client side… where I have no control on the pydap version used by a potential user do you have any suggestion?
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (4 by maintainers)
This seemed to work for me, in pydap/handlers/dap.py
Can you help with this? It is being a crucial set back for countries that use unicode characters in their netcdf metadata 😦