question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

decode error for unicode charaters

See original GitHub issue

maybe related to https://github.com/pydap/pydap/pull/152 and https://github.com/pydap/pydap/issues/164

Trying to get a netcdf data served via PyDap. The file in question works fine in a standard python console with direct access using python-netcdf4 while in pydap, on the web interface the das is not available and the apache log returns this error:

[Wed May 22 13:55:25.685392 2019] [wsgi:error] [pid 20625:tid 140168119965440] [client 157.249.114.74:44934]   File "/usr/local/lib/python3.6/dist-packages/pydap/responses/das.py", line 44, in __iter__, referer: http://dap.metsis.met.no/
[Wed May 22 13:55:25.685402 2019] [wsgi:error] [pid 20625:tid 140168119965440] [client 157.249.114.74:44934]     #yield line.encode('ascii'), referer: http://dap.metsis.met.no/
[Wed May 22 13:55:25.685429 2019] [wsgi:error] [pid 20625:tid 140168119965440] [client 157.249.114.74:44934] UnicodeEncodeError: 'ascii' codec can't encode character '\\xd8' in position 33: ordinal not in range(128), referer: http://dap.metsis.met.no/

a bad hack to fix the das … is to add an exception and try to decode using utf-8 … which now gave me a working page for the das but this doesn’t fix the pydap.client … as the error trying to laod such dataset is:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-1-2bb713f8a88f> in <module>
      1 from pydap.client import open_url
----> 2 dataset = open_url('http://dap.metsis.met.no/SN99938.nc')

/usr/local/lib/python3.7/dist-packages/pydap/client.py in open_url(url, application, session, output_grid, timeout, verify)
     65     """
     66     dataset = DAPHandler(url, application, session, output_grid,
---> 67                          timeout=timeout, verify=verify).dataset
     68 
     69     # attach server-side functions

/usr/local/lib/python3.7/dist-packages/pydap/handlers/dap.py in __init__(self, url, application, session, output_grid, timeout, verify)
     61                 verify=verify)
     62         raise_for_status(r)
---> 63         das = safe_charset_text(r)
     64 
     65         # build the dataset from the DDS and add attributes from the DAS

/usr/local/lib/python3.7/dist-packages/pydap/handlers/dap.py in safe_charset_text(r)
    115     else:
    116         r.charset = get_charset(r)
--> 117         return r.text
    118 
    119 

/usr/local/lib/python3.7/dist-packages/webob/response.py in _text__get(self)
    620         decoding = self.charset or self.default_body_encoding
    621         body = self.body
--> 622         return body.decode(decoding, self.unicode_errors)
    623 
    624     def _text__set(self, value):

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 94: ordinal not in range(128)

if i put a print statement in /usr/local/lib/python3.7/dist-packages/webob/response.py line 622 it tells me the decoding is set to ascii while to work in my case it should be utf-8 decoding is define few lines above by : decoding = self.charset or self.default_body_encoding so adding an other try/except to switch to utf-8 … will work but this is a hack and most important … this is happening on the client side… where I have no control on the pydap version used by a potential user do you have any suggestion?

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
petejancommented, Mar 30, 2020

This seemed to work for me, in pydap/handlers/dap.py

 def get_charset(r):
     charset = r.charset
     if not charset:
-        charset = 'ascii'
+        charset = 'utf-8'
     return charset
1reaction
epifaniocommented, Jun 13, 2019

Can you help with this? It is being a crucial set back for countries that use unicode characters in their netcdf metadata 😦

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python Unicode Encode Error - Stack Overflow
UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in position 0: ordinal ... you might have to use a decoder object from the codecs...
Read more >
Working of Unicode Error in Python with Examples - eduCBA
In the above program, when we run, we get an error as UnicodeDecodeError. So to avoid this error, we have to manually decode...
Read more >
Python Unicode Encode Error - Finxter
Summary: The UnicodeEncodeError generally occurs while encoding a Unicode string into a certain coding. Only a limited number of Unicode characters are ...
Read more >
Overcoming frustration: Correctly using unicode in python2
Sending the wrong value here will lead to a UnicodeError being thrown when the string contains non-ASCII characters. Note. There is one mitigating...
Read more >
Unicode character encodings - Python Morsels
The encode method uses the character encoding utf-8 by default: ... line 23, in decode return codecs.charmap_decode(input,self.errors ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found