question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

to_csv encoding problem?

See original GitHub issue

I’m getting a UnicodeEncodeError when I try to send a table to_csv. The table was made with a join of two other tables, both loaded from csv using utf-8 encoding. The character that seems to have caused the error is apparently just an en-dash – nothing too exotic. Not sure if I’m doing something wrong or if something worse is going on.

The traceback:

Traceback (most recent call last):
  File "path/goes/here/lobbyists.py", line 57, in <module>
    combo.to_csv('combo.csv')
  File "/Users/theofrancis/miniconda3/lib/python3.5/site-packages/agate/table/__init__.py", line 408, in to_csv
    writer.writerow(tuple(csv_funcs[i](d) for i, d in enumerate(row)))
  File "/Users/theofrancis/miniconda3/lib/python3.5/site-packages/agate/csv_py3.py", line 91, in writerow
    self.writer.writerow(row)
UnicodeEncodeError: 'ascii' codec can't encode character '\u2013' in position 282: ordinal not in range(128)

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

6reactions
bitlathercommented, Sep 5, 2018

This was among the top results when I was looking up the error:

UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xe9’ in position 0: ordinal not in range(128)

I thought I’d show how I fixed my particular issue for anyone else who ends up here.

Replicating The Problem

Consider this script:

import csv

data = [["a", "b", u'\xe9']]

with open("output.csv", "w") as csv_file:
    writer = csv.writer(csv_file, quoting=csv.QUOTE_ALL)
    writer.writerows(data)

When I run it, I get:

$ python tmp_csv_script.py 
Traceback (most recent call last):
  File "tmp_csv_script.py", line 7, in <module>
    writer.writerows(data)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)

Solution

Set default encoding to utf8.

import csv

import sys
reload(sys)
sys.setdefaultencoding('utf8')

data = [["a", "b", u'\xe9']]

with open("output.csv", "w") as csv_file:
    writer = csv.writer(csv_file, quoting=csv.QUOTE_ALL)
    writer.writerows(data)

Running this was successful for me, and the output looked like this:

"a","b","é"

Further Reading

There is an excellent blog article that might be able to help you out more, if you don’t want to set the default encoding:

Python: UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xfc’ in position 11: ordinal not in range(128)

0reactions
jpmckinneycommented, Jul 14, 2021

Noting that this might help solve some of the encoding issues on Windows prior to Python 3.8 (currently skipped in CI).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas df.to_csv("file.csv" encode="utf-8") still gives trash ...
A simple, complete example that reproduces the problem is what we want: df = pd.DataFrame({"A": ['a', '≥']}); df.to_csv('test.csv') , ...
Read more >
Pandas to_csv Encoding Error Solution - varunpramanik.com
The error was unusual to me because I was using Pandas in a way I typically would, on data that should not have...
Read more >
Python 3 writing to_csv file ignores encoding argument. #13068
is missing the UTF8 BOM (encoded with default encoding UTF8) with open('path_to_f', 'w') as f: df.to_csv(f, encoding='utf-8-sig') # is not ...
Read more >
pandas.DataFrame.to_csv — pandas 0.18.0 documentation
A string representing the encoding to use in the output file, defaults to 'ascii' on Python 2 and 'utf-8' on Python 3. compression...
Read more >
PYTHON : Pandas df.to_csv("file.csv" encode="utf-8") still ...
PYTHON : Pandas df. to_csv ("file.csv" encode =" utf-8 ") still gives trash characters for minus sign. 322 views 1 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found