question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

read_csv character encoding bug?

See original GitHub issue

This is a weird one from StackOverflow, this file has some \x00s which seem to be ignored when printing but confuse read_csv:

x = 'x,y\n \x00\x00\x00,Reg\n \x00\x00\x00,Reg\nI,Swp\nI,Swp\n'
X = StringIO(x)

In [3]: pd.read_csv(X)
Out[3]: 
     x    y
0          
1  NaN  NaN
2    I  Swp
3    I  Swp

In [4]: print x
x,y
 ,Reg
 ,Reg
I,Swp
I,Swp

Issue Analytics

  • State:closed
  • Created 11 years ago
  • Comments:10 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
jrebackcommented, Jul 31, 2016

if u want to put up tests for the c engine and s nice error message Python engine then can close

0reactions
wesmcommented, Aug 1, 2016

Seems like if you can address the regex delimiter problem (easier said than done) then it may be possible to deprecate the Python engine. This would be easier in the possible pandas 2.0 future in which we might add libre2 to the build / development toolchain

Read more comments on GitHub >

github_iconTop Results From Across the Web

Loading .csv file with UTF-8 encoding error "no lines available ...
Excel likely uses a different encoding. Try to find which one your Excel is using. Other alternative: Go to RStudio -> File ->...
Read more >
Issues with CSV uploads and character encoding in Shiny
Are there any generalizable solutions that allow me to 1) detect the character encoding of a CSV and 2) set my CSV to...
Read more >
"Special" characters encoding issues with write_* and read_* ...
The problem with format_csv seems to be that the output is "UTF-8" encoded, but that R does not know about it. I.e. it...
Read more >
Considerations for Data Loader, special characters, file ...
This behavior is the result of a combination of your import file's encoding and the Data Loader settings you have selected and is...
Read more >
Solved: Problem importing csv file with UTF-8 encoding
I am trying to import a csv file. In this file the first line, the one containing the variables name, contains some names...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found