SUB-character in a csv causes read_csv() with C-Engine to detect EOF
See original GitHub issueProblem description
If there is a SUB-character in a string in a csv, read_csv()
with the standard C-engine returns
ParserError: Error tokenizing data. C error: EOF inside string starting at line 0
The Python-engine can read the file fine.
It seems I can’t put example data with a SUB-character here, so I pasted an example line here instead:
https://pastebin.com/x6QPY4Hf
Just paste the line into a csv and try to read it with read_csv()
.
I don’t know if this behaviour is expected or not since this character is indeed used as EOF in certain cases, however I see little sense in having a SUB character interpreted as EOF in the middle of a csv file.
commit: None
python: 3.6.1.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64
pandas: 0.20.2
Issue Analytics
- State:
- Created 6 years ago
- Comments:9 (5 by maintainers)
Top Results From Across the Web
read_csv() & EOF character in string cause parsing issue
The problem I found is that there is a single ; in each csv file towards the end of the file.
Read more >Error tokenizing data. C error: EOF inside string starting at line
The solution was to use the parameter engine='python' in the read_csv function call. The Pandas CSV parser can use two different “engines” to...
Read more >Read Rectangular Text Data • readr
The goal of readr is to provide a fast and friendly way to read rectangular data from delimited files, such as comma-separated values...
Read more >EOF marker x1A throwing csv input - CloverCARE Support
I've downloaded a pkzip'd file from the mainframe and unzipped it. ... The end of file marker is causing the delimited csv parser...
Read more >What's New — pandas 0.20.1 documentation
The 'python' engine for read_csv() , as well as the read_fwf() function for ... Bug in .to_json() causing single byte ascii characters to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks for adding the test. Since I’m only here on weekdays you were faster than me. 😃
Can confirm that updating solves the problem.