Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

error_bad_lines is ignored if names argument is used in read_csv function

See original GitHub issue

Code Sample, a copy-pastable example if possible

#example taken from #20573
import io
import numpy as np
import pandas as pd
buf = io.StringIO("0,1,Amigo,3\n1,1,Inimigo,amigo,9\n2,1,Cowboy,42\n")
names = ['ID','X1','X2','X3']
dtypes = {"X3": int}
pd.read_csv(buf, names=names, error_bad_lines=False, dtype=dtypes, header=None)

Problem description

Bad lines option (error_bad_lines=False) is ignored when using the names argument. When omitting the names option everything works fine with pandas 0.23.3 (see issue #20573), but when names is used a ValueError is raised (ValueError: invalid literal for int() with base 10: ‘amigo’).

Expected Output

b’Skipping line 3: expected 4 fields, saw 5\n’

ID	X1	X2	X3
0	1	Amigo	3
2	1	Cowboy	42

Output of `pd.show_versions()`

[paste the output of `pd.show_versions()` here below this line] INSTALLED VERSIONS

commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-24-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.23.3 pytest: 3.4.2 pip: 9.0.2 setuptools: 38.5.2 Cython: 0.27.3 numpy: 1.13.3 scipy: 1.0.0 pyarrow: 0.8.0 xarray: None IPython: 6.2.1 sphinx: 1.7.1 patsy: 0.5.0 dateutil: 2.6.1 pytz: 2018.3 blosc: None bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.4 feather: None matplotlib: 2.2.0 openpyxl: 2.5.1 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.0.2 lxml: 4.1.1 bs4: 4.6.0 html5lib: 0.9999999 sqlalchemy: 1.2.5 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: 0.1.4 pandas_gbq: None pandas_datareader: None