read_fwf: skip_blank_lines does nothing
See original GitHub issueCode Sample, a copy-pastable example if possible
from io import StringIO
import pandas as pd
f = StringIO('''A B
C D''')
df = pd.read_fwf(f, colspecs=[(0, 1), (2,3)], header=None, skip_blank_lines=True)
print(df)
Problem description
Output:
0 1
0 A B
1 NaN NaN
2 C D
The (second) blank line is not skipped, but instead there is a row with two NaN
values. It seems that skip_blank_lines
has no effect on read_fwf
. On the other hand, read_csv(f, sep=' ', header=None)
, produces the expected output below.
Expected Output
0 1
0 A B
1 C D
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.6.6.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None
pandas: 0.23.1 pytest: 3.6.2 pip: 18.0 setuptools: 39.0.1 Cython: None numpy: 1.15.1 scipy: None pyarrow: None xarray: None IPython: 6.5.0 sphinx: 1.5.5 patsy: None dateutil: 2.7.3 pytz: 2018.5 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 1.0.1 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:9 (4 by maintainers)
Top GitHub Comments
@dcdenu4 or anyone can submit a PR pandas is all volunteer and we have 3000+ open issues
As mentioned above, the original issue isn’t reproducible anymore and there is a test covering this case (
test_fwf_skip_blank_lines
): https://github.com/pandas-dev/pandas/blob/36dcf519c67a8098572447f7d5a896740fc9c464/pandas/tests/io/parser/test_read_fwf.py#L354As for
read_csv
, I think the behavior is as expected (Nan
for coma-separated blank values).I guess this issue can be closed?