question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

read_csv in combination with index_col and usecols

See original GitHub issue

Starting point:

http://pandas.pydata.org/pandas-docs/stable/io.html#index-columns-and-trailing-delimiters

If there is one more column of data than there are colum names, usecols exhibits some (at least for me) unintuitive behavior:

>>> data = 'a,b,c\n4,apple,bat,5.7\n8,orange,cow,10'
>>> pd.read_csv(StringIO(data))
        a    b     c
4   apple  bat   5.7
8  orange  cow  10.0
>>> pd.read_csv(StringIO(data), usecols=['a', 'b'])
   a       b
0  4   apple
1  8  orange
>>>

I was expecting it to be equal to

>>> pd.read_csv(StringIO(data))[['a', 'b']]
        a    b
4   apple  bat
8  orange  cow

I am not sure if my expectation is unfounded, though, and that this behavior is indeed intentional?

Issue Analytics

  • State:closed
  • Created 11 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
gfyoungcommented, Oct 6, 2017

That’s not a bug. index_cols is relative to usecols. In this, you only have one column that you want to extract from the CSV, but you want two columns for the index.

0reactions
gfyoungcommented, Oct 6, 2017

Exactly. That’s part of what I was proposing you do in #9098.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Import CSV Files As Pandas DataFrame With skiprows ...
Import CSV Files As Pandas DataFrame With skiprows, skipfooter, usecols, index_col and header Options. Learn to use read_csv function with some ...
Read more >
pandas.read_csv — pandas 1.5.2 documentation
Column(s) to use as the row labels of the DataFrame , either given as string name or column index. If a sequence of...
Read more >
pandas read_csv and filter columns with usecols
read_csv when I filter the columns with usecols and use multiple indexes. I expect that df1 and df2 should be the same except...
Read more >
Import CSV Files As Pandas DataFrame With ... - Regenerative
Import CSV Files As Pandas DataFrame With skiprows, skipfooter, usecols, index_col and header Options. rashida048; April 20, 2020; Data Science · 0 Comments....
Read more >
Pandas read_csv() - How to read a csv file in Python
Syntax: pandas.read_csv( filepath_or_buffer, sep, header, index_col, usecols, prefix, dtype, converters, skiprows, skiprows, nrows, na_values, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found