read_csv in combination with index_col and usecols
See original GitHub issueStarting point:
http://pandas.pydata.org/pandas-docs/stable/io.html#index-columns-and-trailing-delimiters
If there is one more column of data than there are colum names, usecols exhibits some (at least for me) unintuitive behavior:
>>> data = 'a,b,c\n4,apple,bat,5.7\n8,orange,cow,10'
>>> pd.read_csv(StringIO(data))
a b c
4 apple bat 5.7
8 orange cow 10.0
>>> pd.read_csv(StringIO(data), usecols=['a', 'b'])
a b
0 4 apple
1 8 orange
>>>
I was expecting it to be equal to
>>> pd.read_csv(StringIO(data))[['a', 'b']]
a b
4 apple bat
8 orange cow
I am not sure if my expectation is unfounded, though, and that this behavior is indeed intentional?
Issue Analytics
- State:
- Created 11 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Import CSV Files As Pandas DataFrame With skiprows ...
Import CSV Files As Pandas DataFrame With skiprows, skipfooter, usecols, index_col and header Options. Learn to use read_csv function with some ...
Read more >pandas.read_csv — pandas 1.5.2 documentation
Column(s) to use as the row labels of the DataFrame , either given as string name or column index. If a sequence of...
Read more >pandas read_csv and filter columns with usecols
read_csv when I filter the columns with usecols and use multiple indexes. I expect that df1 and df2 should be the same except...
Read more >Import CSV Files As Pandas DataFrame With ... - Regenerative
Import CSV Files As Pandas DataFrame With skiprows, skipfooter, usecols, index_col and header Options. rashida048; April 20, 2020; Data Science · 0 Comments....
Read more >Pandas read_csv() - How to read a csv file in Python
Syntax: pandas.read_csv( filepath_or_buffer, sep, header, index_col, usecols, prefix, dtype, converters, skiprows, skiprows, nrows, na_values, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
That’s not a bug.
index_cols
is relative tousecols
. In this, you only have one column that you want to extract from the CSV, but you want two columns for the index.Exactly. That’s part of what I was proposing you do in #9098.