question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

numpy error using read_csv with parse_dates=[...] and index_col=[...]

See original GitHub issue

Consider a file of the following format:

week,sow,prn,rxstatus,az,elv,l1_cno,s4,s4_cor,secsigma1,secsigma3,secsigma10,secsigma30,secsigma60,code_carrier,c_cstdev,tec45,tecrate45,tec30,tecrate30,tec15,tecrate15,tec00,tecrate00,l1_loctime,chanstatus,l2_locktime,l2_cno
1765,68460.00,126,00E80000,0.00,0.00,39.38,0.118447,0.107595,0.252663,0.532384,0.600540,0.603073,0.603309,-13.255543,0.114,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,1692.182,8C023D84,0.000,0.00
1765,68460.00,23,00E80000,0.00,0.00,53.48,0.034255,0.021177,0.035187,0.042985,0.061142,0.061738,0.061801,-22.760003,0.015,24.955111,0.112239,25.115330,-0.119774,25.146603,-0.065852,24.747576,-0.243804,10426.426,08109CC4,10409.660,44.52
1765,68460.00,13,00E80000,0.00,0.00,54.28,0.046218,0.019314,0.037818,0.056421,0.060602,0.060698,0.060735,-20.679035,0.090,25.670250,-0.070761,25.752224,-0.055089,26.045048,-0.180056,25.360369,-0.062119,7553.020,18109CA4,7202.660,47.27

I try to read that with the following code

data = pd.read_csv(FILE, date_parser=GPStime2datetime,
                   parse_dates={'datetime': ['week', 'sow']},
                   index_col=['datetime', 'prn'])

Here I’m parsing week and sow into a datetime column using a custom function (this works properly) and using datetime and the prn column as a MultiIndex. The file is read successfully when index_col='datetime', but not when trying to create the MultiIndex using index_col=['datetime', 'prn'] (or when using column numbers instead of names). I get the following traceback:

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 474, in parser_f
    return _read(filepath_or_buffer, kwds)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 260, in _read
    return parser.read()

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 721, in read
    ret = self._engine.read(nrows)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 1223, in read
    index, names = self._make_index(data, alldata, names)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 898, in _make_index
    index = self._agg_index(index, try_parse_dates=False)

  File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 984, in _agg_index
    index = MultiIndex.from_arrays(arrays, names=self.index_names)

  File "C:\Anaconda\lib\site-packages\pandas\core\index.py", line 4410, in from_arrays
    cats = [Categorical.from_array(arr, ordered=True) for arr in arrays]

  File "C:\Anaconda\lib\site-packages\pandas\core\categorical.py", line 355, in from_array
    return Categorical(data, **kwargs)

  File "C:\Anaconda\lib\site-packages\pandas\core\categorical.py", line 271, in __init__
    codes, categories = factorize(values, sort=False)

  File "C:\Anaconda\lib\site-packages\pandas\core\algorithms.py", line 131, in factorize
    (hash_klass, vec_klass), vals = _get_data_algo(vals, _hashtables)

  File "C:\Anaconda\lib\site-packages\pandas\core\algorithms.py", line 412, in _get_data_algo
    mask = com.isnull(values)

  File "C:\Anaconda\lib\site-packages\pandas\core\common.py", line 230, in isnull
    return _isnull(obj)

  File "C:\Anaconda\lib\site-packages\pandas\core\common.py", line 240, in _isnull_new
    return _isnull_ndarraylike(obj)

  File "C:\Anaconda\lib\site-packages\pandas\core\common.py", line 330, in _isnull_ndarraylike
    result = np.isnan(values)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

I am using Python 2.7, Pandas 0.16.1 and numpy 1.9.2.

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:31 (31 by maintainers)

github_iconTop GitHub Comments

1reaction
jrebackcommented, Jun 1, 2015

@cmeeren ok thanks.

The basic issue is that some the inference in read_csv is not as general as to_datetime which correctly handles all of these cases. So the output of the date_parser needs to be coerced to fix this.

pull-requests are welcome!

0reactions
jrebackcommented, Jun 2, 2015

best to do a pull-request. you need to add your example as a test.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error when parsing timestamp with pandas read_csv
For me works change format to %Y-%m-%d %H:%M : def dateparse (timestamp): return pd.datetime.strptime(timestamp, '%Y-%m-%d %H:%M'). Sample:
Read more >
pandas.read_csv — pandas 1.5.2 documentation
Column(s) to use as the row labels of the DataFrame , either given as string name or column index. If a sequence of...
Read more >
How to “read_csv” with Pandas - Towards Data Science
The data type of StartDate column is object but we know this column includes dates so we can read the values as date...
Read more >
Pandas read_csv() - How to read a csv file in Python
You can convert them to a pandas DataFrame using the read_csv function. ... parse_dates=True df = pd.read_csv("data.csv", index_col='Date', ...
Read more >
Pandas read_csv() Tutorial: Importing Data - DataCamp
You're now ready to import the CSV file into Python using read_csv() from pandas ... file paths and convert your flat file as...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found