Failing to specify numeric columns fails with "could not convert foofoo to numeric"
See original GitHub issueA minor suggestion to improve error messages: The following code fails, because the meta data does not specify that column A
is numeric. However, the error message is somewhat non-intuitive:
import pandas as pd
import dask.dataframe as dd
from dask import delayed
df = pd.DataFrame({"A": [1, 2, 3]})
ddf = dd.from_delayed(delayed(lambda: df), meta=pd.DataFrame(columns=["A"]))
ddf["A"].mean()
# TypeError: Could not convert foofoo to numeric
This is probably a consequence of the data in _simple_fake_mapping
in dataframe/utils.py
. It would be good to turn the error into something more explicit mentioning the column name + dtype, so that users don’t start to search for foofoo.
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
Why is R tune_grid() having an issue with non-numeric ...
Some columns are non-numeric. The data cannot be converted to numeric matrix: 'Company', # 'State', 'Zip'." Note, the product_folds include ...
Read more >Error with Data Type: We could not convert to number
Hi, I have a column (100 million rows) with Data Type: Decimal Number which generates an error because a number is written 8,005,5...
Read more >Convert numbers stored as text to numbers
Numbers that are stored as text can cause unexpected results. Select the cells, and then click Excel Error Alert button to choose a...
Read more >File: NEWS — Documentation by YARD 0.9.20 - Shoulda Matchers
Fix validate_inclusion_of so that if it fails, it will no longer blow up with ... no longer raises an IneffectiveTestError if used against...
Read more >invalid parameter number: - You.com | The search engine ...
"Invalid parameter number: parameter was not defined" Inserting data ... then the insert will fail, because there is only 1 column name referenced...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
If the column is actually an object column, then the metadata doesn’t have to be wrong, the operation is just invalid. I’d rather fix series reductions to check for object dtype, and error explicitly with
ValueError("mean reduction not supported for object dtype")
. If the metadata was correct, this is a better error message than what pandas throws. If the metadata is wrong, the user will see “object dtype” and should be able to track down their prior mistake.My idea would be to wrap all lines like this into:
with an appropriate context manager that catches only
TypeError
containing “Could not convert” and re-raises with an explicit hint that the meta data must be wrong.It took me quite some time to make sense of this error, probably because:
meta
including dtypes, just with a tiny mistake..mean()
is a lazy operation, so it was surprising to see hints of a numeric operation in the first place.