question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

get_dummies not executing

See original GitHub issue

Dask get_dummies “runs” but it doesn’t actually execute the get_dummies task.

import pandas as pd
import dask.dataframe as dd

pandasData = pd.DataFrame({'var1': ['a', 'b', 'a'], 'var2': ['b', 'a', 'c'], 'var3': ['c', 'a', 'b']})
pd.get_dummies(pandasData)

daskData = dd.from_pandas(pandasData, npartitions=1)
daskData.head()

daskDataDummies = dd.get_dummies(daskData).compute()
daskDataDummies.head()

daskDataDummies.to_csv('daskDataDummies_out.csv', header=True, index=False)

There’s no error message it simply doesn’t transform the dataframe.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mrocklincommented, Nov 24, 2016

Adding to that parametrized test has been helpful so far. Next time I take a look at this I’ll probably dump in a bunch more dtypes and such.

0reactions
mrocklincommented, Nov 24, 2016

Sampling probably isn’t sufficient. We need to know all of the values throughout the file to determine the columns.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using get_dummies(), but it's not working on array
I am trying the encode a column in a dataset using Pandas get_dummies, but it returns 0 as it is not filtering each...
Read more >
How to Use Pandas Get Dummies in Python - Sharp Sight
In this tutorial, I'll show you how to use the Pandas get dummies function to create dummy variables in Python.
Read more >
pandas.get_dummies — pandas 1.5.2 documentation
Convert categorical variable into dummy/indicator variables. Parameters. dataarray-like, Series, or DataFrame. Data of which to get dummy indicators. prefix ...
Read more >
Using get_dummies(), but it's not working on array-Pandas ...
Coding example for the question Using get_dummies(), but it's not working on array-Pandas,Python.
Read more >
Pandas Get Dummies – pd.get_dummies() - Data Independent
Be careful, if your categorical column has too many distinct values in it, you'll quickly explode your new dummy columns. Before you run...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found