ElemWise DataFrame operations by Series
See original GitHub issueIt is not possible to perform trivial operations like standardization of data e.g.
ddf = (ddf - ddf.mean(axis=0)) / ddf.std(axis=0)
It throws an error saying
ValueError: Not all divisions are known, can't align partitions.
Please use `set_index` or `set_partition` to set the index.
The output of the reduction operations are Series
with one row and same number of columns so it shouldn’t need to know the divisions. However, would it make sense to set the divisions of the Series
to be the start and end divisions of the DataFrame
that was reduced?
Issue Analytics
- State:
- Created 7 years ago
- Comments:10 (5 by maintainers)
Top Results From Across the Web
How to do element wise operation of Pandas Series to get ...
Using pandas import pandas as pd #create pandas dataframe with one column "col_" with data. df = pd.DataFrame({'col_':list(range(1, ...
Read more >Essential Basic Functionality — pandas 0.23.1 ...
The appropriate method to use depends on whether your function expects to operate on an entire DataFrame or Series , row- or column-wise,...
Read more >How to apply functions element-wise in a dataframe ...
Dataframe is created by using the 'random' function and creating data that has 5 rows and 5 columns. The names of the columns...
Read more >Operating on Data in Pandas | Python Data Science Handbook
One of the essential pieces of NumPy is the ability to perform quick element-wise operations, both with basic arithmetic (addition, subtraction, ...
Read more >DataFrame — PySpark 3.3.1 documentation
Get Exponential power of series of dataframe and other, element-wise (binary operator ... Aggregate using one or more operations over the specified axis....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I recommend raising a new issue with a minimal reproducible example.
https://stackoverflow.com/help/mcve
The error here is saying that “OK, you’ve given me a dask.dataframe with two partitions without any information about how those partitions are aligned. For example I don’t know how many rows are in each partition. You’ve also given me a pandas series with 10 rows (or whatever). I have no idea how to split the pandas series to align with the partitions of the dask dataframe.”