question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: exponential moving window covariance fails for multiIndexed DataFrame

See original GitHub issue
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example


import pandas as pd
import numpy as np 

columns = pd.MultiIndex.from_product([['a','b','c'],['x','y','w','z'], list(range(9))])
index = range(1000)
df = pd.DataFrame(
    np.random.normal(size=(len(index), len(columns))),
    index=index,
    columns=columns
    )
    
df.ewm(alpha=0.1).cov()  #Throws AssertionError: Length of order must be same as number of levels (4), got 3

Problem description

When calculating ewm covariance, pandas fails when the DataFrame has multiindex columns. However it works when columns are simple Index dataframes. It works for:


pd.DataFrame(df.values).ewm(alpha=0.1).cov()

Expected Output

The covariance, actually only the last matrix (last level of index)

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None python : 3.7.7.final.0 python-bits : 64 OS : Windows OS-release : 10 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None

pandas : 1.0.3 numpy : 1.18.1 pytz : 2019.3 dateutil : 2.8.1 pip : 20.0.2 setuptools : 46.1.3.post20200330 Cython : 0.29.15 pytest : 5.4.1 hypothesis : 5.8.3 sphinx : 2.4.4 blosc : None feather : None xlsxwriter : 1.2.8 lxml.etree : 4.5.0 html5lib : 1.0.1 pymysql : None psycopg2 : None jinja2 : 2.11.1 IPython : 7.13.0 pandas_datareader: None bs4 : 4.9.0 bottleneck : 1.3.2 fastparquet : None gcsfs : None lxml.etree : 4.5.0 matplotlib : 3.1.3 numexpr : 2.7.1 odfpy : None openpyxl : 3.0.3 pandas_gbq : None pyarrow : 0.15.1 pytables : None pytest : 5.4.1 pyxlsb : None s3fs : None scipy : 1.4.1 sqlalchemy : 1.3.16 tables : 3.6.1 tabulate : 0.8.3 xarray : None xlrd : 1.2.0 xlwt : 1.3.0 xlsxwriter : 1.2.8 numba : 0.49.0

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
jorisvandenbosschecommented, May 29, 2020

@PablocFonseca thanks for the report, and @arw2019 thanks for the confirmation and simple reproducer!

It’s also failing on 0.25

0reactions
Dinkarkumarcommented, Aug 18, 2021

I want to work on this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

BUG: exponential moving window covariance fails for a non ...
I would expect the .cov to work on any MultiIndex since the form of the column names should not impact covariance computations. It...
Read more >
pandas.core.window.rolling.Rolling.corr
If True then all pairwise combinations will be calculated and the output will be a MultiIndexed DataFrame in the case of DataFrame inputs....
Read more >
3.5 Exponentially Weighted Windows — Pandas Doc
3.5 Exponentially Weighted Windows ; Span corresponds to what is commonly called an “N-day EW moving average”. ; Center of mass has a...
Read more >
pandas ewm.std calculation - Stack Overflow
I am plotting the ratio of the var calculated by pandas and using the formula I inferred from the Cython code of pandas...
Read more >
DataFrame — PySpark 3.3.1 documentation - Apache Spark
pandas -on-Spark DataFrame that corresponds to pandas DataFrame logically. ... Return unbiased standard error of the mean over requested axis. DataFrame.skew ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found