question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

PerformanceWarning: DataFrame is highly fragmented

See original GitHub issue

Which version are you running? The lastest version is on Github. Pip is for major releases. 0.3.2b0

Upgrade. I appear to be running the latest (?)

Describe the bug When adding technical indicators to an existing data frame, I receive the following warning:

PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider using pd.concat instead.  To get a de-fragmented frame, use `newframe = frame.copy()`

These warnings are reduced when I implement the suggestion of newframe = frame.copy(), but not eliminated. I believe fragmentation is happening during pandas-ta calls, but I can’t tell for sure because there is no trace.

To Reproduce

I start with a DataFrame populated with just the right columns for pandas-ta. Then add technical indicators like so:

for n in self.__SMA_INTERVALS:
	
	key = "SMA_{}".format(n)
	
	df[key] = ta.sma(df["Close"], length=n)
	df[key] = df[key].fillna(0)
	
	df = df.copy()

I’m not sure whether the warnings come from the SMA indicator. I’m adding SMA, EMA, MACD, STOCH, RSI, ADX, CCI, AROON, BBANDS, the ta.cmf one (Chaikin’s AD??), and OBV

Expected behavior No warnings or excessive fragmentation during the computations.

Screenshots N/A

Additional context pandas: 1.3.0 pandas-ta: 0.3.2b0 python: 3.9.6 numpy: 1.21.0

Thanks for using Pandas TA!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

6reactions
twopirllccommented, Jul 13, 2021

Hello @mikeperalta1,

I am aware of the PerformanceWarning (PW), but it’s just a warning and not a bug. It was never a warning in prior versions of Pandas. So you can downgrade to an earlier Pandas version, or you can suppress the warning. In fact the current development version does suppress the PW until a permanent fix is in place.

from warnings import simplefilter
simplefilter(action="ignore", category=pd.errors.PerformanceWarning)

It occurs when you are appending a Pandas DataFrame to an existing Pandas DataFrame. They recommend using pd.concat() to quickly combine two or more DataFrames. It is not because of an internal calculation of an indicator.

In Pandas TA, it occurs in the internal method _append of the main Pandas TA DataFrame extension class when trying to append the resultant (aroon, bbands, et al) DataFrame to current DataFrame and almost never when appending a resultant (sma, rsi, et al) Series to the current DataFrame.

@pd.api.extensions.register_dataframe_accessor("ta")
class AnalysisIndicators(BasePandasObject):
    # ...
    def _append(self, result=None, **kwargs) -> None:
        # ...

Furthermore, I have already tried to use pd.concat() and frame.copy() in the _append() method to make the PW disappear to no avail. In fact, it would not append the resultant DataFrame to the current DataFrame as expected like it currently does in the for loop:

https://github.com/twopirllc/pandas-ta/blob/1deb7559f626d2a5cf664b6c0af7a8016a53bfca/pandas_ta/core.py#L416-L418:

which is generating this PW. Now this is either a Pandas bug or I am coding pd.concat incorrectly. 🤷🏼‍♂️ I am open to contributions to fix this Issue as it obviously a concern. 😎

Hope this helps!

Kind Regards, KJ

3reactions
mikeperalta1commented, Jul 13, 2021

@twopirllc Thank you for that thorough explanation! I suppose for now I will simply mute the warning.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DataFrame is highly fragmented. This is usually the result of ...
PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance.
Read more >
mitigating a performance warning from pandas (DataFrame is ...
Pandas : mitigating a performance warning from pandas ( DataFrame is highly fragmented ) [ Beautify Your Computer ...
Read more >
Unbalanced Pandas DataFrames - Python in Plain English
<timed exec>:5: PerformanceWarning: DataFrame is highly fragmented. ... First, I read in a dataframe that only has 2 columns, one column is for...
Read more >
DataFrame is highly fragmented. This is usually the result of ...
PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance.
Read more >
[#SPARK-38988] Pandas API - ASF JIRA - Apache
Pandas API - "PerformanceWarning: DataFrame is highly fragmented." get printed many times. Status: Assignee: Priority: Resolution: Resolved. Xinrong Meng.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found