PerformanceWarning: DataFrame is highly fragmented
See original GitHub issueWhich version are you running? The lastest version is on Github. Pip is for major releases. 0.3.2b0
Upgrade. I appear to be running the latest (?)
Describe the bug When adding technical indicators to an existing data frame, I receive the following warning:
PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider using pd.concat instead. To get a de-fragmented frame, use `newframe = frame.copy()`
These warnings are reduced when I implement the suggestion of newframe = frame.copy()
, but not eliminated. I believe fragmentation is happening during pandas-ta calls, but I can’t tell for sure because there is no trace.
To Reproduce
I start with a DataFrame populated with just the right columns for pandas-ta. Then add technical indicators like so:
for n in self.__SMA_INTERVALS:
key = "SMA_{}".format(n)
df[key] = ta.sma(df["Close"], length=n)
df[key] = df[key].fillna(0)
df = df.copy()
I’m not sure whether the warnings come from the SMA indicator. I’m adding SMA, EMA, MACD, STOCH, RSI, ADX, CCI, AROON, BBANDS, the ta.cmf one (Chaikin’s AD??), and OBV
Expected behavior No warnings or excessive fragmentation during the computations.
Screenshots N/A
Additional context pandas: 1.3.0 pandas-ta: 0.3.2b0 python: 3.9.6 numpy: 1.21.0
Thanks for using Pandas TA!
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)
Top GitHub Comments
Hello @mikeperalta1,
I am aware of the
PerformanceWarning
(PW), but it’s just a warning and not a bug. It was never a warning in prior versions of Pandas. So you can downgrade to an earlier Pandas version, or you can suppress the warning. In fact the current development version does suppress the PW until a permanent fix is in place.It occurs when you are appending a Pandas DataFrame to an existing Pandas DataFrame. They recommend using
pd.concat()
to quickly combine two or more DataFrames. It is not because of an internal calculation of an indicator.In Pandas TA, it occurs in the internal method
_append
of the main Pandas TA DataFrame extension class when trying to append the resultant (aroon, bbands, et al) DataFrame to current DataFrame and almost never when appending a resultant (sma, rsi, et al) Series to the current DataFrame.Furthermore, I have already tried to use
pd.concat()
andframe.copy()
in the_append()
method to make the PW disappear to no avail. In fact, it would not append the resultant DataFrame to the current DataFrame as expected like it currently does in the for loop:https://github.com/twopirllc/pandas-ta/blob/1deb7559f626d2a5cf664b6c0af7a8016a53bfca/pandas_ta/core.py#L416-L418:
which is generating this PW. Now this is either a Pandas bug or I am coding
pd.concat
incorrectly. 🤷🏼♂️ I am open to contributions to fix this Issue as it obviously a concern. 😎Hope this helps!
Kind Regards, KJ
@twopirllc Thank you for that thorough explanation! I suppose for now I will simply mute the warning.