_get_column() change proposal
See original GitHub issueHi Kevin,
I noticed some behavior which seemed confusing, at least to me. While using df.ta.hlc3(append=True), I noticed that the column that was created was exactly the ‘close’ column (mind that I have set df.ta.adjusted = close). Stepping through the code uncovered that this is because I didn’t give explicit ‘high’, ‘low’ and ‘close’ columns, and if those are left to None, _get_column() will return ‘adjusted’ (‘close’ in my case) unless that’s not set, in which case it will return default which is passed as an argument.
line 388-389 in core.py
elif series is None or default is None:
return df[self.adjusted] if self.adjusted is not None else df[default]
What I found strange about this implementation is that even if a default is given (‘high’, ‘low’, ‘close’ respectively for hlc3), the function will still return the ‘adjusted’ column if it is set explicitely. To me that doesn’t really make sense with the ‘default’ name of the argument.
I propose to change the _get_column() function along these lines:
def _get_column(self, series, default):
"""Attempts to get the correct series or 'column' and return it."""
df = self._df
if df is None: return
# def _get_case(column: str):
# cases = [column.lower(), column.upper(), column.title()]
# return [c for i, c in enumerate(cases) if column == cases[i]].pop()
# default = _get_case(default)
# Explicitly passing a pd.Series to override default.
if isinstance(series, pd.Series):
return series
**# Apply default if no series, or adjusted if no default is passed.
elif series is None:
if default is not None:
series = default
else:
series = self.adjusted
# Try to interpret passed string as a column.
if isinstance(series, str):**
# Return the df column since it's in there.
if series in df.columns:
return df[series]
else:
# Attempt to match the 'series' because it was likely misspelled.
matches = df.columns.str.match(series, case=False)
match = [i for i, x in enumerate(matches) if x]
# If found, awesome. Return it or return the 'series'.
cols = ', '.join(list(df.columns))
NOT_FOUND = f"[X] Ooops!!!: It's {series not in df.columns}, the series '{series}' was not found in {cols}"
return df.iloc[:,match[0]] if len(match) else print(NOT_FOUND)
This way the passed ‘default’ column gets priority over the ‘adjusted’ column. Moving the ‘if isinstance(series,str)’ to a separate ‘if’ statement also allows for slightly misspelled default column names to work as well.
Since this is a core function I wanted to first open a discussion before sending a PR as it may potentially break things that I’m not aware of, or there may be a particular reason why this has been implemented as it is right now. If that’s the case, I’m happy to hear the reasoning as it may help me pay attention to usage of the _get_column() function in the future.
EDIT: cleaned up code markup
Best regards, Wout
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)
Top GitHub Comments
Hi @DrPaprikaa ,
You are absolutely correct! I’m less experienced with the usage of kwargs, so I wasn’t aware that the second parameter in kwargs.pop() is the fallback in case the key is not found. This works exactly as what I tried to propose, and in a clean way (although I like the explicit method definition ‘of old’). I’m sorry for raising this issue since it seems that everything would just have been solved if I had updated to the latest version. I still find it strange because I made sure to look up that piece of the source code on what I believed what was the latest version.
@twopirllc , I believe this issue can be closed, unless of course you want to keep this as a discussion thread for explicit method definition?
Wout
Hi, If I’m not mistaken this bug is not happening in the current master version.
high = self._get_column(kwargs.pop("high", "high"))
here,
pop
searches for the value associated with the keyhigh
in kwargs (chained indicator), defaulting with the valuehigh
(normal indicators). So_get_column()
is never passedNone
. IfNone
is passed for whatever reason,_get_column()
simply returnsadjusted
, orNone
if not present via theelse None
that I added a few versions back in line 384 :So I think that nothing needs to be changed, correct me if I’m wrong 😃
DrPaprikaa