Un-deprecate DataFrame.from_items()
See original GitHub issueUnder #18262 a FutureWarning
was added suggesting that existing code like this:
pd.DataFrame.from_items(x)
Should be changed to this:
import collections
pd.DataFrame.from_dict(collections.OrderedDict(x))
The fact that from_items()
appeared only 6 times (now 8 times) in a Stack Overflow search was used as partial justification for removing it. But if you search on GitHub, pd.DataFrame.from_items()
appears more than 15,000 times in Python–almost half as many as from_records()
!
We should celebrate the fact that this function doesn’t cause enough confusion to appear often on Stack Overflow. But it does occur (a lot!) in real code, and deprecating it is a mistake.
If constructing a temporary OrderedDict
around items is the best way to construct a DataFrame, Pandas should implement that as a short function called DataFrame.from_items()
, rather than asking thousands of people to busy themselves to accommodate this unnecessary API change.
I recommend removing the FutureWarning, and retaining this widely-used, longstanding function.
For reference, the FutureWarning
starts in 0.23 and looks like this:
FutureWarning: from_items is deprecated. Please use DataFrame.from_dict(dict(items), …) instead. DataFrame.from_dict(OrderedDict(items)) may be used to preserve the key order.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:8
- Comments:22 (19 by maintainers)
Top GitHub Comments
@jreback It is clear that you think this is a good deprecation, otherwise you would not have done it. However, the initial justification (“only 6 hits on Stack Overflow”) completely ignored the larger picture, which is that this function is widely used in a huge number of applications and libraries.
It might be the case that the original implementation was unwieldy or costly to maintain. But what could be the reason not to maintain backward compatibility by a 3-line function that delegates the work to OrderedDict and the main DataFrame constructor?
In other words, maybe the old code was not helpful. Would you accept new code to implement this widely-used function in a single place, rather than having thousands of users sprinkle the same into their local codebases?
I haven’t seen any mention that the
from_dict(OrderedDict(items)...
doesn’t work in cases where there would be duplicates in the index. Here is a test that would fail: