question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sort_values bug while by both list and other values

See original GitHub issue

first, I know the strings in column b can be used directly for sorting. My function aFucntion seems to be so stupid, because it is just a simplified one to present the problem. In my real situation, it actually produces a list of some English Characters, which then be used to sort words in my native language in a customed way.

Then, I know LIST is not hashable

# coding = utf-8
import pandas as pd

def aFucntion(var):
    return list(var)

a = {'a': [1, 2],
     'b': ['Python', 'Java']
     }

df =pd.DataFrame(a)

print(df)
print('\n')


df['fun'] = df['b'].map(aFucntion)


df1 = df.sort_values(['b']) # this works
print(df1)
print('\n')

df2 = df.sort_values(['fun'])  # this works
print(df2)
print('\n')
df3 = df.sort_values(['b', 'fun'])  # this yields `TypeError: unhashable type: 'list'`
# but why df.sort_values(['fun'])  works?
print(df3)
print('\n')

df['fun'] = df['b'].map(lambda e: tuple(aFucntion(e)))
df4 = df.sort_values(['b', 'fun'])  # this works
print(df4)
print('\n')

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
TomAugspurgercommented, Feb 13, 2019

sort_values needs the values of a column to be hashable when sorting by multiple columns. The original example could be fixed by converting the column of (unhashable) lists to a column of tuples.

In [10]: df['fun2'] = df['fun'].map(tuple)

In [11]: df
Out[11]:
   a       b                 fun                fun2
0  1  Python  [P, y, t, h, o, n]  (P, y, t, h, o, n)
1  2    Java        [J, a, v, a]        (J, a, v, a)

In [12]: df.sort_values(['a', 'fun2'])
Out[12]:
   a       b                 fun                fun2
0  1  Python  [P, y, t, h, o, n]  (P, y, t, h, o, n)
1  2    Java        [J, a, v, a]        (J, a, v, a)
0reactions
m-martin-jcommented, Feb 13, 2019

A workaround that helped me is

df2 = df.applymap(tuple) df2.sort_values([‘b’, ‘fun’])

to sort by more than one column

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas sort_values seems to sort list but getting similar errors?
Oddly, the corresponding values on the post-sort list look like they sorted appropriately. I tried: I confirmed that I was not sorting strings ......
Read more >
Pandas Sort By Column – pd.DataFrame.sort_values()
Pandas Sort - Easily do pandas sort by column with pd.DataFrame.sort_values(). This awesome function will sort across multiple columns, and custom keys.
Read more >
pandas.DataFrame.sort_values
Name or list of names which refer to the axis items. ... Sort ascending vs. descending. Specify list for multiple sort orders. If...
Read more >
pandas: Sort DataFrame, Series with sort_values(), sort_index()
Series , use sort_values() and sort_index() . You can sort in ascending or descending order, or sort by multiple columns.
Read more >
Pandas Sort Values Tutorial - DataCamp
For example, when we apply sort_values() on the weight_kg column of the dogs dataframe, ... To change the direction values are sorted in,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found