question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pd.NA reverses axis ordering

See original GitHub issue

Issue

When plotting with pd.NA, axis ordering get reversed into descending.

Workaround

np.nan does not produce this issue

Expected Behavior

NAs should be excluded without reversing axis order

Reproducible Example

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt


mock_data = pd.DataFrame({
    'date': ['0', '1', '2', '3'],
    'value': [1, 2, 1, 1.5]
})

mock_data_full = mock_data.copy()
mock_data_full['type'] = 'no_NA'

mock_data_pd_na = mock_data.copy()
mock_data_pd_na['type'] = 'pd.NA'
mock_data_pd_na.loc[2, 'value'] = pd.NA

mock_data_np_nan = mock_data.copy()
mock_data_np_nan['type'] = 'np.nan'
mock_data_np_nan.loc[2, 'value'] = np.nan

test_data = pd.concat([mock_data_full, mock_data_pd_na, mock_data_np_nan])

grid = sns.FacetGrid(
    data=test_data,
    col='type',
    sharey=False,
    sharex=True,  # time-series consistency
)
grid.map(sns.lineplot, 'date', 'value', alpha=0.5)
plt.show()

Result

image

System Info

print(f'''
    python: {sys.version},
    seaborn: {sns.__version__},
    pandas: {pd.__version__}
''')
    python: 3.9.7 (default, Sep  9 2021, 23:20:13)  [GCC 9.3.0],
    seaborn: 0.11.2,
    pandas: 1.3.4

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
mwaskomcommented, Oct 26, 2021

This is a weird one! Let’s debug.

First thought, this has something to do with using FacetGrid directly (which is discouraged). But no, we can reproduce using relplot:

g = sns.relplot(
    data=test_data,
    col="type",
    x="date", y="value",
    kind="line", height=3,
    marker="o",
    facet_kws=dict(sharey=False),
)

image

Let’s get rid of FacetGrid altogether, and set up the figure with matplotlib. Still there:

f, axs = plt.subplots(1, 3, sharex=True, sharey=False, figsize=(7, 3))
kws =  dict(x="date", y="value", marker="o")
sns.lineplot(data=test_data.query("type == 'no_NA'"), **kws, ax=axs[0])
sns.lineplot(data=test_data.query("type == 'pd.NA'"), **kws, ax=axs[1])
sns.lineplot(data=test_data.query("type == 'np.nan'"), **kws, ax=axs[2])
f.tight_layout()

image

Now we can focus on a single lineplot and simplify even further. Is it because the x values are strings? No:

sns.lineplot(x=[1, 2, 3, 4], y=[1, 2, pd.NA, 1.5], marker="o")

image

Hm,

pd.Series([1, 2, pd.NA, 1.5]).dtype
object

What if we force that to numeric?

sns.lineplot(
    x=[1, 2, 3, 4],
    y=pd.Series([1, 2, pd.NA, 1.5], dtype="Float64"),
)

image

There we go. So why is this happening? Seaborn will invert the y axis if the y variable is categorical, and indeed, it thinks the y variable is categorical:

sns._core.variable_type(pd.Series([1, 2, pd.NA, 1.5]))
'categorical'

But that’s because of the object dtype:

sns._core.variable_type(pd.Series([1, 2, pd.NA, 1.5], dtype="Float64"))
'numeric'

Seaborn will introspect and object-typed series and consider it numeric if every element is subclass of numbers.Number:

def all_numeric(x):
    from numbers import Number
    for x_i in x:
        if not isinstance(x_i, Number):
            return False
    return True

This is intended to allow object-typed series that mix int and np.nan. But while np.nan is a Number, pd.NA is not:

all_numeric(pd.Series([1, 2, pd.NA, 1.5]))
False

So this is happening because seaborn thinks your y variable is categorical in the case where you are using pd.NA for missing.

0reactions
mwaskomcommented, Jun 29, 2022

seaborn is not doing the wrong thing from a plotting perspective here, it is just treating the vector as categorical because a) it does not have a numeric dtype and b) not all of its elements are numbers.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to plot x values to reverse order, when using both primary ...
Use the Layout ribbon > Axes drop-down and show the secondary X axis. Format the secondary X axis to show in reverse order....
Read more >
pandas.DataFrame.sort_values — pandas 1.5.2 documentation
Axis to be sorted. ascendingbool or list of bool, default True. Sort ascending vs. descending. Specify list for multiple sort orders.
Read more >
How to reverse axis order in Excel? - ExtendOffice
Reverse axis order in chart. There is an option in Format Axis dialog for reversing the axis order. 1. Right click the y...
Read more >
9.2.6 The (Plot Details) Axis Tab - OriginLab
Orientation of Labels & Titles & Ticks group. Specify whether to apply the orientation of tick labels, ticks and axis titles for all...
Read more >
Pandas Sort: Your Guide to Sorting Data in Python - Real Python
As a quick reminder, a DataFrame is a data structure with labeled axes for both rows and ... By passing False to ascending...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found