question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

px - string or categorical integer values considered as numeric in strip/box/violin plots

See original GitHub issue

Issue:

When using integer values as categorical variable in a strip / box / violin plot, the values of the categorical variable are mapped to a continuous numeric axis even if the values are of string or pd.Categorical type.

Example:

We create a dataframe with columns names having an integer value in string format. These could be any category that makes sense to the specific business case (e.g. product code, etc.)

import numpy as np
import pandas as pd
import plotly_express as px

n = 50

df = pd.DataFrame({
    '1': np.random.normal(2, .3, n),
    '2': np.random.lognormal(.5, .2, n),
    '34': np.random.triangular(0, 2, 3, n),
    '123': np.random.uniform(1, 3, n)
})

We unpivot the data using and make a strip plot.

df1 = df.melt()

px.strip(df1, x='variable', y='value')

string

The categorical variable (values ‘1’, ‘2’, ‘34’ and ‘123’) get mapped to a continuous numeric scale. Here, the variables ‘1’ and ‘2’ blend together and this can get worst if there are orders of magnitude between the different values.

Converting the string values to pd.Categorical type yields the same result as above.

df2 = df.melt()
df2.variable = pd.Categorical(df2.variable)

px.strip(df2, x='variable', y='value')

Workaround:

Adding a character to the values makes them be recognized as categorical which is the expected result (except for the added character in the category names). Unfortunately, adding a blank space does not work either.

df3 = df.copy()
df3.columns = [f"c{c}" for c in df3.columns]

px.strip(df3.melt(), x='variable', y='value')

string+char

Considering that numeric categorical values are legitimate in many contexts, it should be possible to use numbers as categories if they are represented by a string or categorical data type, as is the case with the color parameter (https://github.com/plotly/plotly_express/issues/140):

px.strip(df.melt(), y='value', color='variable')

color

Thanks!

Package              Version  
-------------------- --------- 
plotly               4.1.1    
plotly-express       0.4.1    

@emmanuelle, this is one of the two issues we discussed at the PyData meetup.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:1
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
nicolaskruchtencommented, Jun 25, 2020

@cvrnogueira If you use fig.update_xaxes(type='category') it will apply to all facets.

2reactions
emmanuellecommented, Nov 22, 2019

Hey @DrGFreeman sure this is a valid concern. You can force the axis to be categorical by creating the plotly figure using the px function and then do

fig.update_layout(xaxis_type='category')

which will force your axis to be categorical. Then of course there is the question whether we should impose this at the plotly.express level…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Countplot with plotly express - Plotly Python
px - string or categorical integer values considered as numeric in strip/box/violin plots ... Issue: When using integer values as categorical ...
Read more >
mrsbezy - Blog
Ĭonverting the string values to pd.Categorical type yields the same result as above. The as.numeric() function in R is used to convert a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found