question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorporate categorical labels in plots/summary

See original GitHub issue

I’m using arviz in conjunction with pymc3 fitting a model with categorical variables. When using arviz to view/plot variables the names associated with the categories are replaced by their integer values, making information harder to interpret. It would be nice if there were a way to restore meaningful category names, perhaps during creation of the InferenceData object?

For example, plot_forest:

data_path = "https://raw.githubusercontent.com/rmcelreath/rethinking/master/data/NWOGrants.csv"
nwo = pd.read_csv(data_path, delimiter=';')

gender = pd.Categorical(nwo['gender'])  # 0 is female, 1 is male

with pm.Model() as m_1:
    a = pm.Normal('a', 0, 1.5, shape=gender.categories.size)
    
    p = pm.invlogit(a[gender.codes])
    # Category labels are removed by the use of `.codes`, which provides integer labels.
    # This is necessary because theano throws a RecursionError if the categories are not integers.
    # This is the information that it would be nice to re-incorporate in arviz. 
    
    award = pm.Binomial('award', nwo.applications, p, observed=nwo.awards)
    
    trace_1 = pm.sample()

az.plot_forest(trace_1, combined=True)

image

It would be nice if there a way to re-incorporate the category labels lost through having to provide the categories as integers (gender.codes)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
oscarbransoncommented, May 18, 2020

Great, Thanks @OriolAbril. Think this can probably be closed now?

1reaction
OriolAbrilcommented, May 14, 2020

I have been doing some tweaks to the docs, from_pymc3 will have a link to the cookbook and the cookbook will have a table of contents to ease navigation. Links point to the updated docs (hosted in my fork until merged in #1184)

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Plot Categorical Data in R (With Examples) - Statology
In statistics, categorical data represents data that can take on names or labels. Examples include: Smoking status (“smoker”, “non-smoker”)
Read more >
Working with Categorical Plot Types - Texas Instruments
The categorical plot types can be used to compare the representations of data across different plots. When the same variable (list) is used...
Read more >
Plotting and evaluating two categorical variables
In the last chapter, we covered how to look at a single categorical variable. Let's do that quickly now for both Gender and...
Read more >
Plotting categorical variables — Matplotlib 3.6.2 documentation
Plotting categorical variables#. You can pass categorical values (i.e. strings) directly as x- or y-values to many plotting functions:.
Read more >
How to label the values of categorical variables - YouTube
This video demonstrates how to label the values of categorical variables in Stata. Copyright 2011-2019 StataCorp LLC. All rights reserved.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found