question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Overlapping Histogram Count

See original GitHub issue

(Code provided at the bottom)

Consider the results of two experiments

Trial A Trial B
0.397371 -0.600645
-0.110611 -1.075366
0.518151 -1.940370
1.218424 -2.646937
-0.187323 -1.301777

After tidying up the data I create a plot of overlapping histograms.

Under the plot of the overlapping histograms I would like to include a bar representing the amount of overlap. I would also like the graphic to include an interval selection linking the two.

I am unsure how to do this.

Thanks for any advice.

-Eitan

import pandas as pd
import altair as alt
import numpy as np
np.random.seed(42)

# Generating Data
source = pd.DataFrame({'Trial A': np.random.normal(0, 0.8, 1000),
                   'Trial B': np.random.normal(-2, 1, 1000)})

interval = alt.selection_interval(encodings=['x'])

# Tidying Data
source = pd.melt(
    source,
    id_vars=source.index.name,
    value_vars=source.columns,
    var_name='Experiment',
    value_name='Measurement'
)

# Overlapping Histograms 
hist = alt.Chart(source).mark_area(
    opacity=0.3,
    interpolate='step'
).encode(
    alt.X('Measurement', bin=alt.Bin(maxbins=100)),
    alt.Y('count()', stack=None),
    alt.Color('Experiment')
).add_selection(interval)

# Amount of overlap (???)
bar = alt.Chart(source).mark_bar().encode(
    x = alt.X('count()', scale=alt.Scale(domain=(0, 2100)))
).transform_filter(interval)

hist & bar

visualization 17

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jakevdpcommented, Nov 25, 2019

Yeah, TBH I probably couldn’t have answered this when you first asked it, but creating the http://github.com/altair-viz/altair-transform package has given me more practice with thinking in terms of Vega-Lite transforms…

1reaction
jakevdpcommented, Nov 24, 2019

Here’s the same with an interval selection to choose which points to include:

import pandas as pd
import altair as alt
import numpy as np
np.random.seed(42)

# Generating Data
source = pd.DataFrame({'Trial A': np.random.normal(0, 0.8, 1000),
                   'Trial B': np.random.normal(-2, 1, 1000)})

interval = alt.selection_interval()

scatter = alt.Chart(source).mark_point().encode(
    x='Trial A',
    y='Trial B',
    color=alt.condition(interval, alt.value('steelblue'), alt.value('lightgray'))
).add_selection(
    interval
)

base = alt.Chart(source).transform_fold(
    ['Trial A', 'Trial B'],
    ['Experiment', 'Measurement']
).transform_filter(
    interval
).transform_bin(
    field='Measurement',
    bin=alt.Bin(maxbins=50),
    as_=['Measurement_min', 'Measurement_max']
).transform_aggregate(
    count='count()',
    groupby=['Measurement_min', 'Measurement_max', 'Experiment']
)

hist = base.mark_area(
    opacity=0.3,
    interpolate='step'
).encode(
    x=alt.X('Measurement_min:Q', bin='binned'),
    x2='Measurement_max:Q',
    y=alt.Y('count:Q', stack=None),
    color='Experiment:N'
)

overlap = base.transform_impute(
    impute='count',
    key='Measurement_min',
    value=0,
    groupby=['Experiment']
).transform_aggregate(
    overlap='min(count)',
    groupby=['Measurement_min']
).mark_bar().encode(
    x='sum(overlap):Q'
)

scatter | (hist & overlap)
Read more comments on GitHub >

github_iconTop Results From Across the Web

How To... Create an Overlapping Histogram in Excel - YouTube
In this video you will learn how to plot histograms, using data for heights in males and females, that will overlap each other....
Read more >
Plot two (overlapping) histograms on one chart in R
I was preparing some teaching material recently and wanted to show how two samples distributions overlapped. This meant I needed to work out ......
Read more >
Overlapping Histograms with Matplotlib in Python
To make multiple overlapping histograms, we need to use Matplotlib pyplot's hist function multiple times. For example, to make a plot with two...
Read more >
Overlapping Histograms with Matplotlib in Python
Here, we will learn how to plot overlapping histograms in python using Matplotlib ... Python | Count overlapping substring in a given string....
Read more >
how to overlap histogram and density plot with Numbers on Y ...
You'll want to use the ..count.. parameter created by stat_density , and then scale it by the bin width. library(ggplot2) set.seed(15) df ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found