Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Document differences in dimension handling needed for tidy vs. wide tabular datasets

See original GitHub issue

When composing multiple Curves into a single chart using an Overlay, the y-axis is formatted as the FIRST Curve’s ydims Dimension’s “{label} ({unit})”. This labeling convention might be fine to display a single Curve, but if there are multiple Curves per chart, the Y-axis label should not mention the First curve’s label, but should instead be the Units that are common to all the Curves in the Overlay.

How does one directly change the axis label of an Overlay?
Is there a way to get the Dimensions units to be used as an axis label?
Is there control over the axis label’s format?

For example this code creates a chart comparing the acceleration of two cars. The y-axis should be the unit “mph” and not “Porsche (mph)”

# matplotlib defaults
import matplotlib as mpl
mpl.rcParams['figure.facecolor']="white"
# Holoviews
import holoviews as hv
hv.extension('matplotlib')
# Compare Accelleration Data
import pandas as pd
raw_data = [{'time':0, 'yugo':0, 'porsche':0},
            {'time':1, 'yugo':10, 'porsche':10},
            {'time':2, 'yugo':20, 'porsche':30},
            {'time':3, 'yugo':30, 'porsche':90}]
data = pd.DataFrame(raw_data)
# Compare Chart
porsche = hv.Dimension('porsche', label='Porsche', range=(0,100), unit='mph')
yugo = hv.Dimension('yugo', label='Yugo', range=(0,55), unit='mph')
time = hv.Dimension('time', label='Time', range=(0,3), unit='sec')
porsche_curve=hv.Curve(data, kdims=[time], vdims=[porsche], label=porsche.label)
porsche_dot=hv.Scatter(data, kdims=[time], vdims=[porsche], label=porsche.label)
yugo_curve=hv.Curve(data, kdims=[time], vdims=[yugo], label=yugo.label)
yugo_dot=hv.Scatter(data, kdims=[time], vdims=[yugo], label=yugo.label)( style={'marker':'s'})
cmp_chart = (porsche_curve * porsche_dot * yugo_curve * yugo_dot).relabel('Compare Auto Accelleration')(plot={'legend_position':'top_left'})
cmp_chart

Issue Analytics

State:
Created 6 years ago
Comments:8 (5 by maintainers)

Top GitHub Comments

1reaction

johnzzzzcommented, Dec 7, 2017

Thank you for pointing out that holoviews assumes data is always “tidy”. I agree that conversion between tidy data sets and wide datasets is a conceptual hurdle; which is at least as import as the separation of model (data) and view (visualization); and plot (chart type) vs style (rendering). I thought I could annotate the data columns, map the data columns to the views, annotate the views with a style, and holoviews would render it. The missing step is that the raw data must be normalized (tidied), before the columns are annotated. The holoview annotation concept of “group”, also seems to be part of Tidy-ness

One problem with the current documentation is it always shows the clever cases where the data is tidy and the labels can be correctly inferred from the name. This helps motivate using holoviews for the novice, but hides the subtly of what is going on in the inference layer, which hinders the user’s learning curve when the user needs to take control of the inference. I would suggest that you can use the clever examples in the Getting Started section, but never use them in the main user guide

0reactions

jbednarcommented, Dec 7, 2017

Oops; I see you basically had this code already. Ok, here’s a slightly expanded version showing how you can change the dimension if you want to do it after the fact:

I don’t think there’s anything less well supported about this approach; it’s just more verbose than with tidy data and not yet very well documented in the examples.

Top Results From Across the Web

Tidy data

Tidy data is a standard way of mapping the meaning of a dataset to its structure. A dataset is messy or tidy depending...

3.8 Reshaping data - long vs wide format

Data is 'tidy' when it follows a couple of rules: each variable is in its own column, and each observation is in its...

Chapter 3 Tidy data and combining tables

We say that a data table is in tidy format if: Each variable has its own column. Each observation has its own row....

Tidy Data - Hadley Wickham

Tidy datasets are easy to manipulate, model and visualise, and have a specific structure: each variable is a column, each observation is a...

Chapter 4 Data Importing and “Tidy” Data - ModernDive

Some examples of Excel spreadsheet meta-data include the use of bold and italic fonts, colored cells, different column widths, and formula macros. Third,...