question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make `sort.op` optional

See original GitHub issue

Crossposting from vega/vega-lite#1489

Why is the op keyword on sort required?

Here’s an example Jupyter Notebook using Vega via Altair where I see the requirement causing a problem.

The user has a simple table with one nominal value and one quantitative value per row, in this case the median income of each county in the United States.

Each row is encoded into a bar on a chart. The user would like to sort the nominal bars using the y-axis’ quantitative value with no transformation or aggregation.

To the user in that circumstance, it seems extraneous to have to submit any operation at all.

download

alt.Chart(df, title="Median household income of U.S. counties").mark_bar().encode(
    x=alt.X(
        "name:N",
        axis=alt.Axis(labels=False, title="", ticks=False), 
        sort=alt.SortField(
            field='b19013001',
            op='sum',  # <-- Why is this necessary?
            order="descending" 
        )
    ),
    y=alt.Y(
        "b19013001:Q",
        axis=alt.Axis(title="", format="$s", ticks=False) 
    )
).properties(width=620)

If the chart is not aggregated, why should the user have to specify an aggregation?

Am I crazy to think that a sensible default would be that if no aggregation function is provided Vega should assume there in a 1:1 relationship between the axis and the sort, perhaps raising an error if there isn’t?

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
jheercommented, Jun 19, 2018

There has to be some aggregate here, as there is no guarantee that there are not multiple records for values in the scale domain. So the question becomes: should there be a default aggregate operation and if so what should it be? Min? Max? Something else?

The “virtue” of including the op is that (1) it makes it clear what is being done, hopefully preventing future confusion at the cost of some upfront learning, (2) while options like “sum” and “average” are only applicable to numeric values (and so not suitable as default operations), they permit more efficient streaming operations than “min” or “max”, thus enabling more performant visualizations. (To deal with possible value removals, min/max must keep a list of all data records seen.)

As a result of the above I’m inclined to keep the design as-is, though I welcome more discussion. Another option is for Vega-Lite to make it’s own decision here, and keep Vega as-is regardless.

0reactions
CMCDragonkaicommented, Oct 26, 2018

The documentation should mention that if you just want to do a simple sort on some quantitative value, just use the sum aggregate.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is the most efficient way to sort an Html Select's Options ...
Extract options into a temporary array, sort, then rebuild the list: var my_options = $("#my_select option"); var selected = $("#my_select").val(); ...
Read more >
Sorting | Vega-Lite
To sort data by another encoding channel, the sort property can be an encoding channel name to sort by (e.g., "x" or "y"...
Read more >
sort : ('a -> 'a -> bool) -> 'a list
SYNOPSIS: Sorts a list using a given transitive `ordering' relation. DESCRIPTION: The call. sort op list. where op is a transitive relation on...
Read more >
Unit 8 Optional Project: Sorting Teacher Guide
In this lab, students develop their own sorting algorithm, then investigate two common sorting algorithms, selection sort and partition sort ...
Read more >
How to Sort a Custom Product Table with Visualforce - OpFocus
How to Make Sorting Simple with Visualforce ... I decided to go with option 2, build the table, then implement client side sorting...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found