Ordinal and nominal types produce the same Vega specs
See original GitHub issueThe documentation for ordinal type says that the default sorting is the “natural” sorting, but both ordinal and nominal types are generating scales with sort: true
in Vega.
For example, if I want to render months in sequential order, my data might already be sorted. Alphabetical sorting is not helpful in this case. To reproduce, try this spec. The expected behavior is that the months should be rendered in the same order as they are provided:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"month": "March", "amount": 5000},
{"month": "April", "amount": 12000},
{"month": "May", "amount": 3750},
{"month": "June", "amount": 9000}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "month", "type": "ordinal"},
"y": {"field": "amount", "type": "quantitative"}
}
}
Compare ordinal months vs nominal months and you’ll see that they are the same spec.
As a workaround, you can add a transform
to calculate the natural sort order, but this should not be required:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"month": "March", "amount": 5000},
{"month": "April", "amount": 12000},
{"month": "May", "amount": 3750},
{"month": "June", "amount": 9000}
]
},
"mark": "bar",
"transform": [{ "window": [{ "op": "rank", "as": "rank" }] }],
"encoding": {
"x": {"field": "month", "type": "ordinal", "sort": {"field": "rank"}},
"y": {"field": "amount", "type": "quantitative"}
}
}
Edit: I don’t believe this is the same issue as https://github.com/vega/vega-lite/issues/4229 and https://github.com/vega/vega-lite/issues/5013, but they look related.
Issue Analytics
- State:
- Created 2 years ago
- Comments:18 (13 by maintainers)
I think there are two topics here:
1. What’s the right default for nominal?
I could see an argument that
sort: null
should be the default for nominal. However, in many cases, the original data is not meaningful, and thus applyingsort: true
by default seems sensible because it’s more efficient to read through a list of categories if they are alphabetically sorted.(Basically, I disagree with the latter part in “Row number isn’t always meaningful, but neither is alphabetical order.”)
Also, changing this will be a massive breaking change that I’d rather not do, even for the next major version.
A more middle ground approach is to add a config for nominal’s default sort (which is
true
by default). With this config for people who use VL in SQL, they can set it tonull
in the config, and thus can reuse this in the config, but don’t have to repeatsort: null
in encoding every time.(That said, if we were to do this, we need to determine where is the best place to put this in the config. (For example, should it be
config.sortNominal
or something)?2. What does ordinal mean?
To me, I see the confusion above as another evidence of my concern in #6633. (We have “ordinal” type, but the ordinal type doesn’t provide the right affordance that to make it work reasonable at all, one should specify custom “sort” order for that ordinal field.)
If it’s up to me, I’d rather kill this confusion (and remove internal complexity) by deprecating nominal/ordinal (which will be 100% backward compatible for nominal, and mostly for ordinal except for color) and introduce “type: categorical” (which is basically = nominal but sounds way less like a jargon) in the next major version.
I know that there are pushbacks as we discussed earlier in #6633, but this discussion here confirms my feeling that there is a fundamental confusion here. (Plus, given that type is in the first chapter of our teaching material, we better make this thing not confusing).
After all, I think type is so fundamental to our language, teaching, and also internal complexity that it justifies the cost.
(If we do this then,
config.sortCategorical
might make more sense as the name for Topic 1.)I would disagree in classifying
"sort": null
as a workaround. Rather, I consider it the documented API to do what you have in mind.