Replace nominal/ordinal with categorical to avoid confusion.
See original GitHub issueOur ordinal type has long history since the original version.
I’ve feeling more and more that it’s a mistake.
The only thing “ordinal” really does in the whole compiler is we use “ordinal” color scheme instead of “category” when we use it with color scale.
More importantly, the keyword alone is incomplete because the keyword “ordinal” alone doesn’t tell the order. Users still have to specify custom sort order (e.g., "sort": ["small", "medium", "large"]
) anyway.
Basically there is also no difference between:
x: {field: 'size', type: 'nominal', sort: ["small", "medium", "large"]}
and
x: {field: 'size', type: 'ordinal', sort: ["small", "medium", "large"]}
So having “ordinal” in the language is simply just confusing because the keyword doesn’t do anything related to order.
In a way, some books even just call nominal “unordered categorical” and ordinal “ordered categorical”.
I think we should consider deprecating it in VL 5.0 (with backward compatibility so we don’t break people’s code) so we can simplify both the documentation and the internal code.
For internal code, there are several places that we have to check if the type is either nominal or ordinal, even the ordinal bit is never useful beyond the color range.
This is also related to the discussion about adding cyclical
type https://github.com/vega/vega-lite/issues/6590
Issue Analytics
- State:
- Created 3 years ago
- Comments:14 (14 by maintainers)
Top GitHub Comments
Seeing #7654, I still feel like the option c) is probably the right long term solution.
After all, I think the teaching argument that we need to align with steven’s ratio of measurement is less convincing because we don’t align ratio/interval with our current type system anyway. It’s simpler to teach that we have: categorical, temporal, quantitative as the 3 primary data types (+ geojson for maps). No one would get confused.
Personally I am generally in favour of using breaking changes only as an exceptional last resort, and the costs of losing Ordinal greatly outweigh the (questionable) benefits of doing so. Adding overlapping alternatives to the API like ‘categorical’ would probably add more confusion than clarity. FWIW, I’d vote on keeping the existing labels in Vega-Lite.
I would question the reasoning that synonymous use of terms like ‘interval’ are the block in people’s understanding though. After all, the more general use of the term ‘quantitative’ (as in ‘countable’) covers many ‘ordinal’ data too. I think the confusion arises (along with @kanitw 's point about ‘interval’ also meaning an time interval) because students often fail to realise that each measurement scale property applies not only to the named scale, but also to the others ‘later’ in the sequence (so ratio data also allow intervals to be derived and are also orderable; interval data are also orderable; ordinal data are also identifiable by name). No amount of renaming will solve that problem.
BTW, the other common interval data example that students often stumble on is temperature, especially with respect to zero axis baseline.