Provide simple mechanism for adding icons to datasets
See original GitHub issueDescription
Is your feature request related to a problem? A clear and concise description of what the problem is: “I’m always frustrated when …”
Users can label their dataset in the catalog and provide layers - but there is very little they can do to differentiate datasets beyond this from a visual perspective. Adding the facility to apply an icon from an existing library of icons would be an effective mechanism for making the pipeline visualisation a clearer and more efficient story-telling tool.
Context
Why is this change important to you? How would you use it? How can it benefit other users?
A simple example for where this would be useful would be to allow users to mark Excel datasources vs SQL datasources at a glance, even more so in the collapsed label-less view.
Possible Implementation
(Optional) Suggest an idea for implementing the addition or change.
On the YAML catalog side there could be an extra key for icon
like so:
flight_times:
type: pandas.CSVDataSet
layer: raw
load_args:
sep: '|'
icon: carbon-csv
This could pull in the following icon from the Carbon design system (by IBM): provided by the iconfiy
framework which collects several open source icon libraries.
https://iconify.design/icon-sets/carbon/csv.html
By using the [iconfiy-react](https://github.com/iconify/iconify-react)
library this would hopefully be a low effort addition
Checklist
- Include labels so that we can categorise your feature request
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:7 (4 by maintainers)
Top GitHub Comments
Such a feature would be nice, not only at the dataset level, but also at the node level.
Personally when I make pipelines with kedro i try to make nodes responsible for only one main activity (that can be clearly explained to others). It would be nice to represent each node in kedro viz with an icon summarizing the main task the node is doing. e.g. clean, stack, join, filter, train, predict…
And naturally as you said in the issue title, it can be also interesting to identify the dataset types based on their icon, giving a hint to a data source origin : csv, sql, excel file, as a pipeline is often a mix of various dataset types…
Possibilities are quite numerous if a good icon collection can be provided!
I would second this. Could help keep the pipeline maintainer right as well as improve comprehension among a non-technical audience. It should go without saying that it should not be compulsory, however. Perhaps could be toggled in Viz.
To play devil’s advocate, I would also say that the wrong implementation could add unnecessary complexity. Should Kedro be Laissez-faire about what icon packs can be used? If so, would it still be easy in most cases to distinguish the difference between a node and a dataset? Should the user be able to specify custom icons for their custom datasets? If yes, would certain visualisations begin to look messy? If no, could things look incomplete? Would the OCD among us end up spending more time customising icons than writing code? lol.
I think it’s a great idea, but only with the right design choices. Users frequently say they like Kedro because it’s opinionated, so whatever the implementation is, it should be congruent with that broader ethos.
That’s my two pence anyways!