Improvements to Silhouette Visualizer
See original GitHub issueThe following improvements to the Silhouette Visualizer are left over from #91:
Note to contributors: items in the below checklist don’t need to be completed in a single PR; if you see one that catches your eye, feel to pick it off the list!
- Improve the documentation describing what Silhouette scores are and how to use the visualizer to qualitatively evaluate a clustering solution.
- Find a real world example rather than just using make_blobs (note: we also have an example using the Iris dataset; ideally we’d having something a bit more unique to YB that we can add to
yellowbrick.datasets
module - perhaps this should be a separate issue?). - Instead of hard fixing the limits of the X-axis from -1.0 to 1.0; be more flexible so that the visualizer has a better display (or give the user the option of setting the limits).
- Move the cluster identity labels away from the middle and to the y-axis.
- Add ability to define cluster colors and improve color selection methodology.
- Add a legend/annotation that describes the average clustering coefficient (e.g. label the red axvline)
Issue Analytics
- State:
- Created 5 years ago
- Comments:22 (20 by maintainers)
Top Results From Across the Web
Silhouette Visualizer — Yellowbrick v1.5 documentation
Implements visualizers that use the silhouette metric for cluster evaluation. ... The Silhouette Visualizer displays the silhouette coefficient for each sample on ...
Read more >ML visualization with yellowbrick (3) - Kaggle
Silhouette analysis can be used to study the separation distance between the resulting clusters. The silhouette plot displays a measure of how close...
Read more >Silhouette 2022.5 - Boris FX
Silhouette 2022 includes under the hood enhancements that improve general workflows for professional post-production and VFX studios. Apple M1 support delivers ...
Read more >Improved silhouette rendering and detection of splat-based ...
This approach allows an efficient silhouette detection on GPUs. Our silhouette detection was applied on a surface splatting pipeline aiming to render quadric ......
Read more >Silhouette Plot - Orange Data Mining
Silhouette Plot · Choose the distance metric. You can choose between: · Select the cluster label. · Display options: · If Send automatically...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Bullet point
Add a legend/annotation that describes the average clustering coefficient (e.g. label the red axvline)
has been addressed with #839 merged@gokriznastic glad to hear you’re willing to keep going with the updates to this visualizer. The label should indicate the value of the
SilhouetteVisualizer.silhouette_score_
property, which is what is plotted by the red axvline: L178. This may be as simple as just adding alabel=""
keyword argument to the axvline; it’ll just be a matter of how we communicate this score. Note that if you need to add math to the matplotlib figure, you can format the label with $ as in$S_i=0.2$
will render the latex math. Not sure if this is necessary or not, but perhaps there is a symbol for the mean silhouette score.As for tick box 2 - I appreciate the Iris data set, but I was hoping for something a bit more unique to us.
Also, let’s not forget the user specified limits to the figure!
Thanks again!