Feature Request: Include loss_ as an attribute for the fitting
See original GitHub issueFirst off, thank you for implementing this method in Python! Very stoked to start using it for my bioinformatics datasets. I have been trying to quantify which parameters are the best for my datasets and having some trouble. I was wondering if the loss provided in your Enthought talk from SciPy 2018:
Could be included as an attribute we can access later so I can figure out which of my hyperparameters settings should be used?
I have a precomputed data matrix (137 x 137)
and using the following hyperparameter configs:
for n_neighbors in [3,4,5,6,7]:
for min_dist in [0.01, 0.1,0.2,0.3,0.5]:
for spread in [0.01, 0.1,0.2,0.3,0.5]:
for learning_rate in [1e-3, 1e-2,1e-1,1]:
There is some structure that fits my hypothesis from some of these configs and I want to know which one specifically I should choose so I thought maybe so sort of loss_
metric would be really useful in this scenario.
Also. if you have a few moments. Can you describe why I see this topology sometimes where it looks like a regression?
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
I would start with a learning rate of 1.0 (the default). You can scale it down a bit if you need, but realistically you shouldn’t need to get too much smaller than 0.5 I would imagine.
You shouldn’t need a learning rate that low – I suspect that that was simply working around the bug that got fixed very recently. You may want to try again with fresh code and see if you can get away with a higher learning rate.
As to comparing hyperparameters … the main ones to change are n_neighbors and min_dist. Ultimately there are not right values, nor is one any better than the others; they are simply different views of your data. You can think of it an being somewhat loosely analogous to looking at 3D data from different viewing angles – no one angle is more true than any other, but some angles may highlight different properties of the data than others. The parameters for UMAP are not quite so simple, but it comes down to a similar thing – they offer you different lenses on the data, and ultimately the lens that helps you see relevant things is the useful one (as opposed to being the true one).