NA returned with Warning: Embedding 8 connected components using meta-embedding (experimental) n_components
See original GitHub issueWorking enviroment: Mac OS 13.5, PYTHON 3.6. My data is 660K * 6 dimensions. Firstly, I tried n_neighbors =100. It worked fine. Then I tried n_neighbors=15, it gave warning:
lib/python3.6/site-packages/umap/spectral.py:229: UserWarning: Embedding 8 connected components using meta-embedding (experimental) n_components
And the returned embedding is all NA. Then I also tried n_neighbors =200,500, all the embedding is NA. I am not sure what happened.
Thank you!
Issue Analytics
- State:
- Created 5 years ago
- Reactions:3
- Comments:20 (9 by maintainers)
Top Results From Across the Web
Computing Meta-Embeddings by Averaging Source Word ...
In this paper, we show that the arithmetic mean of two distinct word embedding sets yields a performant meta-embedding that is comparable or ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I suspect the spectral initialisation is failing for one reason or another. This can often happen for particularly oddly distributed data. As a workaround you can use
init='random'
as a parameter to UMAP. It should stop the NaNs happening at least. This isn’t ideal, but it should get you past the immediate problem. I’ll try to look into the deeper issue soon.On Sat, Jul 21, 2018 at 2:46 PM Yubin notifications@github.com wrote:
Each instance is viewed as an independent object, so even if they are identical in the data they are treated as technically separate, and thus embed into different locations.
Checking for unique rows is certainly an option, but a very expensive one computationally. There are other checks that should catch such situations, so I’m not sure whether this was technically the problem or not.
On Wed, Aug 1, 2018 at 7:10 AM JoshuaC3 notifications@github.com wrote: