question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Warning message: failed creating initial embedding; using random embedding instead

See original GitHub issue

Hello, I am trying to perform umap on a dataset of ~5000 observations and 20 features (selected with previous PCA), using the R implementation provided by the umap package.

When computing UMAP I get this warning message: Warning message: failed to create initial embedding; using random embedding insteadx. So the spectral initialization is not working. How should I tackle this sort of instability?

Plots are quite different from each other in different runs, and I would like something reproducible and possibly robust. I imagine this could be due to my data, but I wasn’t able to find help about that in the UMAP documentation.

Feel free to close this issue if it’s not appropriate for this repo, Thank you for your kind attention

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
lmcinnescommented, Nov 21, 2019

Essentially this would mean that the internal topological representation (a graph) does not have a good spectral gap in the Laplacian; or that the spectral embedding/eigenvector solver is not working correctly on your system. In the first case that means that potentially you may want to add a small amount of noise or jitter to your data to hopefully nudge it out of this odd situation. In the second case you probably want to look at how the ARPACK components are installed.

0reactions
tkonopkacommented, Nov 21, 2019

I like the approach to experiment with the number of neighbors. Along similar lines, another experiment might be to change the input data… The 20 features in this dataset come from PCA preprocessing. Using a couple more or fewer features from that preprocessing stage should not affect the interpretation of the workflow, but might introduce that “noise or jitter” to avoid numerical problems. Not sure if this would work in practice, but it is quick to test.

Read more comments on GitHub >

github_iconTop Results From Across the Web

umap source: R/embedding.R - Rdrr.io
R defines the following functions: make.initial.spectator.embedding ... warn.msg <- c("failed creating initial embedding;", "using random embedding ...
Read more >
Uniform Manifold Approximation and Projection in R
Warning : failed creating initial embedding; using random embedding instead ## Warning: failed creating initial embedding; using random embedding instead.
Read more >
Uniform Manifold Approximation and Projection in R
Once we have a 'umap' object describing an embedding of a dataset into a low-dimensional layout, we can project other data onto the...
Read more >
Working with Random/SplittableRandom instances in ...
Introduction Embedding instances of Random and SplittableRandom in ... and the Warnings that inform us that native-image failed to create a ...
Read more >
Embedding Infinispan caches in Java applications
2.2. Creating and using embedded caches. Infinispan provides a GlobalConfigurationBuilder API that controls the Cache Manager and a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found