-1 entries in neighbor_graph
See original GitHub issueI get some -1
entries in the kNN graph when I build it on a sparse matrix with cosine distance. Is this intentional? Does -1 mean X.shape[0]-1? When I run query(X)
, I don’t get any negative elements. E.g.:
X = scipy.sparse.csr_matrix(np.random.randn(10000,1247)>3)
nn = pynndescent.NNDescent(X, metric='cosine', n_neighbors=15)
Now nn.neighbor_graph[0]
has some values equal to -1 but nn.query(X, k=15)[0]
does not.
Update: Forgot to say that I do get a warning “UserWarning: Failed to correctly find n_neighbors for some samples.Results may be less than ideal. Try re-running withdifferent parameters.” from NNDescent
. Maybe that’s what -1 indicate? But then query()
does not return any negative elements and does not complain. How should one approach this?
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (4 by maintainers)
Top Results From Across the Web
Chapter 8 Graphs: Definition, Applications, Representation
CHAPTER 8. GRAPHS: DEFINITION, APPLICATIONS, REPRESENTATION. Neighbors. A vertex u is a neighbor of (or equivalently adjacent to) a vertex v in a...
Read more >Nearest Neighbor Graph - an overview | ScienceDirect Topics
Initially, a nearest neighbor graph G is constructed using X. G consists of N vertices where each vertex corresponds to an instance in...
Read more >Directed graph node neighbors - Stack Overflow
The out-neighbors of a node N are all the nodes in the singly linked list belonging to that element N residing in the...
Read more >Representing graphs (article) | Algorithms - Khan Academy
One is how long it takes to determine whether a given edge is in the graph. The other is how long it takes...
Read more >Neighbors of graph node - MATLAB neighbors - MathWorks
Node identifier, specified as one of the values in this table. Value, Example. Scalar node index, 1. Character vector node name, ' ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
For your information, I’ve been looking at this dataset: https://johnhw.github.io/umap_primes/index.md.html. Turns out, this (all neighbors
-1
) happens for around 10% of points (at least out of the first 100K numbers). It seems these are mostly prime numbers and UMAP mostly keeps them where they were at initialization.Hi Leland, I ran into a similar issue with -1’s in the neigbor_graph.
When I run the snippet below I see that there are no -1’s in the graph.
output: 0
However, If I query the graph there are suddenly -1’s.
output: 62251
This seems odd since the underlying graph shouldn’t be affected by querying (new) data. The number of -1’s varies when using different metrics, but there’s always some.
I’m on pynndescent 0.48.1 and numba 0.49.1.