LLVM-Error when using mahalanobis metric with larger datasets
See original GitHub issueHello!
First: Thanks for this splendid project! It’s looking great, especially the semi-supervised part.
When I try to use Mahalanobis-distances with larger datasets, so the < 4000 codepath is not taken, I get an LLVM IR parsing failure. The other codepath works just fine.
Steps to reproduce:
Version: umap-learn-0.3.2
import numpy as np
from umap import UMAP
matrix = np.random.rand(5000,50)
umap = UMAP(n_components=2, n_neighbors=30, metric='mahalanobis', metric_kwds={'V': np.eye(50)})
umap_model = umap.fit_transform(matrix)
Resulting in
Failed at nopython (nopython mode backend)
LLVM IR parsing error
<string>:1121:137: error: invalid use of function-local name
%".786" = extractvalue [1 x {i8*, i8*, i64, i64, double*, [2 x i64], [2 x i64]}] [{i8*, i8*, i64, i64, double*, [2 x i64], [2 x i64]} %".785"], 0
Full stracktrace (redacted for brevity) here: https://gist.github.com/johanbev/1918108e65f014600b2e44affcc35fee
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Mahalanobis Distance - Understanding the math with ...
Mahalanobis distance is an effective multivariate distance metric that measures the distance between a point and a distribution.
Read more >Learning a Mahalanobis Metric from Equivalence Constraints
(b) A large data set of images collected by a real-time surveillance application, where the equiva- lence constraints are gathered automatically. (c) Several ......
Read more >Sample Complexity of Learning Mahalanobis Distance Metrics
So there is some distribution in the collection which has large error. These distributions constructed so that Metric Learning acts as classification. Since,...
Read more >Efficient Learning of Mahalanobis Metrics for Ranking
In this work, we propose an efficient distance metric learn- ing algorithm for ranking which scales to high-dimensional and large datasets.
Read more >Mahalanobis Distance and Multivariate Outlier Detection in R
How to find Multivariate outliers in R by using Mahalanobis. ... Mahalanobis Distance (MD) is an effective distance metric that finds the distance...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks for the quick reply! It seems that this issue is caused by jitting the
make_nn_descent()
in nopython mode. My guess is that njit()-variant cannot access the outer *dist_args parameter somehow. Changing this into@jit
and turning off parallelization fixed the issue. Same issue and solution withseuclidean
.Very cursory benchmarking with
euclidean
distance suggests that there is no performance impact with runningnn_descent
in object-mode.Thanks for the report – I’m not sure quite what the cause is here, but I’ll try to track down what might be at issue soon. I’m working on other projects right now, so it might be a little while before I can get to this in earnest. Sorry.
On Fri, Sep 14, 2018 at 6:15 AM Johan notifications@github.com wrote: