umap crashes in my computer with 900,000 points
See original GitHub issueHi, I have been trying to embed 900,000 points using UMAP in my computer. The program eventually gets killed by the system. I tried running in both Jupyter and in terminal.
My system: 16Core/32Thread AMD CPU, 128GB RAM (Terminal reports 125GB). Ubuntu 18.04.3 LTS.
I was wondering if it is a system requirement issue or an issue in how the UMAP handles this many points. (In the paper, it seems UMAP can handle millions of points as there is a visualization of 3Million points.)
Here is a code that reproduces the error in my computer:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
X_main = np.random.rand(900000, 1000)
n_components = 2
pca = PCA(n_components = 50)
X_train = pca.fit_transform(X_main)
n_neighbors= 50
MIN_DIST = 0.1
import umap
ump = umap.UMAP(n_neighbors=n_neighbors,
min_dist=MIN_DIST,
n_components=2,
random_state=100,
metric= 'euclidean')
y_umap = ump.fit_transform(X_train)
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:8 (2 by maintainers)
Top Results From Across the Web
Editor crashes on umap open - Unreal Engine Forums
I have no idea what's happening. I was finishing a level with large terrain, some foliage and used the smooth tool quite a...
Read more >Joint Committee Print 106-61 - Congress.gov
U.S. officials frequently made the point that the use of such ``filters'' to ... The number of Feles Mora in the country has...
Read more >424B4 - SEC.gov
This is the initial public offering of shares of Class A common stock of Recursion Pharmaceuticals, Inc. We are offering 24,242,424 shares of...
Read more >TME Volume 7, Numbers 2 and 3
computer science and physics have reported feeling isolated or alienated in ... Once the coding scheme reached a point at which it seemed...
Read more >An Epidemiologic Approach to Reproductive Health
check each record as it is entered into the computer (this may be ... Measures of Disease Frequency in Reproductive Health. Point prevalence...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The most likely reason for a silent crash with the system killing the job is a memory issue. UMAP can be pretty memory hungry (newer development versions are working to fix this). At least one option is the try the option
low_memory=True
which will try to use a sometimes slower but less memory hungry approach. Another option is to install the latest (version 0.5 or newer) version of pynndescent.Thank you for your answers! I have a couple of new insights on this: