question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unstable pairwise_distances

See original GitHub issue

Description

On some machines, the code below can output a nonzero value.

Steps/Code to Reproduce

import numpy as np
from sklearn.metrics.pairwise import pairwise_distances

np.random.seed(42)

# l = np.array(np.random.rand(9, 10000))
l = np.array(np.random.rand(9, 10000), dtype=np.float32)
m1 = pairwise_distances(l)
m2 = pairwise_distances(l[:4])

# print(m1[0, 1])
# print(m2[0, 1])
print(m2[0, 1] - m1[0, 1])

The bug also exists with the cosine distance but not the cityblock.

Expected Results

0.0

Actual Results

1.1444092e-05

Versions

Darwin-16.7.0-x86_64-i386-64bit
Python 3.6.4 |Anaconda, Inc.| (default, Mar 12 2018, 20:05:31) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
NumPy 1.15.0
SciPy 1.1.0
Scikit-Learn 0.19.1

On this config, the bug doesn’t exist:

Linux-4.4.0-130-generic-x86_64-with-debian-stretch-sid
Python 3.6.4 |Anaconda custom (64-bit)| (default, Jan 16 2018, 18:10:19) 
[GCC 7.2.0]
NumPy 1.15.0
SciPy 1.1.0
Scikit-Learn 0.19.1

The problem might come from another component.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:3
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
ManifoldFRcommented, Jul 30, 2018

I’m having the same problem

Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32

Same NumPy, SciPy, sklearn, Windows 10 build 17134.

1reaction
rthcommented, Aug 1, 2018

Thanks for the detailed report in any case @louisabraham !

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn.metrics.pairwise_distances
Compute the distance matrix from a vector array X and optional Y. This method takes either a vector array or a distance matrix,...
Read more >
Fast and Numerically Stable Pairwise Distance Algorithms
I'm looking for resources on fast, numerically stable pairwise euclidean distance algorithms. In particular, suppose A∈RM×D and B∈RN×D are two ...
Read more >
Cluster analysis is unstable, we knew that!
Fundamentally the pairwise distances calculated are identical, but SQOM defaults ... This illustrates that cluster analysis can be unstable.
Read more >
Instability in progressive multiple sequence alignment ...
... common multiple sequence alignment programs are inherently unstable, ... the total number of pairwise distances increases quadratically, ...
Read more >
gmx-distance(1) — gromacs-data — Debian unstable — Debian ...
unstable / gromacs-data / gmx-distance(1) ... distances between two selections, including minimum, maximum, and pairwise distances, use gmx pairdist.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found