question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Different behaviour of scipy.spatial.distance.cdist depending on parameters

See original GitHub issue

Describe your issue.

There is a different behavior of the function scipy.distance.spatial.cdist depending on whether the parameter w is specified or not: if it is explicitly passed then the C implementation of the distance measure is used otherwise the python implementation is used.

Reproducing Code Example

import numpy, scipy.spatial
v1 = numpy.ndarray((1, 4))
v2 = numpy.ndarray((1, 4))
v1[0] = [0, 1, 0, 0]
v2[0] = [0, 1, 0, 0]

# Cosine similarity measures the cosine of the angle between the two vectors

print(scipy.spatial.distance.cdist(v1, v2, 'cosine')) # no parameter `w` so the C implementation of `cosine` is used
# array([[0.]])    -> 0 means that we have a 0 degree angle, so a perfect match

v2[0] = [0, 0, 0, 0] # Change the vector

print(scipy.spatial.distance.cdist(v1, v2, 'cosine')) # no parameter `w` so the C implementation of `cosine` is used
# array([[nan]])  # We get nan because the result of the dot product is divided by the product of the norm of the vector, so in this case 0

print(scipy.spatial.distance.cdist(v1, v2, 'cosine', w=[1,1,1,1])) # parameter `w` has been specified so the Python implementation of `cosine` is used
# array([[0.]])  # Different behavior

Error message

None

SciPy/NumPy/Python version information

1.8.0 1.22.3 sys.version_info(major=3, minor=10, micro=4, releaselevel=‘final’, serial=0)

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
czgdp1807commented, May 18, 2022

I think #14611 fixes the issue. Output I got on my PR,

[[0.]]
[[nan]]
[[nan]]
1reaction
czgdp1807commented, May 11, 2022

Update - Went out of my mind. Will do the checking tomorrow for sure. Apologies.

Read more comments on GitHub >

github_iconTop Results From Across the Web

scipy.spatial.distance.cdist — SciPy v1.9.3 Manual
Compute distance between each pair of the two collections of inputs. See Notes for common calling conventions. Parameters. XAarray_like.
Read more >
Using Additional kwargs with a Custom Function for Scipy's ...
Ideally, I was hoping to pass k as an argument when calling cdist like so: d_ar = scipy.spatial.distance.cdist(arr1, arr2, metric=cust_metric(k= ...
Read more >
SciPy 1.6.0 Release Notes — SciPy v1.10.0.dev0+2142.6912c63 ...
It contains many new features, numerous bug-fixes, improved test coverage and ... scipy.spatial.distance.cdist has improved performance with the minkowski ...
Read more >
scipy/scipy: SciPy 1.0.0rc1 | Zenodo
It contains many new features, numerous bug-fixes, improved test coverage and ... of scipy.spatial.distance.pdist and scipy.spatial.distance.cdist were ...
Read more >
SciPy: doc/release/1.0.0-notes.rst | Fossies
Also, an optional out parameter was added to pdist and cdist allowing the user ... Implementation of scipy.spatial.distance.wminkowski was based on a wrong ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found