Different algorithms giving different results in `shortest_path`
See original GitHub issueIt seems that, at least when masked arrays are passed to shortest_path
and some entries are 0
, the output is algorithm-dependent.
In my application, the meaning of 0
in a dense graph to be passed to shortest_path
should be that of an edge with no cost, not one with an infinite cost (absent edge). I understand that this is not the view adopted by shortest_path
, and have hence resorted to using masked arrays instead. However, this leads to unexpected discrepancies in the results.
Reproducing code example:
import numpy as np
from scipy.sparse.csgraph import shortest_path
csgraph = np.array(
[[0, 1, 0],
[1, 0, 0],
[0, 0, 0]]
)
csgraph_masked = np.ma.masked_invalid(csgraph)
shortest_FW = shortest_path(csgraph_masked, method='FW', directed=False)
shortest_J = shortest_path(csgraph_masked, method='J', directed=False)
shortest_D = shortest_path(csgraph_masked, method='D', directed=False)
shortest_BF = shortest_path(csgraph_masked, method='BF', directed=False)
Expected result:
The documentation does not suggest that, when the graphs are undirected and there are no negative values, outputs according to different methods should be different (and since the 'auto'
option is available, as a user one would rather like a guarantee that this is not the case). However, in the above example, while the last three results are all equal to
array([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]])
(which, incidentally, is the one I want), shortest_FW
equals
array([[ 0., 1., inf],
[ 1., 0., inf],
[inf, inf, 0.]])
What I guess is happening is that 'FW'
is converting to a dense representation and forgetting about the mask completely.
Scipy/Numpy/Python version information:
1.4.1 1.18.4 sys.version_info(major=3, minor=8, micro=2, releaselevel='final', serial=0)
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
I think the issue is here: https://github.com/scipy/scipy/blob/655ce1bd180e2554a12d501fea0f24a0c6e8123a/scipy/sparse/csgraph/_shortest_path.pyx#L302-L304
This logic doesn’t account for masked arrays with explicit zeros.
I don’t think the mask is being ignored. All algorithms start by calling
validate_graph
, which handles masked inputs: https://github.com/scipy/scipy/blob/655ce1bd180e2554a12d501fea0f24a0c6e8123a/scipy/sparse/csgraph/_validation.py#L28-L34