Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

linalg.solve broadcasting behavior is ambiguous

See original GitHub issue

The spec for the linalg.solve function seems ambiguous. In solve(x1, x2), x1 has shape (..., M, M) and x2 either has shape (..., M) or (..., M, K). In either case, the ... parts should be broadcast compatible.

This is ambiguous. For example, if x1 is shape (2, 2, 2) and x2 is shape (2, 2), should this be interpreted as x2 is (2,) stack of a (2,) vector, i.e., the result would be (2, 2, 2, 1) after broadcasting, or as a single stack of a 2x2 matrix, i.e., resulting in (2, 2, 2, 2).

Relevant pytorch issue about this: https://github.com/pytorch/pytorch/issues/52915
Relevant NumPy issue: https://github.com/numpy/numpy/issues/15349
torch.linalg.solve docs: https://pytorch.org/docs/stable/generated/torch.linalg.solve.html
numpy.linalg.solve docs: https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html#numpy.linalg.solve

Regarding NumPy, it seems to sometimes pick one over the other, even when only the other one makes sense. For example

>>> x1 = np.eye(1)
>>> x2 = np.asarray([[0.], [0.]])
>>> x1.shape
(1, 1)
>>> x2.shape
(2, 1)
>>> np.linalg.solve(x1, x2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 5, in solve
  File "/Users/aaronmeurer/anaconda3/envs/array-apis/lib/python3.9/site-packages/numpy/linalg/linalg.py", line 393, in solve
    r = gufunc(a, b, signature=signature, extobj=extobj)
ValueError: solve: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (m,m),(m,n)->(m,n) (size 2 is different from 1)

Here it wants to treat x2 as a single 2x1 matrix, which is shape incompatible with the 1x1 x1, but it could also treat it a (2,) stacks of length 1 vectors.

I think there are also some issues with the way the spec describes broadcasting. It says “shape(x2)[:-1] must be compatible with shape(x1)[:-1]” but I think this should be shape(x2)[:-2] and so on, since matrix dimensions should never broadcast with each other. It also says that the output should always have same shape as x2, which contradicts that the inputs should broadcast together.

If I am reading the pytorch docs correctly, it resolves this by only allowing broadcasting in the case where x2 is exactly 1- or 2-dimensional. Otherwise when x2 is a stack of matrices, the stack part of the shape has to match the stack part of shape(x1) exactly.

However, I think this still is ambiguous in the case I noted above where x1 is (2, 2, 2) and x2 is (2, 2). x2 could be a matrix, which would broadcast, or a stack of a (2,) matrix, which has a matching stack shape as x1.

So I think more is required to disambiguate, e.g., only allow broadcasting for matrices and not for vectors. One could also remove the vector case completely, or only allow it in the sample case of x2 being 1-D (i.e., no stacks of 1-D vectors).

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:16 (11 by maintainers)

Top GitHub Comments

1reaction

leofangcommented, Nov 5, 2021

x1 with shape (M, M) and x2 is a stack of vectors with shape (1000, M); it is 1000 calls to LAPACK API if no reshaping of x2 to (M, 1000) is done.

@IvanYashchuk As I mentioned in yesterday’s call this is not the case. There are ways to make only 1 call, which is an implementation detail, so this alone is not enough to justify the API design decision we agreed upon.

1reaction

kgrytecommented, Nov 4, 2021

@leofang Already on today’s agenda. 😅

Top Results From Across the Web

np.linalg.solve documentation suggests ambigous behaviour

which leads to ambiguous conclusions. Let's stick to first choice. This would mean for a.shape = (M, M) , b.shape = (N, M)...

Solving a large number of small linear systems - Stack Overflow

In NumPy 1.8 and later, numpy.linalg.solve actually broadcasts. For numpy.linalg.solve(a, b) , if b.ndim == a.ndim - 1 , it will perform a ......

numpy.linalg.solve — NumPy v1.24 Manual

Solve a linear matrix equation, or system of linear scalar equations. ... Broadcasting rules apply, see the numpy.linalg documentation for details.

Complete Solutions, Nullspace, Space, Dimension, Basis

The complete solution to this equation is the line x1 + x2 = 8. The homogeneous solution, or the nullspace is the set...

Scaling in Numerical Linear Algebra - People @EECS

Technology Trends and Methodology; Dense Linear Algebra; Sparse Direct Solvers for Ax= b ... Solving and eigenproblems; Galois theory of parallel prefix.