numpy.einsum new feature request: repeated output subscripts as diagonal
See original GitHub issueI think that the following new feature would make numpy.einsum
even more powerful/useful/awesome than it already is. Moreover, the change should not interfere with existing code, it would preserve the “minimalistic” spirit of numpy.einsum
, and the new functionality would integrate in a seamless/intuitive manner for the users.
In short, the new feature would allow for repeated subscripts to appear in the “output” part of the subscripts
parameter (i.e., on the right-hand side of ->
). The corresponding dimensions in the resulting ndarray
would only be filled along their diagonal, leaving the off diagonal entries to the default value for this dtype
(typically zero). Note that the current behavior is to raise an exception when repeated output subscripts are being used.
This is simplest to describe with an example involving the dual behavior of numpy.diag
.
# Extracting the diagonal of a 2-D array.
A = arange(16).reshape(4,4)
print(diag(A)) # Output: [ 0 5 10 15 ]
print(einsum('ii->i', A)) # Same as previous line (current behavior).
# Constructing a diagonal 2-D array.
v = arange(4)
print(diag(v)) # Output: [[0 0 0 0] [0 1 0 0] [0 0 2 0] [0 0 0 3]]
print(einsum('i->ii', v)) # New behavior would be same as previous line.
# The current behavior of the previous line is to raise an exception.
By opposition to numpy.diag
, the approach generalizes to higher dimensions: einsum('iii->i', A)
extracts the diagonal of a 3-D array, and einsum('i->iii', v)
would build a diagonal 3-D array.
The proposed behavior really starts to shine in more intricate cases.
# Dummy values, these should be probabilities to make sense below.
P_w_ab = arange(24).reshape(3,2,4)
P_y_wxab = arange(144).reshape(3,3,2,2,4)
# With the proposed behavior, the following two lines should be equivalent.
P_xyz_ab = einsum('wab,xa,ywxab,zy->xyzab', P_w_ab, eye(2), P_y_wxab, eye(3))
also_P_xyz_ab = einsum('wab,ywaab->ayyab', P_w_ab, P_y_wxab)
If this is not convincing enough, replace eye(2)
by eye(P_w_ab.shape[1])
and replace eye(3)
by eye(P_y_wxab.shape[0])
, then imagine more dimensions and repeated indices… The new notation would allow for crisper codes and reduce the opportunities for dumb mistakes.
For those who wonder, the above computation amounts to $P(X=x,Y=y,Z=z|A=a,B=b) = \sum_w P(W=w|A=a,B=b) P(X=x|A=a) P(Y=y|W=w,X=x,A=a,B=b) P(Z=z|Y=y)$ with $P(X=x|A=a)=\delta_{xa}$ and $P(Z=z|Y=y)=\delta_{zy}$ (using LaTeX notation, and $\delta_{ij}$ is Kronecker’s delta).
Issue Analytics
- State:
- Created 9 years ago
- Reactions:3
- Comments:5 (3 by maintainers)
Top GitHub Comments
Lets say that you have an axis
a
of length 100, that you want to have three times in your output. If you do the math, that means that only 100 in 1,000,000, or 1 in 10,000 entries is going to be different from 0. Even in the simplest case of an axis of length 2, repeated only twice, half of the entries are going to be 0. You do not want to step over every item of the output array, check if it is in a diagonal, and put a zero in there if it is not: you memset everything to 0 at the beginning, and then fill the right positions in. If I am not mistaken, it turns out that, for diagonals, you can access them without any checking with some striding magic. If you, e.g. wanted to do an operation like `ijk,ik->ijiji’, which is a madman’s storage for the products of a stack of matrices with a stack of vectors, you could simply do the following from your Python interpreter:You can probably get a general case function in about 50 lines of Python code. I have just looked at
einsum.c.src
and there are probably something like 500 lines of painful to read code, spread in three functions, to do the subscript parsing… It is probably better, and infinitely simpler, to write a newdiagonals
method that useseinsum
’s subscript notation, perhaps insidenumpy.lib.stride_tricks
, to build or extract diagonals from arbitrary dimensional arrays. So your example above would turn into:Umh. Consider the following hack: I implement a
diagonals
such as the one you propose. I then editeinsum
at the place where it would usually raise an exception to do the following:einsum
without repeated indices in the outputs (such as your ‘wab,ywaab->ayb’).diagonals
to add in the missing dimensions (such as your ‘ayb->ayyab’).This way, I can’t break nor slow down the current
einsum
, unless the user’s call would have raised an error anyway. Thoughts?(BTW, I sent this to the mailing list: http://mail.scipy.org/pipermail/numpy-discussion/2014-August/070969.html )