[Question] Best way to create a bidiagonal LazyTensor
See original GitHub issueHi everyone,
I am looking for some advice on how to create a specific LazyTensor such that matrix products/inversions are done efficiently.
Goal
My goal is to create the following LazyTensor, of size nx(n+1)
:
[[1, -g, 0, 0, ..., 0, 0],
[0, 1, -g, 0, ..., 0, 0],
...
[0, 0, 0, 0, ..., -g, 0],
[0, 0, 0, 0, ..., 1, -g]]
Current solution and issues
The current way I do this is:
from gpytorch.lazy import ZeroLazyTensor, DiagLazyTensor, CatLazyTensor
import torch
g = 0.5
n = 5
eye_n = DiagLazyTensor(torch.tensor([1.] * n))
g_n = DiagLazyTensor(torch.tensor([-g] * n))
# Create two different zero_col to make sure gradients are
# correctly backpropagated
zero_col_1 = ZeroLazyTensor(n, 1, dtype=torch.float)
zero_col_2 = ZeroLazyTensor(n, 1, dtype=torch.float)
diag_part = CatLazyTensor(eye_n, zero_col_1, dim=-1)
superdiag_part = CatLazyTensor(zero_col_2, g_n, dim=-1)
out = diag_part + superdiag_part
print(out.evaluate())
This creates the correct tensor, but it scales very poorly: since the tensor has a complicated structure, it cannot be inverted or multiplied efficiently. Consequently, when I use this tensor in a custom kernel (with n
the number of points), I can only use the GP on a few thousand samples, after which my GPU’s memory is not large enough to do the operations.
Is there a more efficient way to define this LazyTensor?
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
python - Create a bidiagonal matrix - Stack Overflow
Here's another way you can do that with NumPy: import numpy as np def make_diags(diags): # Make a linear array for the whole...
Read more >LazyTensors — KeOps
Create two arrays with 3 columns and a (huge) number of lines, ... probably the best way of getting familiar with the KeOps...
Read more >Understanding LazyTensor System Performance with PyTorch ...
And finally, thanks to the authors of the LazyTensor paper not only for developing LazyTensor but also for writing such an accessible paper....
Read more >gpleiss_thesis.d218bc00.pdf - Geoff Pleiss
This approach unifies several existing methods into a highly-parallel and stable algorithm. Chapter 4 focuses on making predictions with Gaussian processes. A.
Read more >txt-file - hgpu.org
... Various Manycore Architectures A Framework for Developing Real-Time OLAP ... Large-Scale Symmetric Tridiagonal Eigenvalue Problem on Multi-GPU Systems A ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Wonderful, thank you. I will keep this thread open for now just in case, and I will close it when I manage to implement the custom LazyTensor.
I’d write your own lazy tensor.
All a
LazyTensor
is is a way to define how to do operations efficiently on a matrix whose entries are specified in a potentially unusual way.In your case, you could imagine a LazyTensor that takes as input
g
and defines at a minimum the three abstract methods:_matmul
,_size
, and_transpose_nonbatch
, as well as_solve
since bidiagonal matrices can’t use CG for solves.Something like this might be the minimum viable implementation:
Now, in your specific case I wouldn’t stop there. By just defining matmul, the default behavior of
_solve
, the method that computesA^{-1}b
with the lazy tensor will be to use CG, which obviously isn’t applicable to bidiagonal matrices. With a bidiagonal matrix with constant diagonals we’d need to write our own_solve
method, which we can do because we know that the inverse of such a matrix is upper triangular with a specific structure.In particular, if
C = A^{-1}
withA
having your specified structure, thenC[i, j] = 0
wheneveri > j
(i.e.,C
is upper triangular), andC[i, j] = (g)^j
wheneveri <= j
(here assuming 0 indexing – in other words, the diagonal ofC
is all ones). In other words, the jth superdiagonal of C is constant with value(g)^j
, where the “0th” superdiagonal is taken to be the diagonal.Thus, you could override
_solve
to exploit this structure. If you wanted to keep things linear space, you’d need to code up your own version of backward substitution specific to thisC
, which shouldn’t be too hard.A good place to start would be to look at
TriangularLazyTensor
here, since your new LT has very similar properties in that it isn’t positive definite, so many of the methods overwritten inTriangularLazyTensor
will also be overwritten inBidiagonalLazyTensor
.