non-convergence since upgrade
See original GitHub issueHi all, I upgraded gpytorch to test out the new white noise kernel and now a 3D-exact GP classification model (without white noise addition) that had previously converged is failing. My target data is -1/1. Using the RBF kernel, I see the following error:
File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/module.py", line 162, in __call__ outputs = self.forward(*inputs, **kwargs) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/mlls/variational_marginal_log_likelihood.py", line 29, in forward variational_strategy.kl_divergence().sum() for variational_strategy in self.model.variational_strategies() File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/mlls/variational_marginal_log_likelihood.py", line 29, in <genexpr> variational_strategy.kl_divergence().sum() for variational_strategy in self.model.variational_strategies() File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/variational/mvn_variational_strategy.py", line 27, in kl_divergence inv_quad_rhs=inv_quad_rhs, log_det=True File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/lazy/lazy_variable.py", line 454, in inv_quad_log_det preconditioner=self._preconditioner()[0], File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/lazy/added_diag_lazy_variable.py", line 50, in _preconditioner self._piv_chol_self = pivoted_cholesky.pivoted_cholesky(self._lazy_var, max_iter) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/utils/pivoted_cholesky.py", line 97, in pivoted_cholesky return L[0, :m, :] RuntimeError: dimension out of range (expected to be in range of [-1, 0], but got 1)
And with Matern 2.5:
File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/models/variational_gp.py", line 37, in __call__ prior_output = self.prior_output() File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/models/abstract_variational_gp.py", line 65, in prior_output res = GaussianRandomVariable(res.mean(), res.covar().evaluate_kernel()) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/lazy/lazy_variable.py", line 274, in evaluate_kernel return self.representation_tree()(*self.representation()) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/lazy/lazy_variable.py", line 578, in representation_tree return LazyVariableRepresentationTree(self) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/lazy/lazy_variable_representation_tree.py", line 14, in __init__ for arg in lazy_var._args: File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/lazy/mul_lazy_variable.py", line 73, in _args right_lazy_var = RootLazyVariable(right_lazy_var.root_decomposition()) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/lazy/lazy_variable.py", line 593, in root_decomposition )(*self.representation()) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/functions/_root_decomposition.py", line 54, in forward eigenvalues, eigenvectors = lanczos_tridiag_to_diag(t_mat) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/utils/lanczos.py", line 170, in lanczos_tridiag_to_diag return batch_symeig(t_mat) File "/Users/stanleybiryukov/miniconda2/envs/idp/lib/python2.7/site-packages/gpytorch/utils/eig.py", line 23, in batch_symeig evals, evecs = mat[i, j].symeig(eigenvectors=True) RuntimeError: Lapack Error syev : 2 off-diagonal elements didn't converge to zero at /Users/soumith/minicondabuild3/conda-bld/pytorch_1524587833086/work/aten/src/TH/generic/THTensorLapack.c:388
Please let me know if I can provide additional details to help with de-bugging.
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
As a side note, because the original code was unintentionally using some fairly complicated stuff intended for handling product kernels, training should also be much faster without the
expand_as
.Excellent, thanks for catching this.