LinAlgError (Array must not contain infs or NaNs) thrown in get_mu_tensor
See original GitHub issueBelow is a simple piece of code to try YellowFin on my dataset.
x = tf.placeholder( tf.float32, [ None, train_x.shape[ 1 ] ] )
y = tf.placeholder( tf.float32, [ None, train_y.shape[ 1 ] ] )
m = tf.layers.dense( x, hidden_dim )
m = tf.layers.batch_normalization( m )
m = tf.nn.elu( m )
m = tf.layers.dense( m, hidden_dim )
m = tf.layers.batch_normalization( m )
m = tf.nn.elu( m )
m = tf.layers.dense( m, hidden_dim )
m = tf.layers.batch_normalization( m )
m = tf.nn.elu( m )
m = tf.layers.dense( m, train_y.shape[ 1 ] )
prediction = tf.nn.softmax( m )
loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits( labels=y, logits=m ) )
optimizer = yellowfin.YFOptimizer().minimize( loss )
s = tf.Session()
s.run( tf.global_variables_initializer() )
for epoch in range( epochs ):
_, h = s.run( [ optimizer, loss ], feed_dict={ x: train_x, y: train_y } )
Usually, it crashes and throws the following exception.
Caused by op 'update_hyper/cond/PyFuncStateless', defined at:
File "test2.py", line 47, in <module>
optimizer = yf.YFOptimizer( learning_rate=1., momentum=0. ).minimize( loss )
File "/data/python-mp-test/libs/yellowfin.py", line 268, in minimize
return self.apply_gradients(grads_and_vars)
File "/data/python-mp-test/libs/yellowfin.py", line 223, in apply_gradients
update_hyper_op = self.update_hyper_param()
File "/data/python-mp-test/libs/yellowfin.py", line 191, in update_hyper_param
lambda: self._mu_var) )
File "/usr/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 289, in new_func
return func(*args, **kwargs)
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1814, in cond
orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1689, in BuildCondBranch
original_result = fn()
File "/data/python-mp-test/libs/yellowfin.py", line 190, in <lambda>
self._mu = tf.identity(tf.cond(self._do_tune, lambda: self.get_mu_tensor(),
File "/data/python-mp-test/libs/yellowfin.py", line 173, in get_mu_tensor
roots = tf.py_func(np.roots, [coef], Tout=tf.complex64, stateful=False)
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/script_ops.py", line 201, in py_func
input=inp, token=token, Tout=Tout, name=name)
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/gen_script_ops.py", line 56, in _py_func_stateless
Tout=Tout, name=name)
File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
self._traceback = _extract_stack()
UnknownError (see above for traceback): LinAlgError: Array must not contain infs or NaNs
[[Node: update_hyper/cond/PyFuncStateless = PyFuncStateless[Tin=[DT_FLOAT], Tout=[DT_COMPLEX64], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](update_hyper/cond/ScatterUpdate)]]
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
LinAlgError: Array must not contain infs or NaNs, but no ...
Okay, I realized what I was doing wrong. The problem was the within_class_matrix method, which returns the following traceback:
Read more >LinAlgError Array must not contain infs or NaNs · Issue #4291
I'm having some weird behavior. In my local environment the same code works as expected, and in my prod environment throws this error: ......
Read more >numpy.linalg.LinAlgError: Array must not contain infs or ...
Hi All, I searched the following error but no solution so I posted it as a separate topic. Any suggestion? Thanks.
Read more >As Function Inputs - ValueError: array must not contain infs ...
Multiple Latent Gaussian Processes - As Function Inputs - ValueError: array must not contain infs or NaNs ... I am trying to fit...
Read more >PCA scikit-learn - ValueError: array must not contain infs or ...
The numpy array shape is (512, 48), dtype is float64. ... The array does not contain infs or NaNs but I get an...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @ywchan2005,
Thanks for trying out the optimizer. This is mostly because of the exploding gradient in the middle of training.
If it happens in the very beginning, you might want to play with the initial value a bit.
If it is in the middle of training, please consider using gradient clipping. There is discussion with solutions in our PyTorch YellowFin repo here. Similar solution can apply to the TF repo.
We are working on a better auto gradient clipping feature. You may also wait for that in a few days. But I suggest you can already start working on 2.
I can confirm that I ran into this using a standard AlexNet architecture being trained on the ImageNet corpus using PyTorch. After 7 full epochs, (that is, having trained on 60928 minibatches, each of size 64) I received the following error:
It would be really nice to have gradient clipping or some kind of workaround for this built-in to YellowFin. 😃