Blas GEMM launch failed
See original GitHub issueAfter upgrade to the TensorFlow 1.1 an example python -m baselines.deepq.experiments.train_cartpole stopped working for me. How it can be fixed?
2017-06-01 17:37:06.830729: I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)
WARNING:tensorflow:VARIABLES collection name is deprecated, please use GLOBAL_VARIABLES instead; VARIABLES will be removed after 2017-03-02.
[2017-06-01 17:37:07,224] VARIABLES collection name is deprecated, please use GLOBAL_VARIABLES instead; VARIABLES will be removed after 2017-03-02.
WARNING:tensorflow:VARIABLES collection name is deprecated, please use GLOBAL_VARIABLES instead; VARIABLES will be removed after 2017-03-02.
[2017-06-01 17:37:07,262] VARIABLES collection name is deprecated, please use GLOBAL_VARIABLES instead; VARIABLES will be removed after 2017-03-02.
2017-06-01 17:37:08.309557: E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_blas.cc:365] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2017-06-01 17:37:08.309714: W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\stream.cc:1550] attempting to perform BLAS operation using StreamExecutor without BLAS support
Traceback (most recent call last):
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1039, in _do_call
return fn(*args)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1021, in _run_fn
status, run_metadata)
File "C:\Users\Viktor\Anaconda3\lib\contextlib.py", line 66, in __exit__
next(self.gen)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(1, 4), b.shape=(4, 64), m=1, n=64, k=4
[[Node: deepq/q_func/fully_connected/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_deepq/observation_0/_11, deepq/q_func/fully_connected/weights/read)]]
[[Node: deepq/cond/Merge/_17 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_42_deepq/cond/Merge", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Viktor\Anaconda3\lib\runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "C:\Users\Viktor\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\experiments\train_cartpole.py", line 31, in <module>
main()
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\experiments\train_cartpole.py", line 24, in main
callback=callback
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\simple.py", line 216, in learn
action = act(np.array(obs)[None], update_eps=exploration.value(t))[0]
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\common\tf_util.py", line 402, in <lambda>
return lambda *args, **kwargs: f(*args, **kwargs)[0]
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\common\tf_util.py", line 445, in __call__
results = get_session().run(self.outputs_update, feed_dict=feed_dict)[:-1]
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 778, in run
run_metadata_ptr)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 982, in _run
feed_dict_string, options, run_metadata)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1032, in _do_run
target_list, options, run_metadata)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1052, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(1, 4), b.shape=(4, 64), m=1, n=64, k=4
[[Node: deepq/q_func/fully_connected/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_deepq/observation_0/_11, deepq/q_func/fully_connected/weights/read)]]
[[Node: deepq/cond/Merge/_17 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_42_deepq/cond/Merge", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op 'deepq/q_func/fully_connected/MatMul', defined at:
File "C:\Users\Viktor\Anaconda3\lib\runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "C:\Users\Viktor\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\experiments\train_cartpole.py", line 31, in <module>
main()
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\experiments\train_cartpole.py", line 24, in main
callback=callback
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\simple.py", line 178, in learn
grad_norm_clipping=10
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\build_graph.py", line 178, in build_train
act_f = build_act(make_obs_ph, q_func, num_actions, scope=scope, reuse=reuse)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\build_graph.py", line 111, in build_act
q_values = q_func(observations_ph.get(), num_actions, scope="q_func")
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\models.py", line 27, in <lambda>
return lambda *args, **kwargs: _mlp(hiddens, *args, **kwargs)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\baselines\deepq\models.py", line 9, in _mlp
out = layers.fully_connected(out, num_outputs=hidden, activation_fn=tf.nn.relu)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\contrib\framework\python\ops\arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\contrib\layers\python\layers\layers.py", line 1433, in fully_connected
outputs = layer.apply(inputs)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\layers\base.py", line 320, in apply
return self.__call__(inputs, **kwargs)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\layers\base.py", line 290, in __call__
outputs = self.call(inputs, **kwargs)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\layers\core.py", line 144, in call
outputs = standard_ops.matmul(inputs, self.kernel)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1801, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1263, in _mat_mul
transpose_b=transpose_b, name=name)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 768, in apply_op
op_def=op_def)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Users\Viktor\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1228, in __init__
self._traceback = _extract_stack()
InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(1, 4), b.shape=(4, 64), m=1, n=64, k=4
[[Node: deepq/q_func/fully_connected/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_deepq/observation_0/_11, deepq/q_func/fully_connected/weights/read)]]
[[Node: deepq/cond/Merge/_17 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_42_deepq/cond/Merge", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Issue Analytics
- State:
- Created 6 years ago
- Comments:5
Top Results From Across the Web
TensorFlow: Blas GEMM launch failed
I ran into this problem when trying to run several servers that use a model to serve predictions. As I wasn't training a...
Read more >InternalError: Blas GEMM launch failed · Issue #11812
Try 'Shutdown' the running notebooks which uses your GPU. Restart the kernel. Run the code again.. This time it should work. This worked...
Read more >Error Internal: Blas GEMM launch failed
Hi,. I an encountering an error when I moved to a new laptop with RTX3070. I am new to GPU world and I...
Read more >Execute failed: Blas GEMM launch failed
Running what I thought was a simple test using the RedField BERT extensions. My workstation has 2 x 2080ti nvidia GPUs - nothing...
Read more >Internal Error:Blas GEMM launch failed解决方法
出现Blas GEMM launch failed报错的原因是tensorflow在调用GPU时的显存分配出现问题,tensorflow默认申请可使用的全部显存,当tensorflow程序运行会话却 ...
Read more >Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Quick update - after restart of my laptop it works the same as before the TF upgrade, without any crashes. Learning is fast and stable.
Try those codes to set constant GPU memory for Tensorflow.