GPU mode error
See original GitHub issueHi,
I got the following error:
I’m using Docker version 1.1.0 gpu NVIDIA GeForce RTX 3090
Any suggestion or advice?
Thanks in advance. Amin
`2021-05-06 16:56:50.765879: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 I0506 16:56:52.008759 140393620989696 call_variants.py:338] Shape of input examples: [100, 221, 6] 2021-05-06 16:56:52.013998: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-05-06 16:56:52.046181: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2100000000 Hz 2021-05-06 16:56:52.053674: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x47507d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-05-06 16:56:52.053727: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2021-05-06 16:56:52.058754: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1 2021-05-06 16:56:52.188018: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x47b9240 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2021-05-06 16:56:52.188089: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6 2021-05-06 16:56:52.191811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:20:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6 coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s 2021-05-06 16:56:52.191885: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2021-05-06 16:56:52.195656: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10 2021-05-06 16:56:52.199014: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10 2021-05-06 16:56:52.199715: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10 2021-05-06 16:56:52.203305: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10 2021-05-06 16:56:52.205473: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10 2021-05-06 16:56:52.211828: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7 2021-05-06 16:56:52.216068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0 2021-05-06 16:56:52.216108: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 2021-05-06 16:58:21.551842: E tensorflow/core/common_runtime/session.cc:91] Failed to create session: Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid 2021-05-06 16:58:21.551943: E tensorflow/c/c_api.cc:2184] Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid Traceback (most recent call last): File “/tmp/Bazel.runfiles_1q2x77gk/runfiles/com_google_deepvariant/deepvariant/call_variants.py”, line 502, in <module> tf.compat.v1.app.run() File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py”, line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File “/tmp/Bazel.runfiles_1q2x77gk/runfiles/absl_py/absl/app.py”, line 299, in run _run_main(main, args) File “/tmp/Bazel.runfiles_1q2x77gk/runfiles/absl_py/absl/app.py”, line 250, in _run_main sys.exit(main(argv)) File “/tmp/Bazel.runfiles_1q2x77gk/runfiles/com_google_deepvariant/deepvariant/call_variants.py”, line 492, in main use_tpu=FLAGS.use_tpu, File “/tmp/Bazel.runfiles_1q2x77gk/runfiles/com_google_deepvariant/deepvariant/call_variants.py”, line 393, in call_variants with tf.compat.v1.Session(config=config) as sess: File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1586, in init super(Session, self).init(target, graph, config=config) File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 701, in init self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts) tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid
real 1m31.942s user 1m31.967s sys 0m8.062s `
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Hi @aardes, my understanding is that cuDNN v8, CUDA 11, TF 2.5, and Python 3.8 will be needed for RTX 3090. Our code is currently not ready to be upgraded to Python 3.8, but this is something we are looking into for future releases.
Looking forward to it, thanks