build rocm JAX failed
See original GitHub issuePlease:
- Check for duplicate issues.
- Provide a complete example of how to reproduce the bug, wrapped in triple backticks like this:
I use singularity to build an image and follow the README.md using the command like below.
./build_rocm.sh
- If applicable, include full error messages/tracebacks. these are all the outputs.
Singularity> ./build_rocm.sh
+ ROCM_TF_FORK_REPO=https://github.com/ROCmSoftwarePlatform/tensorflow-upstream
+ ROCM_TF_FORK_BRANCH=develop-upstream
+ ROCM_PATH=/opt/rocm/
+ python3 ../build.py --enable_rocm --rocm_path=/opt/rocm/ --bazel_options=--override_repository=org_tensorflow=/tmp/tensorflow-upstream
_ _ __ __
| | / \ \ \/ /
_ | |/ _ \ \ /
| |_| / ___ \/ \
\___/_/ \/_/\_\
Bazel binary path: ./bazel-5.1.0-linux-x86_64
Bazel version: 5.1.0
Python binary path: /opt/conda/bin/python3
Python version: 3.9
NumPy version: 1.20.3
MKL-DNN enabled: yes
Target CPU: x86_64
Target CPU features: release
CUDA enabled: no
TPU enabled: no
ROCm enabled: yes
ROCm toolkit path: /opt/rocm/
ROCm amdgpu targets: gfx900,gfx906,gfx908,gfx90a,gfx1030
Building XLA and installing it in the jaxlib source tree...
./bazel-5.1.0-linux-x86_64 run --verbose_failures=true --override_repository=org_tensorflow=/tmp/tensorflow-upstream --config=avx_posix --config=mkl_open_source_only --config=rocm :build_wheel -- --output_path=/home/code/jax/build/rocm/dist --cpu=x86_64
INFO: Options provided by the client:
Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'run' from /home/code/jax/.bazelrc:
Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'run' from /home/code/jax/.bazelrc:
Inherited 'build' options: --apple_platform_type=macos --macos_minimum_os=10.9 --announce_rc --define open_source_build=true --spawn_strategy=standalone --enable_platform_specific_config --experimental_cc_shared_library --define=no_aws_support=true --define=no_gcp_support=true --define=no_hdfs_support=true --define=no_kafka_support=true --define=no_ignite_support=true --define=grpc_no_ares=true -c opt --config=short_logs --copt=-DMLIR_PYTHON_PACKAGE_PREFIX=jaxlib.mlir.
INFO: Reading rc options for 'run' from /home/code/jax/.jax_configure.bazelrc:
Inherited 'build' options: --strategy=Genrule=standalone --repo_env PYTHON_BIN_PATH=/opt/conda/bin/python3 --action_env=PYENV_ROOT --python_path=/opt/conda/bin/python3 --action_env ROCM_PATH=/opt/rocm/ --distinct_host_configuration=false
INFO: Found applicable config definition build:short_logs in file /home/code/jax/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:avx_posix in file /home/code/jax/.bazelrc: --copt=-mavx --host_copt=-mavx
INFO: Found applicable config definition build:mkl_open_source_only in file /home/code/jax/.bazelrc: --define=tensorflow_mkldnn_contraction_kernel=1
INFO: Found applicable config definition build:rocm in file /home/code/jax/.bazelrc: --crosstool_top=@local_config_rocm//crosstool:toolchain --define=using_rocm=true --define=using_rocm_hipcc=true --define=xla_python_enable_gpu=true --repo_env TF_NEED_ROCM=1 --action_env TF_ROCM_AMDGPU_TARGETS=gfx900,gfx906,gfx908
INFO: Found applicable config definition build:rocm in file /home/code/jax/.jax_configure.bazelrc: --action_env TF_ROCM_AMDGPU_TARGETS=gfx900,gfx906,gfx908,gfx90a,gfx1030
INFO: Found applicable config definition build:linux in file /home/code/jax/.bazelrc: --config=posix --copt=-Wno-stringop-truncation --copt=-Wno-array-parameter
INFO: Found applicable config definition build:posix in file /home/code/jax/.bazelrc: --copt=-fvisibility=hidden --copt=-Wno-sign-compare --cxxopt=-std=c++14 --host_cxxopt=-std=c++14
Loading:
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/tensorflow/runtime/archive/2123408fb43a5c4afdf87dafd67117d9c0ff70cd.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found
Loading: 0 packages loaded
Loading: 0 packages loaded
INFO: Repository local_config_rocm instantiated at:
/home/code/jax/WORKSPACE:31:14: in <toplevel>
/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/tensorflow/workspace2.bzl:869:19: in workspace
/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/tensorflow/workspace2.bzl:101:19: in _tf_toolchains
Repository rule rocm_configure defined at:
/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl:843:33: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_config_rocm':
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 824, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 550, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 398, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx, find_rocm_config_script)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 376, column 41, in find_rocm_config
exec_result = _exec_find_rocm_config(repository_ctx, script_path)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 372, column 19, in _exec_find_rocm_config
return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
ERROR: /home/code/jax/WORKSPACE:31:14: fetching rocm_configure rule //external:local_config_rocm: Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 824, column 38, in _rocm_autoconf_impl
_create_local_rocm_repository(repository_ctx)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 550, column 35, in _create_local_rocm_repository
rocm_config = _get_rocm_config(repository_ctx, bash_bin, find_rocm_config_script)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 398, column 30, in _get_rocm_config
config = find_rocm_config(repository_ctx, find_rocm_config_script)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 376, column 41, in find_rocm_config
exec_result = _exec_find_rocm_config(repository_ctx, script_path)
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/gpus/rocm_configure.bzl", line 372, column 19, in _exec_find_rocm_config
return execute(repository_ctx, [python_bin, "-c", decompress_and_execute_cmd])
File "/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/third_party/remote_config/common.bzl", line 230, column 13, in execute
fail(
Error in fail: Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
INFO: Repository bazel_skylib instantiated at:
/home/code/jax/WORKSPACE:28:14: in <toplevel>
/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/org_tensorflow/tensorflow/workspace3.bzl:21:17: in workspace
Repository rule http_archive defined at:
/root/.cache/bazel/_bazel_root/7777a22c05c38dcb5674638712ab6fae/external/bazel_tools/tools/build_defs/repo/http.bzl:353:31: in <toplevel>
ERROR: Skipping ':build_wheel': no such package '@local_config_rocm//rocm': Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_rocm//rocm': Repository command failed
ERROR: ROCm version file not found in ['include/rocm-core/rocm_version.h', 'include/rocm_version.h']
INFO: Elapsed time: 77.870s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
ERROR: Build failed. Not running target
FAILED: Build did NOT complete successfully (0 packages loaded)
b''
Traceback (most recent call last):
File "/home/code/jax/build/rocm/../build.py", line 527, in <module>
main()
File "/home/code/jax/build/rocm/../build.py", line 522, in main
shell(command)
File "/home/code/jax/build/rocm/../build.py", line 53, in shell
output = subprocess.check_output(cmd)
File "/opt/conda/lib/python3.9/subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/opt/conda/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['./bazel-5.1.0-linux-x86_64', 'run', '--verbose_failures=true', '--override_repository=org_tensorflow=/tmp/tensorflow-upstream', '--config=avx_posix', '--config=mkl_open_source_only', '--config=rocm', ':build_wheel', '--', '--output_path=/home/code/jax/build/rocm/dist', '--cpu=x86_64']' returned non-zero exit status 1.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
rocm/jax - Docker Image
JAX -ROCm Docker Images: This repository contains the latest support for JAX in ROCm framework. Here's the recommended command to launch JAX-ROCm docker ......
Read more >Building from source - JAX documentation
Building JAX involves two steps: Building or installing jaxlib , the C++ support library for jax . Installing the jax Python package.
Read more >python-jax - AUR (en) - Arch Linux
Git Clone URL: https://aur.archlinux.org/python-jax.git (read-only, click to copy) ... FAILED: Build did NOT complete successfully ERROR: Build failed.
Read more >Build from source - TensorFlow
Build a TensorFlow pip package from source and install it on Ubuntu Linux and macOS. While the instructions might work for other systems,...
Read more >Error when tryind to build JAX in Windows - Stack Overflow
But I am getting an error when trying to build JAX. ... no ROCm enabled: no Building XLA and installing it in the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I installed rocm 5.1.3, and have a similar failure during the build process as well:
I’m guessing this issue is fixed!