question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cupy error during inference when starting docker container as non-root

See original GitHub issue

Describe the bug When running monailabel docker container as non-root, inference of segmentation model throws an error (deepedit inference works fine though).

To Reproduce Steps to reproduce the behavior:

  1. Run docker container (e.g. projectmonai/monailabel:0.4.2) as non-root (e.g. with parameter --user $(id -u):$(id -g)).
  2. Run default radiology app with all models, on any abdominal CT dataset.
  3. Run inference with segmentation model.

Expected behavior Inference should work. Instead, a cupy error is thrown (detailed log below). In Slicer, only an “Internal Server Error” is reported, the cupy error only shows up in MONAI-Label Server logs.

Environment docker container projectmonai/monailabel:0.4.2

Detailed error output:

[2022-08-12 11:54:57,313] [263] [MainThread] [ERROR] (uvicorn.error:369) - Exception in ASGI application
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 366, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/opt/conda/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
    return await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.8/site-packages/fastapi/applications.py", line 269, in __call__
    await super().__call__(scope, receive, send)
  File "/opt/conda/lib/python3.8/site-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/conda/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/opt/conda/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/opt/conda/lib/python3.8/site-packages/starlette/middleware/cors.py", line 84, in __call__
    await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.8/site-packages/starlette/exceptions.py", line 93, in __call__
    raise exc
  File "/opt/conda/lib/python3.8/site-packages/starlette/exceptions.py", line 82, in __call__
    await self.app(scope, receive, sender)
  File "/opt/conda/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/opt/conda/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.8/site-packages/starlette/routing.py", line 670, in __call__
    await route.handle(scope, receive, send)
  File "/opt/conda/lib/python3.8/site-packages/starlette/routing.py", line 266, in handle
    await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.8/site-packages/starlette/routing.py", line 65, in app
    response = await func(request)
  File "/opt/conda/lib/python3.8/site-packages/fastapi/routing.py", line 227, in app
    raw_response = await run_endpoint_function(
  File "/opt/conda/lib/python3.8/site-packages/fastapi/routing.py", line 160, in run_endpoint_function
    return await dependant.call(**values)
  File "/opt/conda/lib/python3.8/site-packages/monailabel/endpoints/infer.py", line 179, in api_run_inference
    return run_inference(background_tasks, model, image, session_id, params, file, label, output)
  File "/opt/conda/lib/python3.8/site-packages/monailabel/endpoints/infer.py", line 161, in run_inference
    result = instance.infer(request)
  File "/opt/conda/lib/python3.8/site-packages/monailabel/interfaces/app.py", line 289, in infer
    result_file_name, result_json = task(request)
  File "/opt/conda/lib/python3.8/site-packages/monailabel/interfaces/tasks/infer.py", line 286, in __call__
    data = self.run_post_transforms(data, self.post_transforms(data))
  File "/opt/conda/lib/python3.8/site-packages/monailabel/interfaces/tasks/infer.py", line 348, in run_post_transforms
    return run_transforms(data, transforms, log_prefix="POST")
  File "/opt/conda/lib/python3.8/site-packages/monailabel/interfaces/utils/transform.py", line 93, in run_transforms
    data = t(data)
  File "/opt/monai/monai/transforms/post/dictionary.py", line 259, in __call__
    d[key] = self.converter(d[key])
  File "/opt/monai/monai/transforms/post/array.py", line 369, in __call__
    mask = get_largest_connected_component_mask(foreground, self.connectivity)
  File "/opt/monai/monai/transforms/utils.py", line 961, in get_largest_connected_component_mask
    x_label = cucim.skimage.measure.label(x_cupy, connectivity=connectivity)
  File "/opt/conda/lib/python3.8/site-packages/cucim/skimage/_shared/utils.py", line 231, in fixed_func
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/cucim/skimage/measure/_label.py", line 125, in label
    num = _label(label_image, structure, labels, greyscale_mode=True)
  File "/opt/conda/lib/python3.8/site-packages/cucim/skimage/measure/_label_kernels.py", line 23, in _label
    _kernel_init()(x, y)
  File "cupy/_core/_kernel.pyx", line 850, in cupy._core._kernel.ElementwiseKernel.__call__
  File "cupy/_core/_kernel.pyx", line 875, in cupy._core._kernel.ElementwiseKernel._get_elementwise_kernel
  File "cupy/_util.pyx", line 59, in cupy._util.memoize.decorator.ret
  File "cupy/_core/_kernel.pyx", line 662, in cupy._core._kernel._get_elementwise_kernel
  File "cupy/_core/_kernel.pyx", line 62, in cupy._core._kernel._get_simple_elementwise_kernel
  File "cupy/_core/core.pyx", line 2040, in cupy._core.core.compile_with_cache
  File "/opt/conda/lib/python3.8/site-packages/cupy/cuda/compiler.py", line 461, in compile_with_cache
    return _compile_with_cache_cuda(
  File "/opt/conda/lib/python3.8/site-packages/cupy/cuda/compiler.py", line 516, in _compile_with_cache_cuda
    os.makedirs(cache_dir, exist_ok=True)
  File "/opt/conda/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/conda/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/.cupy'

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9

github_iconTop GitHub Comments

1reaction
nvahmadicommented, Aug 31, 2022

Hi @SachidanandAlle - you are fully right. The MONAI docker images are not built with user profiles, and the way I mounted the container was too careless to set the profile inside the container correctly (only enough to have file ownership set correctly on the host system). Therefore, the home directory of my user did not exist, which led to the cupy error.

For reference, I am posting a solution here, since I really want to have file ownership set correctly on my host system: To mount the user profile correctly, it is necessary to also map the home folder, and the /etc/passwd folder in read-only mode (this is apparently not preferred, as described in this forum message, but I am working on a system with relatively few users, so I think it should be fine). The complete docker run command, with image v0.4.2 (as in the example above) would look sth like this:

docker run -it \
    --gpus all \
    --shm-size=16g \
    --ulimit memlock=-1 \
    --ulimit stack=67108864 \
    --user $(id -u):$(id -g) \
    -v /etc/passwd:/etc/passwd:ro \
    -v /home/$(id -u -n):/home/$(id -u -n) \
    --network=host \
    --ipc=host \
    --name monailabel_0.4.2_nonroot \
    projectmonai/monailabel:0.4.2

with the key lines being these:

    --user $(id -u):$(id -g) \
    -v /etc/passwd:/etc/passwd:ro \
    -v /home/$(id -u -n):/home/$(id -u -n) \

With this, the cupy error does not happen anymore, and I am still mapping files correctly to my user profile on host. Thanks for the help everyone, and for the key hint, @SachidanandAlle!

1reaction
nvahmadicommented, Aug 12, 2022

@SachidanandAlle - " wondering any issues with specific system here…" The system is a DGX Station V100 32GB with the latest DGX OS. And yes, I can confirm that it works perfectly fine when starting the docker container with root user. I am not blocked at all, it was a coincidental finding, thought I’d report it here in case someone else has this issue - workaround is easy, just run container as root.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Docker image does not run under arbitrary non-root user #2600
If you try to run as another non-root user you get a failure in the logs: ... The Docker image will run with...
Read more >
Run the Docker daemon as a non-root user (Rootless mode)
Rootless mode allows running the Docker daemon and containers as a non-root user to mitigate potential vulnerabilities in the daemon and the container...
Read more >
Serving Machine Learning Models With Docker: 5 Mistakes ...
As you would already know that Docker is a tool that allows you to create and deploy isolated environments using containers for running...
Read more >
Unable to run dotnet test in docker container when run as non ...
dotnet test runs successfully using the following docker file and docker command on running as root user. Docker file: FROM mcr.microsoft.com/ ...
Read more >
Running a Docker container as a non-root user - Medium
The Problem: Docker writes files as root. Sometimes, when we run builds in Docker containers, the build creates files in a folder that's...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found