question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[backend] OutputPath is giving "permission denied" - why?

See original GitHub issue

When I run a step that runs fine with Kubeflow v1 compatibility, it fails with the following error:

time="2022-04-26T21:53:30.710Z" level=info msg="capturing logs" argo=true
I0426 21:53:30.745547      18 launcher.go:144] PipelineRoot defaults to "minio://mlpipeline/v2/artifacts".
I0426 21:53:30.745908      18 cache.go:120] Connecting to cache endpoint 10.100.244.104:8887
I0426 21:53:30.854201      18 launcher.go:193] enable caching
F0426 21:53:30.979055      18 main.go:50] Failed to execute component: failed to create directory "/tmp/outputs/output_context_path" for output parameter "output_context_path": mkdir /tmp/outputs/output_context_path: permission denied
time="2022-04-26T21:53:30.980Z" level=info msg="/tmp/outputs/output_context_path/data -> /var/run/argo/outputs/artifacts/tmp/outputs/output_context_path/data.tgz" argo=true
time="2022-04-26T21:53:30.981Z" level=info msg="Taring /tmp/outputs/output_context_path/data"
Error: failed to tarball the output /tmp/outputs/output_context_path/data to /var/run/argo/outputs/artifacts/tmp/outputs/output_context_path/data.tgz: stat /tmp/outputs/output_context_path/data: permission denied
failed to tarball the output /tmp/outputs/output_context_path/data to /var/run/argo/outputs/artifacts/tmp/outputs/output_context_path/data.tgz: stat /tmp/outputs/output_context_path/data: permission denied

The code that produces this is here:

import kfp
from kfp.v2.dsl import component, Artifact, Input, InputPath, Output, OutputPath, Dataset, Model
from typing import NamedTuple

def same_step_000_afc67b36914c4108b47e8b4bb316869d_fn(
    input_context_path: InputPath(str),
    output_context_path: OutputPath(str),
    run_info: str ="gAR9lC4=",
    metadata_url: str="",
):
    from base64 import urlsafe_b64encode, urlsafe_b64decode
    from pathlib import Path
    import datetime
    import requests
    import tempfile
    import dill
    import os

    input_context = None
    with Path(input_context_path).open("rb") as reader:
        input_context = reader.read()

    # Helper function for posting metadata to mlflow.
    def post_metadata(json):
        if metadata_url == "":
            return

        try:
            req = requests.post(metadata_url, json=json)
            req.raise_for_status()
        except requests.exceptions.HTTPError as err:
            print(f"Error posting metadata: {err}")

    # Move to writable directory as user might want to do file IO.
    # TODO: won't persist across steps, might need support in SDK?
    os.chdir(tempfile.mkdtemp())

    # Load information about the current experiment run:
    run_info = dill.loads(urlsafe_b64decode(run_info))

    # Post session context to mlflow.
    if len(input_context) > 0:
            input_context_str = urlsafe_b64encode(input_context)
            post_metadata({
                "experiment_id": run_info["experiment_id"],
                "run_id": run_info["run_id"],
                "step_id": "same_step_000",
                "metadata_type": "input",
                "metadata_value": input_context_str,
                "metadata_time": datetime.datetime.now().isoformat(),
            })

    # User code for step, which we run in its own execution frame.
    user_code = f"""
import dill

# Load session context into global namespace:
if { len(input_context) } > 0:
    dill.load_session("{ input_context_path }")

{dill.loads(urlsafe_b64decode("gASVGAAAAAAAAACMFHByaW50KCJIZWxsbyB3b3JsZCIplC4="))}

# Remove anything from the global namespace that cannot be serialised.
# TODO: this will include things like pandas dataframes, needs sdk support?
_bad_keys = []
_all_keys = list(globals().keys())
for k in _all_keys:
    try:
        dill.dumps(globals()[k])
    except TypeError:
        _bad_keys.append(k)

for k in _bad_keys:
    del globals()[k]

# Save new session context to disk for the next component:
dill.dump_session("{output_context_path}")
"""

    # Runs the user code in a new execution frame. Context from the previous
    # component in the run is loaded into the session dynamically, and we run
    # with a single globals() namespace to simulate top-level execution.
    exec(user_code, globals(), globals())

    # Post new session context to mlflow:
    with Path(output_context_path).open("rb") as reader:
        context = urlsafe_b64encode(reader.read())
        post_metadata({
            "experiment_id": run_info["experiment_id"],
            "run_id": run_info["run_id"],
            "step_id": "same_step_000",
            "metadata_type": "output",
            "metadata_value": context,
            "metadata_time": datetime.datetime.now().isoformat(),
        })

Environment

  • How did you deploy Kubeflow Pipelines (KFP)? From manifests
  • KFP version: 1.8.1
  • KFP SDK version: 1.8.12

Expected result

Is there supposed to be a different way of writing this file? Do I need to mount a storage location in order to do this? (You didn’t have to in KFP v2)


Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:4
  • Comments:5

github_iconTop GitHub Comments

1reaction
aronchickcommented, Apr 29, 2022

FYI, this is a pure regression from KFP 1.

Here’s two gists - one compiled in LEGACY mode and one compiled in V2 compat.

using the compiler in v1 mode - https://gist.github.com/aronchick/0dfc57d2a794c1bd4fb9bff9962cfbd6 using the compiler in v2 mode - https://gist.github.com/aronchick/473060503ae189b360fbded04d802c80

First one executes fine. Second one gives:

time="2022-04-26T21:53:30.710Z" level=info msg="capturing logs" argo=true
I0426 21:53:30.745547      18 launcher.go:144] PipelineRoot defaults to "minio://mlpipeline/v2/artifacts".
I0426 21:53:30.745908      18 cache.go:120] Connecting to cache endpoint 10.100.244.104:8887
I0426 21:53:30.854201      18 launcher.go:193] enable caching
F0426 21:53:30.979055      18 main.go:50] Failed to execute component: failed to create directory "/tmp/outputs/output_context_path" for output parameter "output_context_path": mkdir /tmp/outputs/output_context_path: permission denied
time="2022-04-26T21:53:30.980Z" level=info msg="/tmp/outputs/output_context_path/data -> /var/run/argo/outputs/artifacts/tmp/outputs/output_context_path/data.tgz" argo=true
time="2022-04-26T21:53:30.981Z" level=info msg="Taring /tmp/outputs/output_context_path/data"
Error: failed to tarball the output /tmp/outputs/output_context_path/data to /var/run/argo/outputs/artifacts/tmp/outputs/output_context_path/data.tgz: stat /tmp/outputs/output_context_path/data: permission denied
failed to tarball the output /tmp/outputs/output_context_path/data to /var/run/argo/outputs/artifacts/tmp/outputs/output_context_path/data.tgz: stat /tmp/outputs/output_context_path/data: permission denied
0reactions
asiricmlcommented, Dec 21, 2022

This issue is affecting us as well. Is there any solution / workaround ? Thank You

Read more comments on GitHub >

github_iconTop Results From Across the Web

Permission Denied errors with Visual Studio 2008 under Vista
Open your .csproj file and make sure your output path C:\test\ is correct that one more place to check. Share. Share ...
Read more >
Container permission denied: How to diagnose this error
1. Confirm the problem is security ... Use the --privileged flag to ensure it is a security problem. Sometimes the problem is related...
Read more >
[MSB3191] Unable to create directory. Access to the path is ...
I am running into an error when trying to compile. I have chmod 777 set on the solution and all sub directories.
Read more >
Permission denied (publickey) | Bitbucket Cloud Cloud KB
Diagnosis. You are receiving this message because Bitbucket Cloud could not authenticate you with any of the keys that were offered to it...
Read more >
How to Fix Docker Permission Denied? - phoenixNAP
The "Permission Denied" error appears when a system cannot communicate with the Docker daemon because the user lacks privileges. The example ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found