Allow use of KubernetesJobEnvironment + S3 storage + KubernetesAgent
See original GitHub issueCurrent behavior
As of #2666 , it’s now possible to use non-docker storage with dockerized agents such as KubernetesAgent
. This is a super exciting feature!
However, if you are using S3
storage and KubernetesAgent
, it seems that it’s not possible to customize the jobs created for flow runs using KubernetesJobEnvironment
.
I’ve observed that no errors are generated today (prefect
0.12.0) when you use this combination:
S3
storageKubernetesJobEnvironment
KubernetesAgent
The agent I’ve created is happily creating jobs in the cluster for flow runs, but the manifests for those jobs are all default values and ignore anything I customize in KubernetesJobEnvironment
.
After digging for a bit, I found the root cause. From “Kubernetes Job Environment”:
The
KubernetesJobEnvironment
accepts an argumentjob_spec_file
which is a string representation of a path to a Kubernetes Job YAML file. On initialization that Job spec file is loaded and stored in the Environment. It will never be sent to Prefect Cloud and will only exist inside your Flow’s Docker storage.
I can also see this in code in KubernetesJobEnvironment.create_flow_run_job()
, where the environment explicitly expects a docker image containing the job_spec_file
Proposed behavior
I’d like to be able to store the job_spec_file
from KubernetesJobEnvironment
in S3
storage, so that the job details of a flow run using S3
storage and run by KubernetesAgent
can be customized.
Example
The benefits of non-docker Storage are explained in https://docs.prefect.io/orchestration/execution/storage_options.html#non-docker-storage-for-containerized-environments. Adding this proposed behavior would allow flows using KubernetesAgent
to take advantage of that storage without sacrificing the ability to customize the jobs using KubernetesJobEnvironment
.
Without this proposed behavior, I think users of KubernetesAgent
have to choose between non-docker storage and customizing their jobs.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:8
Hi, I have a similar request: KubernetesAgent, KubernetesJobEnvironment and storage:GitHub.
I expected the file specified in KubernetesJobEnvironment to be picked up, and
prefect register flow
validates that it can read the yaml file, but when running from KuvernetesAgent I getFailed to load and execute Flow's environment: FileNotFoundError(2, 'No such file or directory')
@joshmeek I think it can be closed, yes.
I haven’t tried with S3 storage since #2950 was merged, but I have been using another non-Docker storage (
Webhook
) successfully withKubernetesJobEnvironment
+ a custom spec file for a few days and it’s been working exactly as expected.