question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

k8 executor ignores namespace configuration from task's executor_config

See original GitHub issue

Apache Airflow version: 1.10.10

Kubernetes version (if you are using kubernetes) (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-beta.0", GitCommit:"e7f962ba86f4ce7033828210ca3556393c377bcc", GitTreeState:"clean", BuildDate:"2020-01-15T08:26:26Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17+", GitVersion:"v1.17.6-eks-4e7f64", GitCommit:"4e7f642f9f4cbb3c39a4fc6ee84fe341a8ade94c", GitTreeState:"clean", BuildDate:"2020-06-11T13:55:35Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS EKS
  • OS (e.g. from /etc/os-release): official airflow docker image
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

What happened: Airflow ignores the namespace provided in the executor_config of the task. Our use case is a centralized airflow running in one namespace scheduling pods in multiple other namespaces based on configuration.

Running the following snippet from the example:

other_ns_task = PythonOperator(
        task_id="other_namespace_task",
        python_callable=print_stuff,
        executor_config={
            "KubernetesExecutor": {
                "namespace": "test-namespace",
                "labels": {
                    "release": "stable"
                }
            }
        }
    )

What you expected to happen: The scheduler should create the pod in the namespace as provided in the task’s configuration.

I believe the problem is in pod_generator.py where the executor_config is overridden by the airflow.cfg kubernetes namespace configuration.

        # Reconcile the pods starting with the first chronologically,
        # Pod from the airflow.cfg -> Pod from executor_config arg -> Pod from the K8s executor
        pod_list = [worker_config, kube_executor_config, dynamic_pod]

dynamic_pod.namespace is initialized with the same value as worker_config.namespace thus overriding the provided kube_executor_config

How to reproduce it:

configure AIRFLOW__KUBERNETES__NAMESPACE (or the value in airflow.cfg) to one namespace, then try to run a task on another namespace.

executor_config={
          "KubernetesExecutor": {
                "namespace": "test-namespace"
            }
        }

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
johngtamcommented, Mar 9, 2022

@eladkal, I’m not OP, but I can confirm this still happens on 2.0.2 – looking at more of the codeflow:

        dynamic_pod = k8s.V1Pod(
            metadata=k8s.V1ObjectMeta(
                namespace=namespace,
                annotations=annotations,
                name=PodGenerator.make_unique_pod_id(pod_id),
                labels=labels,
            ),
            spec=k8s.V1PodSpec(
                containers=[
                    k8s.V1Container(
                        name="base",
                        args=args,
                        image=image,
                        env=[k8s.V1EnvVar(name="AIRFLOW_IS_K8S_EXECUTOR_POD", value="True")],
                    )
                ]
            ),
        )

        # Reconcile the pods starting with the first chronologically,
        # Pod from the pod_template_File -> Pod from executor_config arg -> Pod from the K8s executor
        pod_list = [base_worker_pod, pod_override_object, dynamic_pod]

        return reduce(PodGenerator.reconcile_pods, pod_list)

I see here that dynamic pod takes precedence. This dynamic pod gets its namespace from the caller of this method. Looking at the callers of this method here, here, and here, I search through the code to find out where the namespace is coming from.

If it’s coming from kube_config.executor_namespace, I can see where it’s coming from here. Therefore, going from the code block above, dynamic_pod’s namespace takes precedence, and that is determined from the config’s namespace.

Therefore, dynamic_pod will ultimately override namespace without regard to pod_override, because there can only be one namespace. Going to explore upgrading Airflow on my own (but currently going with a workaround before then), but it doesn’t seem like this particular code flow has changed much – might be worth verifying on part of the Airflow maintainers?

2reactions
DerekHeldtWerlecommented, Mar 16, 2021

Running into this as well @liorhar, did you end up doing anything to resolve it outside of changing the order?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Kubernetes Executor — Airflow Documentation
The Kubernetes executor runs each task instance in its own pod on a Kubernetes cluster. KubernetesExecutor runs as a process in the Airflow...
Read more >
Kubernetes Executor - Apache Airflow Documentation
Kubernetes Executor¶. The kubernetes executor is introduced in Apache Airflow 1.10.0. The Kubernetes executor will create a new pod for every task instance....
Read more >
argo-workflows/schema.json at master · argoproj ... - GitHub
"description": "KrbConfig is the configmap selector for Kerberos config as string It must be set if either ccache or keytab is used." },....
Read more >
Release Notes - Apache Airflow documentation - Amazon AWS
In order to support Dynamic Task Mapping the default templates for per-task instance logging has changed. If your config contains the old default...
Read more >
GitLab Runner Helm Chart
This eliminates the need to configure the image_pull_secrets parameter in the Kubernetes executor config.toml settings. runners: ## Specify one or more ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found