question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Secrets Backend Search Path Ordering/Priority

See original GitHub issue

Description

A way to set a custom secrets backend to be lower priority than the built-in airflow.secrets.environment_variables.EnvironmentVariablesBackend and airflow.secrets.metastore.MetastoreBackend.

Use case / motivation When creating a our own secrets backend utilizing Secret Server, our team noticed you cannot configure the the custom backend to be a lower priority than the default secrets backends. In certain cases, we have DAGs that write to different sets of external systems and being able to change one of those external systems easily via environment variable to test certain conditions is a very simple way to validate things, and we also have several variables that have no need of security and checking the env vars first eliminates that network call/load to a busy system.

Now as a workaround, I do realize we can have our own secrets backend check available env vars first, but this does seem a bit clunky given the current design.

The goal would be to be able to toggle a custom backend to be lower priority such that the metastore and env vars are checked first.

Are you willing to submit a PR?

Yes, definitely.

Related Issues

You could argue #16404 is slightly related.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:10 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
uranusjrcommented, Jul 19, 2021

Maybe we should ship a secret backend implementation that allows the user to pass multiple secret backends and search them in that order? Something like:

class AggregatedBackend(BaseSecretsBackend):
    def __init__(self, backends: List[str]) -> None:
        super().__init__()
        self._backends = [get_function_from_name(name) for name in backends]

    def get_variable(self, key: str) -> Optional[str]:
        for backend in self._backends:
            if (value := backend.get_variable(key)) is not None:
                return value

This way the user can do something like

[secrets]
backend = airflow.secrets.aggregated.AggregatedBackend
backend_kwargs = {"backends": ["airflow.secrets.environment_variables.EnvironmentVariablesBackend", "my_backend.MyBackend", "airflow.secrets.metastore.MetastoreBackend"]}

We could even introduce some shorthands like airflow.EnvironmentVariablesBackend.

1reaction
kaxilcommented, Jul 14, 2021

I think the priority is correct and will cause confusion if changed later on. And with customer Secrets Backend, you can mix and match however you like. We intentionally did this (same fox XCom Backend) so that companies can create one for their own needs as ONE SIZE DOES NOT FEEL ALL.

That being said, something I had planned earlier, was to allow DAG Authors to pick a single backend to choose the variable or connections from ( Not getting configurations from Secrets Backend though ).

For Example the following will only check Environment Variables to get Airflow Variables.:

Variable.get(
    key="example_key",
    secrets_backend="airflow.secrets.environment_variables.EnvironmentVariablesBackend"
)

Only changes required:

diff --git a/airflow/models/variable.py b/airflow/models/variable.py
index 7d4726966..05b266bd3 100644
--- a/airflow/models/variable.py
+++ b/airflow/models/variable.py
@@ -124,6 +124,7 @@ class Variable(Base, LoggingMixin):
         key: str,
         default_var: Any = __NO_DEFAULT_SENTINEL,
         deserialize_json: bool = False,
+        secrets_backend: Optional[str] = None,
     ) -> Any:
         """
         Gets a value for an Airflow Variable Key
@@ -132,7 +133,7 @@ class Variable(Base, LoggingMixin):
         :param default_var: Default value of the Variable if the Variable doesn't exists
         :param deserialize_json: Deserialize the value to a Python dict
         """
-        var_val = Variable.get_variable_from_secrets(key=key)
+        var_val = Variable.get_variable_from_secrets(key, secrets_backend)
         if var_val is None:
             if default_var is not cls.__NO_DEFAULT_SENTINEL:
                 return default_var
@@ -193,14 +194,35 @@ class Variable(Base, LoggingMixin):
             self._val = fernet.rotate(self._val.encode('utf-8')).decode()

     @staticmethod
-    def get_variable_from_secrets(key: str) -> Optional[str]:
+    def get_variable_from_secrets(
+        key: str,
+        secrets_backend: Optional[str] = None,
+    ) -> Optional[str]:
         """
         Get Airflow Variable by iterating over all Secret Backends.

         :param key: Variable Key
         :return: Variable Value
         """
-        for secrets_backend in ensure_secrets_loaded():
+        secrets_backends = ensure_secrets_loaded()
+        secrets_backends_classes = {
+            f"{backend.__class__.__module__}.{backend.__class__.__name__}": backend
+            for backend in secrets_backends
+        }
+
+        if secrets_backend not in secrets_backends_classes:
+            raise KeyError(
+                f"Invalid secrets backend - '{secrets_backend}'. "
+                f"Should be one of {', '.join(secrets_backends_classes.keys())}"
+            )
+
+        if secrets_backend:
+            var_val = secrets_backends_classes[secrets_backends].get_variable(key=key)
+            if var_val is not None:
+                return var_val
+            return None
+

cc @fhoda

Read more comments on GitHub >

github_iconTop Results From Across the Web

Secrets Backend — Airflow Documentation
If you enable an alternative secrets backend, it will be searched first, followed by environment variables, then metastore. This search ordering ...
Read more >
Policies | Vault - HashiCorp Developer
Policies are written in HCL or JSON and describe which paths in Vault a user or machine is allowed to access. Here is...
Read more >
Create Your Custom Secrets Backend for Apache Airflow
This talk aims to share how Airflow's secrets backend works, and how users can create their custom secret backends for their specific use ......
Read more >
Create a vault /secret/search endpoint - search the list ...
We wrote a client-side application that lists all secret directories and secrets, but the amount of recursive requests that you have to do...
Read more >
Writing Backstage Configuration Files
The priority of the configurations is determined by the following rules, in order: Configuration from the APP_CONFIG_ environment variables has the highest ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found