feature: consider allowing meltano run plugins to export env vars to context for down stream plugins to use
See original GitHub issueFeature scope
CLI (options, error messages, logging, etc.)
Description
I have a tap-spreadsheets-anywhere configuration that I want to make more dynamic by injecting a DATE into the Azure IP address URL (because its rotated every 14 days). Currently environment variables dont expand at that depth of config so I cant use regular environment variable templating. Also using the scheduled jobs feature I dont have the ability to set environment variables prior to my run like MY_DATE=$(date +'%Y-%m-%d') meltano run tap-spreadsheets-anywhere target-snowflake
because Airflow builds the command for me for the input tap-spreadsheets-anywhere target-snowflake
. Another thing to consider is that my production is in read-only mode, I think many others do this too, so I cant make changes to files.
Something I considered doing was writing a python script to populate the config value and try to export it to the meltano environment context, like meltano run my_script_to_export_env tap-spreadsheets-anywhere target-snowflake
. This is definitely a hack but I wonder if theres other use cases for something like this, maybe multi tenant would also like something like this.
I know theres probably a challenge with the idea of potentially parallelizing these blocks in the future in which case the timing of the script would get messed up.
@kgpayne suggested that we consider adding functionality to the EDK to allow writing back to meltano environment, maybe thats a path forward.
Is there a simpler way to achieve what I want?
Related slack thread https://meltano.slack.com/archives/C01UTUSP34M/p1660163864060499
Issue Analytics
- State:
- Created a year ago
- Comments:7 (5 by maintainers)
Top GitHub Comments
@tayloramurphy I might be misunderstanding but I think @pnadolny13 has a need for more “dynamic” env vars. Things you define in the schedule config would generally be static.
A context concept is something we’ve chatted about it in the EDK spec discussions, and something I could definitely see being handy. A databricks plugin might provision a spark cluster and pass on what cluster was provisioned, etc.
IMHO there shouldn’t be - if we’re smart about how jobs/tasks get parallelized - you would just need to ensure that your script runs before down stream parallel tasks get farmed out. Basically, it “just” means run invocations need to support both sequential and parallel execution stages. Which with our current “Block” architecture is totally doable.
Another related request by @JulesHuisman in slack thread https://meltano.slack.com/archives/C03QCPY1XBQ/p1668112601570319. The idea was to create an extension that allows you to inject env vars from Azure Key Vault like
meltano run vault:inject tap-xyz target-xyz
. This may potentially be solved by secrets backends though.