question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Using env in BashOperator results in invalid SyntaxError

See original GitHub issue

Apache Airflow version: 2.0.1

Environment:

  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release): OSX11 arm64
  • Kernel (e.g. uname -a): Darwin 20.3.0
  • Install tools: pip
  • Others:

What happened:

I am executing python scripts using the BashOperator. For my scripts to work I need to set some environment variables for connecting to the database. Until a few hours ago I was explicitly exporting them in the same bash command in the BashOperator:

my_task = BashOperator(
        task_id='my-task',
        bash_command=f"export DB_URL={db_url}; echo -e 'y' | python /path/to/my/script/my_script.py"
    )

This works, but in the logs we can see the password in plain text. Therefore, I tried using the env parameter and pass the variable there:

my_task = BashOperator(
        task_id='my-task',
        bash_command=f"echo -e 'y' | python /path/to/my/script/my_script.py",
        env={'DB_URL': db_url}
    )

This results in an Error where it is claimed that there is a SyntaxError in my Python script which executes fine without using env:

[2021-03-05 10:13:20,416] {bash.py:158} INFO - Running command: echo -e 'y' | python  /path/to/my/script/my_script.py
[2021-03-05 10:13:20,424] {bash.py:169} INFO - Output:
[2021-03-05 10:13:20,518] {bash.py:173} INFO -   File "/path/to/my/script/my_script.py", line 16
[2021-03-05 10:13:20,530] {bash.py:173} INFO -     n_rows: int = query.count()
[2021-03-05 10:13:20,531] {bash.py:173} INFO -           ^
[2021-03-05 10:13:20,531] {bash.py:173} INFO - SyntaxError: invalid syntax
[2021-03-05 10:13:20,531] {bash.py:177} INFO - Command exited with return code 1
[2021-03-05 10:13:20,544] {taskinstance.py:1455} ERROR - Bash command failed. The command returned a non-zero exit code.

What you expected to happen: Executing the script flawlessly with the environment variable set and not showing in the log.

How to reproduce it: Create a file hello.py:

import os

print(f"Hello {os.environ.get('name')}")

And create a DAG:

from datetime import timedelta, datetime
from airflow import DAG
from airflow.operators.bash import BashOperator

default_args = {
    'owner': 'Bilbo',
    'depends_on_past': False,
    'email': ['bilbo@the-shire.com'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 0,
    'retry_delay': timedelta(minutes=5),
}

with DAG(
    dag_id='test-env',
    default_args=default_args,
    description='Test env',
    schedule_interval=None,
    start_date=datetime(2021, 2, 24)
) as dag:

    my_task = BashOperator(
        task_id='my-task',
        bash_command=f"python /Users/bilbo/hello.py",
        env={'name': 'AirFlow'}
    )

This results in the following error:

[2021-03-05 10:21:59,646] {bash.py:158} INFO - Running command: python /Users/bilbo/hello.py
[2021-03-05 10:21:59,654] {bash.py:169} INFO - Output:
[2021-03-05 10:21:59,717] {bash.py:173} INFO -   File "/Users/bilbo/hello.py", line 3
[2021-03-05 10:21:59,718] {bash.py:173} INFO -     print(f"Hello {os.environ.get('name')}")
[2021-03-05 10:21:59,718] {bash.py:173} INFO -                                           ^
[2021-03-05 10:21:59,718] {bash.py:173} INFO - SyntaxError: invalid syntax
[2021-03-05 10:21:59,719] {bash.py:177} INFO - Command exited with return code 1
[2021-03-05 10:21:59,742] {taskinstance.py:1455} ERROR - Bash command failed. The command returned a non-zero exit code.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
mik-lajcommented, Mar 5, 2021

This doesn’t look like a problem with Aiirflow, but your script is not compatible with Python 2 and the default system interpreter is Python 2 on your system. You probably use virtual environments and you didn’t pass the information on to the subprocess.

Can you try the code below?

    my_task = BashOperator(
        task_id='my-task',
        bash_command=f"python /Users/bilbo/hello.py",
        env={**os.environ, 'name': 'AirFlow'}
    )

0reactions
HansBambelcommented, Mar 6, 2021

Ok, so the env parameter creates a new environment and does not add environment variables. Got it, thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Airflow execute python script through bash operator
You use double quotes for JSON, but Python interprets them as start or end of a string. One way to resolve this is...
Read more >
BashOperator "env" should extend os envvars and not entirely ...
I'm using a BashOperator + custom env dict on an Airflow stable helm installation, and i had to add an additional env dict...
Read more >
BashOperator — Airflow Documentation
Use the BashOperator to execute commands in a Bash shell. ... Instead, you should pass this via the env kwarg and use double-quotes...
Read more >
Bash Reference Manual - GNU.org
Bash performs the expansion by executing command in a subshell environment and replacing the command substitution with the standard output ...
Read more >
Advanced Bash-Scripting Guide
By itself on the command-line, file fails with an error message. Add a "-" for a more useful result. This causes the shell...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found