question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

3rd-party packages firing "ModuleNotFoundError" on official airflow 2.0.2 images

See original GitHub issue

Apache Airflow version: 2.0.2

Kubernetes version (if you are using kubernetes) (use kubectl version):

Environment:

  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release): official airflow images apache/airflow:2.0.2-python3.8 and apache/airflow:2.0.2-python3.6
  • Kernel (e.g. uname -a):
  • Install tools: docker
  • Others:

What happened: Using the official 2.0.2 airflow image the 3rd-party packages are not being recognized by the DAG parser, firing the ModuleNotFoundError on DAGs and plugins.

As a test, I created a new docker image with one 3rd-party package (surveymonty). A simple DAG with no tasks imports it. The following error is observed:

Broken DAG: [/usr/src/airflow/dags/simpledag.py] Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/src/airflow/dags/simpledag.py", line 2, in <module>
    import surveymonty
ModuleNotFoundError: No module named 'surveymonty'

I could confirm the package is installed by running pip freeze and python simpledag.py with no errors on both cases.

What you expected to happen: 3rd-party packages to be recognized by the parser.

Using a python image and manually installing airflow 2.0.2 and the requirements, the issue is not present.

How to reproduce it: Create the following Dockerfile:

FROM apache/airflow:2.0.2-python3.8

USER root
RUN pip3 install surveymonty==0.2.5
WORKDIR /usr/src/airflow
ENV AIRFLOW_HOME /usr/src/airflow
ENV AIRFLOW_GPL_UNIDECODE true
RUN chown airflow:airflow .
USER airflow
COPY ./simpledag.py dags/simpledag.py
# Airflow webserver
EXPOSE 8080

ENTRYPOINT []

simpledag.py:

from airflow import DAG
import surveymonty

dag = DAG('test', schedule_interval="* * * * *")

docker-compose.yml:

version: '3.8'
services:
    db:
        image: postgres:9.6
        environment:
            - POSTGRES_USER=airflow
            - POSTGRES_PASSWORD=airflow
            - POSTGRES_DB=airflow
        ports:
            - "5432:5432"

    webserver:
        build: .
        restart: always
        depends_on:
            - db
        environment: 
            - AIRFLOW__CORE__LOAD_EXAMPLES=False
            - AIRFLOW__CORE__DAGS_FOLDER=/usr/src/airflow/dags
            - AIRFLOW__CORE__PLUGINS_FOLDER=/usr/src/airflow/plugins
            - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow@db:5432/airflow
        ports:
            - "8080:8080"
        command: bash -c "airflow initdb && airflow webserver"

    scheduler:
        build: .
        restart: always
        depends_on:
            - db
        environment: 
            - AIRFLOW__CORE__LOAD_EXAMPLES=False
            - AIRFLOW__CORE__DAGS_FOLDER=/usr/src/airflow/dags
            - AIRFLOW__CORE__PLUGINS_FOLDER=/usr/src/airflow/plugins
            - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow@db:5432/airflow
        command: airflow scheduler

I’m my production case, all 3rd-party libraries imported on plugins and DAGs fire the ModuleNotFoundError and the latest Google provider has multiple failures as well.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
argemirontcommented, Apr 23, 2021

Installing the pypi dependencies on the airflow user solved the issue! Thank you so much!

0reactions
potiukcommented, Apr 23, 2021

FYI: This works:

[jarek:~/tmp] 29s % cat Dockerfile 
FROM apache/airflow:2.0.2-python3.8

RUN pip3 install surveymonty==0.2.5


[jarek:~/tmp] % docker build -t test .                           
Sending build context to Docker daemon   38.4kB
Step 1/2 : FROM apache/airflow:2.0.2-python3.8
 ---> 50c98ebd1c4b
Step 2/2 : RUN pip3 install surveymonty==0.2.5
 ---> Using cache
 ---> 8afe77240fd9
Successfully built 8afe77240fd9
Successfully tagged test:latest
[jarek:~/tmp] 3s % docker run -it test python
Python 3.8.8 (default, Mar 27 2021, 03:01:29) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import surveymonty
>>> 
Read more comments on GitHub >

github_iconTop Results From Across the Web

Error: ModuleNotFoundError: No module named 'airflow ...
Apache Airflow version: 2.0.0 Kubernetes version (if you are using ... Error: ModuleNotFoundError: No module named 'airflow' | Docker Image ...
Read more >
Installation — Airflow Documentation - Apache Airflow
Using PyPI. Using Production Docker Images. Using Official Airflow Helm Chart. Using Managed Airflow Services. Using 3rd-party images, charts, deployments.
Read more >
Cloud Composer release notes | Google Cloud
The apache-airflow-providers-google package in images with Airflow 2.1.4 and 2.2.5 was upgraded to 2022.10.17+composer . Changes compared to version 2022.9.6+ ...
Read more >
No module named 'airflow.providers.slack' Airflow 2.0 (MWAA)
By default, MWAA is constrained to using version 3.0.0 for the package apache-airflow-providers-slack . If you specify version 4.2.3 in ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found