Add Production-ready docker compose for the production image
See original GitHub issueDescription
In order to use the production image we are already working on a helm chart, but we might want to add a production-ready docker compose that will be able to run airflow installation.
Use case / motivation
For local tests/small deployments - being able to have such docker-compose environment would be really nice.
We seem to get to consensus that we need to have several docker-compose “sets” of files:
- Local Executor
- Celery Executor
- Kubernetes Executor (??? do we need to have a Kubernetes Executor in a Compose ? I guess not…)
They should be varianted and possible to specify the number of parameters:
- Database (Postgres/MySQL)
- Redis vs. Rabitmq (should we choose one ???)
- Ports
- Volumes (persistent / not)
- Airflow Images
- Fernet Key
- RBAC
Depending on the setup, those Docker compose file should do proper DB initialisation.
Example Docker Compose (From https://apache-airflow.slack.com/archives/CQAMHKWSJ/p1587748008106000) that we might use as a base and #8548 . This is just example so this issue will not implement all of it and we will likely split those docker-compose into separate postgres/sqlite/mysql similarly as we do in CI script, so I wanted to keep it as separate issue - we will deal with user creation in #8606
version: '3'
services:
postgres:
image: postgres:latest
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=airflow
- POSTGRES_PORT=5432
ports:
- 5432:5432
redis:
image: redis:latest
ports:
- 6379:6379
flower:
image: apache/airflow:1.10.10
volumes:
- ./airflow-data/dags:/opt/airflow/dags
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__WEBSERVER__RBAC=True
command: flower
ports:
- 5555:5555
airflow:
image: apache/airflow:1.10.10
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__WEBSERVER__RBAC=True
command: webserver
ports:
- 8080:8080
volumes:
- ./airflow-data/dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
airflow-scheduler:
image: apache/airflow:1.10.10
container_name: airflow_scheduler_cont
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__WEBSERVER__RBAC=True
command: scheduler
volumes:
- ./airflow-data/dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
airflow-worker1:
image: apache/airflow:1.10.10
container_name: airflow_worker1_cont
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__WEBSERVER__RBAC=True
command: worker
volumes:
- ./airflow-data/dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
airflow-worker2:
image: apache/airflow:1.10.10
container_name: airflow_worker2_cont
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__WEBSERVER__RBAC=True
command: worker
volumes:
- ./airflow-data/dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
airflow-worker3:
image: apache/airflow:1.10.10
container_name: airflow_worker3_cont
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__CELERY__BROKER_URL=redis://:@redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://postgres:postgres@postgres:5432/airflow
- AIRFLOW__CORE__FERNET_KEY=FB0o_zt4e3Ziq3LdUUO7F2Z95cvFFx16hU8jTeR1ASM=
- AIRFLOW__CORE__LOAD_EXAMPLES=False
- AIRFLOW__WEBSERVER__RBAC=True
command: worker
volumes:
- ./airflow-data/dags:/opt/airflow/dags
- ./airflow-data/logs:/opt/airflow/logs
- ./airflow-data/plugins:/opt/airflow/plugins
Another example from https://apache-airflow.slack.com/archives/CQAMHKWSJ/p1587679356095400:
version: '3.7'
networks:
airflow:
name: airflow
attachable: true
volumes:
logs:
x-database-env:
&database-env
POSTGRES_USER: airflow
POSTGRES_DB: airflow
POSTGRES_PASSWORD: airflow
x-airflow-env:
&airflow-env
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__WEBSERVER__RBAC: 'True'
AIRFLOW__CORE__CHECK_SLAS: 'False'
AIRFLOW__CORE__STORE_SERIALIZED_DAGS: 'False'
AIRFLOW__CORE__PARALLELISM: 50
AIRFLOW__CORE__LOAD_EXAMPLES: 'False'
AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS: 'False'
AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC: 10
services:
postgres:
image: postgres:11.5
environment:
<<: *database-env
PGDATA: /var/lib/postgresql/data/pgdata
ports:
- 5432:5432
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./database/data:/var/lib/postgresql/data/pgdata
- ./database/logs:/var/lib/postgresql/data/log
command: >
postgres
-c listen_addresses=*
-c logging_collector=on
-c log_destination=stderr
-c max_connections=200
networks:
- airflow
redis:
image: redis:5.0.5
environment:
REDIS_HOST: redis
REDIS_PORT: 6379
ports:
- 6379:6379
networks:
- airflow
webserver:
image: airflow:1.10.10
user: airflow
ports:
- 8090:8080
volumes:
- ./dags:/opt/airflow/dags
- logs:/opt/airflow/logs
- ./files:/opt/airflow/files
- /var/run/docker.sock:/var/run/docker.sock
environment:
<<: *database-env
<<: *airflow-env
ADMIN_PASSWORD: airflow
depends_on:
- postgres
- redis
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /opt/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
networks:
- airflow
flower:
image: airflow:1.10.10
user: airflow
ports:
- 5555:5555
depends_on:
- redis
volumes:
- logs:/opt/airflow/logs
command: flower
networks:
- airflow
scheduler:
image: airflow:1.10.10
volumes:
- ./dags:/opt/airflow/dags
- logs:/opt/airflow/logs
- ./files:/opt/airflow/files
- /var/run/docker.sock:/var/run/docker.sock
environment:
<<: *database-env
command: scheduler
networks:
- airflow
worker:
image: airflow:1.10.10
user: airflow
volumes:
- ./dags:/opt/airflow/dags
- logs:/opt/airflow/logs
- ./files:/opt/airflow/files
- /var/run/docker.sock:/var/run/docker.sock
environment:
<<: *database-env
command: worker
depends_on:
- scheduler
Related issues The initial user creation #8606, #8548 Quick start documentation planned in #8542
Issue Analytics
- State:
- Created 3 years ago
- Comments:43 (39 by maintainers)
Top GitHub Comments
I have prepared some Dockerfiles with some common configuration.
Postgres - Redis - Airflow 2.0
Postgres - Redis - Airflow 1.10.14
Mysql 8.0 - Redis - Airflow 2.0
Mysql 8.0 - Redis - Airflow 1.10.14
I added health checks where it was simple. Anyone have an idea for health-checks for
airflow-scheduler
/airflow-worker
? This will improve stability.Besides, I am planning to prepare a tool that is used to generate docker-compose files using a simple wizard. I am thinking of something similar to the Pytorch project. https://pytorch.org/get-started/locally/
Thank you all for the
docker-compose
files 😃 I’m sharing mine as it addresses some aspects that I couldn’t find in this thread and had me spend some time on it to get it to work. These are:git-sync
(This one is optional but is quite convienent).@mik-laj I also have a working healthcheck on the scheduler. Not the most expressive but works.
This configuration relies on an existing and initialized database.
External database - LocalExecutor - Airflow 2.0.0 - Traefik - Dags mostly based on DockerOperator.
I have an extra container (not shown) to handle rotating logs that are output directly to files. It is based on logrotate. Not sharing it here because it is a custom image and is beyond the scope of the thread. But if anybody interested, message me.
Hope it helps!