DockerOperator fails to pull an image
See original GitHub issueApache Airflow version: 2.0
Environment:
- OS (from /etc/os-release): Debian GNU/Linux 10 (buster)
- Kernel (
uname -a
): Linux 37365fa0b59b 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 GNU/Linux - Others: running inside a docker container, forked puckel/docker-airflow
What happened:
DockerOperator
does not attempt to pull an image unless force_pull is set to True, instead displaying a misleading 404 error.
What you expected to happen:
DockerOperator
should attempt to pull an image when it is not present locally.
How to reproduce it:
Make sure you don’t have an image tagged debian:buster-slim
present locally.
DockerOperator(
task_id=f'try_to_pull_debian',
image='debian:buster-slim',
command=f'''echo hello''',
force_pull=False
)
prints: {taskinstance.py:1396} ERROR - 404 Client Error: Not Found ("No such image: ubuntu:latest")
This, on the other hand:
DockerOperator(
task_id=f'try_to_pull_debian',
image='debian:buster-slim',
command=f'''echo hello''',
force_pull=True
)
pulls the image and prints {docker.py:263} INFO - hello
Anything else we need to know:
I overrode DockerOperator
to track down what I was doing wrong and found the following:
When trying to run an image that’s not present locally, self.cli.images(name=self.image)
in the line:
https://github.com/apache/airflow/blob/8723b1feb82339d7a4ba5b40a6c4d4bbb995a4f9/airflow/providers/docker/operators/docker.py#L286
returns a non-empty array even when the image has been deleted from the local machine.
In fact, self.cli.images
appears to return non-empty arrays even when supplied with nonsense image names.
force_pull_false.log
[2021-01-27 06:15:28,987] {__init__.py:124} DEBUG - Preparing lineage inlets and outlets [2021-01-27 06:15:28,987] {__init__.py:168} DEBUG - inlets: [], outlets: [] [2021-01-27 06:15:28,987] {config.py:21} DEBUG - Trying paths: ['/usr/local/airflow/.docker/config.json', '/usr/local/airflow/.dockercfg'] [2021-01-27 06:15:28,987] {config.py:25} DEBUG - Found file at path: /usr/local/airflow/.docker/config.json [2021-01-27 06:15:28,987] {auth.py:182} DEBUG - Found 'auths' section [2021-01-27 06:15:28,988] {auth.py:142} DEBUG - Found entry (registry='https://index.docker.io/v1/', username='xxxxxxx') [2021-01-27 06:15:29,015] {connectionpool.py:433} DEBUG - http://localhost:None "GET /version HTTP/1.1" 200 851 [2021-01-27 06:15:29,060] {connectionpool.py:433} DEBUG - http://localhost:None "GET /v1.41/images/json?filter=debian%3Abuster-slim&only_ids=0&all=0 HTTP/1.1" 200 None [2021-01-27 06:15:29,060] {docker.py:224} INFO - Starting docker container from image debian:buster-slim [2021-01-27 06:15:29,063] {connectionpool.py:433} DEBUG - http://localhost:None "POST /v1.41/containers/create HTTP/1.1" 404 48 [2021-01-27 06:15:29,063] {taskinstance.py:1396} ERROR - 404 Client Error: Not Found ("No such image: debian:buster-slim") Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 261, in _raise_for_status response.raise_for_status() File "/usr/local/lib/python3.8/site-packages/requests/models.py", line 941, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.41/containers/createDuring handling of the above exception, another exception occurred:
Traceback (most recent call last): File “/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py”, line 1086, in _run_raw_task self._prepare_and_execute_task_with_callbacks(context, task) File “/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py”, line 1260, in _prepare_and_execute_task_with_callbacks result = self._execute_task(context, task_copy) File “/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py”, line 1300, in _execute_task result = task_copy.execute(context=context) File “/usr/local/lib/python3.8/site-packages/airflow/providers/docker/operators/docker.py”, line 305, in execute return self._run_image() File “/usr/local/lib/python3.8/site-packages/airflow/providers/docker/operators/docker.py”, line 231, in _run_image self.container = self.cli.create_container( File “/usr/local/lib/python3.8/site-packages/docker/api/container.py”, line 427, in create_container return self.create_container_from_config(config, name) File “/usr/local/lib/python3.8/site-packages/docker/api/container.py”, line 438, in create_container_from_config return self._result(res, True) File “/usr/local/lib/python3.8/site-packages/docker/api/client.py”, line 267, in _result self._raise_for_status(response) File “/usr/local/lib/python3.8/site-packages/docker/api/client.py”, line 263, in _raise_for_status raise create_api_error_from_http_exception(e) File “/usr/local/lib/python3.8/site-packages/docker/errors.py”, line 31, in create_api_error_from_http_exception raise cls(e, response=response, explanation=explanation) docker.errors.ImageNotFound: 404 Client Error: Not Found (“No such image: debian:buster-slim”)
force_pull_true.log
[2021-01-27 06:17:01,811] {__init__.py:124} DEBUG - Preparing lineage inlets and outlets [2021-01-27 06:17:01,811] {__init__.py:168} DEBUG - inlets: [], outlets: [] [2021-01-27 06:17:01,811] {config.py:21} DEBUG - Trying paths: ['/usr/local/airflow/.docker/config.json', '/usr/local/airflow/.dockercfg'] [2021-01-27 06:17:01,811] {config.py:25} DEBUG - Found file at path: /usr/local/airflow/.docker/config.json [2021-01-27 06:17:01,811] {auth.py:182} DEBUG - Found 'auths' section [2021-01-27 06:17:01,812] {auth.py:142} DEBUG - Found entry (registry='https://index.docker.io/v1/', username='xxxxxxxxx') [2021-01-27 06:17:01,825] {connectionpool.py:433} DEBUG - http://localhost:None "GET /version HTTP/1.1" 200 851 [2021-01-27 06:17:01,826] {docker.py:287} INFO - Pulling docker image debian:buster-slim [2021-01-27 06:17:01,826] {auth.py:41} DEBUG - Looking for auth config [2021-01-27 06:17:01,826] {auth.py:242} DEBUG - Looking for auth entry for 'docker.io' [2021-01-27 06:17:01,826] {auth.py:250} DEBUG - Found 'https://index.docker.io/v1/' [2021-01-27 06:17:01,826] {auth.py:54} DEBUG - Found auth config [2021-01-27 06:17:04,399] {connectionpool.py:433} DEBUG - http://localhost:None "POST /v1.41/images/create?tag=buster-slim&fromImage=debian HTTP/1.1" 200 None [2021-01-27 06:17:04,400] {docker.py:301} INFO - buster-slim: Pulling from library/debian [2021-01-27 06:17:04,982] {docker.py:301} INFO - a076a628af6f: Pulling fs layer [2021-01-27 06:17:05,884] {docker.py:301} INFO - a076a628af6f: Downloading [2021-01-27 06:17:11,429] {docker.py:301} INFO - a076a628af6f: Verifying Checksum [2021-01-27 06:17:11,429] {docker.py:301} INFO - a076a628af6f: Download complete [2021-01-27 06:17:11,480] {docker.py:301} INFO - a076a628af6f: ExtractingIssue Analytics
- State:
- Created 3 years ago
- Reactions:3
- Comments:9 (3 by maintainers)
Top GitHub Comments
I am facing the same issue. This seems to only happen when running Airflow inside a Docker container. I had a non-dockerized deployment for which it worked fine without the
force_pull=True
option. Since going to a docker-compose deployment, I’ve been having the same 404 error.The issue template is just all the questions asked when you create this Github issue like the Airflow version, how to reproduce, etc (https://github.com/apache/airflow/issues/new?assignees=&labels=kind%3Abug&template=bug_report.md&title=)