question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DockerOperator fails to pull an image

See original GitHub issue

Apache Airflow version: 2.0

Environment:

  • OS (from /etc/os-release): Debian GNU/Linux 10 (buster)
  • Kernel (uname -a): Linux 37365fa0b59b 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 GNU/Linux
  • Others: running inside a docker container, forked puckel/docker-airflow

What happened:

DockerOperator does not attempt to pull an image unless force_pull is set to True, instead displaying a misleading 404 error.

What you expected to happen:

DockerOperator should attempt to pull an image when it is not present locally.

How to reproduce it:

Make sure you don’t have an image tagged debian:buster-slim present locally.

DockerOperator(
        task_id=f'try_to_pull_debian',
        image='debian:buster-slim',
        command=f'''echo hello''',
        force_pull=False
    )

prints: {taskinstance.py:1396} ERROR - 404 Client Error: Not Found ("No such image: ubuntu:latest") This, on the other hand:

DockerOperator(
        task_id=f'try_to_pull_debian',
        image='debian:buster-slim',
        command=f'''echo hello''',
        force_pull=True
    )

pulls the image and prints {docker.py:263} INFO - hello

Anything else we need to know:

I overrode DockerOperator to track down what I was doing wrong and found the following:

When trying to run an image that’s not present locally, self.cli.images(name=self.image) in the line: https://github.com/apache/airflow/blob/8723b1feb82339d7a4ba5b40a6c4d4bbb995a4f9/airflow/providers/docker/operators/docker.py#L286 returns a non-empty array even when the image has been deleted from the local machine.

In fact, self.cli.images appears to return non-empty arrays even when supplied with nonsense image names.

force_pull_false.log [2021-01-27 06:15:28,987] {__init__.py:124} DEBUG - Preparing lineage inlets and outlets [2021-01-27 06:15:28,987] {__init__.py:168} DEBUG - inlets: [], outlets: [] [2021-01-27 06:15:28,987] {config.py:21} DEBUG - Trying paths: ['/usr/local/airflow/.docker/config.json', '/usr/local/airflow/.dockercfg'] [2021-01-27 06:15:28,987] {config.py:25} DEBUG - Found file at path: /usr/local/airflow/.docker/config.json [2021-01-27 06:15:28,987] {auth.py:182} DEBUG - Found 'auths' section [2021-01-27 06:15:28,988] {auth.py:142} DEBUG - Found entry (registry='https://index.docker.io/v1/', username='xxxxxxx') [2021-01-27 06:15:29,015] {connectionpool.py:433} DEBUG - http://localhost:None "GET /version HTTP/1.1" 200 851 [2021-01-27 06:15:29,060] {connectionpool.py:433} DEBUG - http://localhost:None "GET /v1.41/images/json?filter=debian%3Abuster-slim&only_ids=0&all=0 HTTP/1.1" 200 None [2021-01-27 06:15:29,060] {docker.py:224} INFO - Starting docker container from image debian:buster-slim [2021-01-27 06:15:29,063] {connectionpool.py:433} DEBUG - http://localhost:None "POST /v1.41/containers/create HTTP/1.1" 404 48 [2021-01-27 06:15:29,063] {taskinstance.py:1396} ERROR - 404 Client Error: Not Found ("No such image: debian:buster-slim") Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/docker/api/client.py", line 261, in _raise_for_status response.raise_for_status() File "/usr/local/lib/python3.8/site-packages/requests/models.py", line 941, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.41/containers/create

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File “/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py”, line 1086, in _run_raw_task self._prepare_and_execute_task_with_callbacks(context, task) File “/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py”, line 1260, in _prepare_and_execute_task_with_callbacks result = self._execute_task(context, task_copy) File “/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py”, line 1300, in _execute_task result = task_copy.execute(context=context) File “/usr/local/lib/python3.8/site-packages/airflow/providers/docker/operators/docker.py”, line 305, in execute return self._run_image() File “/usr/local/lib/python3.8/site-packages/airflow/providers/docker/operators/docker.py”, line 231, in _run_image self.container = self.cli.create_container( File “/usr/local/lib/python3.8/site-packages/docker/api/container.py”, line 427, in create_container return self.create_container_from_config(config, name) File “/usr/local/lib/python3.8/site-packages/docker/api/container.py”, line 438, in create_container_from_config return self._result(res, True) File “/usr/local/lib/python3.8/site-packages/docker/api/client.py”, line 267, in _result self._raise_for_status(response) File “/usr/local/lib/python3.8/site-packages/docker/api/client.py”, line 263, in _raise_for_status raise create_api_error_from_http_exception(e) File “/usr/local/lib/python3.8/site-packages/docker/errors.py”, line 31, in create_api_error_from_http_exception raise cls(e, response=response, explanation=explanation) docker.errors.ImageNotFound: 404 Client Error: Not Found (“No such image: debian:buster-slim”)

force_pull_true.log [2021-01-27 06:17:01,811] {__init__.py:124} DEBUG - Preparing lineage inlets and outlets [2021-01-27 06:17:01,811] {__init__.py:168} DEBUG - inlets: [], outlets: [] [2021-01-27 06:17:01,811] {config.py:21} DEBUG - Trying paths: ['/usr/local/airflow/.docker/config.json', '/usr/local/airflow/.dockercfg'] [2021-01-27 06:17:01,811] {config.py:25} DEBUG - Found file at path: /usr/local/airflow/.docker/config.json [2021-01-27 06:17:01,811] {auth.py:182} DEBUG - Found 'auths' section [2021-01-27 06:17:01,812] {auth.py:142} DEBUG - Found entry (registry='https://index.docker.io/v1/', username='xxxxxxxxx') [2021-01-27 06:17:01,825] {connectionpool.py:433} DEBUG - http://localhost:None "GET /version HTTP/1.1" 200 851 [2021-01-27 06:17:01,826] {docker.py:287} INFO - Pulling docker image debian:buster-slim [2021-01-27 06:17:01,826] {auth.py:41} DEBUG - Looking for auth config [2021-01-27 06:17:01,826] {auth.py:242} DEBUG - Looking for auth entry for 'docker.io' [2021-01-27 06:17:01,826] {auth.py:250} DEBUG - Found 'https://index.docker.io/v1/' [2021-01-27 06:17:01,826] {auth.py:54} DEBUG - Found auth config [2021-01-27 06:17:04,399] {connectionpool.py:433} DEBUG - http://localhost:None "POST /v1.41/images/create?tag=buster-slim&fromImage=debian HTTP/1.1" 200 None [2021-01-27 06:17:04,400] {docker.py:301} INFO - buster-slim: Pulling from library/debian [2021-01-27 06:17:04,982] {docker.py:301} INFO - a076a628af6f: Pulling fs layer [2021-01-27 06:17:05,884] {docker.py:301} INFO - a076a628af6f: Downloading [2021-01-27 06:17:11,429] {docker.py:301} INFO - a076a628af6f: Verifying Checksum [2021-01-27 06:17:11,429] {docker.py:301} INFO - a076a628af6f: Download complete [2021-01-27 06:17:11,480] {docker.py:301} INFO - a076a628af6f: Extracting

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
ldealmeicommented, Feb 18, 2021

I am facing the same issue. This seems to only happen when running Airflow inside a Docker container. I had a non-dockerized deployment for which it worked fine without the force_pull=True option. Since going to a docker-compose deployment, I’ve been having the same 404 error.

[2021-02-18 09:24:44,360] {docker.py:224} INFO - Starting docker container from image busybox:stable
[2021-02-18 09:24:44,394] {taskinstance.py:1396} ERROR - 404 Client Error: Not Found ("No such image: busybox:stable")
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/client.py", line 261, in _raise_for_status
    response.raise_for_status()
  File "/home/airflow/.local/lib/python3.6/site-packages/requests/models.py", line 941, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.41/containers/create
1reaction
kaxilcommented, Jan 26, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

Relatedly, I looked for an issue template in the contributing guidelines and couldn’t find one.

The issue template is just all the questions asked when you create this Github issue like the Airflow version, how to reproduce, etc (https://github.com/apache/airflow/issues/new?assignees=&labels=kind%3Abug&template=bug_report.md&title=)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Airflow DockerOperator cannot find some images but can find ...
I get the following error when trying to use the Docker operator in ... had been set up on each server but no...
Read more >
airflow.operators.docker_operator — Airflow Documentation
force_pull (bool) – Pull the docker image on every run. Default is False. mem_limit (float or str) – Maximum amount of memory the...
Read more >
dockeroperator in apache-airflow - liveBook · Manning
The DockerOperator wraps around the Docker Python client and, given a list of arguments, enables starting of Docker containers.
Read more >
Airflow DockerOperator: The Basics (and more ) - YouTube
Airflow DockerOperator : The Basics (and more ) Smash the like button to become an Airflow Super Hero!❤️ Subscribe to my...
Read more >
How to use the DockerOperator in Apache Airflow
Wondering how to use the DockerOperator in Apache Airflow to kick off a docker ... Notice that if the image does not exist,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found