question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Running Airflow in Docker - Tutorial dags backfill tutorial example fails with TypeError: cannot serialize '_io.TextIOWrapper' object

See original GitHub issue

Hello, Apologizes in advance if this is a newbie mistake. I’m working through the tutorial using Docker Desktop for Mac locally using the docker-compose from the Running Airflow in Docker. I’m copying the tutorial code as is as except for replacing airflow with ./airflow.sh to run in docker. All the commands work as expected except for the backfill example, which fails with TypeError: cannot serialize '_io.TextIOWrapper' object. Please advise.

# start your backfill on a date range
 % ./airflow.sh dags backfill tutorial \
    --start-date 2015-06-01 \
    --end-date 2015-06-07
Creating airflow_tut_airflow-worker_run ... done
BACKEND=postgresql+psycopg2
DB_HOST=postgres
DB_PORT=5432

/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/commands/dag_command.py:62 PendingDeprecationWarning: --ignore-first-depends-on-past is deprecated as the value is always set to True
[2021-02-23 08:11:03,363] {dagbag.py:448} INFO - Filling up the DagBag from /opt/airflow/dags
[2021-02-23 08:11:04,308] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-01T00:00:00+00:00', '--ignore-depends-on-past', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,343] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-02T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,375] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-03T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,406] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-04T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,446] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-05T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,487] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-06T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
[2021-02-23 08:11:04,573] {base_executor.py:82} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'tutorial', 'print_date', '2015-06-07T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/home/airflow/.local/lib/python3.6/site-packages/airflow/example_dags/tutorial.py']
Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/__main__.py", line 40, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/cli_parser.py", line 48, in command
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/cli.py", line 89, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/cli/commands/dag_command.py", line 116, in dag_backfill
    run_backwards=args.run_backwards,
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/dag.py", line 1706, in run
    job.run()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/base_job.py", line 237, in run
    self._execute()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/session.py", line 65, in wrapper
    return func(*args, session=session, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/backfill_job.py", line 805, in _execute
    session=session,
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/session.py", line 62, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/backfill_job.py", line 727, in _execute_for_run_dates
    session=session,
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/utils/session.py", line 62, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/jobs/backfill_job.py", line 602, in _process_backfill_task_instances
    executor.heartbeat()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 158, in heartbeat
    self.trigger_tasks(open_slots)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 263, in trigger_tasks
    self._process_tasks(task_tuples_to_send)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 272, in _process_tasks
    key_and_async_results = self._send_tasks_to_celery(task_tuples_to_send)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 332, in _send_tasks_to_celery
    send_task_to_executor, task_tuples_to_send, chunksize=chunksize
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
    put(task)
  File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/local/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: cannot serialize '_io.TextIOWrapper' object
 % docker --version
Docker version 20.10.2, build 2291f61
% docker-compose --version
docker-compose version 1.27.4, build 40524192
% docker ps
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS                    PORTS                              NAMES
cfb65ccd5432   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes             8080/tcp                           airflow_tut_airflow-scheduler_1
46b70dfa6c90   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes             8080/tcp                           airflow_tut_airflow-worker_1
37065f3e3d9b   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes (healthy)   0.0.0.0:5555->5555/tcp, 8080/tcp   airflow_tut_flower_1
e498390f3c50   apache/airflow:2.0.1   "/usr/bin/dumb-init …"   21 minutes ago   Up 21 minutes (healthy)   0.0.0.0:8080->8080/tcp             airflow_tut_airflow-webserver_1
5844644d1157   postgres:13            "docker-entrypoint.s…"   23 minutes ago   Up 22 minutes (healthy)   5432/tcp                           airflow_tut_postgres_1
929c10f40745   redis:latest           "docker-entrypoint.s…"   23 minutes ago   Up 22 minutes (healthy)   0.0.0.0:6379->6379/tcp             airflow_tut_redis_1

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:5
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

6reactions
DougForrestcommented, Feb 23, 2021

Thanks for the response I wasn’t aware there was a large difference. I can confirm that the image apache/airflow:2.0.1-python3.8 also works in my local environment.

4reactions
DougForrestcommented, Feb 23, 2021

I updated the docker-compose.yaml default airflow image which fixed the problem.

-  image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.0.1}
+   image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:master-python3.8}

Output above shows container was running python3.6. The comments in the docker-compose.yaml state the default should be Default: apache/airflow:master-python3.8

Created pull request for v2-0-stable. Changes are consistent with master branch. https://github.com/apache/airflow/pull/14404

Read more comments on GitHub >

github_iconTop Results From Across the Web

[GitHub] [airflow] potiuk commented on issue #14379 ...
... Running Airflow in Docker - Tutorial dags backfill tutorial example fails with TypeError: cannot serialize '_io.TextIOWrapper' object.
Read more >
When running Apache Airflow in Docker how can I fix the ...
So now when I look in the postgres database I'm using it shows that the dag.has_import_errors if false. However in the table import_error...
Read more >
Apache Airflow on Docker for Complete Beginners - Medium
The Airflow Scheduler is what takes care of actually running all of the DAGs that you've created: making sure things run in order,...
Read more >
vocab.txt - Hugging Face
... pur ##for pylint s3 lim family external ndim least problem ##default proc ... consist isfile against wr bed ##char runs generic dl...
Read more >
https://maxwell.ydns.eu/git/rnhmjoj/qutebrowser/co...
Each item is checked against the valid v |FontFamily|A Qt font family. ... + + If your command handler encounters an error and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found