question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Scheduler is unable to find serialized DAG in the serialized_dag table

See original GitHub issue

Apache Airflow version: 2.0.0

Kubernetes version (if you are using kubernetes) (use kubectl version): Not relevant

Environment:

  • Cloud provider or hardware configuration:

  • OS (e.g. from /etc/os-release): CentOS Linux 7 (Core)

  • Kernel (e.g. uname -a): Linux us01odcres-jamuaar-0003 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools: PostgreSQL 12.2

  • Others:

What happened:

I have 2 dag files say, dag1.py and dag2.py. dag1.py creates a static DAG i.e. once it’s parsed it will create 1 specific DAG. dag2.py creates dynamic DAGs based on json files kept in an external location.

The static DAG (generated from dag1.py) has a task in the later stage which generates json files and they get picked up by dag2.py which creates dynamic DAGs.

The dynamic DAGs which get created are unpaused by default and get scheduled once. This whole process used to work fine with airflow 1.x where DAG serialization was not mandatory and was turned off by default.

But with Airflow 2.0 I am getting the following exception occasionally when the dynamically generated DAGs try to get scheduled by the scheduler.

[2021-01-06 10:09:38,742] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
Traceback (most recent call last):
  File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
    self._run_scheduler_loop()
  File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
    num_queued_tis = self._do_scheduling(session)
  File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1474, in _do_scheduling
    self._create_dag_runs(query.all(), session)
  File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1557, in _create_dag_runs
    dag = self.dagbag.get_dag(dag_model.dag_id, session=session)
  File "/global/packages/python/lib/python3.7/site-packages/airflow/utils/session.py", line 62, in wrapper
    return func(*args, **kwargs)
  File "/global/packages/python/lib/python3.7/site-packages/airflow/models/dagbag.py", line 171, in get_dag
    self._add_dag_from_db(dag_id=dag_id, session=session)
  File "/global/packages/python/lib/python3.7/site-packages/airflow/models/dagbag.py", line 227, in _add_dag_from_db
    raise SerializedDagNotFound(f"DAG '{dag_id}' not found in serialized_dag table")
airflow.exceptions.SerializedDagNotFound: DAG 'dynamic_dag_1' not found in serialized_dag table

When I checked the serialized_dag table manually, I am able to see the DAG entry there. I found the last_updated column value to be 2021-01-06 10:09:38.757076+05:30 Whereas the exception got logged at [2021-01-06 10:09:38,742] which is little before the last_updated time.

I think this means that the Scheduler tried to look for the DAG entry in the serialized_dag table before DagFileProcessor created the entry.

Is this right or something else can be going on here?

What you expected to happen:

Scheduler should start looking for the DAG entry in the serialized_dag table only after DagFileProcessor has added it. Here it seems that DagFileProcessor added the DAG entry in the dag table, scheduler immediately fetched this dag_id from it and tried to find the same in serialized_dag table even before DagFileProcessor could add that.

How to reproduce it: It occurs occasionally and there is no well defined way to reproduce it.

Anything else we need to know:

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (4 by maintainers)

github_iconTop GitHub Comments

6reactions
nik-daviscommented, Jan 19, 2021

Would just like to add our temporary solution that is helping us get around this issue, and seems to be working quite nicely. We’ve added a python script to run before starting the scheduler which will serialize any missing DAGs, so if it fails on this error it will be fixed the next time it starts up.

Here’s serialize_missing_dags.py:

from airflow.models import DagBag
from airflow.models.serialized_dag import SerializedDagModel

dag_bag = DagBag()

# Check DB for missing serialized DAGs, and add them if missing
for dag_id in dag_bag.dag_ids:
    if not SerializedDagModel.get(dag_id):
        dag = dag_bag.get_dag(dag_id)
        SerializedDagModel.write_dag(dag)

Which we call before starting the scheduler: python serialize_missing_dags.py && exec airflow scheduler

I hope this helps!

5reactions
kaxilcommented, Jan 22, 2021

Will be fixed for 2.0.1 – currently aiming to release it in 2nd week of Feb

Read more comments on GitHub >

github_iconTop Results From Across the Web

Airflow 2.0 - Scheduler is unable to find serialized DAG in ...
When I check the table manually after this error, I am able to see the DAG entry in it. This issue is not...
Read more >
airflow.models.serialized_dag — Airflow Documentation
A table for serialized DAGs. serialized_dag table is a snapshot of DAG files synchronized by scheduler. ... Get the SerializedDAG for the given...
Read more >
airflow/jobs/scheduler_job.py
Only run DagRun.verify integrity if Serialized DAG has changed since it is slow. Return True if we determine that DAG still exists. " ......
Read more >
Release Notes - Apache Airflow documentation - Amazon AWS
If [core] store_dag_code was set to True , the Scheduler stored the code in the DAG file in the DB (in dag_code table)...
Read more >
airflow scheduler_job 源码 - seaxiang
:param subdir: directory containing Python files with Airflow DAG ... ti, state, ti.state, info) # Get task from the Serialized DAG try: dag...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found