DAG serialization JSONDecodeError
See original GitHub issueApache Airflow version: 1.10.12
Kubernetes version (if you are using kubernetes) (use kubectl version):
Environment:
- Cloud provider or hardware configuration: AWS
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a): - Install tools:
- Others: MySQL (RDS) metadata backend (v5.6.43)
What happened:
We recently turned on DAG serialization and noticed that when we tried to click on large DAGs in the UI, we get an error:
Traceback
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.7/dist-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/usr/local/lib/python3.7/dist-packages/airflow/www_rbac/decorators.py", line 121, in wrapper
return f(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/flask_appbuilder/security/decorators.py", line 109, in wraps
return f(self, *args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/airflow/www_rbac/decorators.py", line 92, in view_func
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/airflow/www_rbac/decorators.py", line 56, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/airflow/www_rbac/views.py", line 1407, in tree
dag = dagbag.get_dag(dag_id)
File "/usr/local/lib/python3.7/dist-packages/airflow/models/dagbag.py", line 136, in get_dag
self._add_dag_from_db(dag_id=dag_id)
File "/usr/local/lib/python3.7/dist-packages/airflow/models/dagbag.py", line 191, in _add_dag_from_db
row = SerializedDagModel.get(dag_id)
File "/usr/local/lib/python3.7/dist-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/airflow/models/serialized_dag.py", line 217, in get
row = session.query(cls).filter(cls.dag_id == dag_id).one_or_none()
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
ret = list(self)
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/orm/loading.py", line 100, in instances
cursor.close()
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
with_traceback=exc_tb,
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/orm/loading.py", line 80, in instances
rows = [proc(row) for row in fetch]
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/orm/loading.py", line 80, in <listcomp>
rows = [proc(row) for row in fetch]
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/orm/loading.py", line 588, in _instance
populators,
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/orm/loading.py", line 725, in _populate_full
dict_[key] = getter(row)
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/sql/type_api.py", line 1278, in process
return process_value(impl_processor(value), dialect)
File "/usr/local/lib/python3.7/dist-packages/sqlalchemy/sql/sqltypes.py", line 2454, in process
return json_deserializer(value)
File "/usr/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 16275 (char 16274)
We’ve determined the issue is that in the serialized_dag table with MySQL, the data column type is TEXT, which has a max of 64KB, but some of our DAG code is larger than that. We were able to get around this by running the following manually on the serialized_dag table then waiting for the table to get re-updated:
CREATE TABLE serialized_dag_backup AS SELECT * FROM serialized_dag;
ALTER TABLE serialized_dag MODIFY data MEDIUMTEXT;
SELECT * FROM serialized_dag
WHERE LENGTH(data) = 65535;
DELETE FROM serialized_dag
WHERE LENGTH(data) = 65535;
What you expected to happen:
Should be able to click on the DAG in the UI without error
How to reproduce it: With a MySQL metadata backend, create a DAG with code that is larger than 64KB and enable DAG serialization. Then attempt to click on that DAG in the UI.
Anything else we need to know: N/A
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (7 by maintainers)

Top Related StackOverflow Question
Cool, yeah with Airflow 2.0 around the corner, I think it would be worth for you to at least upgrade to 5.7 and even better 8.0 if you want to run multiple Schedulers in 2.0 😉
Ah nice catch – seems like we can resolve this issue fairly easily by upgrading our MySQL version. Feel free to close this issue out if you don’t think it’s worth fixing for 5.6.x. Thanks again!