2.2.3 -> 2.3 upgrade db error with mysql metadb
See original GitHub issueApache Airflow version
2.3.0 (latest released)
What happened
Hi,
Tried to upgrade from v.2.2.3 to v.2.3
airflow upgrade db:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/cursor_cext.py", line 269, in execute
result = self._cnx.cmd_query(stmt, raw=self._raw,
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/connection_cext.py", line 528, in cmd_query
raise errors.get_mysql_exception(exc.errno, msg=exc.msg,
mysql.connector.errors.ProgrammingError: 1054 (42S22): Unknown column 'rendered_task_instance_fields.dag_id' in 'on clause'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/media/data/anaconda3/envs/airflow2.3/bin/airflow", line 8, in <module>
sys.exit(main())
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/__main__.py", line 38, in main
args.func(args)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/cli/cli_parser.py", line 51, in command
return func(*args, **kwargs)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/cli.py", line 99, in wrapper
return f(*args, **kwargs)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/cli/commands/db_command.py", line 82, in upgradedb
db.upgradedb(to_revision=to_revision, from_revision=from_revision, show_sql_only=args.show_sql_only)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper
return func(*args, session=session, **kwargs)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/db.py", line 1400, in upgradedb
for err in _check_migration_errors(session=session):
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/db.py", line 1285, in _check_migration_errors
yield from check_fn(session=session)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/db.py", line 1243, in check_bad_references
invalid_row_count = invalid_rows_query.count()
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3062, in count
return self._from_self(col).enable_eagerloads(False).scalar()
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2803, in scalar
ret = self.one()
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2780, in one
return self._iter().one()
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2818, in _iter
result = self.session.execute(
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1670, in execute
result = conn._execute_20(statement, params or {}, execution_options)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1520, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 313, in _execute_on_connection
return connection._execute_clauseelement(
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1389, in _execute_clauseelement
ret = self._execute_context(
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1748, in _execute_context
self._handle_dbapi_exception(
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1929, in _handle_dbapi_exception
util.raise_(
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/cursor_cext.py", line 269, in execute
result = self._cnx.cmd_query(stmt, raw=self._raw,
File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/connection_cext.py", line 528, in cmd_query
raise errors.get_mysql_exception(exc.errno, msg=exc.msg,
sqlalchemy.exc.ProgrammingError: (mysql.connector.errors.ProgrammingError) 1054 (42S22): Unknown column 'rendered_task_instance_fields.dag_id' in 'on clause'
[SQL: SELECT count(*) AS count_1
FROM (SELECT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.execution_date AS execution_date, rendered_task_instance_fields.rendered_fields AS rendered_fields, rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml
FROM rendered_task_instance_fields
WHERE NOT (EXISTS (SELECT 1
FROM task_instance INNER JOIN dag_run ON rendered_task_instance_fields.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id AND rendered_task_instance_fields.execution_date = dag_run.execution_date
WHERE rendered_task_instance_fields.dag_id = task_instance.dag_id AND rendered_task_instance_fields.task_id = task_instance.task_id))) AS anon_1]
The above mysql statement, which causes error, seems to be generated in the count(self) function in /sqlalchemy/orm/query.py. I ran that statement through mysql workbench and it’s easy to see that it’s wrong
SELECT 1 FROM task_instance INNER JOIN dag_run ON rendered_task_instance_fields.dag_id = task_instance.dag_id
AND dag_run.run_id = task_instance.run_id
AND rendered_task_instance_fields.execution_date = dag_run.execution_date
WHERE rendered_task_instance_fields.task_id = task_instance.task_id
AND rendered_task_instance_fields.dag_id = task_instance.dag_id;
There is no the rendered_task_instance_fields table in the FROM clause. I probably know how to fix the statement but the mysql code is autogenerated (am I right?) but not hardcoded.
What you think should happen instead
No response
How to reproduce
- upgrade airflow installation from pip apache airflow
- run
airflow db upgrade --show-sql-only
to check the changes. Everything’s fine. - run
airflow db upgrade
Operating System
Oracle Linux 7.8
Versions of Apache Airflow Providers
No response
Deployment
Virtualenv installation
Deployment details
MySQL DB 5.7.10 which is supported per the docs.
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
Issue Analytics
- State:
- Created a year ago
- Reactions:3
- Comments:9 (6 by maintainers)
Top Results From Across the Web
Set up a Database Backend - Apache Airflow
Set up a Database Backend¶. Airflow was built to interact with its metadata using SqlAlchemy. The document below describes the database engine ...
Read more >MySQL Shell 8.0 :: 11.1 Upgrade Checker Utility
The util.checkForServerUpgrade() function is an upgrade checker utility that enables you to verify whether MySQL server instances are ready for upgrade.
Read more >4. Kea Database Administration — Kea 1.8.2-git documentation
db -upgrade — Conducts a database schema upgrade. This is useful when upgrading Kea. lease-dump — Dumps the contents of the lease database...
Read more >Understanding the Airflow metadata database
Learn about everything you need to use the Apache Airflow metadata database.
Read more >Admin Guide - Mantis Bug Tracker
This book is targeted at MantisBT administrators, and documents the installation, upgrade, configuration, customization and administration tasks required to ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The query was changed to left join (instead of a not exists) in https://github.com/apache/airflow/pull/23528
So that should fix this. cc @ashb
Correctl v2-3-test is there for cherry-picking changes from
main
and we merge it to v2-3-stable when we release.