question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

2.2.3 -> 2.3 upgrade db error with mysql metadb

See original GitHub issue

Apache Airflow version

2.3.0 (latest released)

What happened

Hi, Tried to upgrade from v.2.2.3 to v.2.3 airflow upgrade db:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
    self.dialect.do_execute(
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
    cursor.execute(statement, parameters)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/cursor_cext.py", line 269, in execute
    result = self._cnx.cmd_query(stmt, raw=self._raw,
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/connection_cext.py", line 528, in cmd_query
    raise errors.get_mysql_exception(exc.errno, msg=exc.msg,
mysql.connector.errors.ProgrammingError: 1054 (42S22): Unknown column 'rendered_task_instance_fields.dag_id' in 'on clause'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/media/data/anaconda3/envs/airflow2.3/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/__main__.py", line 38, in main
    args.func(args)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/cli/cli_parser.py", line 51, in command
    return func(*args, **kwargs)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/cli.py", line 99, in wrapper
    return f(*args, **kwargs)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/cli/commands/db_command.py", line 82, in upgradedb
    db.upgradedb(to_revision=to_revision, from_revision=from_revision, show_sql_only=args.show_sql_only)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper
    return func(*args, session=session, **kwargs)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/db.py", line 1400, in upgradedb
    for err in _check_migration_errors(session=session):
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/db.py", line 1285, in _check_migration_errors
    yield from check_fn(session=session)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/airflow/utils/db.py", line 1243, in check_bad_references
    invalid_row_count = invalid_rows_query.count()
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3062, in count
    return self._from_self(col).enable_eagerloads(False).scalar()
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2803, in scalar
    ret = self.one()
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2780, in one
    return self._iter().one()
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2818, in _iter
    result = self.session.execute(
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1670, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1520, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 313, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1389, in _execute_clauseelement
    ret = self._execute_context(
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1748, in _execute_context
    self._handle_dbapi_exception(
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1929, in _handle_dbapi_exception
    util.raise_(
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
    raise exception
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
    self.dialect.do_execute(
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
    cursor.execute(statement, parameters)
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/cursor_cext.py", line 269, in execute
    result = self._cnx.cmd_query(stmt, raw=self._raw,
  File "/media/data/anaconda3/envs/airflow2.3/lib/python3.8/site-packages/mysql/connector/connection_cext.py", line 528, in cmd_query
    raise errors.get_mysql_exception(exc.errno, msg=exc.msg,
sqlalchemy.exc.ProgrammingError: (mysql.connector.errors.ProgrammingError) 1054 (42S22): Unknown column 'rendered_task_instance_fields.dag_id' in 'on clause'
[SQL: SELECT count(*) AS count_1
FROM (SELECT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.execution_date AS execution_date, rendered_task_instance_fields.rendered_fields AS rendered_fields, rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml
FROM rendered_task_instance_fields
WHERE NOT (EXISTS (SELECT 1
FROM task_instance INNER JOIN dag_run ON rendered_task_instance_fields.dag_id = task_instance.dag_id AND dag_run.run_id = task_instance.run_id AND rendered_task_instance_fields.execution_date = dag_run.execution_date
WHERE rendered_task_instance_fields.dag_id = task_instance.dag_id AND rendered_task_instance_fields.task_id = task_instance.task_id))) AS anon_1]

The above mysql statement, which causes error, seems to be generated in the count(self) function in /sqlalchemy/orm/query.py. I ran that statement through mysql workbench and it’s easy to see that it’s wrong

SELECT 1 FROM task_instance INNER JOIN dag_run ON rendered_task_instance_fields.dag_id = task_instance.dag_id
AND dag_run.run_id = task_instance.run_id
AND rendered_task_instance_fields.execution_date = dag_run.execution_date
WHERE rendered_task_instance_fields.task_id = task_instance.task_id
AND rendered_task_instance_fields.dag_id = task_instance.dag_id;

There is no the rendered_task_instance_fields table in the FROM clause. I probably know how to fix the statement but the mysql code is autogenerated (am I right?) but not hardcoded.

What you think should happen instead

No response

How to reproduce

  1. upgrade airflow installation from pip apache airflow
  2. run airflow db upgrade --show-sql-only to check the changes. Everything’s fine.
  3. run airflow db upgrade

Operating System

Oracle Linux 7.8

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

MySQL DB 5.7.10 which is supported per the docs.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:3
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
dstandishcommented, May 10, 2022

The query was changed to left join (instead of a not exists) in https://github.com/apache/airflow/pull/23528

So that should fix this. cc @ashb

1reaction
potiukcommented, May 14, 2022

so, it’ll be released in v.2.3.1, right? now it’s included in v2-3-test, but not in v2-3-stable.

Correctl v2-3-test is there for cherry-picking changes from main and we merge it to v2-3-stable when we release.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Set up a Database Backend - Apache Airflow
Set up a Database Backend¶. Airflow was built to interact with its metadata using SqlAlchemy. The document below describes the database engine ...
Read more >
MySQL Shell 8.0 :: 11.1 Upgrade Checker Utility
The util.checkForServerUpgrade() function is an upgrade checker utility that enables you to verify whether MySQL server instances are ready for upgrade.
Read more >
4. Kea Database Administration — Kea 1.8.2-git documentation
db -upgrade — Conducts a database schema upgrade. This is useful when upgrading Kea. lease-dump — Dumps the contents of the lease database...
Read more >
Understanding the Airflow metadata database
Learn about everything you need to use the Apache Airflow metadata database.
Read more >
Admin Guide - Mantis Bug Tracker
This book is targeted at MantisBT administrators, and documents the installation, upgrade, configuration, customization and administration tasks required to ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found