Airflow 2.0.2 full width comma causes an exception
See original GitHub issueApache Airflow version: Airflow 2.0.2
Environment:
- Cloud provider or hardware configuration: AWS
- OS (e.g. from /etc/os-release): Linux
- Kernel (e.g.
uname -a
): 3.10.0-1062.12.1.el7.x86_64 - Install tools: rh-python36
- Others:
What happened:
There is a full width comma in the DAG’s pydoc block. It works ok in airflow 1.10.10. After upgrading to airflow 2.0.2, it causes an exception. The exception shows in the scheduler’s log as follow:
c = cached_connections[connection].execute(statement, multiparams) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1011, in execute return meth(self, multiparams, params) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement distilled_params, File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context e, statement, parameters, cursor, context File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception util.raise_(exc_info[1], with_traceback=exc_info[2]) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_ raise exception File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context cursor, statement, parameters, context File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute cursor.execute(statement, parameters) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/MySQLdb/cursors.py", line 239, in execute args = tuple(map(db.literal, args)) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/MySQLdb/connections.py", line 321, in literal s = self.escape(o, self.encoders) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/site-packages/MySQLdb/connections.py", line 229, in unicode_literal return db.string_literal(str(u).encode(db.encoding)) UnicodeEncodeError 'latin-1' codec can't encode character '\uff0c' in position 3234 ordinal not in range(256)
What you expected to happen:
The DAG should be imported like airflow 1.10.10 without any issues.
How to reproduce it:
If you have a DAG and let’s say in the DAG file. There is a python doc comments like this:
""" this is a test python doc comment,this comma will cause the exception """
Anything else we need to know:
The backend metadb is mysql. The encoding of table and database is utf8mb4
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
The database stores things in utf8mb4, but you still need to connect to it using the correct encoding. You can set the client’s default encoding on the server (see MySQL documentation on this), or set the client’s encoding explicitly in the URL string as suggested above.
Bottom line: Not an Airflow bug.
Agreed. MySQL encoding rules are complex and distinction of client/service creates more trouble than it’s worth