question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Airflow UI does not pick up TFX component logs

See original GitHub issue

Summary

For some reason TFX writes the log files in a different format/path then expected by the Airflow Web UI. Therefore the logs on the UI don’t show the really intressting logs (the root cause) but only that something went wrong (example s. below).

So for example, if the log root of Airflow is: /usr/local/airflow/logs

Official Log File

The log that appears on the UI is: /usr/local/airflow/logs/chicago_taxi_simple/CsvExampleGen/2019-05-29T11:03:26.751866+00:00

Which is follwing the offical format: {dag_id}/{task_id}/{execution_date}/{try_number}.log

Actual TFX log file

But the relevant logs are here: /usr/local/airflow/logs/chicago_taxi_simple.CsvExampleGen/chicago_taxi_simple.CsvExampleGen.checkcache/2019-05-29T11:03:26.751866+00:00

This seems to follow a different format (dot instead of sub folder): {dag_id}.{task_id}/{execution_date}/{try_number}.log

Question

Is there a way to either change the AirFlow UI or TFX, so that the relevant log appears in the UI. I had a look at the LoggerConfig but it does not contain a setting to change anything but the root log folder.

Additional Information

Both a local airflow instance and one running on a K8s cluster have the same problem. The airflow version is v1.10.2. I am running the chicago example pipeline with the settings described in this repo.

Just to make clear what I mean. This is the log on the UI (no information about the actual error):

[2019-05-29 11:03:54,685] {models.py:1359} INFO - Dependencies all met for <TaskInstance: chicago_taxi_simple.CsvExampleGen 2019-05-29T11:03:26.751866+00:00 [queued]>
[2019-05-29 11:03:54,706] {models.py:1359} INFO - Dependencies all met for <TaskInstance: chicago_taxi_simple.CsvExampleGen 2019-05-29T11:03:26.751866+00:00 [queued]>
[2019-05-29 11:03:54,706] {models.py:1571} INFO - 
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------

[2019-05-29 11:03:54,725] {models.py:1593} INFO - Executing <Task(Component): CsvExampleGen> on 2019-05-29T11:03:26.751866+00:00
[2019-05-29 11:03:54,725] {base_task_runner.py:118} INFO - Running: ['bash', '-c', 'airflow run chicago_taxi_simple CsvExampleGen 2019-05-29T11:03:26.751866+00:00 --job_id 44 --raw -sd DAGS_FOLDER/taxi_pipeline_simple.py --cfg_path /tmp/tmpsg_x9rv7']
[2019-05-29 11:03:55,295] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen WARNING:root:OSError while attempting to symlink the latest log directory
[2019-05-29 11:03:55,553] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:03:55,553] {settings.py:174} INFO - settings.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800, pid=13
[2019-05-29 11:03:56,145] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:03:56,145] {__init__.py:51} INFO - Using executor LocalExecutor
[2019-05-29 11:03:56,577] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:03:56,576] {models.py:273} INFO - Filling up the DagBag from /usr/local/airflow/dags/taxi_pipeline_simple.py
[2019-05-29 11:04:02,105] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/airflow/dags/assets
[2019-05-29 11:04:02,105] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/airflow/dags/assets/data/simple
[2019-05-29 11:04:02,105] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:04:02,105] {cli.py:520} INFO - Running <TaskInstance: chicago_taxi_simple.CsvExampleGen 2019-05-29T11:03:26.751866+00:00 [running]> on host chicagotaxisimplecsvexamplegen-ee1d5266818e4cf0bab94a46b794c766
[2019-05-29 11:04:02,382] {logging_mixin.py:95} INFO - [2019-05-29 11:04:02,381] {base_executor.py:56} INFO - Adding to queue: airflow run chicago_taxi_simple.CsvExampleGen chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29T11:03:26.751866+00:00 --local -sd DAGS_FOLDER/taxi_pipeline_simple.py --cfg_path /tmp/tmpvn0z61z_
[2019-05-29 11:04:07,172] {logging_mixin.py:95} INFO - [2019-05-29 11:04:07,172] {sequential_executor.py:45} INFO - Executing command: airflow run chicago_taxi_simple.CsvExampleGen chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29T11:03:26.751866+00:00 --local -sd DAGS_FOLDER/taxi_pipeline_simple.py --cfg_path /tmp/tmpvn0z61z_
[2019-05-29 11:04:07,673] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen WARNING:root:OSError while attempting to symlink the latest log directory
[2019-05-29 11:04:07,911] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:04:07,910] {settings.py:174} INFO - settings.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800, pid=23
[2019-05-29 11:04:08,401] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:04:08,401] {__init__.py:51} INFO - Using executor LocalExecutor
[2019-05-29 11:04:08,750] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:04:08,749] {models.py:273} INFO - Filling up the DagBag from /usr/local/airflow/dags/taxi_pipeline_simple.py
[2019-05-29 11:04:14,220] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/airflow/dags/assets
[2019-05-29 11:04:14,220] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/airflow/dags/assets/data/simple
[2019-05-29 11:04:14,220] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen [2019-05-29 11:04:14,219] {cli.py:520} INFO - Running <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29T11:03:26.751866+00:00 [queued]> on host chicagotaxisimplecsvexamplegen-ee1d5266818e4cf0bab94a46b794c766
[2019-05-29 11:04:24,343] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
[2019-05-29 11:04:24,343] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   """)
[2019-05-29 11:04:24,343] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/lib/python3.6/site-packages/apache_beam/__init__.py:84: UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully supported. You may encounter buggy behavior or missing features.
[2019-05-29 11:04:24,343] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   'Running the Apache Beam SDK on Python 3 is not yet fully supported. '
[2019-05-29 11:04:25,072] {logging_mixin.py:95} INFO - [2019-05-29 11:04:25,070] {jobs.py:1926} ERROR - Task instance <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29 11:03:26.751866+00:00 [failed]> failed
[2019-05-29 11:04:25,125] {logging_mixin.py:95} INFO - [2019-05-29 11:04:25,125] {jobs.py:2100} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 3 | succeeded: 0 | running: 0 | failed: 1 | skipped: 0 | deadlocked: 0 | not ready: 3
[2019-05-29 11:04:25,135] {logging_mixin.py:95} INFO - [2019-05-29 11:04:25,135] {jobs.py:2199} ERROR - Task instance <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.exec 2019-05-29 11:03:26.751866+00:00 [upstream_failed]> with upstream_failed state
[2019-05-29 11:04:25,181] {logging_mixin.py:95} INFO - [2019-05-29 11:04:25,181] {jobs.py:2251} ERROR - Task instance <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.publishcache 2019-05-29 11:03:26.751866+00:00 [upstream_failed]> upstream failed
[2019-05-29 11:04:25,219] {logging_mixin.py:95} INFO - [2019-05-29 11:04:25,218] {jobs.py:2251} ERROR - Task instance <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.publishexec 2019-05-29 11:03:26.751866+00:00 [upstream_failed]> upstream failed
[2019-05-29 11:04:25,251] {logging_mixin.py:95} INFO - [2019-05-29 11:04:25,251] {models.py:5258} INFO - Marking run <DagRun chicago_taxi_simple.CsvExampleGen @ 2019-05-29T11:03:26.751866+00:00: backfill_2019-05-29T11:03:26.751866+00:00, externally triggered: False> failed
[2019-05-29 11:04:25,299] {logging_mixin.py:95} INFO - [2019-05-29 11:04:25,299] {jobs.py:2100} INFO - [backfill progress] | finished run 1 of 1 | tasks waiting: 0 | succeeded: 0 | running: 0 | failed: 4 | skipped: 0 | deadlocked: 0 | not ready: 0
[2019-05-29 11:04:25,324] {models.py:1788} ERROR - ---------------------------------------------------
Some task instances failed:
{('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.exec', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.publishexec', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.publishcache', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.checkcache', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1)}
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1657, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.6/site-packages/airflow/operators/subdag_operator.py", line 103, in execute
    executor=self.executor)
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 4324, in run
    job.run()
  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 202, in run
    self._execute()
  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 73, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 2447, in _execute
    raise AirflowException(err)
airflow.exceptions.AirflowException: ---------------------------------------------------
Some task instances failed:
{('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.exec', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.publishexec', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.publishcache', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.checkcache', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1)}

[2019-05-29 11:04:25,345] {models.py:1819} INFO - Marking task as FAILED.
[2019-05-29 11:04:25,364] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
[2019-05-29 11:04:25,364] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   """)
[2019-05-29 11:04:25,365] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen /usr/local/lib/python3.6/site-packages/apache_beam/__init__.py:84: UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully supported. You may encounter buggy behavior or missing features.
[2019-05-29 11:04:25,365] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   'Running the Apache Beam SDK on Python 3 is not yet fully supported. '
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen Traceback (most recent call last):
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/bin/airflow", line 32, in <module>
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     args.func(args)
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in wrapper
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     return f(*args, **kwargs)
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 526, in run
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     _run(args, dag, ti)
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 445, in _run
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     pool=args.pool,
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 73, in wrapper
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     return func(*args, **kwargs)
[2019-05-29 11:04:25,450] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1657, in _run_raw_task
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     result = task_copy.execute(context=context)
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/operators/subdag_operator.py", line 103, in execute
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     executor=self.executor)
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 4324, in run
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     job.run()
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 202, in run
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     self._execute()
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 73, in wrapper
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     return func(*args, **kwargs)
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen   File "/usr/local/lib/python3.6/site-packages/airflow/jobs.py", line 2447, in _execute
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen     raise AirflowException(err)
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen airflow.exceptions.AirflowException: ---------------------------------------------------
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen Some task instances failed:
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen {('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.exec', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.publishexec', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.publishcache', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1), ('chicago_taxi_simple.CsvExampleGen', 'chicago_taxi_simple.CsvExampleGen.checkcache', datetime.datetime(2019, 5, 29, 11, 3, 26, 751866, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1)}
[2019-05-29 11:04:25,451] {base_task_runner.py:101} INFO - Job 44: Subtask CsvExampleGen 
[2019-05-29 11:04:29,715] {logging_mixin.py:95} INFO - [2019-05-29 11:04:29,713] {jobs.py:2527} INFO - Task exited with return code 1

An this is the log with the real information (saying that the metadata store was unable to open database file):

[2019-05-29 11:04:14,270] {models.py:1359} INFO - Dependencies all met for <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29T11:03:26.751866+00:00 [queued]>
[2019-05-29 11:04:14,278] {models.py:1359} INFO - Dependencies all met for <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29T11:03:26.751866+00:00 [queued]>
[2019-05-29 11:04:14,278] {models.py:1571} INFO - 
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------

[2019-05-29 11:04:14,295] {models.py:1593} INFO - Executing <Task(BranchPythonOperator): chicago_taxi_simple.CsvExampleGen.checkcache> on 2019-05-29T11:03:26.751866+00:00
[2019-05-29 11:04:14,295] {base_task_runner.py:118} INFO - Running: ['bash', '-c', 'airflow run chicago_taxi_simple.CsvExampleGen chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29T11:03:26.751866+00:00 --job_id 46 --raw -sd DAGS_FOLDER/taxi_pipeline_simple.py --cfg_path /tmp/tmprrn3fjt_']
[2019-05-29 11:04:14,794] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache WARNING:root:OSError while attempting to symlink the latest log directory
[2019-05-29 11:04:15,017] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache [2019-05-29 11:04:15,017] {settings.py:174} INFO - settings.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800, pid=31
[2019-05-29 11:04:15,493] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache [2019-05-29 11:04:15,493] {__init__.py:51} INFO - Using executor LocalExecutor
[2019-05-29 11:04:15,861] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache [2019-05-29 11:04:15,861] {models.py:273} INFO - Filling up the DagBag from /usr/local/airflow/dags/taxi_pipeline_simple.py
[2019-05-29 11:04:21,192] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache /usr/local/airflow/dags/assets
[2019-05-29 11:04:21,192] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache /usr/local/airflow/dags/assets/data/simple
[2019-05-29 11:04:21,192] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache [2019-05-29 11:04:21,191] {cli.py:520} INFO - Running <TaskInstance: chicago_taxi_simple.CsvExampleGen.chicago_taxi_simple.CsvExampleGen.checkcache 2019-05-29T11:03:26.751866+00:00 [running]> on host chicagotaxisimplecsvexamplegen-ee1d5266818e4cf0bab94a46b794c766
[2019-05-29 11:04:21,223] {python_operator.py:95} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_ID=chicago_taxi_simple.CsvExampleGen
AIRFLOW_CTX_TASK_ID=chicago_taxi_simple.CsvExampleGen.checkcache
AIRFLOW_CTX_EXECUTION_DATE=2019-05-29T11:03:26.751866+00:00
AIRFLOW_CTX_DAG_RUN_ID=backfill_2019-05-29T11:03:26.751866+00:00
[2019-05-29 11:04:21,263] {models.py:1788} ERROR - <built-in function CreateMetadataStore> returned a result with an error set
RuntimeError: Cannot connect sqlite3 database: unable to open database file

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1657, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 131, in execute
    branch = super(BranchPythonOperator, self).execute(context)
  File "/usr/local/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 103, in execute
    return_value = self.execute_callable()
  File "/usr/local/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 108, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/usr/local/lib/python3.6/site-packages/tfx/orchestration/airflow/airflow_adapter.py", line 110, in check_cache_and_maybe_prepare_execution
    self._logger) as m:
  File "/usr/local/lib/python3.6/site-packages/tfx/orchestration/metadata.py", line 53, in __enter__
    self._store = metadata_store.MetadataStore(self._connection_config)
  File "/usr/local/lib/python3.6/site-packages/ml_metadata/metadata_store/metadata_store.py", line 48, in __init__
    config.SerializeToString())
SystemError: <built-in function CreateMetadataStore> returned a result with an error set
[2019-05-29 11:04:21,298] {models.py:1819} INFO - Marking task as FAILED.
[2019-05-29 11:04:21,452] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache /usr/local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
[2019-05-29 11:04:21,452] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   """)
[2019-05-29 11:04:21,452] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache /usr/local/lib/python3.6/site-packages/apache_beam/__init__.py:84: UserWarning: Running the Apache Beam SDK on Python 3 is not yet fully supported. You may encounter buggy behavior or missing features.
[2019-05-29 11:04:21,452] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   'Running the Apache Beam SDK on Python 3 is not yet fully supported. '
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache RuntimeError: Cannot connect sqlite3 database: unable to open database file
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache 
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache The above exception was the direct cause of the following exception:
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache 
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache Traceback (most recent call last):
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/bin/airflow", line 32, in <module>
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     args.func(args)
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in wrapper
[2019-05-29 11:04:21,521] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     return f(*args, **kwargs)
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 526, in run
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     _run(args, dag, ti)
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 445, in _run
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     pool=args.pool,
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 73, in wrapper
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     return func(*args, **kwargs)
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1657, in _run_raw_task
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     result = task_copy.execute(context=context)
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 131, in execute
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     branch = super(BranchPythonOperator, self).execute(context)
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 103, in execute
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     return_value = self.execute_callable()
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/airflow/operators/python_operator.py", line 108, in execute_callable
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     return self.python_callable(*self.op_args, **self.op_kwargs)
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/tfx/orchestration/airflow/airflow_adapter.py", line 110, in check_cache_and_maybe_prepare_execution
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     self._logger) as m:
[2019-05-29 11:04:21,522] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/tfx/orchestration/metadata.py", line 53, in __enter__
[2019-05-29 11:04:21,523] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     self._store = metadata_store.MetadataStore(self._connection_config)
[2019-05-29 11:04:21,523] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache   File "/usr/local/lib/python3.6/site-packages/ml_metadata/metadata_store/metadata_store.py", line 48, in __init__
[2019-05-29 11:04:21,523] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache     config.SerializeToString())
[2019-05-29 11:04:21,523] {base_task_runner.py:101} INFO - Job 46: Subtask chicago_taxi_simple.CsvExampleGen.checkcache SystemError: <built-in function CreateMetadataStore> returned a result with an error set
[2019-05-29 11:04:24,267] {logging_mixin.py:95} INFO - [2019-05-29 11:04:24,265] {jobs.py:2527} INFO - Task exited with return code 1

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:16 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
rummenscommented, Sep 26, 2019

Changed permissions? Keep in mind that the airflow path might differ from your development path (except if you are developing in ~/airflow. Maybe delete DB file and let TFX create a new one? I also had some issues with multiple reads to the DB causing an issue but the error message was different.

1reaction
krazyhaascommented, Jun 5, 2019

Done! You should now be able to see both the component/subcomponent logs as well as the actual nodes in the DAG viewer. Sorry about the delay – getting the test updated and passing was actually more work than the code change itself. 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Log won't show in Airflow UI - Astronomer Forum
We recently upgraded our Astronomer Enterprise platform to v0.23.10. Since the upgrade, the tasks' logs do not load into the Airflow UI.
Read more >
Issue with TFX Trainer component not outputting model to ...
The reason that the model was not written in the filesystem was that the estimator needs a config argument to know where to...
Read more >
The TFX User Guide - TensorFlow
TFX uses Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes tasks on an array of workers ......
Read more >
Logging for Tasks - Apache Airflow
Airflow writes logs for tasks in a way that allows you to see the logs for each task separately in the Airflow UI....
Read more >
Airflow vs. Luigi vs. Argo vs. MLFlow vs. KubeFlow
Luigi is simple, Airflow is powerful, and Argo is Kubernetes-based. ... and it provides a calendar UI to help you set up when...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found