question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MLMD Database locked on AirFlow K8s Deployment

See original GitHub issue

Description

I am running into the following error, when executing the taxi example on AirFlow (1.10.2) installation on a K8s cluster.

AIRFLOW_CTX_DAG_ID=chicago_taxi_simple.CsvExampleGen
AIRFLOW_CTX_TASK_ID=chicago_taxi_simple.CsvExampleGen.checkcache
AIRFLOW_CTX_EXECUTION_DATE=2019-06-04T12:59:18.913844+00:00
AIRFLOW_CTX_DAG_RUN_ID=backfill_2019-06-04T12:59:18.913844+00:00
[2019-06-04 13:22:36,448] {base_task_runner.py:101} INFO - Job 14: Subtask chicago_taxi_simple.CsvExampleGen.checkcache 2019-06-04 13:22:36.448618: F ml_metadata/metadata_store/metadata_source.cc:107] Non-OK-status: metadata_source_->Commit() status: Internal: Error when executing query: database is lockedquery: COMMIT;
[2019-06-04 13:22:41,491] {logging_mixin.py:95} INFO - [2019-06-04 13:22:41,489] {jobs.py:2527} INFO - Task exited with return code -6
airflow@airflow-web-d4bbc4f6c-4xg8g:~/logs/chicago_taxi_simple.CsvExampleGen/chicago_taxi_simple.CsvExampleGen.checkcache/2019-06-04T12:59

I am not sure why this happens since the csv component is the only one running. So there should be no concurrent hits on the sqlite db. The underlying PV is ReadWriteMany, so the file structure supports multiple writes.

Is this a bug or is there an option to increase the timeout?

Possible Solution

If this is not the case I guess the only way is to connect MLMD to its own mySQL instance but I am struggling with it. I do not understand where to add the metadata_store variable to the pipeline definition?

I assume I have to change something in the final return statement of the pipeline (metadata_db_root maybe?). Unfortunately there is no API/doc of tfx.orchestration import pipeline, so I have no idea which attribute to use:-/

  return pipeline.Pipeline(
      pipeline_name='chicago_taxi_simple',
      pipeline_root=_pipeline_root,
      components=[
          example_gen, statistics_gen, infer_schema, validate_stats, transform,
          trainer, model_analyzer, model_validator, pusher
      ],
      enable_cache=True,
      metadata_db_root=_metadata_db_root,
      additional_pipeline_args={'logger_args': logger_overrides},
  )

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
krazyhaascommented, Jul 17, 2019

Thanks. Unfortunately, we can’t reproduce either issue.

0reactions
rummenscommented, Jul 17, 2019

We have actually moved away from Airflow on K8s, so I never came around to test this. I thought that you might wanna keep this around in case other people have the same problem? But from my perspective we can close this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Production Deployment — Airflow Documentation
Create an empty DB and give airflow's user the permission to CREATE/ALTER it. ... on kubernetes, use a livenessProbe on the scheduler deployment...
Read more >
Using data manipulation language (DML) | BigQuery
Using data manipulation language (DML) ... The BigQuery data manipulation language (DML) enables you to update, insert, and delete data from your BigQuery...
Read more >
Towards MLOps: technical capabilities of a machine learning ...
When a data science team runs a few ML models, having a manual process for training and deploying is sufficient, especially when these ......
Read more >
Tags and their synonyms - Stack Exchange Data Explorer
select e.id, count(t.tagName), string_agg(TagSynonyms.SourceTagName, ',') as synonyms, t.tagName, e.body as 'Excerpt', w.body as 'WikiBody' ...
Read more >
Airflow Kubernetes Executor pods go into "NotReady" state ...
It's also really strange that the upgrade-db pod is doing this too. Screenshot of kubectl get pods for the namespace airflow is deployed...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found