question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to delete a job using delete API

See original GitHub issue

Problem

There is a case where a job does NOT get deleted using the job delete mentioned here: https://marquezproject.github.io/marquez/openapi.html#operation/deleteJob.

Steps to reproduce

create a file data.txt that contains the following JSON data:

{
  "eventTime": "2022-10-06T15:11:58.316935Z",
  "eventType": "START",
  "id": {
    "namespace": "new-bolide-7243",
    "name": "etl_categories.insert"
  },
  "job": {
    "facets": {
      "sql": {
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
        "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/SqlJobFacet",
        "query": "select * from howard;"
      }
    },
    "name": "etl_categories.insert",
    "namespace": "new-bolide-7243"
  },
  "producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
  "type": "BATCH",
  "name": "etl_categories.insert",
  "simpleName": "insert",
  "parentJobName": "etl_categories",
  "createdAt": "2022-09-27T23:00:09.715054Z",
  "updatedAt": "2022-10-11T05:00:06.610820Z",
  "namespace": "new-bolide-7243",
  "inputs": [
    {
      "namespace": "bigquery",
      "name": "bq-airflow-marquez.food_delivery.tmp_categories"
    }
  ],
  "outputs": [
    {
      "namespace": "bigquery",
      "name": "bq-airflow-marquez.food_delivery.categories"
    }
  ],
  "location": null,
  "context": {},
  "description": "Loads newly added menus categories daily.",
  "run": {
    "runId": "387cd9f5-187a-443c-892a-fa5eda9a6621",
    "createdAt": "2022-10-11T05:00:06.358338Z",
    "updatedAt": "2022-10-11T05:00:08.985Z",
    "nominalStartTime": "2022-10-11T04:00:00Z",
    "nominalEndTime": null,
    "state": "COMPLETED",
    "startedAt": "2022-10-11T05:00:06.358338Z",
    "endedAt": "2022-10-11T05:00:08.985Z",
    "durationMs": 2627,
    "args": {
      "nominal_start_time": "2022-10-11T04:00Z[UTC]",
      "run_id": "2e058c29-44c9-3482-be01-ff9d2509d8e3",
      "name": "etl_categories",
      "namespace": "new-bolide-7243"
    },
    "jobVersion": {
      "namespace": "new-bolide-7243",
      "name": "etl_categories.insert",
      "version": "fca17b7b-2937-382a-9d44-bc211a1f0adc"
    },
    "inputVersions": [
      {
        "namespace": "bigquery",
        "name": "bq-airflow-marquez.food_delivery.tmp_categories",
        "version": "21604ec4-c75c-35ac-9d70-de06cfb397f0"
      }
    ],
    "outputVersions": [
      {
        "namespace": "bigquery",
        "name": "bq-airflow-marquez.food_delivery.categories",
        "version": "d91618a3-f1b1-3b59-b70e-02b85b81b4ba"
      }
    ],
    "context": {},
    "facets": {
      "bigQuery_job": {
        "cached": false,
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
        "_schemaURL": "https://github.com/OpenLineage/OpenLineage/tree/main/integration/common/openlineage/schema/bq-statistics-run-facet.json",
        "properties": "{\"kind\": \"bigquery#job\", \"etag\": \"QaNobHQBjHZ1gWc4G/cTux==\", \"id\": \"bq-airflow-marquez:US.airflow_1665464406906115_dcf516d39faf8ac9df7ff10b8b10e5ec\", \"selfLink\": \"https://bigquery.googleapis.com/bigquery/v2/projects/bq-airflow-marquez/jobs/airflow_1665464406906115_dcf516d39faf8ac9df7ff10b8b10e5ec?location=US\", \"user_email\": \"1823-bq-airflow-marquez@bq-airflow-marquez.gserviceaccount.com\", \"jobReference\": {\"projectId\": \"bq-airflow-marquez\", \"jobId\": \"airflow_1665464406906115_dcf516d39faf8ac9df7ff10b8b10e5ec\", \"location\": \"US\"}, \"statistics\": {\"creationTime\": 1665464407043.0, \"startTime\": 1665464407178.0, \"endTime\": 1665464408479.0, \"totalBytesProcessed\": \"0\", \"query\": {\"queryPlan\": [{\"name\": \"S00: Input\", \"id\": \"0\", \"startMs\": \"1665464407323\", \"endMs\": \"1665464407332\", \"waitRatioAvg\": 0.008771929824561403, \"waitMsAvg\": \"2\", \"waitRatioMax\": 0.008771929824561403, \"waitMsMax\": \"2\", \"readRatioAvg\": 0, \"readMsAvg\": \"0\", \"readRatioMax\": 0, \"readMsMax\": \"0\", \"computeRatioAvg\": 0.013157894736842105, \"computeMsAvg\": \"3\", \"computeRatioMax\": 0.013157894736842105, \"computeMsMax\": \"3\", \"writeRatioAvg\": 0.017543859649122806, \"writeMsAvg\": \"4\", \"writeRatioMax\": 0.017543859649122806, \"writeMsMax\": \"4\", \"shuffleOutputBytes\": \"0\", \"shuffleOutputBytesSpilled\": \"0\", \"recordsRead\": \"0\", \"recordsWritten\": \"0\", \"parallelInputs\": \"1\", \"completedParallelInputs\": \"1\", \"status\": \"COMPLETE\", \"steps\": [{\"kind\": \"READ\", \"substeps\": [\"$1:id, $2:name, $3:menu_id, $4:description\", \"FROM food_delivery.tmp_categories\"]}, {\"kind\": \"WRITE\", \"substeps\": [\"$1, $2, $3, $4\", \"TO __stage00_output\"]}], \"slotMs\": \"9\"}, {\"name\": \"S01: Coalesce\", \"id\": \"1\", \"startMs\": \"1665464407340\", \"endMs\": \"1665464407382\", \"inputStages\": [\"0\"], \"waitRatioAvg\": 0.0043859649122807015, \"waitMsAvg\": \"1\", \"waitRatioMax\": 0.0043859649122807015, \"waitMsMax\": \"1\", \"readRatioAvg\": 0, \"readMsAvg\": \"0\", \"readRatioMax\": 0, \"readMsMax\": \"0\", \"computeRatioAvg\": 0.03508771929824561, \"computeMsAvg\": \"8\", \"computeRatioMax\": 0.05263157894736842, \"computeMsMax\": \"12\", \"writeRatioAvg\": 0.06578947368421052, \"writeMsAvg\": \"15\", \"writeRatioMax\": 0.08771929824561403, \"writeMsMax\": \"20\", \"shuffleOutputBytes\": \"0\", \"shuffleOutputBytesSpilled\": \"0\", \"recordsRead\": \"0\", \"recordsWritten\": \"0\", \"parallelInputs\": \"50\", \"completedParallelInputs\": \"50\", \"status\": \"COMPLETE\", \"steps\": [{\"kind\": \"READ\", \"substeps\": [\"FROM __stage00_output\"]}], \"slotMs\": \"2024\"}, {\"name\": \"S02: Output\", \"id\": \"2\", \"startMs\": \"1665464407568\", \"endMs\": \"1665464407767\", \"inputStages\": [\"1\"], \"waitRatioAvg\": 1, \"waitMsAvg\": \"228\", \"waitRatioMax\": 1, \"waitMsMax\": \"228\", \"readRatioAvg\": 0, \"readMsAvg\": \"0\", \"readRatioMax\": 0, \"readMsMax\": \"0\", \"computeRatioAvg\": 0.06578947368421052, \"computeMsAvg\": \"15\", \"computeRatioMax\": 0.06578947368421052, \"computeMsMax\": \"15\", \"writeRatioAvg\": 0.8114035087719298, \"writeMsAvg\": \"185\", \"writeRatioMax\": 0.8114035087719298, \"writeMsMax\": \"185\", \"shuffleOutputBytes\": \"0\", \"shuffleOutputBytesSpilled\": \"0\", \"recordsRead\": \"0\", \"recordsWritten\": \"0\", \"parallelInputs\": \"1\", \"completedParallelInputs\": \"1\", \"status\": \"COMPLETE\", \"steps\": [{\"kind\": \"READ\", \"substeps\": [\"$1, $2, $3, $4\", \"FROM __stage01_output\"]}, {\"kind\": \"WRITE\", \"substeps\": [\"$1, $2, $3, $4\", \"TO __stage02_output\"]}], \"slotMs\": \"248\"}], \"estimatedBytesProcessed\": \"0\", \"timeline\": [{\"elapsedMs\": \"646\", \"totalSlotMs\": \"2282\", \"pendingUnits\": \"0\", \"completedUnits\": \"52\", \"activeUnits\": \"0\"}, {\"elapsedMs\": \"1272\", \"totalSlotMs\": \"2282\", \"pendingUnits\": \"0\", \"completedUnits\": \"52\", \"activeUnits\": \"0\", \"estimatedRunnableUnits\": \"0\"}], \"totalPartitionsProcessed\": \"0\", \"totalBytesProcessed\": \"0\", \"totalBytesBilled\": \"0\", \"billingTier\": 0, \"totalSlotMs\": \"2282\", \"cacheHit\": false, \"referencedTables\": [{\"projectId\": \"bq-airflow-marquez\", \"datasetId\": \"food_delivery\", \"tableId\": \"tmp_categories\"}], \"statementType\": \"SELECT\", \"performanceInsights\": {\"avgPreviousExecutionMs\": \"830\"}, \"transferredBytes\": \"0\"}, \"totalSlotMs\": \"2282\", \"finalExecutionDurationMs\": \"751\"}, \"status\": {\"state\": \"DONE\"}, \"principal_subject\": \"serviceAccount:svc-bq-airflow-marquez@bq-airflow-marquez.iam.gserviceaccount.com\", \"configuration\": {\"query\": {\"query\": \"\\n    SELECT id, name, menu_id, description\\n      FROM food_delivery.tmp_categories\\n    \", \"destinationTable\": {\"projectId\": \"bq-airflow-marquez\", \"datasetId\": \"food_delivery\", \"tableId\": \"categories\"}, \"createDisposition\": \"CREATE_IF_NEEDED\", \"writeDisposition\": \"WRITE_EMPTY\", \"priority\": \"INTERACTIVE\", \"allowLargeResults\": false, \"useLegacySql\": false}, \"jobType\": \"QUERY\"}}",
        "billedBytes": 0
      },
      "externalQuery": {
        "source": "bigquery",
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
        "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
        "externalQueryId": "airflow_1665464406906115_dcf516d39faf8ac9df7ff10b8b10e5ec"
      },
      "parent": {
        "job": {
          "name": "etl_categories",
          "namespace": "new-bolide-7243"
        },
        "run": {
          "runId": "2e058c29-44c9-3482-be01-ff9d2509d8e3"
        },
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
        "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/ParentRunFacet"
      },
      "nominalTime": {
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
        "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/NominalTimeRunFacet",
        "nominalStartTime": "2022-10-11T04:00:00Z"
      },
      "airflow_runArgs": {
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
        "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
        "externalTrigger": false
      },
      "airflow_version": {
        "operator": "airflow.contrib.operators.bigquery_operator.BigQueryOperator",
        "taskInfo": "{'_BaseOperator__init_kwargs': {'task_id': 'insert', 'dag': <DAG: etl_categories>, 'owner': 'datascience', 'email': ['datascience@example.com'], 'email_on_retry': False, 'email_on_failure': False, 'start_date': DateTime(2022, 10, 10, 0, 0, 0, tzinfo=Timezone('UTC')), 'depends_on_past': False, 'sql': '\\n    SELECT id, name, menu_id, description\\n      FROM food_delivery.tmp_categories\\n    ', 'destination_dataset_table': 'bq-airflow-marquez.food_delivery.categories', 'use_legacy_sql': False}, '_BaseOperator__from_mapped': False, 'task_id': 'insert', 'task_group': <weakproxy at 0x7fc2f41c34a0 to TaskGroup at 0x7fc2f41a2940>, 'owner': 'datascience', 'email': ['datascience@example.com'], 'email_on_retry': False, 'email_on_failure': False, 'execution_timeout': None, 'on_execute_callback': None, 'on_failure_callback': None, 'on_success_callback': None, 'on_retry_callback': None, '_pre_execute_hook': None, '_post_execute_hook': None, 'start_date': DateTime(2022, 10, 10, 0, 0, 0, tzinfo=Timezone('UTC')), 'executor_config': {}, 'run_as_user': None, 'retries': 0, 'queue': 'default', 'pool': 'default_pool', 'pool_slots': 1, 'sla': None, 'trigger_rule': <TriggerRule.ALL_SUCCESS: 'all_success'>, 'depends_on_past': False, 'ignore_first_depends_on_past': True, 'wait_for_downstream': False, 'retry_delay': datetime.timedelta(seconds=300), 'retry_exponential_backoff': False, 'max_retry_delay': None, 'params': <airflow.models.param.ParamsDict object at 0x7fc2f3da31c0>, 'priority_weight': 1, 'weight_rule': <WeightRule.DOWNSTREAM: 'downstream'>, 'resources': None, 'max_active_tis_per_dag': None, 'do_xcom_push': True, 'doc_md': None, 'doc_json': None, 'doc_yaml': None, 'doc_rst': None, 'doc': None, 'upstream_task_ids': {'if_not_exists'}, 'downstream_task_ids': set(), 'end_date': None, '_dag': <DAG: etl_categories>, '_log': <Logger airflow.task.operators (INFO)>, 'inlets': [], 'outlets': [], '_inlets': [], '_outlets': [], '_BaseOperator__instantiated': True, 'sql': '\\n    SELECT id, name, menu_id, description\\n      FROM food_delivery.tmp_categories\\n    ', 'destination_dataset_table': 'bq-airflow-marquez.food_delivery.categories', 'write_disposition': 'WRITE_EMPTY', 'create_disposition': 'CREATE_IF_NEEDED', 'allow_large_results': False, 'flatten_results': None, 'gcp_conn_id': 'google_cloud_default', 'delegate_to': None, 'udf_config': None, 'use_legacy_sql': False, 'maximum_billing_tier': None, 'maximum_bytes_billed': None, 'schema_update_options': None, 'query_params': None, 'labels': None, 'priority': 'INTERACTIVE', 'time_partitioning': None, 'api_resource_configs': None, 'cluster_fields': None, 'location': None, 'encryption_configuration': None, 'hook': None, 'impersonation_chain': None}",
        "_producer": "https://github.com/OpenLineage/OpenLineage/tree/0.10.0/integration/airflow",
        "_schemaURL": "https://raw.githubusercontent.com/OpenLineage/OpenLineage/main/spec/OpenLineage.json#/definitions/BaseFacet",
        "airflowVersion": "2.3.3+astro.1",
        "openlineageAirflowVersion": "0.10.0"
      }
    }
  },
  "facets": {},
  "currentVersion": null
}

Start up the latest marquez. Run the following curl command against Marquez.

curl -v -X POST -H "Content-Type: application/json" -d @data.txt http://localhost:5000/api/v1/lineage

Now, run the following command:

curl -X DELETE -v http://localhost:5000/api/v1/namespaces/new-bolide-7243/jobs/etl_categories

And observe that the delete does complete successfully (you cannot see the above job in Marquez UI anymore). Now, run the following command:

curl -X DELETE -v http://localhost:5000/api/v1/namespaces/new-bolide-7243/jobs/etl_categories.insert

Even though the command runs successfully, observe that the deletion does NOT happen, and you can still see the job.

What is expected

The deletion should soft-delete both jobs without any problems.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
pawel-big-lebowskicommented, Oct 11, 2022

@howardyoo, such a well described and reproducible issues deserve special appreciation. What if dataset name did not contain a dot? Some frameworks skip extension part after the dot.

0reactions
howardyoocommented, Oct 11, 2022

So, currently delete job endpoint looks whether job exists by using JobDao.findJobByName method, then if it finds job, deletes it using JobDao.delete method. But those two methods look at different things: findJobByName looks at jobs_view view, while delete looks at jobs table.

To solve this, delete functionality probably needs to get simple job name from job_fqn table.

I think that may be the solution!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unable to delete a Rundeck Job via webAPI - Stack Overflow
You need to disable the execution/schedule first, under admin rights: Disable executions for that jobs using this endpoint.
Read more >
Delete Job—ArcGIS REST APIs
This operation deletes the specified asynchronous job being run by the geoprocessing service. If the current status of the job is SUBMITTED or...
Read more >
Unable to Delete Integration Center Job - SAP Support Portal
The job will not be able to be deleted from the UI, therefore, we will have to use a Rest Client to perform...
Read more >
Job - Delete - REST API (Azure Batch Service) | Microsoft Learn
When a Delete Job request is received, the Batch service sets the Job to the deleting state. All update operations on a Job...
Read more >
Batchable class stuck due to Completed Apex Jobs. Unable to ...
Unable to delete. I uploaded a class implementing the Database.batchable interface into our live environment. It ran with a couple of errors ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found