BigQueryInsertJobOperator is broken on any type of job except `query`
See original GitHub issueApache Airflow Provider(s)
Versions of Apache Airflow Providers
apache-airflow-providers-google==7.0.0
Apache Airflow version
2.2.5
Operating System
MacOS 12.2.1
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
We are using BigQueryInsertJobOperator
to load data from parquet files in Google Cloud Storage with this kind of configuration:
BigQueryInsertJobOperator(
task_id="load_to_bq",
configuration={
"load": {
"writeDisposition": "WRITE_APPEND",
"createDisposition": "CREATE_IF_NEEDED",
"destinationTable": destination_table,
"sourceUris": source_files
"sourceFormat": "PARQUET"
}
}
After upgrade to apache-airflow-providers-google==7.0.0
all load jobs are now broken. I believe that problem lies in this line: https://github.com/apache/airflow/blob/5bfacf81c63668ea63e7cb48f4a708a67d0ac0a2/airflow/providers/google/cloud/operators/bigquery.py#L2170
So it’s trying to get the destination table from query
job config and makes it impossible to use any other type of job.
What you think should happen instead
No response
How to reproduce
Use BigQueryInsertJobOperator to submit any type of job except query
Anything else
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/google/cloud/operators/bigquery.py", line 2170, in execute
table = job.to_api_repr()["configuration"]["query"]["destinationTable"]
KeyError: 'query'
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
Issue Analytics
- State:
- Created a year ago
- Reactions:9
- Comments:24 (12 by maintainers)
Top Results From Across the Web
Airflow BigQueryInsertJobOperator SQL file from different ...
I'm trying to figure out how to reference a sql file in a another path in the same bucket as the DAG. The...
Read more >airflow.providers.google.cloud.operators.bigquery
Executes BigQuery SQL queries in a specific BigQuery database. This operator does not assert idempotency. This operator is deprecated. Please use airflow.
Read more >BigQuery Explained: Querying your Data - Google Cloud
This post dives into querying data with BigQuery, lifecycle of a SQL query, standard & materialized views, saving and sharing queries.
Read more >BigQueryInsertJobOperator - Astronomer Registry
Executes a BigQuery job. Waits for the job to complete and returns job id. This operator work in the following way:
Read more >Remove all deprecation warnings in providers (#17900)
BigQueryInsertJobOperator ` + :param sql: the sql code to be executed (templated) :type sql: Can receive a str representing a sql statement, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I installed the new 8.0.0rc1 on google composer and it seems to have fixed the problem.
Thx for your help @raphaelauv
@potiuk I’ve tested it and it’s working as expected, the test details are in test status https://github.com/apache/airflow/issues/24289#issuecomment-1148963358
Thank you @raphaelauv, and everyone 👍