question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow partition-wise copy from bq to bq tables.

See original GitHub issue

Description

As of now, there is no support to copy a partition from a source table to destination table’s partition in the existing code

bq cp cli gives an option to do this with a decorator ( $ ).

Use case/motivation

we have many use cases everyday where we keep different tables across projects in sync by materialising new partitions with a frequency. This feature could be great. other options for now would be using DBT, but to use it just for this will be an added task.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:8 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
o-nikolascommented, Oct 6, 2022

@eladkal, @kaxil Can you add the area:providers/provider:Google label to this one?

Also, it seems like a somewhat straightforward update to the operator (though GCP is not my area of expertise) so perhaps Good First Issue as well?

0reactions
bugraoz93commented, Nov 26, 2022

Hey @bhankit1410, I think this feature has already been supported.

I have tested on BigQueryToBigQueryOperator. I created a simple DAG. You can see the DAG as follows. I generated sample 1000 records with multiple types of column types such as integer, UUID and timestamp. I have inserted data into a table. I have tested with every three Write Dispositions. Every case I have tested has been working for me so far.

https://gist.github.com/bugraoz93/d3ee6d2d03d1881de4614d1e7c3b8234

import datetime

from airflow import DAG
from airflow.providers.google.cloud.transfers.bigquery_to_bigquery import BigQueryToBigQueryOperator

with DAG(
    dag_id="test_dag",
    max_active_runs=1,
    start_date=datetime.datetime(2022, 11, 24),
    schedule_interval="@once",
    catchup=False,
    concurrency=1,
) as dag:
    project = "test_project"
    dataset = "bugraoz93_test"
    source_table = "test_table$20221125"
    destination_table = "test_table_dest$20221124"

    copy_table_to_fact = BigQueryToBigQueryOperator(
        task_id="copy_test",
        gcp_conn_id="gcp_conn",
        source_project_dataset_tables="{project}.{dataset}.{table}".format(
            project=project, dataset=dataset, table=source_table),
        destination_project_dataset_table="{project}.{dataset}.{table}".format(
            project=project, dataset=dataset, table=destination_table
        ),
        write_disposition="WRITE_APPEND",
        create_disposition="CREATE_IF_NEEDED",
        dag=dag,
    )

I have also checked the code. I have tested the individual methods to ensure that they can process the $ sign within the table name without any exceptions. There is no part within the code that prevents this feature to work.

Could you please expand your case a little bit? Which Apache Airflow version are you using to achieve it? Which provider version are you using for google (apache-airflow-providers-google)?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Managing partitioned tables | BigQuery | Google Cloud
Managing partitioned tables · Get partition metadata · Set the partition expiration · Set partition filter requirements · Copy a partitioned table ·...
Read more >
Copy partitioned bigquery table that only overwrites the ...
You can do something like this: Use this query to build script for bq command #legacySql select concat ('bq cp -f ', s.project_id, ......
Read more >
BigQuery Copy — How to copy data efficiently between ...
BigQuery provides bq cp command to copy tables with in the same project or to a different project with 0 cost. (Provided, both...
Read more >
How to Duplicate a Table in BigQuery - PopSQL
In the BigQuery UI, select the table you wish to copy, then push the Copy Table button. Enter the desired new table name....
Read more >
Guide to BigQuery Partition - Coupler.io Blog
How to copy a BigQuery partitioned table ... Unfortunately, BigQuery does not allow partitioning a table using multiple columns yet.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found