question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add MERGE functionality

See original GitHub issue

In BigQuery it is possible to merge two tables:

MERGE DATASET.table T
USING (SELECT * FROM DATASET.another_table) S
ON T.id = S.id
WHEN MATCHED THEN
	UPDATE SET value = S.value
WHEN NOT MATCHED THEN
	INSERT ROW

Is it possible to add such functionality to DataFrame.to_gbq function? I.e. something like df.to_gbq(if_exists='merge', on_a='id', on_b='id'), or even provide a possibilty to write a whole ON clause df.to_gbq(if_exists='merge', on='A.city = B.city AND A.date = B.date'

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:4
  • Comments:12 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
partheacommented, Apr 10, 2022

@violetbrina , PRs are welcome!

1reaction
ZiggerZZcommented, Dec 14, 2020

I’m not yet familiar with pandas-gbq codebase, so it will take me a while. In my projects, I split data into batches and merge them into the main using temporary tables. Can we just do like this? Here’s the psedocode:

# we want to merge `data` to `table_id` using `condition`
data_batches = split_data_into_batches(data)
for data_batch in data_batches:
    tmp_id = _id_generator()
    table_tmp_id = f"{table_id}_tmp_{tmp_id}"
    client.create_table(table=bigquery.Table(table_tmp_id, schema=table_id.schema)), expiration=1 hour)
    query = f"""MERGE `{table_id}` T
USING `{table_tmp_id}` S
ON {condition}
WHEN MATCHED THEN
	UPDATE SET ...
WHEN NOT MATCHED THEN
	INSERT ROW"""
    client.query(query).result()
Read more comments on GitHub >

github_iconTop Results From Across the Web

Merge features into one feature—ArcGIS Pro | Documentation
Merge existing features​​ Click the Existing Feature tab. and select the features on the same layer you want to merge. The selected features...
Read more >
BusinessObjects: Merge Data from Multiple Queries
The merge function in BusinessObjects makes it possible to create a report that displays query results from multiple data sets. You can merge:....
Read more >
Merging Records | Apex Developer Guide
Then it executes queries to get the new account records from the database, and adds a contact to the account to be merged....
Read more >
Set the rules for a mail merge - Microsoft Support
Setting up rules is done after selecting recipients for the mail merge, and after inserting merge fields in the document.. Go to Mailings...
Read more >
Merge Duplicate Records in Salesforce Lightning
Simply navigate to the Duplicate Record Set of the object in question, select related and add junction records from the set to the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found