question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BigQuery: insert_rows does not seem to work

See original GitHub issue

Hello, I have this code snippet:

client = bigquery.Client(...)
table = client.get_table(
  self.client.dataset("Integration_tests").table("test")
)
print(table.schema)
rows = [
  {"doi": "test-{}".format(i), "subjects": ["something"]}
  for i in range(1000)
]
client.insert_rows(table, rows)

This produces the following output:

DEBUG:urllib3.util.retry:Converted retries value: 3 -> Retry(total=3, connect=None, read=None, redirect=None, status=None)
DEBUG:google.auth.transport.requests:Making request: POST https://accounts.google.com/o/oauth2/token
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): accounts.google.com:443
DEBUG:urllib3.connectionpool:https://accounts.google.com:443 "POST /o/oauth2/token HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.googleapis.com:443
DEBUG:urllib3.connectionpool:https://www.googleapis.com:443 "GET /bigquery/v2/projects/{projectname}/datasets/Integration_tests/tables/test HTTP/1.1" 200 None
[SchemaField('doi', 'STRING', 'REQUIRED', None, ()), SchemaField('subjects', 'STRING', 'REPEATED', None, ())]
DEBUG:urllib3.connectionpool:https://www.googleapis.com:443 "POST /bigquery/v2/projects/{projectname}/datasets/Integration_tests/tables/test/insertAll HTTP/1.1" 200 None

It seems like it worked, but when I go to my table it’s empty. Any idea?

Python version: 3.6.0 Libraries version: google-cloud-bigquery==1.1.0 google-cloud-core==0.28.1

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:32 (10 by maintainers)

github_iconTop GitHub Comments

3reactions
markvinczecommented, Sep 13, 2019

Okay, I think I might have found a solution.

In the “Streaming into ingestion-time partitioned tables” section on this page there is the suggestion that the partition can be explicitly specified with the syntax mydataset.table$20170301.
If I do this (so replace table_ref = dataset_ref.table('payload_logs') with dataset_ref.table('payload_logs$20190913') in the code above), then it works, and the result is immediately returned by the queries.

This is a bit surprising to me, because if I don’t specify the partitiontime explicitly, then I’d expect BigQuery to simply take the current UTC date, which seems to be identical to what I’m doing when I’m specifying it in code.
Anyhow, this seems to solve the issue.

1reaction
srinidhi-shankarcommented, Aug 6, 2019

I had the same problem. I got around it by using jobs to push data instead of client.insert_rows

Like this:

table_ref = dataset_ref.table(table_id)
job_config = bigquery.LoadJobConfig()
job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
job_config.autodetect = False

job = client.load_table_from_file(io.StringIO(data), table_ref, job_config=job_config)
job.result()  # Waits for table load to complete.
print("Loaded {} rows into {}:{}.".format(job.output_rows, dataset_id, table_id))

Reference: https://cloud.google.com/bigquery/docs/loading-data-local

Read more comments on GitHub >

github_iconTop Results From Across the Web

BigQuery: insert rows, but it's not written - Stack Overflow
This can happen if you do the insert right after deleting and re-creating the table. The streaming buffer of a deleted table is...
Read more >
Error messages | BigQuery - Google Cloud
This document describes error messages you might encounter when working with BigQuery, including HTTP error codes, job errors, and Google Cloud console ...
Read more >
Insert rows in BigQuery tables with complex columns - Adaltas
When reading the schema in BigQuery's UI, the complex column will first appear with it's defined type and mode (record, nullable) and then...
Read more >
BigQuery INSERT and UPDATE Commands - Hevo Data
Since BigQuery is a Data Warehouse service, its querying layer plays a big role in its acceptability for use cases. Data Manipulation statements ......
Read more >
Chapter 4. Loading Data into BigQuery - O'Reilly
Hence, it would not work for the college scorecard dataset unless we had staged it in Google Cloud Storage first. Even if you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found