Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BigQuery: insert_rows does not seem to work

See original GitHub issue

Hello, I have this code snippet:

client = bigquery.Client(...)
table = client.get_table(
  self.client.dataset("Integration_tests").table("test")
)
print(table.schema)
rows = [
  {"doi": "test-{}".format(i), "subjects": ["something"]}
  for i in range(1000)
]
client.insert_rows(table, rows)

This produces the following output:

DEBUG:urllib3.util.retry:Converted retries value: 3 -> Retry(total=3, connect=None, read=None, redirect=None, status=None)
DEBUG:google.auth.transport.requests:Making request: POST https://accounts.google.com/o/oauth2/token
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): accounts.google.com:443
DEBUG:urllib3.connectionpool:https://accounts.google.com:443 "POST /o/oauth2/token HTTP/1.1" 200 None
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.googleapis.com:443
DEBUG:urllib3.connectionpool:https://www.googleapis.com:443 "GET /bigquery/v2/projects/{projectname}/datasets/Integration_tests/tables/test HTTP/1.1" 200 None
[SchemaField('doi', 'STRING', 'REQUIRED', None, ()), SchemaField('subjects', 'STRING', 'REPEATED', None, ())]
DEBUG:urllib3.connectionpool:https://www.googleapis.com:443 "POST /bigquery/v2/projects/{projectname}/datasets/Integration_tests/tables/test/insertAll HTTP/1.1" 200 None

It seems like it worked, but when I go to my table it’s empty. Any idea?

Python version: 3.6.0 Libraries version: google-cloud-bigquery==1.1.0 google-cloud-core==0.28.1

Issue Analytics

State:
Created 5 years ago
Comments:32 (10 by maintainers)

Top GitHub Comments

3reactions

markvinczecommented, Sep 13, 2019

Okay, I think I might have found a solution.

In the “Streaming into ingestion-time partitioned tables” section on this page there is the suggestion that the partition can be explicitly specified with the syntax mydataset.table$20170301.
If I do this (so replace table_ref = dataset_ref.table('payload_logs') with dataset_ref.table('payload_logs$20190913') in the code above), then it works, and the result is immediately returned by the queries.

This is a bit surprising to me, because if I don’t specify the partitiontime explicitly, then I’d expect BigQuery to simply take the current UTC date, which seems to be identical to what I’m doing when I’m specifying it in code.
Anyhow, this seems to solve the issue.

1reaction

srinidhi-shankarcommented, Aug 6, 2019

I had the same problem. I got around it by using jobs to push data instead of client.insert_rows

Like this:

table_ref = dataset_ref.table(table_id)
job_config = bigquery.LoadJobConfig()
job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
job_config.autodetect = False

job = client.load_table_from_file(io.StringIO(data), table_ref, job_config=job_config)
job.result()  # Waits for table load to complete.
print("Loaded {} rows into {}:{}.".format(job.output_rows, dataset_id, table_id))

Reference: https://cloud.google.com/bigquery/docs/loading-data-local

Top Results From Across the Web

BigQuery: insert rows, but it's not written - Stack Overflow

This can happen if you do the insert right after deleting and re-creating the table. The streaming buffer of a deleted table is...

Error messages | BigQuery - Google Cloud

This document describes error messages you might encounter when working with BigQuery, including HTTP error codes, job errors, and Google Cloud console ...

Insert rows in BigQuery tables with complex columns - Adaltas

When reading the schema in BigQuery's UI, the complex column will first appear with it's defined type and mode (record, nullable) and then...

BigQuery INSERT and UPDATE Commands - Hevo Data

Since BigQuery is a Data Warehouse service, its querying layer plays a big role in its acceptability for use cases. Data Manipulation statements ......

Chapter 4. Loading Data into BigQuery - O'Reilly

Hence, it would not work for the college scorecard dataset unless we had staged it in Google Cloud Storage first. Even if you...