question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`read_gbq` fails with ValueError when a column is of type `TIME`

See original GitHub issue

I believe that the smallest example to reproduce this is by running:

from pandas_gbq import read_gbq

read_gbq('select current_time()')

the traceback is:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/pandas_gbq/gbq.py", line 967, in read_gbq
    progress_bar_type=progress_bar_type,
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/pandas_gbq/gbq.py", line 532, in run_query
    progress_bar_type=progress_bar_type,
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/pandas_gbq/gbq.py", line 562, in _download_results
    progress_bar_type=progress_bar_type,
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/google/cloud/bigquery/table.py", line 1776, in to_dataframe
    for frame in self.to_dataframe_iterable(dtypes=dtypes):
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/google/cloud/bigquery/table.py", line 1439, in _to_page_iterable
    for item in tabledata_list_download():
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 565, in download_dataframe_tabledata_list
    yield _tabledata_list_page_to_dataframe(page, column_names, dtypes)
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 539, in _tabledata_list_page_to_dataframe
    columns[column_name] = pandas.Series(page._columns[column_index], dtype=dtype)
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/pandas/core/series.py", line 327, in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/pandas/core/construction.py", line 447, in sanitize_array
    subarr = _try_cast(data, dtype, copy, raise_cast_failure)
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/pandas/core/construction.py", line 568, in _try_cast
    subarr = construct_1d_ndarray_preserving_na(subarr, dtype, copy=copy)
  File "/home/german/Documents/c4v/infra-tools/probe-analyzer/env/lib/python3.6/site-packages/pandas/core/dtypes/cast.py", line 1618, in construct_1d_ndarray_preserving_na
    subarr = np.array(values, dtype=dtype, copy=copy)

I believe that changing the mapping between the TIME data-type and its dtype to object can fix the issue:

https://github.com/pydata/pandas-gbq/blob/d251db03b159447331ac9ae63e13d295d75bad70/pandas_gbq/gbq.py#L698-L710

… but haven’t tried yet and I have never collaborated to this repository before.

Any other idea to solve this issue?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
tswastcommented, Sep 15, 2020

Oh, and thanks for the reproducible test case!

0reactions
tswastcommented, Nov 9, 2020

I just stepped through the code, and https://github.com/googleapis/python-bigquery/blob/0c387dadd57fba9cdbfd39abe530de209943db9a/google/cloud/bigquery/table.py#L1721 isn’t failing on my machine, but it’s also not trying to convert from object to datetime64[ns] despite being passed that as a dtype.

Read more comments on GitHub >

github_iconTop Results From Across the Web

`read_gbq` fails with ValueError when a column is of type `TIME`
I believe that the smallest example to reproduce this is by running: from pandas_gbq import read_gbq read_gbq('select current_time()') the ...
Read more >
400 Error while reading data using pandas to_gbq to create a ...
I'm trying to query data from a MySQL server and write it to Google BigQuery using pandas .to_gbq api ...
Read more >
pandas_gbq.gbq — pandas-gbq documentation - Google Cloud
pass class InvalidIndexColumn(ValueError): """ Raised when the provided index column for output results DataFrame does not match the schema returned by ...
Read more >
Comparison with pandas-gbq | BigQuery - Google Cloud
Note: The pandas.read_gbq method defaults to legacy SQL. To use standard SQL, you must explicitly set the dialect parameter to 'standard' , as...
Read more >
pandas-gbq Documentation - Read the Docs
The read_gbq() method infers the pandas dtype for each column, based on the BigQuery table schema. BigQuery Data Type.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found