question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported

See original GitHub issue

If I modify the SQL statement in the example (https://github.com/GoogleCloudPlatform/spark-bigquery-connector/blob/master/examples/python/query_results.py) to have a WHERE clause

SELECT * FROM bigquery-public-data.san_francisco.bikeshare_stationss JOINbigquery-public-data.san_francisco.bikeshare_tripst ON s.station_id = t.start_station_id WHERE name = 'Mezes Park'

I am getting error

Waiting for job output… Querying BigQuery Reading query results into Spark Traceback (most recent call last): File “/tmp/d34ed08caafd490d9d8ec73c0de1c7f0/demo.py”, line 30, in <module> df.show() File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py”, line 350, in show File “/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py”, line 1257, in call File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py”, line 63, in deco File “/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py”, line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o57.showString. : com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.FailedPreconditionException: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:59) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) at com.google.cloud.spark.bigquery.repackaged.com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:982) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:957) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748) at com.google.cloud.spark.bigquery.repackaged.io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:515) at com.google.cloud.spark.bigquery.repackaged.io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:490) at com.google.cloud.spark.bigquery.repackaged.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700) at com.google.cloud.spark.bigquery.repackaged.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:507) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:627) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:515) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:686) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:675) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.storage.v1beta1.BigQueryStorageClient.createReadSession(BigQueryStorageClient.java:237) at com.google.cloud.spark.bigquery.direct.DirectBigQueryRelation.buildScan(DirectBigQueryRelation.scala:83) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:338) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:337) at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProjectRaw(DataSourceStrategy.scala:415) at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProject(DataSourceStrategy.scala:333) at org.apache.spark.sql.execution.datasources.DataSourceStrategy.apply(DataSourceStrategy.scala:289) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:78) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:75) at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:75) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:67) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:72) at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:68) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:77) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3254) at org.apache.spark.sql.Dataset.head(Dataset.scala:2489) at org.apache.spark.sql.Dataset.take(Dataset.scala:2703) at org.apache.spark.sql.Dataset.showString(Dataset.scala:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) … 1 more … 1 more Caused by: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported at com.google.cloud.spark.bigquery.repackaged.io.grpc.Status.asRuntimeException(Status.java:533) … 24 more

ERROR: (gcloud.dataproc.jobs.submit.pyspark) Job [d34ed08caafd490d9d8ec73c0de1c7f0] failed with error: Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found in ‘gs://dataproc-b888cd53-d066-4ee7-a69f-a8dab6da7af6-us-west2/google-cloud-dataproc-metainfo/e5be50c0-9871-438c-bf8d-dd0f6a42bd04/jobs/d34ed08caafd490d9d8ec73c0de1c7f0/driveroutput’.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:14 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
kmjungcommented, May 12, 2020

@sharadbhadouria the fix is being rolled out at the moment. I will update this bug when it’s enabled globally.

1reaction
kmjungcommented, Nov 13, 2019

It looks like you are retrieving the anonymous table to which the query result was written, rather than setting the destination table when submitting the query for execution. All query results are written to tables in BigQuery, but the workaround is to specify an output table rather than letting the system create an anonymous table for you.

I don’t know the BQQuerier object that you’re using here, but at the BQ API level, this means using the jobs.insert API rather than jobs.query, and specifying the destination table in your JobConfigurationQuery.

Read more comments on GitHub >

github_iconTop Results From Across the Web

BigQuery Storage API: the table has a storage format that is ...
... FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported #46.
Read more >
there was an error creating the session: the table has ... - GitHub
FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported #46.
Read more >
Error messages - Resource Manager - Google Cloud
This response indicates that the requested document has not been modified and ... value specifies an output format that is not supported for...
Read more >
Resolve the error "unable to create input format" in Amazon ...
When you query the table from Athena, the query fails with the error "HIVE_UNKNOWN_ERROR: Unable to create input format".
Read more >
412 Precondition Failed - HTTP - MDN Web Docs
The HyperText Transfer Protocol (HTTP) 412 Precondition Failed client error response code indicates that access to the target resource has been denied.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found