Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported

See original GitHub issue

If I modify the SQL statement in the example (https://github.com/GoogleCloudPlatform/spark-bigquery-connector/blob/master/examples/python/query_results.py) to have a WHERE clause

SELECT * FROM bigquery-public-data.san_francisco.bikeshare_stationss JOINbigquery-public-data.san_francisco.bikeshare_tripst ON s.station_id = t.start_station_id WHERE name = 'Mezes Park'

I am getting error

Waiting for job output… Querying BigQuery Reading query results into Spark Traceback (most recent call last): File “/tmp/d34ed08caafd490d9d8ec73c0de1c7f0/demo.py”, line 30, in <module> df.show() File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py”, line 350, in show File “/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py”, line 1257, in call File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py”, line 63, in deco File “/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py”, line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o57.showString. : com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.FailedPreconditionException: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:59) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) at com.google.cloud.spark.bigquery.repackaged.com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:982) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:957) at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748) at com.google.cloud.spark.bigquery.repackaged.io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:515) at com.google.cloud.spark.bigquery.repackaged.io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:490) at com.google.cloud.spark.bigquery.repackaged.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700) at com.google.cloud.spark.bigquery.repackaged.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at com.google.cloud.spark.bigquery.repackaged.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:507) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:627) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:515) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:686) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:675) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.storage.v1beta1.BigQueryStorageClient.createReadSession(BigQueryStorageClient.java:237) at com.google.cloud.spark.bigquery.direct.DirectBigQueryRelation.buildScan(DirectBigQueryRelation.scala:83) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:293) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:338) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:337) at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProjectRaw(DataSourceStrategy.scala:415) at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProject(DataSourceStrategy.scala:333) at org.apache.spark.sql.execution.datasources.DataSourceStrategy.apply(DataSourceStrategy.scala:289) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:63) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:78) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:75) at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:75) at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:67) at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:72) at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:68) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:77) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3254) at org.apache.spark.sql.Dataset.head(Dataset.scala:2489) at org.apache.spark.sql.Dataset.take(Dataset.scala:2703) at org.apache.spark.sql.Dataset.showString(Dataset.scala:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) … 1 more … 1 more Caused by: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported at com.google.cloud.spark.bigquery.repackaged.io.grpc.Status.asRuntimeException(Status.java:533) … 24 more

ERROR: (gcloud.dataproc.jobs.submit.pyspark) Job [d34ed08caafd490d9d8ec73c0de1c7f0] failed with error: Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found in ‘gs://dataproc-b888cd53-d066-4ee7-a69f-a8dab6da7af6-us-west2/google-cloud-dataproc-metainfo/e5be50c0-9871-438c-bf8d-dd0f6a42bd04/jobs/d34ed08caafd490d9d8ec73c0de1c7f0/driveroutput’.

Issue Analytics

State:
Created 4 years ago
Comments:14 (2 by maintainers)

Top GitHub Comments

1reaction

kmjungcommented, May 12, 2020

@sharadbhadouria the fix is being rolled out at the moment. I will update this bug when it’s enabled globally.

1reaction

kmjungcommented, Nov 13, 2019

It looks like you are retrieving the anonymous table to which the query result was written, rather than setting the destination table when submitting the query for execution. All query results are written to tables in BigQuery, but the workaround is to specify an output table rather than letting the system create an anonymous table for you.

I don’t know the BQQuerier object that you’re using here, but at the BQ API level, this means using the jobs.insert API rather than jobs.query, and specifying the destination table in your JobConfigurationQuery.

Top Results From Across the Web

BigQuery Storage API: the table has a storage format that is ...

... FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported #46.

there was an error creating the session: the table has ... - GitHub

FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported #46.

Error messages - Resource Manager - Google Cloud

This response indicates that the requested document has not been modified and ... value specifies an output format that is not supported for...

Resolve the error "unable to create input format" in Amazon ...

When you query the table from Athena, the query fails with the error "HIVE_UNKNOWN_ERROR: Unable to create input format".

412 Precondition Failed - HTTP - MDN Web Docs

The HyperText Transfer Protocol (HTTP) 412 Precondition Failed client error response code indicates that access to the target resource has been denied.