Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"Internal Error" : com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: INTERNAL: request failed: internal error

See original GitHub issue

Hello All,

I am trying to read table metadata from BigQuery using below code.

Spark BigQuery Lib

spark-bigquery-with-dependencies_2.11-0.15.1-beta.jar

spark
.read
.bigquery("big-query-200217.132235582.__TABLES__")
.show(false)

I am getting below exception & Any kind of help is highly appreciated.

com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.InternalException: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: INTERNAL: request failed: internal error
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:67)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
	at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1074)
	at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
	at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1213)
	at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
	at com.google.cloud.spark.bigquery.repackaged.com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:563)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:463)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:427)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:460)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
	Suppressed: com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
		at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57)
		at com.google.cloud.spark.bigquery.repackaged.com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
		at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.storage.v1.BigQueryReadClient.createReadSession(BigQueryReadClient.java:230)
		at com.google.cloud.spark.bigquery.direct.DirectBigQueryRelation.buildScan(DirectBigQueryRelation.scala:134)
		at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$11.apply(DataSourceStrategy.scala:277)
		at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$11.apply(DataSourceStrategy.scala:277)
		at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:321)
		at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:320)
		at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProjectRaw(DataSourceStrategy.scala:401)
		at org.apache.spark.sql.execution.datasources.DataSourceStrategy.pruneFilterProject(DataSourceStrategy.scala:316)
		at org.apache.spark.sql.execution.datasources.DataSourceStrategy.apply(DataSourceStrategy.scala:273)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:62)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:62)
		at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
		at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
		at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:77)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:74)
		at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
		at scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
		at scala.collection.Iterator$class.foreach(Iterator.scala:893)
		at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
		at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
		at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:74)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:66)
		at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
		at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
		at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:92)
		at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:84)
		at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:80)
		at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89)
		at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89)
		at org.apache.spark.sql.Dataset.withAction(Dataset.scala:2832)
		at org.apache.spark.sql.Dataset.head(Dataset.scala:2153)
		at org.apache.spark.sql.Dataset.take(Dataset.scala:2366)
		at org.apache.spark.sql.Dataset.showString(Dataset.scala:245)
		at org.apache.spark.sql.Dataset.show(Dataset.scala:646)
		at org.apache.spark.sql.Dataset.show(Dataset.scala:623)
		
Caused by: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: INTERNAL: request failed: internal error
	at com.google.cloud.spark.bigquery.repackaged.io.grpc.Status.asRuntimeException(Status.java:535)
	... 17 more

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:19 (3 by maintainers)

Top GitHub Comments

1reaction

chajathcommented, Jul 19, 2021

@kmjung I’m in the same boat. Having trouble querying project_id.data_set.INFORMATION_SCHEMA.TABLES since it breaks some hard-wired assumption about the table naming. I get the following error:

[error] {
[error]   "code" : 400,
[error]   "errors" : [ {
[error]     "domain" : "global",
[error]     "message" : "Invalid project ID 'project_id:data_set'. Project IDs must contain 6-63 lowercase letters, digits, or dashes. Some project IDs also include domain name separated by a colon. IDs must start with a letter and may not end with a dash.",
[error]     "reason" : "invalid"
[error]   } ],
[error]   "message" : "Invalid project ID 'project_id:data_set'. Project IDs must contain 6-63 lowercase letters, digits, or dashes. Some project IDs also include domain name separated by a colon. IDs must start with a letter and may not end with a dash.",
[error]   "status" : "INVALID_ARGUMENT"
[error] }

Is there a way to work around this to query INFORMATION_SCHEMA.TABLES ?

0reactions

davidrabinowitzcommented, Jul 21, 2021

The connector has switched to use Apache Arrow as the default wire format as it is more performant. As the support for this class had been added in spark 2.3, you can try to add the following configuration to your spark application:

spark.conf.set("readDataFormat", "AVRO")

Top Results From Across the Web

request failed: internal error' when retrieving __TABLES__ ...

Notice that the __TABLES__ is not an actual BigQuery table, but a view to its metadata. A way to overcome this is to...

Error messages | BigQuery - Google Cloud

Error message HTTP code Description stopped 200 This status code returns when a job is canceled. timeout 400 The job timed out.

Partition elimination not working from SparkSQL when using ...

StatusRuntimeException : INVALID_ARGUMENT: request failed: Cannot query over table ... at com.google.cloud.spark.bigquery.repackaged.io.grpc.internal.

1622320 - bgbb_pred_dataproc subdag failing after changing project

InternalException: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: INTERNAL: request failed: internal error was a transient error ...

Connection finished with error io.grpc.StatusRuntimeException

Jul 06, 2022 7:04:40 PM com.google.cloud.bigquery.storage.v1.StreamWriter close. INFO: User closing stream: projects/.