question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The 0.16.0 version fails with "Could not initialize class com.google.cloud.spark.bigquery.SparkBigQueryConnectorUserAgentProvider"

See original GitHub issue

The current latest version fails to fetch data from BigQuery with java.lang.NoClassDefFoundError: Could not initialize class com.google.cloud.spark.bigquery.SparkBigQueryConnectorUserAgentProvider

It can get basic metadata, so the data frames get created, but once we need to materialize or compute it fails.

The exact fat jar used: https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-latest_2.11.jar and it passed to the spark-submit with the --jars

The code is a simple spark load: spark.read.format("bigquery").option("table", "project.dataset.tablename").load()

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
medbcommented, Jul 1, 2020

You can list all available Spark BQ connector jars using gsutil:

$ gsutil ls -al gs://spark-lib/bigquery/
  19849551  2019-03-07T01:11:14Z  gs://spark-lib/bigquery/spark-bigquery-assembly-0.5.0-beta.jar#1551921074007485  metageneration=1
  21098621  2019-06-26T00:17:54Z  gs://spark-lib/bigquery/spark-bigquery-assembly-0.6.0-beta.jar#1561508274015522  metageneration=1
  21092168  2019-06-28T22:29:58Z  gs://spark-lib/bigquery/spark-bigquery-assembly-0.7.0-beta.jar#1561760998682150  metageneration=1
  21909442  2019-10-09T22:54:00Z  gs://spark-lib/bigquery/spark-bigquery-assembly-0.8.1-beta.jar#1570661640827792  metageneration=1
  33796003  2020-06-11T15:31:57Z  gs://spark-lib/bigquery/spark-bigquery-latest.jar#1591889517604253  metageneration=1
  33796003  2020-06-11T15:31:59Z  gs://spark-lib/bigquery/spark-bigquery-latest_2.11.jar#1591889519333517  metageneration=1
  33534361  2020-06-11T15:32:03Z  gs://spark-lib/bigquery/spark-bigquery-latest_2.12.jar#1591889523678710  metageneration=1
  26172853  2020-01-29T17:31:04Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.12.0-beta.jar#1580319064309367  metageneration=1
  26745435  2020-02-14T22:24:15Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.13.1-beta.jar#1581719055262477  metageneration=1
  27510856  2020-03-31T19:32:02Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.14.0-beta.jar#1585683122570642  metageneration=1
  32434023  2020-04-21T01:50:49Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.15.0-beta.jar#1587433849225927  metageneration=1
  32436661  2020-04-27T17:11:25Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.15.1-beta.jar#1588007485112340  metageneration=1
  33796043  2020-06-10T17:20:55Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.16.0.jar#1591809655640723  metageneration=1
  33796003  2020-06-11T15:31:55Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.16.1.jar#1591889515913264  metageneration=1
  26103186  2020-01-29T17:31:10Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.12.0-beta.jar#1580319070056211  metageneration=1
  26659161  2020-02-14T22:24:20Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.13.1-beta.jar#1581719060453402  metageneration=1
  27421319  2020-03-31T19:32:31Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.14.0-beta.jar#1585683151505446  metageneration=1
  32188366  2020-04-21T01:49:59Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.15.0-beta.jar#1587433799743031  metageneration=1
  32189382  2020-04-27T17:11:15Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.15.1-beta.jar#1588007475534869  metageneration=1
  33534399  2020-06-10T17:21:01Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.16.0.jar#1591809661916636  metageneration=1
  33534361  2020-06-11T15:32:01Z  gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.16.1.jar#1591889521931635  metageneration=1
  24495214  2019-11-15T00:14:06Z  gs://spark-lib/bigquery/spark-bigquery_2.11-0.10.0-beta-shaded.jar#1573776846054085  metageneration=1
  29064121  2019-12-18T01:04:26Z  gs://spark-lib/bigquery/spark-bigquery_2.11-0.11.0-beta-shaded.jar#1576631066526656  metageneration=1
  24449554  2019-11-12T17:24:26Z  gs://spark-lib/bigquery/spark-bigquery_2.11-0.9.0-beta-shaded.jar#1573579466230193  metageneration=1
  24450418  2019-11-12T17:22:18Z  gs://spark-lib/bigquery/spark-bigquery_2.11-0.9.1-beta-shaded.jar#1573579338461138  metageneration=1
  24476342  2019-11-12T17:22:20Z  gs://spark-lib/bigquery/spark-bigquery_2.11-0.9.2-beta-shaded.jar#1573579340376743  metageneration=1
  24426539  2019-11-15T00:14:12Z  gs://spark-lib/bigquery/spark-bigquery_2.12-0.10.0-beta-shaded.jar#1573776852403048  metageneration=1
  28993273  2019-12-18T01:04:31Z  gs://spark-lib/bigquery/spark-bigquery_2.12-0.11.0-beta-shaded.jar#1576631071111697  metageneration=1
  24393077  2019-11-12T17:24:28Z  gs://spark-lib/bigquery/spark-bigquery_2.12-0.9.0-beta-shaded.jar#1573579468329635  metageneration=1
  24394046  2019-11-12T17:22:22Z  gs://spark-lib/bigquery/spark-bigquery_2.12-0.9.1-beta-shaded.jar#1573579342419191  metageneration=1
  24415555  2019-11-12T17:22:24Z  gs://spark-lib/bigquery/spark-bigquery_2.12-0.9.2-beta-shaded.jar#1573579344322467  metageneration=1
TOTAL: 31 objects, 863156336 bytes (823.17 MiB)

You can use any of these jars.

0reactions
sagarkbhattcommented, Jul 1, 2020

instead of -latest.jar can you provide gs bucket path with specific version?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Could not initialize class com.google.cloud.spark.bigquery ...
fit(), it throws the below error. from pyspark.sql import SparkSession from pyspark.sql.functions import col, split from pyspark ...
Read more >
Use the BigQuery connector with Spark - Google Cloud
Install the spark-bigquery-connector in the Spark jars directory of every node by using the Dataproc connectors initialization action when you create your ...
Read more >
com.google.cloud.spark spark-bigquery_2.11 - Javadoc.io
SparkBigQueryUtil · SparkFilterUtils · SupportedCustomDataType · ToIterator. focushidecom.google.cloud.spark.bigquery.direct.
Read more >
pandas-gbq Documentation - Read the Docs
You can install pandas-gbq with conda, pip, or by installing from source ... Note: The dependency on google-cloud-bigquery is new in version ......
Read more >
spark-bigquery-connector - Scaladex
Create a Google Cloud Dataproc cluster (Optional). If you do not have an Apache Spark environment you can create a Cloud Dataproc cluster...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found