Bulk Write fails with dependency issue
See original GitHub issueSpark connector version: azure-cosmosdb-spark_2.2.0_2.11-1.1.0 Spark Version: Spark 2.2.0 Scala 2_11 Environment: Azure Databricks
Bulk write fails with a dependency error java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.elapsed()Ljava/time/Duration;
A simple call to write to cosmosdb with a writeconfig
val writeConfig = Config(Map(
"Endpoint" -> "https://foo.documents.azure.com:443/",
"Masterkey" -> "mysecret",
"Database" -> "mydatabase",
"PreferredRegions" -> "West US;East US;",
"Collection" -> "mydata",
"SamplingRatio" -> "1.0",
"BulkImport" -> "true",
"WritingBatchSize" -> "1000",
"ConnectionMaxPoolSize" -> "100",
"Upsert" -> "true"
))
Here is the full stack
Caused by: java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.elapsed()Ljava/time/Duration;
at com.microsoft.azure.documentdb.bulkexecutor.DocumentBulkExecutor.executeBulkImportAsyncImpl(DocumentBulkExecutor.java:619)
at com.microsoft.azure.documentdb.bulkexecutor.DocumentBulkExecutor.executeBulkImportInternal(DocumentBulkExecutor.java:479)
at com.microsoft.azure.documentdb.bulkexecutor.DocumentBulkExecutor.importAll(DocumentBulkExecutor.java:445)
at com.microsoft.azure.cosmosdb.spark.CosmosDBSpark$$anonfun$bulkImport$1.apply(CosmosDBSpark.scala:257)
at com.microsoft.azure.cosmosdb.spark.CosmosDBSpark$$anonfun$bulkImport$1.apply(CosmosDBSpark.scala:241)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at com.microsoft.azure.cosmosdb.spark.CosmosDBSpark$.bulkImport(CosmosDBSpark.scala:241)
at com.microsoft.azure.cosmosdb.spark.CosmosDBSpark$.savePartition(CosmosDBSpark.scala:439)
at com.microsoft.azure.cosmosdb.spark.CosmosDBSpark$.com$microsoft$azure$cosmosdb$spark$CosmosDBSpark$$saveFilePartition(CosmosDBSpark.scala:343)
at com.microsoft.azure.cosmosdb.spark.CosmosDBSpark$$anonfun$1.apply(CosmosDBSpark.scala:183)
at com.microsoft.azure.cosmosdb.spark.CosmosDBSpark$$anonfun$1.apply(CosmosDBSpark.scala:177)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:853)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:853)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:332)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:296)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:110)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:349)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
Bulk Write Operation Error when Inserting Multiple CSV Files ...
The error message indicates your code is trying to insert an id of 0 when it's already in the database. Check that you...
Read more >"Error: A bulk write operation resulted in one or more errors"
Solution. To resolve this issue: Move the Mongo DB in the new Quali Server environment. Move the SQL Server - Quali DB.
Read more >mongoimport mode upsert failed process bulk write error
I caught a mistake. Failed process for one reason: timeout, killed, SIGTERM ↵2020-01-28T13:24:28.601+0200 connected to: ...
Read more >Error while executing bulk request - how to solve related issues
This guide will help you check for common problems that cause the log ” Error while executing bulk request ” to appear.
Read more >Working With Line Numbers and Errors Using Bulk Insert
Error file specifies the erroneous rows of data, in this example rows 8, 11 and 13 and gives us the type of problem...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
This happens with the com.microsoft.azure:azure-cosmosdb-spark_2.3.0_2.11:1.2.7 latest version as well as in https://docs.microsoft.com/en-us/azure/cosmos-db/spark-connector
Using Uber jar(azure-cosmosdb-spark_2.2.0_2.11-1.1.1-uber.jar) fixed the issue. On the Databricks you may need to restart the cluster after attaching library.