Writing into EvenHubs, Spark executors fail silently when getting throttled
See original GitHub issueI have a job running in databricks which writes the result of a batch query (which returns 7 millions rows) into an eventhub with 1 TU. This job completes successfully but when I inspect the eventhub I see only around 2million events getting published.
On further inspection of the executor logs I see com.microsoft.azure.eventhubs.ServerBusyException
being thrown but this error does not abort the spark job. Therefore its possible that the write operation returns without any exception but still fail to publish all the events.
Sample code:
val ehWriteConf = EventHubsConf(peersEhConnStr)
val peersLAFrame = peersFrame.select($"body", $"properties")
peersLAFrame.write
.format("eventhubs")
.options(ehWriteConf.toMap)
.save()
Bug Report:
- Actual behavior Throttled eventhub write/ingress operations fail silently.
- Expected behavior When the eventhub operation is throttled that error should possibly abort the job.
- Spark version Databricks 5.5 LTS (includes Apache Spark 2.4.3, Scala 2.11)
- spark-eventhubs artifactId and version azure-eventhubs-spark_2.11:2.3.13
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Streaming Databricks job failing while writing to ADLS after ...
Streaming Databricks job failing while writing to ADLS after reading from Eventhub ; apache.spark.sql.execution.datasources.FileFormatWriter$.
Read more >Writing large DataFrame from PySpark to Kafka runs into timeout
Only issue is that randomly it still runs in timeouts and apparently starts from the beginning again so that I'm ending up with...
Read more >azure-event-hubs-spark/Lobby - Gitter
I have what is hopefully a quick question. Can I have multiple spark executors assigned to each event hub partition? or is is...
Read more >java.util.concurrent.Executors Scala Example
This page shows Scala examples of java.util.concurrent.Executors.
Read more >Main - Apache Camel
Name Description Default
camel.main.autoConfigurationEnabled true
camel.main.autoStartup true
camel.main.basePackageScanEnabled Whether base package scan is enabled. true
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@nyaghma Yes. Just tested it; throws an exception with 2.3.16!
@rvoak Thanks for letting us know. Can you please try again with the version 2.3.16? You should be able to see the exception in this version. Please let me know if the issue still happens with version 2.3.16.