question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can't get spark-streaming-eventhubs_2.11 to work with Jupyter in HDInsight 3.6 (Spark 2.1)

See original GitHub issue

I can launch spark-shell with

spark-shell --packages com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.5

which works fine, but when I use the same package in Jupyter with

%%configure -f
{ "conf": {"spark.jars.packages": "com.microsoft.azure:spark-streaming-eventhubs_2.11:2.0.5" }}

It always fails with the cryptic error:

The code failed because of a fatal error: Session 31 unexpectedly reached final status ‘dead’. See logs:

Or sometimes with this one:

The code failed because of a fatal error: Status ‘shutting_down’ not supported by session…

Also interesting that streaming-eventhubs_2.10 does work in Jupyter

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

4reactions
romitgirdharcommented, Nov 30, 2017

Done! Just submitted a PR with a walkthrough of using the JAR in Jupyter using the PySpark3 kernel. Please review and approve at your earliest convenience.

2reactions
syedhassaanahmedcommented, Nov 29, 2017

@sabeegrewal This is how I imported the package in Jupyter notebook, HDInsight

%%configure -f
{
    "conf": { 
        "spark.jars.packages": "com.microsoft.azure:azure-eventhubs-spark_2.11:2.1.6",
        "spark.jars.excludes": "org.scala-lang:scala-reflect,org.apache.spark:spark-tags_2.11"
    }
}
Read more comments on GitHub >

github_iconTop Results From Across the Web

Install Jupyter locally and connect to Spark in Azure ...
Enter the command pip install sparkmagic==0.13.1 to install Spark magic for HDInsight clusters version 3.6 and 4.0.
Read more >
azure - HdInsight Service - Jupyter notebook Issue
I have deployed a HDInsight 3.6 Spark (2.3) cluster on Microsoft Azure with the standard configurations (Location = Central US, Head Nodes =...
Read more >
How To Use Jupyter Notebooks with Apache Spark
In this post, we will see how to incorporate Jupyter Notebooks with an Apache Spark installation to carry out data analytics through your ......
Read more >
Power BI Data Connector: Azure HDInsight Spark w - YouTube
7 steps to connect Power BI to an Azure HDInsight Spark cluster. Uses Zeppelin notebook and Jupyter notebook to run code on spark...
Read more >
Data Analytics with Apache Spark for Azure HDInsight
Deploy an HDInsight Spark cluster · Work with content stored in Azure Blob Storage and accessed by the Spark cluster as an HDFS...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found