question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Connection issues while using on databricks

See original GitHub issue

Any idea how to fix this?

: java.lang.ClassNotFoundException: Failed to find data source: com.microsoft.sqlserver.jdbc.spark. Please find packages at http://spark.apache.org/third-party-projects.html

conf = SparkConf() \
    .setAppName(appName) \
    .setMaster(master) \
    .set("spark.driver.extraClassPath","C:/Users/XXXX//mssql-jdbc-8.3.1.jre14-preview.jar")\
    .set("spark.sql.execution.arrow.enabled", True)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
zacqedcommented, Jun 28, 2020

I shifted to local to try this with possible combinations, the outcome is as follows:

Spark: 3.0.0 Environment: Windows

Code:

from pyspark import SparkContext, SparkConf, SQLContext

appName = "PySpark SQL Server"
master = "local[*]"
conf = SparkConf() \
    .setAppName(appName) \
    .setMaster(master) \
    .set("spark.driver.extraClassPath","c:/users/xxxx/mssql-jdbc-8.2.1.jre8.jar") \
    .set("spark.driver.extraClassPath","C:/Users/xxxx/apache-spark-sql-connector.jar") \
    .set("spark.sql.execution.arrow.enabled", True)
sc = SparkContext.getOrCreate(conf=conf)
sqlContext = SQLContext(sc)
spark = sqlContext.sparkSession

hostname = "hostname"
database = "database"
port = "port"
table = "table"
user = "user"
password  = "password"

# The code as recommended in readme
jdbcDF = spark.read \
        .format("com.microsoft.sqlserver.jdbc.spark") \
        .option("url", f"jdbc:sqlserver://{hostname}:{port};databaseName={database}") \
        .option("dbtable", table) \
        .option("user", user) \
        .option("password", password).load()
# The above code leads to following exception
 An error occurred while calling o75.load.
: java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging$class
	at com.microsoft.sqlserver.jdbc.spark.DefaultSource.<init>(DefaultSource.scala:19)
# Tried a combination to check
jdbcDF = spark.read \
        .format("com.microsoft.sqlserver.jdbc.spark") \
        .option("url", f"jdbc:sqlserver://{hostname}:{port};databaseName={database}") \
        .option('driver', 'com.microsoft.sqlserver.jdbc.SQLServerDriver') \
        .option("dbtable", table) \
        .option("user", user) \
        .option("password", password).load()
# The above code leads to following exception
An error occurred while calling o85.load.
: java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging$class
	at com.microsoft.sqlserver.jdbc.spark.DefaultSource.<init>(DefaultSource.scala:19)
# The code that we generally use in our environment
jdbcDF = spark.read.format("jdbc") \
    .option("url", f"jdbc:sqlserver://{hostname}:{port};databaseName={database}") \
    .option("dbtable", table) \
    .option("user", user) \
    .option("password", password) \
    .option('driver', 'com.microsoft.sqlserver.jdbc.SQLServerDriver')\
    .load()

This works fine.

0reactions
drapncommented, Feb 18, 2022

@zacqed Muito obrigado! Sua informação me ajudou muito!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Errors and troubleshooting for Databricks Repos
Get guidance for common error messages or troubleshooting issues when using Databricks Repos with a remote Git repo.
Read more >
Troubleshoot Partner Connect | Databricks on AWS
Learn how to troubleshoot common issues with Partner Connect.
Read more >
ConnectException error - Databricks Community
ConnectException error: This is often caused by an OOM error that causes the connection to the Python REPL to be closed. Check your...
Read more >
Troubleshooting JDBC and ODBC connections - Databricks
This article provides information to help you troubleshoot the connection between your Databricks JDBC/ODBC server and BI tools and data ...
Read more >
Connection - Databricks Community
ConnectException error : This is often caused by an OOM error that causes the connection to the Python REPL to be closed. Check...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found