question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spark 3.1 support

See original GitHub issue

Databricks has introduced the 8 series runtimes which are build uppon Spark 3.1.1, as shown in the image below. The com.microsoft.azure:spark-mssql-connector_2.12_3.0:1.0.0-alpha is perfectly working on Spark 3.0.x but unfortunately not working on Spark 3.1.x.

If possible it would be great if the Spark 3 connector could work with all Spark 3.x.x If required updates for Spark 3.1 are not backward compatible with Spark 3.0 it would be great if a Spark 3.1.x compatible connector could be introduced

I would be happy to help but in this case I have no clue where to start.

error logs on Databricks Runtime 8.1 ML

Py4JJavaError: An error occurred while calling o483.save.
: java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.schemaString(Lorg/apache/spark/sql/Dataset;Ljava/lang/String;Lscala/Option;)Ljava/lang/String;
	at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.mssqlCreateTable(BulkCopyUtils.scala:506)
	at com.microsoft.sqlserver.jdbc.spark.SingleInstanceConnector$.createTable(SingleInstanceConnector.scala:46)
	at com.microsoft.sqlserver.jdbc.spark.Connector.write(Connector.scala:73)
	at com.microsoft.sqlserver.jdbc.spark.DefaultSource.createRelation(DefaultSource.scala:64)
	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:73)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:71)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:94)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:196)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:240)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:236)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:192)
	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:163)
	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:162)
	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:1079)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:126)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:267)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:104)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
	at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:217)
	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:1079)
	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:468)
	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:438)
	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:311)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
	at py4j.Gateway.invoke(Gateway.java:295)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:251)
	at java.lang.Thread.run(Thread.java:748)

supported-databricks-runtime-releases-and-support-schedule

image

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:17
  • Comments:9

github_iconTop GitHub Comments

4reactions
dsu4rezcommented, Aug 20, 2021

@shivsood Thanks a lot for this new version! Has it been published to Maven already?

2reactions
zegorcommented, Jul 27, 2021

Yes please! As time goes on, this compatibility becomes more important.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Overview - Spark 3.1.1 Documentation - Apache Spark
Spark runs on Java 8/11, Scala 2.12, Python 3.6+ and R 3.5+. Java 8 prior to version 8u92 support is deprecated as of...
Read more >
Introducing Apache Spark™ 3.1 - The Databricks Blog
Learn more about the latest release of Apache Spark, version 3.1, ... Python typing support in PySpark was initiated as a third party ......
Read more >
Spark 3.1 is now Generally Available on HDInsight
Accordingly Jupyter and Zeppelin notebook's only support Python 3.8 from now on. This should be taken note of while migrating to Spark 3.1.2....
Read more >
Introducing AWS Glue 3.0 with optimized Apache Spark 3.1 ...
To start using AWS Glue 3.0 in AWS Glue Studio, choose the version Glue 3.0 – Supports spark 3.1, Scala 2, Python 3....
Read more >
Snowflake Connector for Spark
Snowflake supports three versions of Spark: Spark 3.1, Spark 3.2, and Spark 3.3. There is a separate version of the Snowflake connector for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found