Spark 3.1 support
See original GitHub issueDatabricks has introduced the 8 series runtimes which are build uppon Spark 3.1.1, as shown in the image below.
The com.microsoft.azure:spark-mssql-connector_2.12_3.0:1.0.0-alpha
is perfectly working on Spark 3.0.x but unfortunately not working on Spark 3.1.x.
If possible it would be great if the Spark 3 connector could work with all Spark 3.x.x If required updates for Spark 3.1 are not backward compatible with Spark 3.0 it would be great if a Spark 3.1.x compatible connector could be introduced
I would be happy to help but in this case I have no clue where to start.
error logs on Databricks Runtime 8.1 ML
Py4JJavaError: An error occurred while calling o483.save.
: java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.schemaString(Lorg/apache/spark/sql/Dataset;Ljava/lang/String;Lscala/Option;)Ljava/lang/String;
at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.mssqlCreateTable(BulkCopyUtils.scala:506)
at com.microsoft.sqlserver.jdbc.spark.SingleInstanceConnector$.createTable(SingleInstanceConnector.scala:46)
at com.microsoft.sqlserver.jdbc.spark.Connector.write(Connector.scala:73)
at com.microsoft.sqlserver.jdbc.spark.DefaultSource.createRelation(DefaultSource.scala:64)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:73)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:71)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:94)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:196)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:240)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:236)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:192)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:163)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:162)
at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:1079)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:126)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:267)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:104)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:217)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:1079)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:468)
at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:438)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:311)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
at py4j.Gateway.invoke(Gateway.java:295)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:251)
at java.lang.Thread.run(Thread.java:748)
supported-databricks-runtime-releases-and-support-schedule
Issue Analytics
- State:
- Created 2 years ago
- Reactions:17
- Comments:9
Top Results From Across the Web
Overview - Spark 3.1.1 Documentation - Apache Spark
Spark runs on Java 8/11, Scala 2.12, Python 3.6+ and R 3.5+. Java 8 prior to version 8u92 support is deprecated as of...
Read more >Introducing Apache Spark™ 3.1 - The Databricks Blog
Learn more about the latest release of Apache Spark, version 3.1, ... Python typing support in PySpark was initiated as a third party ......
Read more >Spark 3.1 is now Generally Available on HDInsight
Accordingly Jupyter and Zeppelin notebook's only support Python 3.8 from now on. This should be taken note of while migrating to Spark 3.1.2....
Read more >Introducing AWS Glue 3.0 with optimized Apache Spark 3.1 ...
To start using AWS Glue 3.0 in AWS Glue Studio, choose the version Glue 3.0 – Supports spark 3.1, Scala 2, Python 3....
Read more >Snowflake Connector for Spark
Snowflake supports three versions of Spark: Spark 3.1, Spark 3.2, and Spark 3.3. There is a separate version of the Snowflake connector for...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@shivsood Thanks a lot for this new version! Has it been published to Maven already?
Yes please! As time goes on, this compatibility becomes more important.