Reading excel file in Azure Databricks
See original GitHub issueI’m tried to use spark-excel in Azure Databricks but I seem to be be running into an error. I earlier tried the same using SQLServer Big Data Cluster but I was unable to.
Current Behavior I’m getting an error java.lang.NoSuchMethodError: org.apache.commons.io.IOUtils.byteArray(I)[B
I loaded first the Maven Coordinates and got the error. I later followed the link and loaded the jar files and yet got the same error as shown in the screenshot.
Steps to Reproduce (for bugs)
df = spark.read.format("excel") \
.option("header", True) \
.option("inferSchema", True) \
.load(f"dbfs:/FileStore/tables/users.xls") \
.withColumn("file_name", input_file_name())
Your Environment
Azure Databricks
Issue Analytics
- State:
- Created 2 years ago
- Reactions:3
- Comments:53
Top Results From Across the Web
Read and Transform Excel file in Databricks - Microsoft Q&A
I have a requirement to read excel file placed in Azure blob via DataBricks using python notebook and replace new line characters present...
Read more >Reading Excel file from Azure Databricks - Stack Overflow
Steps to read Excel file ( .xlsx ) from Azure Databricks, file is in ADLS Gen 2: Step1: Mount the ADLS Gen2 storage...
Read more >How to read excel file using databricks
(1) login in your databricks account, click clusters, then double click the cluster you want to work with. · (2) click Libraries ,...
Read more >Reading excel file in pyspark (Databricks notebook)
crealytics” in the text search box and select latest version of the plugin or as per your scala version in Cluster on DB...
Read more >Handling Excel Data in Azure Databricks - zongbao.blog()
Solution 1: Read Excel data using Pandas, then convert Pandas DataFrame into Spark DataFrame. from datetime import datetime from pyspark.sql.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I had a lot of trouble running spark-excel, because of incompatible dependencies. Finally with these library versions I was able to write and read excel! I share for you.
Faced the same error with the 0.16 and 0.16.1 versions of this library. But then I tried an older version (com.crealytics:spark-excel_2.12:0.14.0) and it is working like a charm now.