[BUG] partitionBy not working as expected
See original GitHub issueIs there an existing issue for this?
- I have searched the existing issues
Current Behavior
I am trying to write excel file. processedData.repartition(1).write() .partitionBy(getPartitions()) .format(“excel”) .mode(SaveMode.Append) .option(“dataAddress”, sheetName) .option(“useHeader”, “true”) .save(getSinkPath()+“/report.xlsx”);
I expected to create subdirectories for each partition same as csv and create file. but now it is writing all the same file.
Expected Behavior
It should behave same as csv format.
Steps To Reproduce
processedData.repartition(1).write() .partitionBy(getPartitions()) .format(“excel”) .mode(SaveMode.Append) .option(“dataAddress”, sheetName) .option(“useHeader”, “true”) .save(getSinkPath()+“/report.xlsx”);
Environment
- Spark version: 2.4.8
- Spark-Excel version:2.4.8_0.17.1
- OS:MACOS
Apache POI: 5.2.2
poi-ooxml: 5.2.2
- Cluster environment: Googe cloud
-Java:8
Anything else?
No response
Issue Analytics
- State:
- Created a year ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Developers - [BUG] partitionBy not working as expected -
Is there an existing issue for this? [X] I have searched the existing issues. Current Behavior. I am trying to write excel file....
Read more >Why PySpark partitionBy isn't working properly? - Stack Overflow
At first I want to partition the table on COL1, then do another partition on COL2, then sort the COL3 in descending order....
Read more >Chapter 6, Restrictions and Limitations on Partitioning
Tables employing user-defined partitioning do not preserve the SQL mode in effect at the ... PARTITION BY ... statement on such a table...
Read more >Partition By () on Delta Files - Databricks Community
mode("overwrite").parquet("Partition file path") -- it worked but in the further steps it complains about the file type is not delta.
Read more >How to perform PartitionBy in spark scala - ProjectPro
error or errorifexists: When saving a DataFrame to a data source, an exception is expected to be thrown if data already exists. It...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Please check these potential duplicates:
Thanks @christianknoepfle - comes from https://github.com/crealytics/spark-excel#excel-api-based-on-datasourcev2
I’m going to close this because it works as expected. Feel free to reopen if there is anything else to discuss.