question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question]: Read multiple parquet files at once from Azure Data lake Gen 2

See original GitHub issue
  • How to read all parquet files in a folder to a datafame ?
  • How to read/write data from Azure data lake Gen2 ?

In PySpark, you would do it this way df = spark.read.parquet("abfss://sharedFolder@abc.dfs.core.windows.net/shared/compact.parquet/") display(df)

How do we do the same for DotNet for Apache spark job that runs in Azure databricks?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
anand035commented, Oct 31, 2019

I tried several things, but this one worked SparkSession spark = SparkSession .Builder() .AppName(".NET Spark SQL basic example") .Config("spark.some.config.option", "some-value") .Config("spark.hadoop.fs.azure.account.oauth2.client.secret", "abcd") .Config("spark.hadoop.fs.azure.account.oauth2.client.endpoint", @"xyz") .Config("spark.hadoop.fs.azure.account.oauth2.client.id", "abcd") .Config("spark.hadoop.fs.azure.account.auth.type", "OAuth") .Config("spark.hadoop.fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider") .GetOrCreate();

0reactions
rapothcommented, Nov 1, 2019

Thanks for the update!

Read more comments on GitHub >

github_iconTop Results From Across the Web

To read multiple part parquet files from Azure DataLake ...
I have one folder in ADLS Gen2 which has more than one part parquet files. I need to read all these parquet files...
Read more >
20 Read Parquet files from Data Lake Storage Gen2 - YouTube
69. Azure Data Factory Integration -Accessing a Databricks Notebook with Input and Output Parameters · 08. Combine Multiple Parquet Files into A ...
Read more >
Power BI reading Parquet from a Data Lake - Simple Talk
The feature to combine multiple files from Azure Data Lake Gen 2 storage. · The Parquet connector is the responsible to read Parquet...
Read more >
How to model thousands of files from Azure Data Lake Gen ...
Hi, I have an initial 1000s of delimited files in Azure Data Lake Gen 2 storage account. I need to read all these...
Read more >
How to download all partitions of a parquet file in Python ...
I am able to read a parquet file from Azure blob storage generated using python. This file does not have any the partition...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found