Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question]: Read multiple parquet files at once from Azure Data lake Gen 2

See original GitHub issue

How to read all parquet files in a folder to a datafame ?
How to read/write data from Azure data lake Gen2 ?

In PySpark, you would do it this way df = spark.read.parquet("abfss://sharedFolder@abc.dfs.core.windows.net/shared/compact.parquet/") display(df)

How do we do the same for DotNet for Apache spark job that runs in Azure databricks?

Issue Analytics

State:
Created 4 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

anand035commented, Oct 31, 2019

I tried several things, but this one worked SparkSession spark = SparkSession .Builder() .AppName(".NET Spark SQL basic example") .Config("spark.some.config.option", "some-value") .Config("spark.hadoop.fs.azure.account.oauth2.client.secret", "abcd") .Config("spark.hadoop.fs.azure.account.oauth2.client.endpoint", @"xyz") .Config("spark.hadoop.fs.azure.account.oauth2.client.id", "abcd") .Config("spark.hadoop.fs.azure.account.auth.type", "OAuth") .Config("spark.hadoop.fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider") .GetOrCreate();

0reactions

rapothcommented, Nov 1, 2019

Thanks for the update!