question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ADLS Gen2 FileSystemClient.get_paths() returns only 5000 paths (1 page in PageIterator)

See original GitHub issue
  • Package Name: azure.storage.filedatalake
  • Package Version: 12.2.2
  • Operating System: Azure Databricks, Ubuntu 16.04.6LTS
  • Python Version: 3.7.3

Describe the bug After getting FileSystemClient of particular container in ADLS Gen2, that contains more than 5000 files & folders, I am trying to retrieve all paths from this container using get_paths() method, which returns me the iterator, that contains only 5000 items of PathProperties or only 1 page in case I am using by_page() method.

To Reproduce Steps to reproduce the behavior:

  1. Connect to the ADLS Gen2 Storage - in my case I used DataLakeServiceClient with storage account key as credential.
  2. Get the FileSystemClient of Container which contains >5000 files & folders using get_file_system_client() method.
  3. Use get_paths() method to get the PathProperties iterator.
  4. Check the number of retreived paths after transforming iterator to list
  5. Optional: Check the number of Pages of retreived paths using by_pages() method.

Expected behavior I excpect to get the iterator of PathProperties with correct number of items corresponding to the particular container (>5000). Optional: I excpect to get the PageIterator with correct number pages (>1) in case of container contains >5000 paths.

Screenshots image image

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
tasherif-msftcommented, Feb 9, 2021

Hi @siarblack, datalake 12.2.3 has been released!

1reaction
tasherif-msftcommented, Feb 9, 2021

Hi @siarblack the fix got merged and we will be doing a patch release for this very soon. I will keep you updated.

Read more comments on GitHub >

github_iconTop Results From Across the Web

azure.storage.filedatalake.FileSystemClient class
Returns all user-defined metadata and system properties for the specified file system. The data returned does not include the file system's list of...
Read more >
Microsoft Azure Data Lake Storage (Tech preview) operation
Create one or more new filesystems in a given ADLS Gen2 storage account. ... Restriction: The List Path operation can fetch up to...
Read more >
How do I retrieve all directory paths from Azure Data Lake ...
get_paths method retrieves paths to both files and directories. Is there an efficent workaround to retrieve or filter only directory paths?
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found