Wildcard retrieval of files
See original GitHub issueI am attempting to access data in our s3 datalake. Since production systems are writing the data I want along with other files that I don’t want, simply using a prefix is insufficient to get the data I need.
Dask allows this sort of wildcarding, for example
import dask.dataframe as dd
dd.read_parquet(f's3://{bucket}/prod-system-*/*/parquet/*.parquet')
Using awswrangler for the above task isn’t viable. I know that boto3 doesn’t allow for wildcard filtering, but surely it must be doable if dask is able to implement that functionality?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:15 (12 by maintainers)
Top Results From Across the Web
Wildcards for Windows File System Subclients
You can use wildcards when you specify the content for a subclient. The search for subclient content is not case-sensitive. Note: If the ......
Read more >Wildcard retrieval of files · Issue #322 · aws/aws-sdk-pandas
I am attempting to access data in our s3 datalake. Since production systems are writing the data I want along with other files...
Read more >How to search for files using the wildcard character (*) in ...
You can also search for files with a specific name or using the wildcard (*) character. Command. Below command will search for the...
Read more >"GET" command retrieves multiple files while using wildcard
Hi All I am using GNU/Linux This is regarding the get command to retrieve files (filename with wild card characters) from remote server....
Read more >Retrieval of Funds Capture Acknowledgment Files Using a ...
You can retrieve multiple funds capture acknowledgment files from your bank instead of a single file by using a wildcard in the file...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@igorborgest I’m happy to take it on! I’d love any guidance you can give on how to do this though. I’m assuming we want to avoid additional imports/dependencies
Sorry @igorborgest . I am unassigning myself from this because I got a heavy task today from my team. Therefore, I wont be able to contribute for next 3-4 weeks. Please re-assign. I will pick up a new task once I am free. I apologize again for inconvenience caused.