Lightweight python commit reader
See original GitHub issueWe are developing a simple client library in python to identify what files to read within hoodie datasets. We have two main use-cases at this point:
- Determine the latest version for every dataset file, in order to read a complete dataset as it exists at a given timestamp.
- Determine the latest version for only files that have changed since a given timestamp, in order to read changes incrementally.
At the moment, this library only reads completed commits, since it’s for a project that consumes data downstream. Another important distinction is that it does not read the files (i.e. generate a DataFrame
), but instead produces a list of files for the client to use and (presumably) read in their own way. We are open to further development and suggestions.
Is this something you would be interested in including in the project? And if so, do you have any requirements or suggestions before we make a PR?
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Lightweight python commit reader #107 - apache/hudi - GitHub
At the moment, this library only reads completed commits, since it's for a project that consumes data downstream. Another important distinction ...
Read more >speechlight - PyPI
A lightweight Python library providing a common interface to multiple TTS and screen reader APIs. See the API reference for more information.
Read more >12 BEST Python IDE & Code Editors for Mac & Windows in 2022
Explore top Python IDEs and Code Editors along with their Pros and cons. Choose the best Python IDE / Code Editor from the...
Read more >sqlite3 — DB-API 2.0 interface for SQLite databases — Python ...
SQLite is a C library that provides a lightweight disk-based database that doesn't ... Call con.commit() on the connection object to commit the...
Read more >Data Version Control With Python and DVC - Real Python
Tagging Commits; Creating One Git Branch Per Experiment ... The .dvc file is lightweight and meant to be stored with your code in...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Closing since we are not pursuing this anymore
did u solve @DSouzaM ?