question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Lightweight python commit reader

See original GitHub issue

We are developing a simple client library in python to identify what files to read within hoodie datasets. We have two main use-cases at this point:

  1. Determine the latest version for every dataset file, in order to read a complete dataset as it exists at a given timestamp.
  2. Determine the latest version for only files that have changed since a given timestamp, in order to read changes incrementally.

At the moment, this library only reads completed commits, since it’s for a project that consumes data downstream. Another important distinction is that it does not read the files (i.e. generate a DataFrame), but instead produces a list of files for the client to use and (presumably) read in their own way. We are open to further development and suggestions.

Is this something you would be interested in including in the project? And if so, do you have any requirements or suggestions before we make a PR?

cc @zqureshi @dterror

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
vinothchandarcommented, Oct 23, 2018

Closing since we are not pursuing this anymore

0reactions
tooptoop4commented, Oct 19, 2022

did u solve @DSouzaM ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Lightweight python commit reader #107 - apache/hudi - GitHub
At the moment, this library only reads completed commits, since it's for a project that consumes data downstream. Another important distinction ...
Read more >
speechlight - PyPI
A lightweight Python library providing a common interface to multiple TTS and screen reader APIs. See the API reference for more information.
Read more >
12 BEST Python IDE & Code Editors for Mac & Windows in 2022
Explore top Python IDEs and Code Editors along with their Pros and cons. Choose the best Python IDE / Code Editor from the...
Read more >
sqlite3 — DB-API 2.0 interface for SQLite databases — Python ...
SQLite is a C library that provides a lightweight disk-based database that doesn't ... Call con.commit() on the connection object to commit the...
Read more >
Data Version Control With Python and DVC - Real Python
Tagging Commits; Creating One Git Branch Per Experiment ... The .dvc file is lightweight and meant to be stored with your code in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found