source: file: Compression
See original GitHub issueDFFML is hoping to participate in Google Summer of Code (GSoC) under the Python Software Foundation umbrella. You can read all about what this means at http://python-gsoc.org/. This issue, and any others tagged gsoc
and project
are not generally available bugs, but related to project ideas for GSoC.
Project Idea: File Source Compression
Project description:
DFFML’s initial release includes a FileSource
which saves and loads data from files using the load_fd
and dump_fd
methods.
JSON Example
For the open
method of FileSource
Allow for reading and writing the following file formats, transparently (so without subclasses having to do anything) to any source which is a subclass of FileSource
.
- gzip (by @yashlamba)
- bz2
- lzma
- zip
Skills: Python, git Difficulty level: Easy
Related Readings/Links:
See https://docs.python.org/3/library/archiving.html for documentation
Potential mentors: @pdxjohnny
Getting Started: Figure out how to do one of the file types, probably gzip (as that probably is as simple as using https://docs.python.org/3/library/gzip.html#gzip.GzipFile if the filename ends in .gz
) then move on to the rest. For now just make modifications directly to the FileSource
class. We may have you split out the logic later, but don’t worry about another class for now.
What we want to see in your application: Describe how you intend to solve the problem, and give us some “stretch goals”, maybe implement a remote file source which reads form URLs. Don’t forget to include some time for building appropriate tests.
Issue Analytics
- State:
- Created 5 years ago
- Comments:23 (23 by maintainers)
Sweet! Just ping me if there’s anywhere you need clarification.
On Wed, Mar 27, 2019 at 11:03:08AM -0700, Yash Lamba wrote:
Hi Yash! sorry i am still working on a reply to your email. I think this is pretty much done. I don;t think tar support is needed right now. If you want to document what’s been implemented with relation to this, that would be awesome. Thank you!