Allow saving to compressed files
See original GitHub issueIs there any way to use gzip (or bzip2 or xz) compression when saving a pandas DataFrame to a feather file? Something like feather.write_dataframe(df, path, compression='gzip')
, the way you can do with DataFrame.to_csv(...)
?
I work a lot with files stored on an NFS network drive, and feather.write_dataframe
stresses our network and creates files which are much larger than the equivalent gzip-compressed CSVs. I tried running gzip -9
on a feather file and got ~10x reduction in size.
I know compression has been mentioned already (#24, #76), but those seem to deal with changing the feather format to support block compression. I just want to gzip the entire file at the same time as I am saving it (and not afterwards using gzip
).
Thank you!
Issue Analytics
- State:
- Created 7 years ago
- Reactions:1
- Comments:14 (9 by maintainers)
Top Results From Across the Web
Save space on your laptop and in the cloud by compressing files
Compressing files on Windows and macOS In File Explorer on Windows, right-click on a file or folder, then choose Send to and Compressed...
Read more >How do I compress files to make them smaller?
To set up an NTFS compressed folder, right-click on the folder you want to use and select Properties. In the General tab, click...
Read more >How to enable file compression on Windows 11
Open File Explorer. · Open the folder to store the files with compression. · Click the New button and select the Folder option.Quick...
Read more >What Is “Enable File and Folder Compression” on Windows PC?
The compression process is that NTFS compresses files and folders by first driving the data streams into CUs (control unit).
Read more >How to zip files on your computer or phone to save space
1. Select all the files you want to zip. You can either drag a box around them, or hold Ctrl and click each...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
hi @fbrundu we’ve set up the compression toolchain within Apache Arrow and I’m standing by to add compression to the Feather format as soon as we are able to ship R bindings for the Arrow C++ libraries (and port over the data.frame conversion code from this repo there as part of the R-Arrow interop layer)
This is done in Feather V2 (lz4 and zstd compression available) coming in Arrow 0.17.0