Trouble saving CSR matrix to S3 using scipy.sparse.save_npz
See original GitHub issueI’m running into an issue where I tried to save a CSR matrix to aws s3 using the testing codes below:
import numpy as np
import pandas as pd
from scipy import sparse
import s3fs
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
csr = sparse.csr_matrix(df.values)
s3 = s3fs.S3FileSystem(anon=False)
s3_path = "<an_aws_s3_path>"
f = s3.open(s3_path, 'wb')
sparse.save_npz(f, csr)
If this is a not current support function, could you provide any leads on good ways to achieve my goal?
Thanks.
Error trace:
...
File "/Users/yangzhou/code/sml/core/util/csr_matrix_wrapper.py", line 37, in save
sparse.save_npz(f, self.csr)
File "/usr/local/lib/python3.7/site-packages/scipy/sparse/_matrix_io.py", line 78, in save_npz
np.savez_compressed(file, **arrays_dict)
File "/usr/local/lib/python3.7/site-packages/numpy/lib/npyio.py", line 667, in savez_compressed
_savez(file, args, kwds, True)
File "/usr/local/lib/python3.7/site-packages/numpy/lib/npyio.py", line 695, in _savez
zipf = zipfile_factory(file, mode="w", compression=compression)
File "/usr/local/lib/python3.7/site-packages/numpy/lib/npyio.py", line 112, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/zipfile.py", line 1214, in __init__
self.fp.seek(self.start_dir)
File "/usr/local/lib/python3.7/site-packages/s3fs/core.py", line 1278, in seek
raise ValueError('Seek only available in read mode')
ValueError: Seek only available in read mode
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:11 (5 by maintainers)
Top Results From Across the Web
Trouble saving CSR matrix to S3 using scipy.sparse.save_npz
I'm running into an issue where I tried to save a CSR matrix to aws s3 using the testing codes below: import numpy...
Read more >Persisting a Large scipy.sparse.csr_matrix - Stack Overflow
save_npz is using the basic numpy savez to the matrix attributes (3 main arrays) to a zip archive. For some reason, possibly some...
Read more >scipy.sparse.save_npz — SciPy v1.9.3 Manual
Save a sparse matrix to a file using . npz format. Either the file name (string) or an open file (file-like object) where...
Read more >How to Create a Sparse Matrix in Python - GeeksforGeeks
Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the matrix are of no...
Read more >scipy.sparse.save_npz — SciPy v1.10.0.dev0+2302.7620ef0 ...
scipy.sparse.save_npz(file, matrix, compressed=True)[source]#. Save a sparse matrix to a file using .npz format. Parameters: filestr or file-like object.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I suppose you’d prefer
and then you don’t have to edit zipfile
there you go, PR incoming https://github.com/intake/filesystem_spec/pull/238