build_index can't handle empty numpy files
See original GitHub issueHello,
I’m currently running a workflow in argo which is generating several embedding files, in parallel, based on a database search. If no data was found, the workflow returns a empty numpy file:
np.save(os.path.join(output, "features", filename), np.empty(0, np.float32))
Sadly the build_index
is not capable of handling those files:
Using 4 omp threads (processes), consider increasing --nb_cores if you have more
Launching the whole pipeline 04/08/2022, 09:54:53
Reading total number of vectors and dimension 04/08/2022, 09:54:53
0%| | 0/16 [00:00<?, ?it/s]
19%|█▉ | 3/16 [00:00<00:00, 29.92it/s]
56%|█████▋ | 9/16 [00:00<00:00, 87.73it/s]
>>> Finished "Reading total number of vectors and dimension" in 0.1517 secs
>>> Finished "Launching the whole pipeline" in 0.1517 secs
Traceback (most recent call last):
File "/usr/local/bin/autofaiss", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/autofaiss/external/quantize.py", line 395, in main
fire.Fire({"build_index": build_index, "tune_index": tune_index, "score_index": score_index})
File "/usr/local/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/usr/local/lib/python3.8/site-packages/autofaiss/external/quantize.py", line 143, in build_index
nb_vectors, vec_dim = read_total_nb_vectors_and_dim(
File "/usr/local/lib/python3.8/site-packages/autofaiss/readers/embeddings_iterators.py", line 258, in read_total_nb_vectors_and_dim
for c in p.imap_unordered(file_to_line_count, file_paths):
File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 868, in next
raise value
File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.8/site-packages/autofaiss/readers/embeddings_iterators.py", line 252, in file_to_line_count
return matrix_reader.get_row_count()
File "/usr/local/lib/python3.8/site-packages/autofaiss/readers/embeddings_iterators.py", line 101, in get_row_count
return self.get_shape()[0]
Would be great if it could handle it, by just showing a waning in the logs or a flag to allow it.
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top Results From Across the Web
How to create an empty numpy array and append to it, like lists
1. A Hack. You can create an empty array using the np. empty() function and specify the dimensions to be (0, 0) and...
Read more >numpy.empty — NumPy v1.24 Manual
Return a new array of given shape and type, without initializing entries. Parameters: shapeint or tuple of int. Shape of the empty array,...
Read more >Unsupported format or combination of formats in buildIndex ...
Unsupported format or combination of formats in buildIndex using ... it can't find features in both images, so you probably should check if ......
Read more >How To Set Up a React Project with Create React App
npm run eject Removes this tool and copies build dependencies, configuration files and scripts into the app directory. If you do this, you...
Read more >TTree Class Reference - ROOT
Variables of one branch are written to the same buffer. A branch buffer is automatically compressed if the file compression attribute is set...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Given that stack trace, you’re using an outdated version of autofaiss, can you try to update it?
It also fails if the file is just empty.