question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

About efficiency of the model

See original GitHub issue

Hi,

Thanks for the great repo, I enjoy a lot exploring it! However, when I tried to run the code in the “Use BLINK in your codebase” chapter in README, I found the speed of running the model relatively slow (in fast=False mode). To be more specific, when I execute “main_dense.run”, the first stage of processing proceeded relatively slow (~2.5 seconds per item) while the later stage (printing “Evaluation”) proceeded ~ 5 items per second. Also, I tried adding indices as below.

config = {
    ...
    "faiss_index": "flat",
    "index_path": models_path+"faiss_flat_index.pkl"
}

However, the performance of the first stage became even worse (~20 seconds per item). I’m wondering if I’m setting something wrong (especially for the faiss index) which resulted in the low speed. If there are any corrections/methods to speed up? Thanks for your help! (I’ll post the performance logs below if needed!)

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:21 (1 by maintainers)

github_iconTop GitHub Comments

4reactions
AOZMHcommented, May 6, 2021

Roughly, we can know that T(cross+bi) : T(bi) = 3:1, ignoring the other process

  • Thanks for the theoretical analysis! That’s a bit different from the case I encountered, in which the bi-encoder phase was quite slow and adding an index turned to be even slower.
  • Actually, when I read through the code, I found that the bi-encoder phase was completely running on CPU (including a transformer encoding of the sentence and a matrix multiplication to calculate the similarity scores of each entity), resulting in the slow execution compared with the cross-encoder which totally ran on GPU.
  • I also manually changed code to set the bi-encoder phase to run on GPU, which resulted in a GPU out-of-memory due to the LARGE multiplication (hidden_dim * num_entities). As far as I can tell, the execution requires at least 24GB GPU memory for fp32, which exceeds the limit of my 16GB P100.
  • To solve that, I tried to turn all calculations to fp16, which did not induce OOM, but resulted in a warning telling me that large matmul of fp16 had bugs (see this). Actually, the bi-encoder results of fp16 was also totally wrong, e.g. giving a few incorrect entities high scores.
  • Finally, I manually pruned the entities to fit in 16GB memory at fp32 and everything was fine, the relative time of bi-encoder and cross-encoder turned to be ~20ms and ~100ms, which is reasonable for me.

To wrap up, I conclude that:

  1. I guess it was the high memory cost that led the contributors to turn bi-encoder phase to CPU (which requires RAM instead of GPU memory), but that significantly harms performance, as we all know, transformers on CPU are weigh more slow than on GPU.
  2. To smoothly run BLINK fully on one GPU, I think we need a >24GB GPU memory, which I guess is not a scarcity for FAIR, but somehow poses difficulty for students like me 😃
  3. What I still haven’t solved is the slow execution of faiss-index, which was even slower than the pure-cpu bi-encoder execution. Maybe someone can give some additional comments?

Thanks for all the help! I’ll be happy to follow any updates.

0reactions
abhinavkulkarnicommented, Oct 27, 2022

Hi,

You may want to make some changes to the codebase and add support for more sparse indexes. Currently, BLINK codebase only supports flat indices.

I am currently using a sparse index OPQ32_768,IVF4096,PQ32x8 built on candidate encodings and the speed improvement is significant.

For e.g., this is what my faiss_indexer.py looks like.

This is how I load the models.

config = {
    "interactive": False,
    "fast": False,
    "top_k": 8,
    "biencoder_model": models_path + "biencoder_wiki_large.bin",
    "biencoder_config": models_path + "biencoder_wiki_large.json",
    "crossencoder_model": models_path + "crossencoder_wiki_large.bin",
    "crossencoder_config": models_path + "crossencoder_wiki_large.json",
    "entity_catalogue": models_path + "entities_aliases_with_ids.jsonl",
    "entity_encoding": models_path + "all_entities_aliases.t7",
    "faiss_index": "OPQ32_768,IVF4096,PQ32x8",
    "index_path": models_path + "index_opq32_768_ivf4096_pq32x8.faiss",
    "output_path": "logs/",  # logging directory
}

self.args = argparse.Namespace(**config)

logger.info("Loading BLINK model...")
self.models = main_dense.load_models(self.args, logger=logger)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Efficient Model - an overview | ScienceDirect Topics
In several cases, efficient models can be built if some of the parts of the structures or the systems are not completely modelled,...
Read more >
model of efficiency/virtue etc - Longman Dictionary
model of efficiency/virtue etc meaning, definition, what is model of efficiency/virtue etc: someone or something that has a lot of a...: Learn more....
Read more >
A new innovative method for model efficiency performance
Rather than depending on such expressions without visual impressions, this paper presents an effective model efficiency evaluation methodology ...
Read more >
Editorial: Model Selection and Efficiency - jstor
Using a more efficient test, test of different size criterion, Bayes information criterion or another information criterion doe problem, because each of them ......
Read more >
An Examination of the Care Planning Process in Nursing Homes
Modeling Efficiency at the Process Level: An Examination of the Care ... Making nursing homes more efficient merits closer attention as a strategy...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found