question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is there any way to reduce the GPU memory usage and enhance the inference speed?

See original GitHub issue

The M-LSD’s pred_lines takes a long time than I expected, about ~6Hz (including other stuff; M-LSD-tiny only seems to be about 10Hz).

And it takes about 2G of GPU memory.

Is there a way to reduce the GPU memory usage and enhance the inference speed? (including TensorRT, etc.)

Please give me an adivce as I’m not an expert of this.

Thanks!

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
JinraeKimcommented, Sep 15, 2022

@rhysdg Thank you for the detailed explanation! Yeah, I’m looking for employment with Nvidia Jetson as well, and my personal laptops for practice as well.

It gave me a really nice insight! Thank you again!

1reaction
rhysdgcommented, Sep 12, 2022

@JinraeKim @lhwcv Apologies for the late reply, busy times! Forsure the main criteria with TensorRT is to reduce latency, and therefore increase inference speed pretty signifcantly with minimal reduction in quality at FP16. Given a successful conversion you should also see a significant reduction in memory allocation overhead.

Its worth bearing in mind that the setup I have here was developed for Jetson series devices, although my understanding is that it plays nice with Nvdia’s NGC PyTorch docker container. I am hoping to start bringing in a TensorrT Python API/ Pycuda version shortly that should work across a wider range of devices. What were you hoping to deploy with @JinraeKim?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory and speed - Hugging Face
We present some techniques and ideas to optimize Diffusers inference for memory or speed. As a general rule, we recommend the use of...
Read more >
Monitor and Improve GPU Usage for Training Deep Learning ...
It doesn't work in every case, but one simple way to possibly increase GPU utilization is to increase batch size. Gradients for a...
Read more >
Maximizing Deep Learning Inference Performance with ...
Optimized hardware usage—Examine GPU memory requirements to run more models on less hardware. Rather than optimizing for throughput, you can use ...
Read more >
How to decrease GPU Inference time and Increase its ...
Test case 1: Created a tf session using tf.Session(graph=tf. · Test case 2: Then we restricted the GPU usage for each TF session...
Read more >
Improving PyTorch inference performance on GPUs with a few ...
There are a few complementary ways to achieve this in practice: use relatively wide models (where the non-batched dimensions are large), use ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found