Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is there any way to reduce the GPU memory usage and enhance the inference speed?

See original GitHub issue

The M-LSD’s pred_lines takes a long time than I expected, about ~6Hz (including other stuff; M-LSD-tiny only seems to be about 10Hz).

And it takes about 2G of GPU memory.

Is there a way to reduce the GPU memory usage and enhance the inference speed? (including TensorRT, etc.)

Please give me an adivce as I’m not an expert of this.

Thanks!

Issue Analytics

State:
Created a year ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

JinraeKimcommented, Sep 15, 2022

@rhysdg Thank you for the detailed explanation! Yeah, I’m looking for employment with Nvidia Jetson as well, and my personal laptops for practice as well.

It gave me a really nice insight! Thank you again!

1reaction

rhysdgcommented, Sep 12, 2022

@JinraeKim @lhwcv Apologies for the late reply, busy times! Forsure the main criteria with TensorRT is to reduce latency, and therefore increase inference speed pretty signifcantly with minimal reduction in quality at FP16. Given a successful conversion you should also see a significant reduction in memory allocation overhead.

Its worth bearing in mind that the setup I have here was developed for Jetson series devices, although my understanding is that it plays nice with Nvdia’s NGC PyTorch docker container. I am hoping to start bringing in a TensorrT Python API/ Pycuda version shortly that should work across a wider range of devices. What were you hoping to deploy with @JinraeKim?