question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

What's kernel launch time?

See original GitHub issue

The performance summary shows that my model spend ~50% time in the “kernel launch” step. I find other items easy to understand, but I have no idea what “kernel launch” is, and how I can reduce its time consumption. I do complicated preprocessing to my data using the tf.data.Dataset APIs, but the summary shows that I spend no time on it. Could the preprocessing be the real cause of the high “kernel launch” time consumption?

image

Thanks!

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:2
  • Comments:30

github_iconTop GitHub Comments

2reactions
cklukcommented, Jan 16, 2021

Hi Slyne,

Usually, high kernel launch overhead is not due to a particular op. Instead, it is because there is insufficient computation in the ops executed on the device to amortize the kernel launch overhead. So, following suggestions from one of my previous replies may still be helpful: (1) Use the XLA compiler, which can fuse multiple TF Ops into the same kernel (see https://www.tensorflow.org/xla). (2) Rewrite your TF code in a way to create larger granularity of computation on the GPU. In particular, have you tried (1)?

On Wed, Jan 13, 2021 at 4:35 AM Slyne Deng notifications@github.com wrote:

@ckluk https://github.com/ckluk Hi, I also encounter this problem. After checking the profiles uploaded, it’s still difficult for me to find the exact operations that I should edit in my codes. It seems there are frequent ops between host and device. Not sure if my guess is correct. Could you help me? Uploading 2021_01_13_19_07_07.zip… [image: StepTime] https://user-images.githubusercontent.com/6286804/104453007-be601280-55de-11eb-8498-8a4d4649cd85.JPG

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/profiler/issues/8#issuecomment-759420428, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE33L3OORZMZ4SVVSKQZ6H3SZWHR7ANCNFSM4MMH2VAA .

2reactions
cklukcommented, Apr 21, 2020

Thanks for providing the log. After inspecting it, I believe that the kernel-launch time is actually overlapped with the tf.data preprocessing time on the CPU. It is possible that the tf.data preprocessing interferes with the kernel launching. You can’t do much with the kernel launching, but you can try to reduce the interference of tf.data preprocessing by: (1) Using more parallelism in tf.data preprocessing (https://www.tensorflow.org/guide/data_performance) (2) Using a more powerful CPU (3) Offloading some of the data preprocessing to GPU

Read more comments on GitHub >

github_iconTop Results From Across the Web

Kernel Launch Time Unexpectedly High
Kernel Launch Time Unexpectedly High ; njuffa July 11, 2022, 8:12am #2 ; attanayakekavishka July 11, 2022, 9:13am #3 ; njuffa July 11,...
Read more >
Kernel Launch - Intel
This is the time at which the kernel that was submitted by the host starts executing on the device. Note that this is...
Read more >
What are the factors that affect CUDA kernels launch time
Each kernel completes its job in less than 10 microsec, however, its launch time is 50-70 microsec. I am suspecting the use of...
Read more >
Understanding the Overheads of Launching CUDA Kernels
CPU Launch Overhead: Latency of CPU calling a launch function. • Small Kernel: Kernel execution time is not the main reason for additional...
Read more >
Optimize TensorFlow GPU performance with the TensorFlow ...
The TensorFlow Stats tool in TensorBoard for the same Profile shows 126,224 Mul operations taking 2.77 seconds. Thus, each kernel is about 21.9 ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found