Benchmarking methodology used is not quite correct
See original GitHub issueHi @1ytic ,
Thank you for this work on optimizing warp rnn-t operation which is becoming increasingly useful for many speech recognition acoustic models. We have studied your implementation and have the following observations:
- When we run the benchmark script in your repo, the run time we got is as below, where new refers to this new repo, and the baseline is what we have now in RNN-T reference model. We used B=32, T=U=200, V= 29, which is a typical case in our dataset. From the output of the benchmark script, it does appear that the new loss function runs faster:
new: 1.76 ms baseline: 6.10 ms
- However, in the benchmark script that the author provided, the run time was measured as:
t = timer()
costs = loss(xs, ys, xn, yn)
elapsed_time += timer() - t
This way of measuring has a problem that CPU could run ahead of GPU and stop the timer even before the kernel is completed. After adding synchronization as below,
torch.cuda.synchronize() # sync before start the timer
t = timer()
costs = loss(xs, ys, xn, yn)
torch.cuda.synchronize() # sync before stop the timer
elapsed_time += timer() - t
the run time we get is:
new: 15.82 ms baseline: 6.12 ms
- This is similar to the run time we got from GPU profiler nsys:
new: 14.38 ms baseline: 4.75 ms
In summary - It does not look like that the alternative loss function is running faster than what we have. The claimed speedup in the repo is likely caused by a flawed benchmark methodology.
Can you share your thought process on this?
Thanks, Ashish
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Understanding the Purpose and Use of Benchmarking
Benchmarking is a process for obtaining a measure – a benchmark. Simply stated, benchmarks are the “what,” and benchmarking is the “how.” But...
Read more >Benchmarking: A Method for Continuous Quality Improvement ...
Benchmarking a systematic approach to identifying the benchmark, comparing yourself to the benchmark and identifying practices that enable you to become the new ......
Read more >8 Steps of the Benchmarking Process | Lucidchart Blog
In business, benchmarking is a process used to measure the quality and performance of your company's products, services, and processes.
Read more >Why Benchmarking Efforts Fail - Quality America Inc.
Teams do not understand their work completely: If the benchmarking team did not map, flowchart, or document its work process, and if it...
Read more >Benchmarking - an overview | ScienceDirect Topics
Benchmarking is a widely used concept, but one that is often misinterpreted and insufficiently used. Although the simple definition of benchmarking is to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It was tested on 2080ti
I added one more explanation 2944982 of performance issue with NVIDIA Profiler.