[ViacomCBS] Samsung fatal stall on preroll to first content period.
See original GitHub issueHave you read the FAQ and checked for duplicate open issues? Yes completely.
What version of Shaka Player are you using? Tested in Shaka 2.5.12, 2.5.11, 2.5.5 and 2.5.1 - Same results on all
Can you reproduce the issue with our latest release version? Yes
Can you reproduce the issue with the latest code from master
?
Yes
Are you using the demo app or your own custom app? Demo raw simple setup. Page urls included in support email with content and la urls.
If custom app, can you reproduce the issue using our demo app? yes
What browser and OS are you using? This issue is only being reported and reproduced by our team for SOME Samsung Tizen based Smart TV mainly years 2017 and 2018. 2019 and 2020 year seems unaffected. Some 2017 and 2018 are also working as expected!!
For embedded devices (smart TVs, etc.), what model and firmware version are you using? <MODEL numbers here and if they fail or pass which test. >
- 2017 UN50MU6070 Fails 100%
- 2018 UN55NU710D Fails 100%
- 2018 UN32N5300AF PASSES
- 2018 UN49NU8000 PASSES
What are the manifest and license server URIs? Sent over on shaka-player-issues@google.com
What did you do?
- Play from 0 start time and let preroll finish, on the first transition to a content period player stalls.
- When a samsung model year has the issue, it happens 100%
- Does not happen with the exact same encode of the stream that is not passed though DAI.
- Does not happen with the exact same encode pass to DAI but no DRM.
- In logs. DRM keysystem, when we fail, both seem to init fine. I tested WV and PR here.
- Non zero test are interesting.
- If you skip the pre-roll and start past that second in stream… all mid-roll DAI period transition thereafter works as expected.
- Stream is fine on all TVs we tested.
- Dash.js does play the same DAI/DRM steam without issue on all the tested devices.
- Shaka is much better overall quality of experience so it is preferred on Samsung
I am also including logs in the support email with content. These logs will be from both fail and pass captures in a few different versions of Shaka mentioned above.
What did you expect to happen? The first content period initializes and plays out.
What actually happened? The content period fails to init and play properly.
Issue Analytics
- State:
- Created 3 years ago
- Comments:53 (51 by maintainers)
Top GitHub Comments
The fix is out in the master branch. Please test in your applications!
These fixes can be backported to v3.0.x and potentially to v2.5.x as well.
Notes from the day’s investigations. TL;DR: Good news, everyone! I may have a workaround!
I changed the test as follows:
Playback still stalls out after the seek.
We buffer up to time 44 while playing to time 31. Then we seek to time 60.
Since we didn’t clear the buffer or reset StreamingEngine state, it kept streaming linearly. Effectively, the playhead seeked, but not StreamingEngine. This means streaming and buffering continued without interruption from 0 to 72, while the stall occurred at time 60. It remained stalled out for 20 seconds before the test timed out.
Looking for possible recovery steps that could shed light on the hang, I added one more step to the test. After the stall, I seeked back to time 24, a segment boundary and a time that not only was buffered, but had already been played. It remained stalled out at 24 for 20 seconds before the test timed out.
I noticed that the timing of the “appended media segment” messages in the logs speeds up significantly after the seek. Not sure what that means. Maybe they aren’t being decoded because they are behind the playhead?
Next I tried recovering by calling pause and play, with 1 second delays before each of them. This worked!
Next I tried making the test call pause & play immediately after seeking, rather than waiting for the stall/timeout, still with 1-second delays. This worked, too!
Next I tried calling play only, not pause, 1 second after seeking. This doesn’t work, which makes some kind of sense, given that the state of the video element was
paused: false
at that time.So the delay seems to be important. It’s not yet clear how large a delay is necessary between pause and play. (It may be a thing that we have to poll state on before calling play.) I also have lots of other hacks in place, which now need to be unwound one by one to determine if any of them are helping, or if pause/play is doing the job all on its own. Finally, if this turns out to be the big winner, it’s not clear when this workaround should be called. (On seek, on stall, just on Tizen, on all TVs, on all platforms, etc)
Calling
video.pause()
, then polling state onvideo.paused
to callvideo.play()
… sadly doesn’t work. After callingvideo.pause()
,video.paused
is immediately true, which follows from the HTML video spec. But some hidden state deeper within Tizen does not seem to update immediately, so some hard-coded delay will be necessary. More work will be required to determine what that is.After this, I noticed that one difference between the successful 800ms and the failing 500ms was the timing of appending the segments. At 500ms, the content at the target time was not yet buffered. At 800ms, it was buffered. So the workaround really only seems to help after content is appended to the position of the playhead. Perhaps instead of applying this workaround on seek, it should be applied by the stall detector, which only fires when the playhead is buffered. It could be a more effective stall recovery mechanism than seeking, particularly on TV platforms (which seem to need stall detection more than desktops anyway).
I reverted all of my other hacks, and I patched the stall detector so that when
streaming.stallSkip
is configured to0
, we call pause & play (with no delay) instead of seeking. Then I changed the default config to0
. The simple, unencrypted DASH content now passes the automated test!Next steps: