question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Usage of fibers causes random .NET runtime crashes

See original GitHub issue

Version All

Description In its main loop (ScriptMain), SHVDN makes use of fibers and also calls into unmanaged code implemented using fibers. This is not supported by .NET and will inevitably lead to random crashes, most notably making the .NET exception handler in CLRVectoredExceptionHandler assume that there is no stack space left. The Microsoft docs state “The .NET threading model does not support fibers. You should not call into any unmanaged function that is implemented by using fibers. Such calls may result in a crash of the .NET runtime.” With my limited knowledge about the project it is unclear to me why a fiber-based approach was chosen here as it seems like a grave design mistake, but I did not study the entire project.

Crash Details Since a fiber is not its own thread, but behaves like one in certain ways, it has its own stack space. This means that for one thread there will be different stack lower and upper bounds, depending on whether the fiber is running or not. Whenever an exception in .NET occurs, the crash handler in CLRVectoredExceptionHandler gets called. (I am referencing the initial coreclr commit here as it is closer to what .NET Framework uses than the current .NET core implementation which does not use stack probing at all and hence is not affected). The call to Thread::IsStackSpaceAvailable returns false when on a fiber-stack, since its internal call to GetLastNormalStackAddress uses a cached stack limit. Naturally this should be using the fiber’s stack bounds, but due to the caching it does not. While this is definitely not handled ideally by .NET Framework (and fixed in .NET core), the issue remains that fibers are not supported. Unfortunately, this means that any exception, whether it is a C++ exception as mentioned in #936 or a .NET exception, will cause the runtime to panic and call DontCallDirectlyForceStackOverflow, subsequently terminating the process. Please note that this crash does not occur on every machine and seemingly at random, but since I had access to a user machine where it happened on every single exception, it was very easy to debug and pinpoint.

Example stack trace where the offending line is a a NullReferenceException in .NET wrapped by try-catch (which is not hit for the reasons outlined above): 0:000> !dumpstack OS Thread Id: 0x882c (0) Current frame: clr!DontCallDirectlyForceStackOverflow+0x10 Child-SP RetAddr Caller, Callee 00000011620f31f0 00007ffc7bbe9a11 clr!CLRVectoredExceptionHandler+0xa8, calling clr!DontCallDirectlyForceStackOverflow 00000011620f3220 00007ffc7ba2f96e clr!SaveCurrentExceptionInfo+0x72, calling clr!ClrFlsSetValue 00000011620f3250 00007ffc7ba2fdcf clr!CLRVectoredExceptionHandlerShim+0xa3, calling clr!CLRVectoredExceptionHandler 00000011620f3280 00007ffc97e883dc ntdll!RtlpCallVectoredHandlers+0x108, calling ntdll!guard_dispatch_icall_nop 00000011620f3320 00007ffc97e5b406 ntdll!RtlDispatchException+0x66, calling ntdll!RtlpCallVectoredHandlers 00000011620f3350 00007ffc7b8d882a clr!invokeCompileMethod+0x97, calling clr!invokeCompileMethodHelper 00000011620f33c0 00007ffc7b8d875e clr!CallCompileMethodWithSEHWrapper+0xe5 00000011620f33f0 00007ffc97e25d21 ntdll!RtlFreeHeap+0x51, calling ntdll!RtlpFreeHeapInternal 00000011620f3430 00007ffc7b8c5809 clr!EEHeapFreeInProcessHeap+0x45, calling KERNEL32!HeapFreeStub 00000011620f3460 00007ffc7b8d864d clr!UnsafeJitFunction+0x81b, calling clr!_security_check_cookie 00000011620f3530 00007ffc97eafe3e ntdll!KiUserExceptionDispatch+0x2e, calling ntdll!RtlDispatchException 00000011620f4330 00007ffc1ca6dc66 (MethodDesc 00007ffc1c87e010 +0x26 Rage.Attributes.PluginAttribute.get_Name()) ====> Exception Code c0000005 cxr@00000011620f3540 exr@00000011620f3a30

Resolution I was able to fix this issue by removing the fiber logic from ScriptMain as well as no longer relying on SHV’s scriptRegister. Unfortunately, you will have to provide your own script VM tick hook as SHV uses fibers and just removing the fiber logic from SHVDN is not enough. For a simple PoC, I hooked a native and ticked SHVDN from there and it worked fine: no more fiber related crashes! You should be able to still use SHV to receive keyboard callbacks according to my own testing.

If you have any questions or feedback as to why fibers were used (or must be used), please let me know.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:34 (25 by maintainers)

github_iconTop GitHub Comments

1reaction
LMSDevcommented, Feb 15, 2023

The reason RPH reimplements the main script loop is to allow execution of certain RPH functionality, such as console commands (which may run natives) even when the game is paused or is not ticking script threads for another reason. At its heart, the tick for plugins is still handled via a normal scrThread in the list and you probably want to do the same.

0reactions
kagikncommented, Jul 24, 2023

Ah, I just remembered you made #1181 . Since it wasn’t mentioned that you can close a thread handle (to avoid thread handle leak that will result in the thread persistence iirc) in that PR, maybe there’s only a few things that should be changed and we can reuse majority of it? For thread handle leak, I have a few cases to show. https://github.com/Lyall/IshinFix/issues/5 https://github.com/kagikn/ExeIntegrityBypassAgainstRGL/blob/40c18129692a5316ff0634a1ce736092bb765769/ExeIntegrityBypassAgainstRGL/dllmain.cpp#L145

Also, crosire didn’t mention scripts used .NET threads as foreground threads in said PR, which can prevent the game from exiting (I changed this to make scripts run in background threads in da63afd).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Usage of fibers causes random .NET runtime crashes
Usage of fibers causes random .NET runtime crashes ... In its main loop (ScriptMain), SHVDN makes use of fibers and also calls into...
Read more >
NET applications crash at startup - .NET Framework
A user starts a .NET application. However, the application crashes at startup if the user.config file is corrupt. The application works fine ...
Read more >
is fiber really bad ??? : r/golang
Ok will to answer the question, why not use fiber boils down to two things: it's advantage is almost never relevant, it's disadvantage...
Read more >
How do I fix a .NET windows application crashing at startup ...
NET 4 runtime. The two versions cannot be installed side by side, but they use the same version numbers in the GAC. This...
Read more >
[Solved] .NET applications crashing
It crashes randomly, whereas the .NET Core 2.2 SDK runs just fine. Going into safe mode and trying to run these applications there...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found