Sporadic AccessViolationException on iteration GetNext
See original GitHub issueUsing FASTER 2.0.22 (we cannot update to 2.1.0 right now due to NuGet dependencies).
We have a custom compaction method which scans the log as follows:
fasterKv.Log.FlushAndEvict(true);
var firstDataAddress = default(long?);
var session = GetPooledSession();
try {
var alive = 0L;
using (var iter = fasterKv.Log.Scan(fasterKv.Log.BeginAddress, fasterKv.Log.TailAddress)) {
while (iter.GetNext(out var recordInfo) && !recordInfo.Tombstone) {
var spanByteAndMemory = iter.GetKey().Deserialize().ToSpanByteAndMemory();
keyConverter.FromSpanByte(ref spanByteAndMemory, out var key);
if (!isDeleted(in key)) {
if (!firstDataAddress.HasValue) {
firstDataAddress = fasterKv.Log.TailAddress;
}
session.Upsert(ref iter.GetKey(), ref iter.GetValue());
alive++;
} else {
session.Delete(ref iter.GetKey());
}
}
}
logger.Info("{count} alive cache entries found", alive);
}
finally {
ReleasePooledSession(session);
}
fasterKv.Log.ShiftBeginAddress(Math.Min(fasterKv.Log.SafeReadOnlyAddress, Math.Max(firstDataAddress.GetValueOrDefault(), fasterKv.Log.SafeReadOnlyAddress - (maxSizeBytes - maxSizeBytes / 5))), true);
This is running concurrently to other operations on the log in a ASP.NET MVC application.
On the server under load we get sporadic (but pretty frequent) AccessViolationException
in the iterator’s GetNext
:
Application: w3wp.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.AccessViolationException
at FASTER.core.SpanByteVarLenStruct.GetLength(FASTER.core.SpanByte ByRef)
at FASTER.core.VariableLengthBlittableAllocator`2[[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2],[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2]].GetRecordSize(Int64)
at FASTER.core.VariableLengthBlittableScanIterator`2[[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2],[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2]].GetNext(FASTER.core.RecordInfo ByRef)
at Seram.Web.Cache.FasterCache`2[[Seram.Area.Indicators.Values.ValueCacheKey, Seram.Area.Indicators, Version=2.2790.8356.3393, Culture=neutral, PublicKeyToken=null],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].Compact(Seram.Web.Cache.KeyPredicate`1<Seram.Area.Indicators.Values.ValueCacheKey>)
at Seram.Area.Indicators.CacheMaintenance+<CacheCleanupInternal>d__2.MoveNext()
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Seram.Area.Indicators.CacheMaintenance+<CacheCleanupInternal>d__2, Seram.Area.Indicators, Version=2.2790.8356.3393, Culture=neutral, PublicKeyToken=null]](<CacheCleanupInternal>d__2 ByRef)
at Seram.Area.Indicators.CacheMaintenance.CacheCleanupInternal(Sirius.Mvc.IServiceLocator, Boolean, System.Threading.CancellationToken)
at Seram.Area.Indicators.CacheMaintenance.CacheCleanup(Sirius.Mvc.IServiceLocator, System.Threading.CancellationToken)
at Seram.Area.TenantManager.MultiTenantProvider+<>c__DisplayClass9_1+<<PerformMaintenance>b__0>d.MoveNext()
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Seram.Area.TenantManager.MultiTenantProvider+<>c__DisplayClass9_1+<<PerformMaintenance>b__0>d, Seram.Area.TenantManager, Version=2.2790.8356.3394, Culture=neutral, PublicKeyToken=null]](<<PerformMaintenance>b__0>d ByRef)
at Seram.Area.TenantManager.MultiTenantProvider+<>c__DisplayClass9_1.<PerformMaintenance>b__0()
at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].InnerInvoke()
at System.Threading.Tasks.Task.Execute()
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
at System.Threading.Tasks.Task.ExecuteEntry(Boolean)
at System.Threading.ThreadPoolWorkQueue.Dispatch()
Are we somehow causing this by inadequate memory-related code or is this a problem of FASTER?
Issue Analytics
- State:
- Created 10 months ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
c# - Programs randomly getting System. ...
This issue is caused by the code which gathers return values. It is possible to work around the issue by disabling Managed return...
Read more >Software and Documentation Problems Fixed in Tuxedo 9.0
MIB request fails with GETNEXT operation on NT. CR090096. Webgui cannot display stats for all the servers selected. CR090248.
Read more >AccessViolationException Class (System)
An access violation occurs in unmanaged or unsafe code when the code attempts to read or write to memory that has not been...
Read more >Assembly Language for Beginners
This instruction increases the value in the SP register and only then writes the next register value into the memory, rather than performing...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thank you for the reply. As I noted the problem occurred only under load, e.g. with a high degree of concurrency read and write while the scan was started. We also noticed memory runaway (with an OoO exception after more than 60GB were used) and really poor performance.
Upon further inspection it seemed that
TakeHybridLogCheckpointAsync()
was performing poorly when invoked during high load and that the scan may have been failing if theTakeHybridLogCheckpointAsync()
was still running in the background. At least the issue seems to be gone now that we are only taking the checkpoint whenSystemState.Phase == Phase.REST
, e.g. avoiding the checkpoint under load.…and we still get AVs: