question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sporadic AccessViolationException on iteration GetNext

See original GitHub issue

Using FASTER 2.0.22 (we cannot update to 2.1.0 right now due to NuGet dependencies).

We have a custom compaction method which scans the log as follows:

fasterKv.Log.FlushAndEvict(true);
var firstDataAddress = default(long?);
var session = GetPooledSession();
try {
	var alive = 0L;
	using (var iter = fasterKv.Log.Scan(fasterKv.Log.BeginAddress, fasterKv.Log.TailAddress)) {
		while (iter.GetNext(out var recordInfo) && !recordInfo.Tombstone) {
			var spanByteAndMemory = iter.GetKey().Deserialize().ToSpanByteAndMemory();
			keyConverter.FromSpanByte(ref spanByteAndMemory, out var key);
			if (!isDeleted(in key)) {
				if (!firstDataAddress.HasValue) {
					firstDataAddress = fasterKv.Log.TailAddress;
				}
				session.Upsert(ref iter.GetKey(), ref iter.GetValue());
				alive++;
			} else {
				session.Delete(ref iter.GetKey());
			}
		}
	}
	logger.Info("{count} alive cache entries found", alive);
}
finally {
	ReleasePooledSession(session);
}
fasterKv.Log.ShiftBeginAddress(Math.Min(fasterKv.Log.SafeReadOnlyAddress, Math.Max(firstDataAddress.GetValueOrDefault(), fasterKv.Log.SafeReadOnlyAddress - (maxSizeBytes - maxSizeBytes / 5))), true);

This is running concurrently to other operations on the log in a ASP.NET MVC application.

On the server under load we get sporadic (but pretty frequent) AccessViolationException in the iterator’s GetNext:

Application: w3wp.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.AccessViolationException
   at FASTER.core.SpanByteVarLenStruct.GetLength(FASTER.core.SpanByte ByRef)
   at FASTER.core.VariableLengthBlittableAllocator`2[[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2],[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2]].GetRecordSize(Int64)
   at FASTER.core.VariableLengthBlittableScanIterator`2[[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2],[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2]].GetNext(FASTER.core.RecordInfo ByRef)
   at Seram.Web.Cache.FasterCache`2[[Seram.Area.Indicators.Values.ValueCacheKey, Seram.Area.Indicators, Version=2.2790.8356.3393, Culture=neutral, PublicKeyToken=null],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].Compact(Seram.Web.Cache.KeyPredicate`1<Seram.Area.Indicators.Values.ValueCacheKey>)
   at Seram.Area.Indicators.CacheMaintenance+<CacheCleanupInternal>d__2.MoveNext()
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Seram.Area.Indicators.CacheMaintenance+<CacheCleanupInternal>d__2, Seram.Area.Indicators, Version=2.2790.8356.3393, Culture=neutral, PublicKeyToken=null]](<CacheCleanupInternal>d__2 ByRef)
   at Seram.Area.Indicators.CacheMaintenance.CacheCleanupInternal(Sirius.Mvc.IServiceLocator, Boolean, System.Threading.CancellationToken)
   at Seram.Area.Indicators.CacheMaintenance.CacheCleanup(Sirius.Mvc.IServiceLocator, System.Threading.CancellationToken)
   at Seram.Area.TenantManager.MultiTenantProvider+<>c__DisplayClass9_1+<<PerformMaintenance>b__0>d.MoveNext()
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Seram.Area.TenantManager.MultiTenantProvider+<>c__DisplayClass9_1+<<PerformMaintenance>b__0>d, Seram.Area.TenantManager, Version=2.2790.8356.3394, Culture=neutral, PublicKeyToken=null]](<<PerformMaintenance>b__0>d ByRef)
   at Seram.Area.TenantManager.MultiTenantProvider+<>c__DisplayClass9_1.<PerformMaintenance>b__0()
   at System.Threading.Tasks.Task`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
   at System.Threading.Tasks.Task.ExecuteEntry(Boolean)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Are we somehow causing this by inadequate memory-related code or is this a problem of FASTER?

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
avonwysscommented, Dec 8, 2022

Thank you for the reply. As I noted the problem occurred only under load, e.g. with a high degree of concurrency read and write while the scan was started. We also noticed memory runaway (with an OoO exception after more than 60GB were used) and really poor performance.

Upon further inspection it seemed that TakeHybridLogCheckpointAsync() was performing poorly when invoked during high load and that the scan may have been failing if the TakeHybridLogCheckpointAsync() was still running in the background. At least the issue seems to be gone now that we are only taking the checkpoint when SystemState.Phase == Phase.REST, e.g. avoiding the checkpoint under load.

0reactions
avonwysscommented, Jan 10, 2023

…and we still get AVs:

System.AccessViolationException
   at FASTER.core.SpanByteVarLenStruct.GetLength(FASTER.core.SpanByte ByRef)
   at FASTER.core.VariableLengthBlittableAllocator`2[[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2],[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2]].GetRecordSize(Int64)
   at FASTER.core.VariableLengthBlittableScanIterator`2[[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2],[FASTER.core.SpanByte, FASTER.core, Version=1.0.0.0, Culture=neutral, PublicKeyToken=eb19722ac09e9af2]].GetNext(FASTER.core.RecordInfo ByRef)
   at Seram.Web.Cache.FasterCache`2[[Seram.Area.Indicators.Values.ValueCacheKey, Seram.Area.Indicators, Version=2.2804.8409.20143, Culture=neutral, PublicKeyToken=null],[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].Compact(Seram.Web.Cache.KeyPredicate`1<Seram.Area.Indicators.Values.ValueCacheKey>, Boolean)
   at Seram.Area.Indicators.CacheMaintenance+<CacheCleanupInternal>d__2.MoveNext()
Read more comments on GitHub >

github_iconTop Results From Across the Web

c# - Programs randomly getting System. ...
This issue is caused by the code which gathers return values. It is possible to work around the issue by disabling Managed return...
Read more >
Software and Documentation Problems Fixed in Tuxedo 9.0
MIB request fails with GETNEXT operation on NT. CR090096. Webgui cannot display stats for all the servers selected. CR090248.
Read more >
AccessViolationException Class (System)
An access violation occurs in unmanaged or unsafe code when the code attempts to read or write to memory that has not been...
Read more >
Assembly Language for Beginners
This instruction increases the value in the SP register and only then writes the next register value into the memory, rather than performing...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found