Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Execution-despatch optimization (reducing heap allocations)

See original GitHub issue

Hi. I’ve been looking at the internal source code and thinking about optimization, in the spirit of new .NET low-allocations and everything. This is not intended as criticism (Polly is awesome), just a suggestion and an offer to contribute some work.

Anyway.

The way Polly currently handles work-to-be-executed is, at the simplest level, as a Func<TResult>. Any state necessary for the operation must therefore be provided as closures, incurring heap allocations. There’s a move in some of the Core projects towards instead passing arguments along with delegates to avoid these closure allocations, so I thought I’d have a go at that in the context of Polly. So, for example, instead of

public string Get(int id) {
  return policy.Execute(() => id.ToString());
}

you would have

public string Get(int id) {
  return policy.Execute(n => n.ToString(), id);
}

Except it turns out that’s a bit of a bugger, because in the internal code you have to multiply all the internal engines’ Execute and ExecuteAndCapture by however many overloads you have (Func<T1,TResult>, Func<T1,T2,TResult>, etc), and that’s not fun for anybody.

So then I thought about wrapping the delegate and its argument(s) in a struct which implemented a single, consistent interface, like this:

public struct PollyAction<T1, TResult> : IPollyAction<TResult>
{
	private readonly Func<T1, TResult> _action;
	private readonly T1 _arg1;

	public PollyAction(Func<T1, TResult> action, T1 arg1)
	{
		_action = action;
		_arg1 = arg1;
	}

	public TResult Execute() => _action(_arg1);
}

public interface IPollyAction<TResult>
{
	TResult Execute();
}

Then all the Execute and ExecuteAndCapture calls can just take an instance of that interface:

public TResult Execute<TResult>(IPollyAction<TResult> action) {
  while (true) {
    try {
      return action.Execute();
    } catch {
      // Whatever exception handling
    }
  }
}

Except, that also causes allocations because the struct gets boxed to the IPollyAction<TResult> interface.

But if you change the Execute method to this:

public TResult Execute<TAction, TResult>(TAction action)
  where TAction : IPollyAction<TResult>
{
  while (true) {
    try {
      action.Execute();
      break;
    } catch {
      // Whatever exception handling
    }
  }
}

then the generic method acts on the struct and no boxing, and hence no GC allocation, occurs. The only downside is that the call to this method has to explicitly specify the generic type parameters because C#'s overload inference doesn’t recurse, but it’s not too bad and will be hidden inside the library code.

I threw together a very quick BenchmarkDotNet comparing this approach to the current closure one:

public class Benchmarks
{
	[Benchmark]
	public int Closure()
	{
		int x = 2;
		return Execute(() => x * 2);
	}

	[Benchmark]
	public int Action()
	{
		int x = 2;
		return Execute<PollyAction<int, int>, int>(new PollyAction<int, int>(n => n * 2, x));
	}

	private static T Execute<T>(Func<T> action) => action();

	private static TResult Execute<TAction, TResult>(TAction action)
		where TAction : IPollyAction<TResult>
		=> action.Execute();
}


BenchmarkDotNet=v0.10.8, OS=Windows 7 SP1 (6.1.7601)
Processor=Intel Xeon CPU E5-2660 0 2.20GHz, ProcessorCount=2
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
dotnet cli version=1.0.0
  [Host]     : .NET Core 4.6.25009.03, 64bit RyuJIT
  DefaultJob : .NET Core 4.6.25009.03, 64bit RyuJIT

Method	Mean	Error	StdDev	Gen 0	Allocated
Closure	20.653 ns	0.4642 ns	1.1561 ns	0.0139	88 B
PollyAction	8.418 ns	0.1315 ns	0.1027 ns	-	0 B

Using the struct wrapper eliminates heap allocations and cuts execution time by >50% just for this simple example. Polly currently creates multiple closures per-call.

Question is, does this matter enough, and are the maintainers interested in this level of optimization? If it does and they are, I’m happy to spike something more representative, like redoing RetryPolicy to use this approach, to enable further discussion.

Issue Analytics

State:
Created 6 years ago
Reactions:1
Comments:13 (9 by maintainers)

Top GitHub Comments

2reactions

martintmkcommented, Jun 19, 2023

Hey folks, this is addressed in V8 where the ResilienceStrategy allow passing TState to lambda. This way, you can use static lambdas that use the state for executions.

https://github.com/App-vNext/Polly/blob/6fcf0f68a116868f18d5be97e32d7a442d85f140/bench/Polly.Core.Benchmarks/ResilienceStrategyBenchmark.cs#L21

@martincostello , I think we can close this one.

cc @SimonCropp

2reactions

reisenbergercommented, Jan 25, 2020

Polly v8.0.0 will also add new execute overloads allowing passing strongly-typed input objects to delegates to be executed, without using a closure:

void Execute<T1>(Action<Context, CancellationToken, T1> action, Context context, CancellationToken cancellationToken, T1 input1)
void Execute<T1, T2>(Action<Context, CancellationToken, T1, T2> action, Context context, CancellationToken cancellationToken, T1 input1, T2 input2)

TResult Execute<T1, TResult>(Func<Context, CancellationToken, T1, TResult> func, Context context, CancellationToken cancellationToken, T1 input1)
TResult Execute<T1, T2, TResult>(Func<Context, CancellationToken, T1, T2, TResult> func, Context context, CancellationToken cancellationToken, T1 input1, T2 input2)

Task ExecuteAsync<T1>(Func<Context, CancellationToken, bool, T1, Task> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1)
Task ExecuteAsync<T1, T2>(Func<Context, CancellationToken, bool, T1, T2, Task> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1, T2 input2)

Task<TResult> ExecuteAsync<T1, TResult>(Func<Context, CancellationToken, bool, T1, Task<TResult>> func, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1)
Task<TResult> ExecuteAsync<T1, T2, TResult>(Func<Context, CancellationToken, bool, T1, T2, Task<TResult>> func, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1, T2 input2)

(and similar .ExecuteAndCapture/Async(...) variants)