Execution-despatch optimization (reducing heap allocations)
See original GitHub issueHi. I’ve been looking at the internal source code and thinking about optimization, in the spirit of new .NET low-allocations and everything. This is not intended as criticism (Polly is awesome), just a suggestion and an offer to contribute some work.
Anyway.
The way Polly currently handles work-to-be-executed is, at the simplest level, as a Func<TResult>
. Any state necessary for the operation must therefore be provided as closures, incurring heap allocations. There’s a move in some of the Core projects towards instead passing arguments along with delegates to avoid these closure allocations, so I thought I’d have a go at that in the context of Polly. So, for example, instead of
public string Get(int id) {
return policy.Execute(() => id.ToString());
}
you would have
public string Get(int id) {
return policy.Execute(n => n.ToString(), id);
}
Except it turns out that’s a bit of a bugger, because in the internal code you have to multiply all the internal engines’ Execute and ExecuteAndCapture by however many overloads you have (Func<T1,TResult>
, Func<T1,T2,TResult>
, etc), and that’s not fun for anybody.
So then I thought about wrapping the delegate and its argument(s) in a struct which implemented a single, consistent interface, like this:
public struct PollyAction<T1, TResult> : IPollyAction<TResult>
{
private readonly Func<T1, TResult> _action;
private readonly T1 _arg1;
public PollyAction(Func<T1, TResult> action, T1 arg1)
{
_action = action;
_arg1 = arg1;
}
public TResult Execute() => _action(_arg1);
}
public interface IPollyAction<TResult>
{
TResult Execute();
}
Then all the Execute and ExecuteAndCapture calls can just take an instance of that interface:
public TResult Execute<TResult>(IPollyAction<TResult> action) {
while (true) {
try {
return action.Execute();
} catch {
// Whatever exception handling
}
}
}
Except, that also causes allocations because the struct gets boxed to the IPollyAction<TResult>
interface.
But if you change the Execute
method to this:
public TResult Execute<TAction, TResult>(TAction action)
where TAction : IPollyAction<TResult>
{
while (true) {
try {
action.Execute();
break;
} catch {
// Whatever exception handling
}
}
}
then the generic method acts on the struct and no boxing, and hence no GC allocation, occurs. The only downside is that the call to this method has to explicitly specify the generic type parameters because C#'s overload inference doesn’t recurse, but it’s not too bad and will be hidden inside the library code.
I threw together a very quick BenchmarkDotNet comparing this approach to the current closure one:
public class Benchmarks
{
[Benchmark]
public int Closure()
{
int x = 2;
return Execute(() => x * 2);
}
[Benchmark]
public int Action()
{
int x = 2;
return Execute<PollyAction<int, int>, int>(new PollyAction<int, int>(n => n * 2, x));
}
private static T Execute<T>(Func<T> action) => action();
private static TResult Execute<TAction, TResult>(TAction action)
where TAction : IPollyAction<TResult>
=> action.Execute();
}
BenchmarkDotNet=v0.10.8, OS=Windows 7 SP1 (6.1.7601)
Processor=Intel Xeon CPU E5-2660 0 2.20GHz, ProcessorCount=2
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
dotnet cli version=1.0.0
[Host] : .NET Core 4.6.25009.03, 64bit RyuJIT
DefaultJob : .NET Core 4.6.25009.03, 64bit RyuJIT
Method | Mean | Error | StdDev | Gen 0 | Allocated |
---|---|---|---|---|---|
Closure | 20.653 ns | 0.4642 ns | 1.1561 ns | 0.0139 | 88 B |
PollyAction | 8.418 ns | 0.1315 ns | 0.1027 ns | - | 0 B |
Using the struct wrapper eliminates heap allocations and cuts execution time by >50% just for this simple example. Polly currently creates multiple closures per-call.
Question is, does this matter enough, and are the maintainers interested in this level of optimization? If it does and they are, I’m happy to spike something more representative, like redoing RetryPolicy to use this approach, to enable further discussion.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:13 (9 by maintainers)
Top GitHub Comments
Hey folks, this is addressed in V8 where the
ResilienceStrategy
allow passingTState
to lambda. This way, you can use static lambdas that use the state for executions.https://github.com/App-vNext/Polly/blob/6fcf0f68a116868f18d5be97e32d7a442d85f140/bench/Polly.Core.Benchmarks/ResilienceStrategyBenchmark.cs#L21
@martincostello , I think we can close this one.
cc @SimonCropp
Polly v8.0.0 will also add new execute overloads allowing passing strongly-typed input objects to delegates to be executed, without using a closure:
(and similar
.ExecuteAndCapture/Async(...)
variants)