question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Execution-despatch optimization (reducing heap allocations)

See original GitHub issue

Hi. I’ve been looking at the internal source code and thinking about optimization, in the spirit of new .NET low-allocations and everything. This is not intended as criticism (Polly is awesome), just a suggestion and an offer to contribute some work.

Anyway.

The way Polly currently handles work-to-be-executed is, at the simplest level, as a Func<TResult>. Any state necessary for the operation must therefore be provided as closures, incurring heap allocations. There’s a move in some of the Core projects towards instead passing arguments along with delegates to avoid these closure allocations, so I thought I’d have a go at that in the context of Polly. So, for example, instead of

public string Get(int id) {
  return policy.Execute(() => id.ToString());
}

you would have

public string Get(int id) {
  return policy.Execute(n => n.ToString(), id);
}

Except it turns out that’s a bit of a bugger, because in the internal code you have to multiply all the internal engines’ Execute and ExecuteAndCapture by however many overloads you have (Func<T1,TResult>, Func<T1,T2,TResult>, etc), and that’s not fun for anybody.

So then I thought about wrapping the delegate and its argument(s) in a struct which implemented a single, consistent interface, like this:

public struct PollyAction<T1, TResult> : IPollyAction<TResult>
{
	private readonly Func<T1, TResult> _action;
	private readonly T1 _arg1;

	public PollyAction(Func<T1, TResult> action, T1 arg1)
	{
		_action = action;
		_arg1 = arg1;
	}

	public TResult Execute() => _action(_arg1);
}

public interface IPollyAction<TResult>
{
	TResult Execute();
}

Then all the Execute and ExecuteAndCapture calls can just take an instance of that interface:

public TResult Execute<TResult>(IPollyAction<TResult> action) {
  while (true) {
    try {
      return action.Execute();
    } catch {
      // Whatever exception handling
    }
  }
}

Except, that also causes allocations because the struct gets boxed to the IPollyAction<TResult> interface.

But if you change the Execute method to this:

public TResult Execute<TAction, TResult>(TAction action)
  where TAction : IPollyAction<TResult>
{
  while (true) {
    try {
      action.Execute();
      break;
    } catch {
      // Whatever exception handling
    }
  }
}

then the generic method acts on the struct and no boxing, and hence no GC allocation, occurs. The only downside is that the call to this method has to explicitly specify the generic type parameters because C#'s overload inference doesn’t recurse, but it’s not too bad and will be hidden inside the library code.

I threw together a very quick BenchmarkDotNet comparing this approach to the current closure one:

public class Benchmarks
{
	[Benchmark]
	public int Closure()
	{
		int x = 2;
		return Execute(() => x * 2);
	}

	[Benchmark]
	public int Action()
	{
		int x = 2;
		return Execute<PollyAction<int, int>, int>(new PollyAction<int, int>(n => n * 2, x));
	}

	private static T Execute<T>(Func<T> action) => action();

	private static TResult Execute<TAction, TResult>(TAction action)
		where TAction : IPollyAction<TResult>
		=> action.Execute();
}

BenchmarkDotNet=v0.10.8, OS=Windows 7 SP1 (6.1.7601)
Processor=Intel Xeon CPU E5-2660 0 2.20GHz, ProcessorCount=2
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
dotnet cli version=1.0.0
  [Host]     : .NET Core 4.6.25009.03, 64bit RyuJIT
  DefaultJob : .NET Core 4.6.25009.03, 64bit RyuJIT


Method Mean Error StdDev Gen 0 Allocated
Closure 20.653 ns 0.4642 ns 1.1561 ns 0.0139 88 B
PollyAction 8.418 ns 0.1315 ns 0.1027 ns - 0 B

Using the struct wrapper eliminates heap allocations and cuts execution time by >50% just for this simple example. Polly currently creates multiple closures per-call.

Question is, does this matter enough, and are the maintainers interested in this level of optimization? If it does and they are, I’m happy to spike something more representative, like redoing RetryPolicy to use this approach, to enable further discussion.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:1
  • Comments:13 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
martintmkcommented, Jun 19, 2023

Hey folks, this is addressed in V8 where the ResilienceStrategy allow passing TState to lambda. This way, you can use static lambdas that use the state for executions.

https://github.com/App-vNext/Polly/blob/6fcf0f68a116868f18d5be97e32d7a442d85f140/bench/Polly.Core.Benchmarks/ResilienceStrategyBenchmark.cs#L21

@martincostello , I think we can close this one.

cc @SimonCropp

2reactions
reisenbergercommented, Jan 25, 2020

Polly v8.0.0 will also add new execute overloads allowing passing strongly-typed input objects to delegates to be executed, without using a closure:

void Execute<T1>(Action<Context, CancellationToken, T1> action, Context context, CancellationToken cancellationToken, T1 input1)
void Execute<T1, T2>(Action<Context, CancellationToken, T1, T2> action, Context context, CancellationToken cancellationToken, T1 input1, T2 input2)

TResult Execute<T1, TResult>(Func<Context, CancellationToken, T1, TResult> func, Context context, CancellationToken cancellationToken, T1 input1)
TResult Execute<T1, T2, TResult>(Func<Context, CancellationToken, T1, T2, TResult> func, Context context, CancellationToken cancellationToken, T1 input1, T2 input2)

Task ExecuteAsync<T1>(Func<Context, CancellationToken, bool, T1, Task> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1)
Task ExecuteAsync<T1, T2>(Func<Context, CancellationToken, bool, T1, T2, Task> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1, T2 input2)

Task<TResult> ExecuteAsync<T1, TResult>(Func<Context, CancellationToken, bool, T1, Task<TResult>> func, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1)
Task<TResult> ExecuteAsync<T1, T2, TResult>(Func<Context, CancellationToken, bool, T1, T2, Task<TResult>> func, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, T1 input1, T2 input2)

(and similar .ExecuteAndCapture/Async(...) variants)

Read more comments on GitHub >

github_iconTop Results From Across the Web

C# Performance tricks — Reducing heap allocations and ...
Here I'm not looking forward to writing about allocation-free code and complex optimization, but about how to reduce allocations in a real ...
Read more >
Minimize Heap Allocations in Node.js
In this post, we explored how to minimize your heap and detect memory leaks in Node. js. We started by looking at heap...
Read more >
Heap Allocations - The Rust Performance Book
Vec is a heap-allocated type with a great deal of scope for optimizing the number of allocations, and/or minimizing the amount of wasted...
Read more >
Optimizing heap memory
This technique lets the allocator reclaim memory faster, and allows it to be immediately used for new heap objects, which, over time, reduces...
Read more >
Is Rust able to optimize local heap allocations?
What I'm wondering is: is Rust able to optimize possibly_realtime such that the local heap allocation of v only occurs once and is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found