Reduce allocations when manually creating expressions that take params or IEnumerable arguments
See original GitHub issueI’ve been working on a library that does various things with expression trees and there were a few changes I made in my library which were able to help improve performance and reduce allocations and noticed that neither of these optimisations are being done inside the EF Core repo so thought I’d open an issue to point them out in case they lead to improved performance in EF Core.
The problem
When using the helper methods for creating expressions such as Expression.Block
or Expression.Lambda
, if an IEnumerable
or params array is passed in, the method will create a defensive copy of it, thus resulting in additional allocations.
To see places where a defensive copy is created, you can look through the System.Linq.Expressions source code to find instances of .ToReadOnly()
. A good example is Expression.Block
which creates defensive copies for both the variables and expressions arguments (source). At the time of writing, the source code for ToReadOnly
is as follows:
public static ReadOnlyCollection<T> ToReadOnly<T>(this IEnumerable<T>? enumerable)
{
if (enumerable == null)
return EmptyReadOnlyCollection<T>.Instance;
if (enumerable is TrueReadOnlyCollection<T> troc)
return troc;
if (enumerable is ReadOnlyCollectionBuilder<T> builder)
return builder.ToReadOnlyCollection();
T[] array = enumerable.ToArray();
return array.Length == 0 ? EmptyReadOnlyCollection<T>.Instance : new TrueReadOnlyCollection<T>(array);
}
From this code we can see that for most enumerables, this is going to result in a call to .ToArray()
which will always allocate a new array and then copy the arguments into it. When I was doing some digging into improving the perf of my library I saw that calls to ToReadOnly
accounted for about 3% of my total CPU usage, as almost every single time I created an expression it resulted in an allocation of an array.
Solutions
Empty Array/IEnumerable
If the enumerable/array being passed in has no elements, it would instead be better to pass in null
rather than an empty array or enumerable. Not only will this help prevent allocating an unnecessary empty array/enumerable in the first place, but it prevents a defensive copy being allocated afterwards. So changing to null
should save two allocations.
In a few methods such as Expression.Lambda
and Expression.Call
, the arguments are passed in using a nullable params array. For example, this is what the Expression.Lambda
method signature looks like
public static LambdaExpression Lambda(Expression body, params ParameterExpression[]? parameters)
If you were to just do Expression.Lambda(body)
, then this will result in parameters
being passed in as an empty array. If instead you called Expression.Lambda(body, null)
then this avoid two empty array allocations. I had a look through the EF Core repo and there are 8 places (excluding unit tests) where Expression.Lambda
is being called without any parameters.
Non-Empty Array/IEnumerable
In the code I posted above, TrueReadOnlyCollection
is an internal type and so it can’t be used, but ReadOnlyCollectionBuilder
is a public sealed type and by using this we can avoid the call to IEnumerable.ToArray
. Here is the source code for ReadOnlyCollectionBuilder
at the time of writing to help explain how it should be used: ReadOnlyCollectionBuilder.cs
For any benefits to be gained, you have to pass in a correct capacity
value into the constructor that is exactly equal to the number of elements in the collection and you need to add the elements in using the .Add
method/collection initializer. If we pass in a capacity that is less than the number of items, then it will result in additional allocations to resize the internal array. If we pass in a capacity that is more than the number of items, then the call to ToReadOnlyCollection
will not re-use the internal array and will instead allocate a new array of the correct length and copy the items into it.
As an example of how this optimisation could be used, we could rewrite the following
Expression.Block(new [] { var1, var2, var3 }, new [] { expr1, expr2, expr3, expr4 });
into
Expression.Block(
new ReadOnlyCollectionBuilder<ParameterExpression>(3) { var1, var2, var3 },
new ReadOnlyCollectionBuilder<Expression>(4) { expr1, expr2, expr3, expr4 });
Overall, I am not really sure how much this would improve perf in EF Core since it’s a very large project with multiple moving parts so it might not be worth investing the time to go through and fix every single scenario, but perhaps the reduced allocations will be appealing since this could result in removing many hundreds of allocations for tiny arrays whenever a query is being compiled.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (4 by maintainers)
@CameronAavik thanks for this!
For context, EF Core generally only constructs expression trees in places which aren’t perf-sensitive. For example, we use expression trees to generate the shaper, which is the piece of code that reads back results from the database and materializes user .NET types out of them; this generation happens once when a query is first seen, and is then cached for later queries.
This generally makes our expression tree usage non-perf-sensitive, and I suspect that is the case in many other libraries (or at least it should be). I’d be curious to know how your library uses expression trees and understand the perf improvements that these optimizations yielded. This doesn’t mean we don’t care about expression tree perf - we do - and I think we’d definitely accept PRs making the improvements above and not hurting readability too much (though /cc @smitpatel who is the query pipeline architect).
For empty arrays specifically, it may be best to optimize the expression tree code itself, i.e. to add code there which specifically checks for an empty array, and avoids copying if so. You could submit a PR directly to https://github.com/dotnet/runtime/ for that. Note that the compiler already optimizes cases where a
params X[]
method is invoked without parameters - an empty array isn’t allocated (see code).For the non-empty case, using ReadOnlyCollectionBuilder indeed looks like it could save some allocations… Even if you don’t know the capacity upfront, it seems like it could remove the allocations and copying in when the expression node is constructed, even if during construction some allocations remain because of resizing (that seems unavoidable if the total length isn’t known in advance).
@CameronAavik - We can make the change even if the perf benefit is minor as long as it doesn’t reduce readability and maintainability of code by huge amount. Though it would remain low priority and will go into future release. Would love to see PR. As for split between this repo vs runtime. If the behavior can be optimized in implementation of the method then it should go to runtime repo. While the code for expression building is almost frozen to avoid breaking changes, as long this doesn’t cause breaking changes, I believe we can make it happen.