Improving performance of LINQ’s Select operation.
This is a follow up on my exploration into improving LINQ performance. This is a work in progress. Go to my previous article to find what triggered this.
The source code is available at https://github.com/NetFabric/NetFabric.Hyperlinq
Note: In other programming languages, it’s usually called
Its implementation is very simple. Given a
IEnumerable<TSource> and a
selector function that transforms a
TSource into a
TResult, it applies the transformation to each element of the source sequence and returns them as an
Note: This implementation uses a local function to avoid a side-effect caused by the use of the
yieldkeyword. This way, the parameter validation is immediately executed, while the transformation loop is lazy executed.
The implementation found in LINQ is way more complex. But, if you check on ShapLab.io the result of the expansion of the above code by the compiler, and if you remove the special cases in the LINQ version, you’ll find that they are very similar. Both return an instance of a class that implements the interfaces
Value type enumerator
As I explained in a previous article, using a value type enumerator instead, can substantially improve performance. Replacing the
yield return by a hand-made enumerator, looks like this:
Things to notice:
Select()now returns an instance of the value type
SelectEnumerable, without casting it to
IEnumerable<T>which would cause it, and the enumerator, to be boxed.
- Now, there is no need to use a local function for immediate/lazy execution separation, but methods that throw exceptions cannot be inlined by the JIT compiler. Using local functions to throw the exceptions is a good alternative, making
Select()a candidate for inlining.
- The public
SelectEnumerable.GetEnumerator()method returns the type
Enumerator, to avoid boxing of the enumerator. The other methods that cause boxing are implemented explicitly so that they are only used when the enumerable is cast to any of the interfaces derived from
IEnumerable<T>that it implements.
- The enumerable is passed by reference to the enumerator constructor using the
inkeyword. This is the equivalent to a read-only
ref. Both the enumerable and the enumerator are declared as
readonly struct, avoiding defensive copies made by the compiler.
- Every time a new
Enumeratorinstance is created, it creates a “twin” enumerator from the source enumerable.
Enumeratormethods simply call the methods from the source enumerator, with the exception of
Current, that transforms the value by calling the
Note: This implementation still doesn’t avoid boxing of the source enumerator. I’ll leave that for later.
IReadOnlyCollection and IReadOnlyList interfaces
Select() applies the transformation to every element of the original sequence. This means that the size of the sequence does not change. It also means that it’s possible to implement random access.
Two overloads can be added for
Select() and the compiler automatically uses them when it finds sequences that implement these interfaces. Here’s the implementation for
.NET does not support inheritance for value types so new definitions for the enumerable and the enumerator have to be added. The only difference is that this enumerable now implements a
Count property and an indexer. The indexer transforms the value by calling the
This implementation allows the direct use of the
Count property, which has complexity O(1), instead of the
Count() operation, which has complexity O(n).
This also allows the use of
for loops instead of
foreach, which can be seen in the benchmarks from the previous article, has better performance, even when using value type enumerators.
Generics interface constraints
Generics constraints allow the use of interfaces without boxing the value types. To avoid boxing the source enumerator, the code has to be changed to use interface constraints:
Notice the changes:
- Parameters of types
IEnumerator<T>were replaced by
TEnumeratorthat are constrained to the respective interfaces by using the
- Use of different static partial classes to avoid ambiguous method calls.
sourceparameter can now be a value type so
is nullcannot be used. The
==operator has to be used instead.
TEnumeratorcan be a value type not defined as
enumeratorfield cannot now be defined as
readonly. Otherwise, the compiler will make defensive copies when its methods are called, breaking the enumeration behavior. This implies that the
Enumeratorstruct also cannot be defined as
readonlybut that’s not a big issue.
Note: You can find in the repository an optimized version of the above code where
SelectReadOnlyList.Enumeratoruses the indexer to enumerate the source sequence instead of using an instance of the source enumerator.
Generic types inference
When using the above code, I got this error:
error CS0411: The type arguments for method ‘Enumerable.Select<TEnumerable, TEnumerator, TSource, TResult>(TEnumerable, Func<TSource, TResult>)’ cannot be inferred from the usage. Try specifying the type arguments explicitly.
A thing I’ve always loved about the C# compiler is that it’s very good at inferring types. Most of the time I don’t even notice I’m using methods with generics. Unfortunately, this is not true when we start using this type of constraints. This is a known issue…
But, even if this issue would be fixed in a future version of the compiler, it would be able to infer the type of the enumerable but not the type of the enumerator. It doesn’t have access to that information. A new
IEnumerable interface that contains the type of enumerator would be required. That’s exactly what Ben Adams defines on his benchmark. Unfortunately, this means that all collections would have to be refactored. While there’s no new .NET framework that includes this feature, I come up with an alternative solution, define an overload for each collection in the .NET framework that defines a value type enumerator.
Here’s an extract of the code:
- When applied to a
List<T>, the third overload is used.
- If it’s not any of the known collections, but implements
IReadOnlyList<T>, then the second overload is used.
- Otherwise, the first overload is used…
Low-level enumeration interfaces
Up until now, I’ve been assuming that the highest-level enumeration interface is used and we can take advantage of its features. If everybody followed the rules I stated in my previous article on .NET enumeration, this wouldn’t be an issue but unfortunately, it’s not always the case.
So, what happens when a type is cast to a lower-level interface? e.g. the
Select() operation is performed on a
List<T> but was called from a reference to it of type
IEnumerable<T>. With the current implementation, it will use the operation defined for
IEnumerable<T> which may not be the best performant for
The use of generics to avoid boxing makes this issue much more complex. I’m still researching possible solutions so I’ll leave it for a future article.
So, how does this implementation of the
Select() operation compares to the one in LINQ?
The following graphs show the mean time to calculate, using
foreach, the sum of a sequence with 0, 100 and 10000 integer items:
It’s also interesting to check the performance of applying a Count() operation after a Select() operation:
Notice that it has complexity O(1) when the source is a
List<T> or an array.
I still have to figure out if it’s possible to reduce the time for an empty enumerable that is more than twice it takes in LINQ.
Due to the limitations on inference with constrained interfaces, I struggled some time to find a solution that is usable. I think this one is quite acceptable, fully compliant with the current LINQ API and with better performance.