NetFabric.Hyperlinq — Select Operation

Improving performance of LINQ’s Select operation.

Tiles by aalmada

Introduction

The source code is available at https://github.com/NetFabric/NetFabric.Hyperlinq

Select Operation

Note: In other programming languages, it’s usually called map().

Its implementation is very simple. Given a IEnumerable<TSource> and a selector function that transforms a TSource into a TResult, it applies the transformation to each element of the source sequence and returns them as an IEnumerable<TResult>:

Note: This implementation uses a local function to avoid a side-effect caused by the use of the yield keyword. This way, the parameter validation is immediately executed, while the transformation loop is lazy executed.

The implementation found in LINQ is way more complex. But, if you check on ShapLab.io the result of the expansion of the above code by the compiler, and if you remove the special cases in the LINQ version, you’ll find that they are very similar. Both return an instance of a class that implements the interfaces IEnumerable<TSource> and IEnumerator<TSource>.

Value type enumerator

Things to notice:

  • Select() now returns an instance of the value type SelectEnumerable, without casting it to IEnumerable<T> which would cause it, and the enumerator, to be boxed.
  • Now, there is no need to use a local function for immediate/lazy execution separation, but methods that throw exceptions cannot be inlined by the JIT compiler. Using local functions to throw the exceptions is a good alternative, making Select() a candidate for inlining.
  • The public SelectEnumerable.GetEnumerator() method returns the type Enumerator, to avoid boxing of the enumerator. The other methods that cause boxing are implemented explicitly so that they are only used when the enumerable is cast to any of the interfaces derived from IEnumerable<T> that it implements.
  • The enumerable is passed by reference to the enumerator constructor using the in keyword. This is the equivalent to a read-only ref. Both the enumerable and the enumerator are declared as readonly struct, avoiding defensive copies made by the compiler.
  • Every time a new Enumerator instance is created, it creates a “twin” enumerator from the source enumerable.
  • The Enumerator methods simply call the methods from the source enumerator, with the exception of Current, that transforms the value by calling theselector function.

Note: This implementation still doesn’t avoid boxing of the source enumerator. I’ll leave that for later.

IReadOnlyCollection and IReadOnlyList interfaces

Two overloads can be added for Select() and the compiler automatically uses them when it finds sequences that implement these interfaces. Here’s the implementation for IReadOnlyList<T>:

.NET does not support inheritance for value types so new definitions for the enumerable and the enumerator have to be added. The only difference is that this enumerable now implements a Count property and an indexer. The indexer transforms the value by calling theselector function.

This implementation allows the direct use of the Count property, which has complexity O(1), instead of the Count() operation, which has complexity O(n).

Note: There can also be an overload for the Count() operation, for when the source is IReadOnlyCollection<T>, that simply returns Count.

This also allows the use of for loops instead of foreach, which can be seen in the benchmarks from the previous article, has better performance, even when using value type enumerators.

Generics interface constraints

Notice the changes:

  • Parameters of types IEnumerable<T> and IEnumerator<T> were replaced by TEnumerable and TEnumerator that are constrained to the respective interfaces by using the where keyword.
  • Use of different static partial classes to avoid ambiguous method calls.
  • source parameter can now be a value type so is null cannot be used. The == operator has to be used instead.
  • TEnumerator can be a value type not defined as readonly so the enumerator field cannot now be defined as readonly. Otherwise, the compiler will make defensive copies when its methods are called, breaking the enumeration behavior. This implies that the Enumerator struct also cannot be defined as readonly but that’s not a big issue.

Note: You can find in the repository an optimized version of the above code where SelectReadOnlyList.Enumerator uses the indexer to enumerate the source sequence instead of using an instance of the source enumerator.

Generic types inference

A thing I’ve always loved about the C# compiler is that it’s very good at inferring types. Most of the time I don’t even notice I’m using methods with generics. Unfortunately, this is not true when we start using this type of constraints. This is a known issue

But, even if this issue would be fixed in a future version of the compiler, it would be able to infer the type of the enumerable but not the type of the enumerator. It doesn’t have access to that information. A new IEnumerable interface that contains the type of enumerator would be required. That’s exactly what Ben Adams defines on his benchmark. Unfortunately, this means that all collections would have to be refactored. While there’s no new .NET framework that includes this feature, I come up with an alternative solution, define an overload for each collection in the .NET framework that defines a value type enumerator.

Here’s an extract of the code:

  • When applied to a List<T>, the third overload is used.
  • If it’s not any of the known collections, but implements IReadOnlyList<T>, then the second overload is used.
  • Otherwise, the first overload is used…

Low-level enumeration interfaces

So, what happens when a type is cast to a lower-level interface? e.g. the Select() operation is performed on a List<T> but was called from a reference to it of type IEnumerable<T>. With the current implementation, it will use the operation defined for IEnumerable<T> which may not be the best performant for List<T>.

You can see in the LINQ implementation of the Select() operation that the is operator is used to check if the sequence is of another type that it has a better implementation for, and then applies it.

The use of generics to avoid boxing makes this issue much more complex. I’m still researching possible solutions so I’ll leave it for a future article.

Benchmarks

The following graphs show the mean time to calculate, using foreach, the sum of a sequence with 0, 100 and 10000 integer items:

It’s also interesting to check the performance of applying a Count() operation after a Select() operation:

Notice that it has complexity O(1) when the source is a Range, a List<T> or an array.

I still have to figure out if it’s possible to reduce the time for an empty enumerable that is more than twice it takes in LINQ.

Conclusions

Principal Engineer @ Farfetch - Future Retail Lab https://about.me/antao.almada