NetFabric.Hyperlinq — Select Operation

9 min readFeb 5, 2019

Improving performance of LINQ’s Select operation.

Introduction

This is a follow up on my exploration into improving LINQ performance. This is a work in progress. Go to my previous article to find what triggered this.

The source code is available at https://github.com/NetFabric/NetFabric.Hyperlinq

Select Operation

Select() is a projection operation, it creates a new sequence by applying a transformation function to each element of the original sequence.

Note: In other programming languages, it’s usually called map().

Its implementation is very simple. Given a IEnumerable<TSource> and a selector function that transforms a TSource into a TResult, it applies the transformation to each element of the source sequence and returns them as an IEnumerable<TResult>:

public static partial class Enumerable
{
    public static IEnumerable<TResult> Select<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector) 
    {
        if(source is null) throw new ArgumentNullException(nameof(source));
        if(selector is null) throw new ArgumentNullException(nameof(selector));

        return PerformSelect();

        IEnumerable<TResult> PerformSelect()
        {
            foreach (var item in source)
            {
                yield return selector(item);
            }
        }
    }
}

Note: This implementation uses a local function to avoid a side-effect caused by the use of the yield keyword. This way, the parameter validation is immediately executed, while the transformation loop is lazy executed.

The implementation found in LINQ is way more complex. But, if you check on ShapLab.io the result of the expansion of the above code by the compiler, and if you remove the special cases in the LINQ version, you’ll find that they are very similar. Both return an instance of a class that implements the interfaces IEnumerable<TSource> and IEnumerator<TSource>.

Value type enumerator

As I explained in a previous article, using a value type enumerator instead, can substantially improve performance. Replacing the yield return by a hand-made enumerator, looks like this:

public static partial class Enumerable
{
    public static SelectEnumerable<TSource, TResult> Select<TSource, TResult>(
        this IEnumerable<TSource> source, 
        Func<TSource, TResult> selector) 
    {
        if(source is null) ThrowSourceNull();
        if(selector is null) ThrowSelectorNull();

        return new SelectEnumerable<TSource, TResult>(source, selector);

        void ThrowSourceNull() => throw new ArgumentNullException(nameof(source));
        void ThrowSelectorNull() => throw new ArgumentNullException(nameof(selector));
    }

    public readonly struct SelectEnumerable<TSource, TResult> : IEnumerable<TResult>
    {
        readonly IEnumerable<TSource> source;
        readonly Func<TSource, TResult> selector;
 
        internal SelectEnumerable(IEnumerable<TSource> source, Func<TSource, TResult> selector)
        {
            this.source = source;
            this.selector = selector;
        }

        public Enumerator GetEnumerator() => new Enumerator(in this);
        IEnumerator<TResult> IEnumerable<TResult>.GetEnumerator() => new Enumerator(in this);
        IEnumerator IEnumerable.GetEnumerator() => new Enumerator(in this);

        public readonly struct Enumerator : IEnumerator<TResult>
        {
            readonly IEnumerator<TSource> enumerator;
            readonly Func<TSource, TResult> selector;

            internal Enumerator(in SelectEnumerable<TSource, TResult> enumerable)
            {
                enumerator = enumerable.source.GetEnumerator();
                selector = enumerable.selector;
            }

            public TResult Current => selector(enumerator.Current);
            object IEnumerator.Current => selector(enumerator.Current);

            public bool MoveNext() => enumerator.MoveNext();

            public void Reset() => enumerator.Reset();

            public void Dispose() => enumerator.Dispose();
        }
    }
}

Things to notice:

Select() now returns an instance of the value type SelectEnumerable, without casting it to IEnumerable<T> which would cause it, and the enumerator, to be boxed.
Now, there is no need to use a local function for immediate/lazy execution separation, but methods that throw exceptions cannot be inlined by the JIT compiler. Using local functions to throw the exceptions is a good alternative, making Select() a candidate for inlining.
The public SelectEnumerable.GetEnumerator() method returns the type Enumerator, to avoid boxing of the enumerator. The other methods that cause boxing are implemented explicitly so that they are only used when the enumerable is cast to any of the interfaces derived from IEnumerable<T> that it implements.
The enumerable is passed by reference to the enumerator constructor using the in keyword. This is the equivalent to a read-only ref. Both the enumerable and the enumerator are declared as readonly struct, avoiding defensive copies made by the compiler.
Every time a new Enumerator instance is created, it creates a “twin” enumerator from the source enumerable.
The Enumerator methods simply call the methods from the source enumerator, with the exception of Current, that transforms the value by calling theselector function.

Note: This implementation still doesn’t avoid boxing of the source enumerator. I’ll leave that for later.

IReadOnlyCollection and IReadOnlyList interfaces

Select() applies the transformation to every element of the original sequence. This means that the size of the sequence does not change. It also means that it’s possible to implement random access.

Two overloads can be added for Select() and the compiler automatically uses them when it finds sequences that implement these interfaces. Here’s the implementation for IReadOnlyList<T>:

public static partial class ReadOnlyList
{
    public static SelectReadOnlyList<TSource, TResult> Select<TSource, TResult>(
        this IReadOnlyList<TSource> source, 
        Func<TSource, TResult> selector) 
    {
        if(source == null) ThrowSourceNull();
        if(selector == null) ThrowSelectorNull();

        return new SelectReadOnlyList<TSource, TResult>(source, selector);

        void ThrowSourceNull() => throw new ArgumentNullException(nameof(source));
        void ThrowSelectorNull() => throw new ArgumentNullException(nameof(selector));
    }

    public readonly struct SelectReadOnlyList<TSource, TResult> : IReadOnlyList<TResult>
    {
        readonly IReadOnlyList<TSource> source;
        readonly Func<TSource, TResult> selector;
 
        internal SelectReadOnlyList(IReadOnlyList<TSource> source, Func<TSource, TResult> selector)
        {
            this.source = source;
            this.selector = selector;
        }

        public Enumerator GetEnumerator() => new Enumerator(in this);
        IEnumerator<TResult> IEnumerable<TResult>.GetEnumerator() => new Enumerator(in this);
        IEnumerator IEnumerable.GetEnumerator() => new Enumerator(in this);

        public int Count => source.Count;

        public TResult this[int index] => selector(source[index]);

        public readonly struct Enumerator : IEnumerator<TResult>
        {
            readonly IEnumerator<TSource> enumerator;
            readonly Func<TSource, TResult> selector;

            internal Enumerator(in SelectReadOnlyList<TSource, TResult> enumerable)
            {
                enumerator = enumerable.source.GetEnumerator();
                selector = enumerable.selector;
            }

            public TResult Current => selector(enumerator.Current);
            object IEnumerator.Current => selector(enumerator.Current);

            public bool MoveNext() => enumerator.MoveNext();

            public void Reset() => enumerator.Reset();

            public void Dispose() => enumerator.Dispose();
        }
    }
}

.NET does not support inheritance for value types so new definitions for the enumerable and the enumerator have to be added. The only difference is that this enumerable now implements a Count property and an indexer. The indexer transforms the value by calling theselector function.

This implementation allows the direct use of the Count property, which has complexity O(1), instead of the Count() operation, which has complexity O(n).

Note: There can also be an overload for the Count() operation, for when the source is IReadOnlyCollection<T>, that simply returns Count.

This also allows the use of for loops instead of foreach, which can be seen in the benchmarks from the previous article, has better performance, even when using value type enumerators.

Generics interface constraints

Generics constraints allow the use of interfaces without boxing the value types. To avoid boxing the source enumerator, the code has to be changed to use interface constraints:

using System;
using System.Collections;
using System.Collections.Generic;

public static partial class ReadOnlyList
{
    public static SelectReadOnlyList<TEnumerable, TEnumerator, TSource, TResult> Select<TEnumerable, TEnumerator, TSource, TResult>(
        this TEnumerable source, 
        Func<TSource, TResult> selector) 
        where TEnumerable : IReadOnlyList<TSource>
        where TEnumerator : IEnumerator<TSource>
    {
        if (source == null) ThrowSourceNull();
        if (selector is null) ThrowSelectorNull();

        return new SelectReadOnlyList<TEnumerable, TEnumerator, TSource, TResult>(source, selector);

        void ThrowSourceNull() => throw new ArgumentNullException(nameof(source));
        void ThrowSelectorNull() => throw new ArgumentNullException(nameof(selector));
    }

    public readonly struct SelectReadOnlyList<TEnumerable, TEnumerator, TSource, TResult> : IReadOnlyList<TResult>
        where TEnumerable : IReadOnlyList<TSource>
        where TEnumerator : IEnumerator<TSource>
    {
        readonly TEnumerable source;
        readonly Func<TSource, TResult> selector;
 
        internal SelectReadOnlyList(TEnumerable source, Func<TSource, TResult> selector)
        {
            this.source = source;
            this.selector = selector;
        }

        public Enumerator GetEnumerator() => new Enumerator(in this);
        IEnumerator<TResult> IEnumerable<TResult>.GetEnumerator() => new Enumerator(in this);
        IEnumerator IEnumerable.GetEnumerator() => new Enumerator(in this);

        public int Count => source.Count;

        public TResult this[int index] => selector(source[index]);

        public struct Enumerator : IEnumerator<TResult>
        {
            TEnumerator enumerator;
            readonly Func<TSource, TResult> selector;

            internal Enumerator(in SelectReadOnlyListEnumerable<TEnumerable, TEnumerator, TSource, TResult> enumerable)
            {
                enumerator = (TEnumerator)enumerable.source.GetEnumerator();
                selector = enumerable.selector;
            }

            public TResult Current => selector(enumerator.Current);
            object IEnumerator.Current => selector(enumerator.Current);

            public bool MoveNext() => enumerator.MoveNext();

            public void Reset() => enumerator.Reset();

            public void Dispose() => enumerator.Dispose();
        }
    }
}

Notice the changes:

Parameters of types IEnumerable<T> and IEnumerator<T> were replaced by TEnumerable and TEnumerator that are constrained to the respective interfaces by using the where keyword.
Use of different static partial classes to avoid ambiguous method calls.
source parameter can now be a value type so is null cannot be used. The == operator has to be used instead.
TEnumerator can be a value type not defined as readonly so the enumerator field cannot now be defined as readonly. Otherwise, the compiler will make defensive copies when its methods are called, breaking the enumeration behavior. This implies that the Enumerator struct also cannot be defined as readonly but that’s not a big issue.

Note: You can find in the repository an optimized version of the above code where SelectReadOnlyList.Enumerator uses the indexer to enumerate the source sequence instead of using an instance of the source enumerator.

Generic types inference

When using the above code, I got this error: error CS0411: The type arguments for method ‘Enumerable.Select<TEnumerable, TEnumerator, TSource, TResult>(TEnumerable, Func<TSource, TResult>)’ cannot be inferred from the usage. Try specifying the type arguments explicitly.

A thing I’ve always loved about the C# compiler is that it’s very good at inferring types. Most of the time I don’t even notice I’m using methods with generics. Unfortunately, this is not true when we start using this type of constraints. This is a known issue…

But, even if this issue would be fixed in a future version of the compiler, it would be able to infer the type of the enumerable but not the type of the enumerator. It doesn’t have access to that information. A new IEnumerable interface that contains the type of enumerator would be required. That’s exactly what Ben Adams defines on his benchmark. Unfortunately, this means that all collections would have to be refactored. While there’s no new .NET framework that includes this feature, I come up with an alternative solution, define an overload for each collection in the .NET framework that defines a value type enumerator.

Here’s an extract of the code:

public static SelectEnumerable<IEnumerable<TSource>, IEnumerator<TSource>, TSource, TResult> Select<TSource, TResult>(
    this IEnumerable<TSource> source, 
    Func<TSource, TResult> selector) =>
        Select<IEnumerable<TSource>, IEnumerator<TSource>, TSource, TResult>(source, selector);

public static SelectReadOnlyList<IReadOnlyList<TSource>, IEnumerator<TSource>, TSource, TResult> Select<TSource, TResult>(
    this IReadOnlyList<TSource> source, 
    Func<TSource, TResult> selector) =>
        Select<IReadOnlyList<TSource>, IEnumerator<TSource>, TSource, TResult>(source, selector);

public static SelectReadOnlyList<List<TSource>, List<TSource>.Enumerator, TSource, TResult> Select<TSource, TResult>(
    this List<TSource> source, 
    Func<TSource, TResult> selector) =>
        Select<List<TSource>, List<TSource>.Enumerator, TSource, TResult>(source, selector);

When applied to a List<T>, the third overload is used.
If it’s not any of the known collections, but implements IReadOnlyList<T>, then the second overload is used.
Otherwise, the first overload is used…

Low-level enumeration interfaces

Up until now, I’ve been assuming that the highest-level enumeration interface is used and we can take advantage of its features. If everybody followed the rules I stated in my previous article on .NET enumeration, this wouldn’t be an issue but unfortunately, it’s not always the case.

So, what happens when a type is cast to a lower-level interface? e.g. the Select() operation is performed on a List<T> but was called from a reference to it of type IEnumerable<T>. With the current implementation, it will use the operation defined for IEnumerable<T> which may not be the best performant for List<T>.

You can see in the LINQ implementation of the Select() operation that the is operator is used to check if the sequence is of another type that it has a better implementation for, and then applies it.

The use of generics to avoid boxing makes this issue much more complex. I’m still researching possible solutions so I’ll leave it for a future article.

Benchmarks

So, how does this implementation of the Select() operation compares to the one in LINQ?

The following graphs show the mean time to calculate, using foreach, the sum of a sequence with 0, 100 and 10000 integer items:

It’s also interesting to check the performance of applying a Count() operation after a Select() operation:

Notice that it has complexity O(1) when the source is a Range, a List<T> or an array.

I still have to figure out if it’s possible to reduce the time for an empty enumerable that is more than twice it takes in LINQ.

Conclusions

Due to the limitations on inference with constrained interfaces, I struggled some time to find a solution that is usable. I think this one is quite acceptable, fully compliant with the current LINQ API and with better performance.