NetFabric.Hyperlinq — Generation Operations

Antão Almada
5 min readJan 18, 2019

--

Improving performance of LINQ generation operations.

Beach Sand by aalmada

Introduction

This is a follow up on my exploration into improving LINQ performance. This is a work in progress. Go to my previous article to find what triggered this.

The source code is available at https://github.com/NetFabric/NetFabric.Hyperlinq

Generation Operations

LINQ is a library of operations on IEnumerable<T>. These operations can be grouped into categories. One of these categories is the generation operations that includes all the operations that create a new sequence of values.

In this category we can find the following operations:

The last two operations are part of the Interactive Extensions (Ix for short). These are developed by the Reactive Extensions team and are available as a NuGet package.

Note: I’m going to ignore DefaultIfEmpty for the moment. Unlike the other operations in this category, it converts an enumerable into another one.

I copied the descriptions from the documentation. The terms collection, sequence and enumerable are used loosely as all these operations return IEnumerable<T>. There’s no ICollection<T> or IReadOnlyCollection<T> involved.

Note: I use the term enumerable for entities that have a public GetEnumerator() method and the term enumerator for entities that have a MoveNext() method and a Current property. Not only implementation of IEnumerable and IEnumerator. This is aligned with the foreach behavior.

What these operations differ from the framework collections (Array, List, Dictionary, HashSet, etc.) is that they don’t allocate memory to store all its items, these are generated dynamically using deferred execution. Meaning that, the items are generated when MoveNext() is called explicitly or, implicitly by aforeach or an aggregation operation.

I’m going to start this series of articles by the generation operations as these allow simpler optimizations and can be used as the source of values for all other operations.

Click/tap on the following links to find my implementations:

Enumerables and enumerators

As mentioned above, all original implementations of the generation operations return IEnumerable<T> . LINQ uses classes derived from an internal Iterator<T> class, while Ix uses the yield keyword. Both result in a reference type that implements both IEnumerable<T> and IEnumerator<T> interfaces. GetEnumerator() returns a “clone” of itself, with the state reset.

The objective of this project is exactly to explore if it’s possible to improve LINQ’s performance by replacing these patterns with value types.

For each operation and its overrides, I created a value type enumerable. I like to think of the enumerables as factories of enumerators. To separate concerns, I keep enumerable and enumerator implementations as separate structs, keeping the close relation by declaring the enumerator as an inner struct. (If there’s an advantage on having a single struct, please let me know!)

GetEnumerator() returns a new instance of the enumerator using its type, not IEnumerator<T>, avoiding boxing of the enumerator.

The interfaces IEnumerable<T> and IEnumerator<T> are implemented explicitly so that these implementations can also be used where required.

Readonly struct

All enumerables are immutable so, the readonly keyword is used on their declaration. This is a feature introduced in C# 7.2, that allows the compiler not to generate defensive copies, improving performance.

Enumerators keep the state of the enumeration which usually mutates when MoveNext() is called so, they cannot always be declared as readonly. In some special cases there is no mutation. That’s the case for EmptyEnumerable<TSource>.Enumerator and RepeatEnumerable<TResult>.Enumerator so, these are also declared as readonly.

The use of the keyword in all these cases resulted in major improvements in some of the benchmarking scenarios.

IReadOnlyCollection and IReadOnlyList

Another objective of this project is to introduce LINQ to “new” interfaces like IReadOnlyCollection<T> and IReadOnlyList<T>.

Interestingly, I found that the enumerables of most generation operations can implement IReadOnlyList<T>. Count is well known and the value returned by the indexer can be inferred from the index parameter.

This means that the Count property can be used on the result of these operations, which has O(1) complexity, instead of the Count() operation, which has O(n) complexity.

Note: Although Count() can be O(1) for some cases, it’s not the case for generation operations. It doesn’t change even for this new implementation as Count() still hasn’t been updated to be aware of IReadOnlyCollection<T>. Like I mentioned in a previous article, it’s better to expose these as public API than as internal runtime optimizations.

The implementation of IReadOnlyList<T> also means that random access and for loops are now supported.

Benchmarks

The benchmark calculates the sum of the items, using foreach and for loops, with a number of items of 0, 100 and 10000. (The original implementations only support foreach).

The graphs show the mean time that it takes to calculate the sum. Shorter is better.

The new implementation results in major performance improvements.

The new implementation is only slower for Range and Repeat, when using foreach and when empty. This is because the original implementation, by returning an interface, can return different implementations given the parameter values. When count is 0, it returns Empty() which is more efficient. This should not be a big issue as empty enumerables are less frequent and its MoveNext() method is called only once.

Conclusions

There’s still a lot of work required to fully take advantage of these performance improvements. foreach can use the value types without boxing but all the methods that take IEnumerable<T> parameters need to use interface constraints.

Still, it shows that these small changes in the code can make a huge difference in the performance.

--

--

No responses yet