NetFabric.Hyperlinq — Generation Operations

Improving performance of LINQ generation operations.

Beach Sand by aalmada

Introduction

The source code is available at https://github.com/NetFabric/NetFabric.Hyperlinq

Generation Operations

In this category we can find the following operations:

The last two operations are part of the Interactive Extensions (Ix for short). These are developed by the Reactive Extensions team and are available as a NuGet package.

Note: I’m going to ignore DefaultIfEmpty for the moment. Unlike the other operations in this category, it converts an enumerable into another one.

I copied the descriptions from the documentation. The terms collection, sequence and enumerable are used loosely as all these operations return IEnumerable<T>. There’s no ICollection<T> or IReadOnlyCollection<T> involved.

Note: I use the term enumerable for entities that have a public GetEnumerator() method and the term enumerator for entities that have a MoveNext() method and a Current property. Not only implementation of IEnumerable and IEnumerator. This is aligned with the foreach behavior.

What these operations differ from the framework collections (Array, List, Dictionary, HashSet, etc.) is that they don’t allocate memory to store all its items, these are generated dynamically using deferred execution. Meaning that, the items are generated when MoveNext() is called explicitly or, implicitly by aforeach or an aggregation operation.

I’m going to start this series of articles by the generation operations as these allow simpler optimizations and can be used as the source of values for all other operations.

Click/tap on the following links to find my implementations:

Enumerables and enumerators

The objective of this project is exactly to explore if it’s possible to improve LINQ’s performance by replacing these patterns with value types.

For each operation and its overrides, I created a value type enumerable. I like to think of the enumerables as factories of enumerators. To separate concerns, I keep enumerable and enumerator implementations as separate structs, keeping the close relation by declaring the enumerator as an inner struct. (If there’s an advantage on having a single struct, please let me know!)

GetEnumerator() returns a new instance of the enumerator using its type, not IEnumerator<T>, avoiding boxing of the enumerator.

The interfaces IEnumerable<T> and IEnumerator<T> are implemented explicitly so that these implementations can also be used where required.

Readonly struct

Enumerators keep the state of the enumeration which usually mutates when MoveNext() is called so, they cannot always be declared as readonly. In some special cases there is no mutation. That’s the case for EmptyEnumerable<TSource>.Enumerator and RepeatEnumerable<TResult>.Enumerator so, these are also declared as readonly.

The use of the keyword in all these cases resulted in major improvements in some of the benchmarking scenarios.

IReadOnlyCollection and IReadOnlyList

Interestingly, I found that the enumerables of most generation operations can implement IReadOnlyList<T>. Count is well known and the value returned by the indexer can be inferred from the index parameter.

This means that the Count property can be used on the result of these operations, which has O(1) complexity, instead of the Count() operation, which has O(n) complexity.

Note: Although Count() can be O(1) for some cases, it’s not the case for generation operations. It doesn’t change even for this new implementation as Count() still hasn’t been updated to be aware of IReadOnlyCollection<T>. Like I mentioned in a previous article, it’s better to expose these as public API than as internal runtime optimizations.

The implementation of IReadOnlyList<T> also means that random access and for loops are now supported.

Benchmarks

The graphs show the mean time that it takes to calculate the sum. Shorter is better.

The new implementation results in major performance improvements.

The new implementation is only slower for Range and Repeat, when using foreach and when empty. This is because the original implementation, by returning an interface, can return different implementations given the parameter values. When count is 0, it returns Empty() which is more efficient. This should not be a big issue as empty enumerables are less frequent and its MoveNext() method is called only once.

Conclusions

Still, it shows that these small changes in the code can make a huge difference in the performance.

Principal Engineer @ Farfetch - Future Retail Lab https://about.me/antao.almada