NetFabric.Hyperlinq — Generation Operations
Improving performance of LINQ generation operations.
Introduction
This is a follow up on my exploration into improving LINQ performance. This is a work in progress. Go to my previous article to find what triggered this.
The source code is available at https://github.com/NetFabric/NetFabric.Hyperlinq
Generation Operations
LINQ is a library of operations on IEnumerable<T>
. These operations can be grouped into categories. One of these categories is the generation operations that includes all the operations that create a new sequence of values.
In this category we can find the following operations:
DefaultIfEmpty<TSource>(TSource defaultValue)
— Replaces an empty collection with a default valued singleton collection.Empty<TResult>()
— Returns an empty collection.Range(int start, int count)
— Generates a collection that contains a sequence of numbers.Repeat<TResult>(TResult element, int count)
—Generates a collection that contains one repeated value a given number of times.Repeat<TResult>(TResult value)
—Generates a sequence by repeating the given value infinitely.Create<TResult>(Func<IEnumerator<TResult>> getEnumerator)
—Creates an enumerable sequence based on an enumerator factory function.
The last two operations are part of the Interactive Extensions (Ix for short). These are developed by the Reactive Extensions team and are available as a NuGet package.
Note: I’m going to ignore
DefaultIfEmpty
for the moment. Unlike the other operations in this category, it converts an enumerable into another one.
I copied the descriptions from the documentation. The terms collection, sequence and enumerable are used loosely as all these operations return IEnumerable<T>
. There’s no ICollection<T>
or IReadOnlyCollection<T>
involved.
Note: I use the term enumerable for entities that have a public
GetEnumerator()
method and the term enumerator for entities that have aMoveNext()
method and aCurrent
property. Not only implementation of IEnumerable and IEnumerator. This is aligned with theforeach
behavior.
What these operations differ from the framework collections (Array
, List
, Dictionary
, HashSet
, etc.) is that they don’t allocate memory to store all its items, these are generated dynamically using deferred execution. Meaning that, the items are generated when MoveNext()
is called explicitly or, implicitly by aforeach
or an aggregation operation.
I’m going to start this series of articles by the generation operations as these allow simpler optimizations and can be used as the source of values for all other operations.
Click/tap on the following links to find my implementations:
Enumerables and enumerators
As mentioned above, all original implementations of the generation operations return IEnumerable<T>
. LINQ uses classes derived from an internal Iterator<T>
class, while Ix uses the yield
keyword. Both result in a reference type that implements both IEnumerable<T>
and IEnumerator<T>
interfaces. GetEnumerator()
returns a “clone” of itself, with the state reset.
For each operation and its overrides, I created a value type enumerable. I like to think of the enumerables as factories of enumerators. To separate concerns, I keep enumerable and enumerator implementations as separate structs, keeping the close relation by declaring the enumerator as an inner struct. (If there’s an advantage on having a single struct, please let me know!)
GetEnumerator()
returns a new instance of the enumerator using its type, not IEnumerator<T>
, avoiding boxing of the enumerator.
The interfaces IEnumerable<T>
and IEnumerator<T>
are implemented explicitly so that these implementations can also be used where required.
Readonly struct
All enumerables are immutable so, the readonly
keyword is used on their declaration. This is a feature introduced in C# 7.2, that allows the compiler not to generate defensive copies, improving performance.
Enumerators keep the state of the enumeration which usually mutates when MoveNext()
is called so, they cannot always be declared as readonly
. In some special cases there is no mutation. That’s the case for EmptyEnumerable<TSource>.Enumerator
and RepeatEnumerable<TResult>.Enumerator
so, these are also declared as readonly
.
The use of the keyword in all these cases resulted in major improvements in some of the benchmarking scenarios.
IReadOnlyCollection and IReadOnlyList
Another objective of this project is to introduce LINQ to “new” interfaces like IReadOnlyCollection<T>
and IReadOnlyList<T>
.
Interestingly, I found that the enumerables of most generation operations can implement IReadOnlyList<T>
. Count
is well known and the value returned by the indexer can be inferred from the index
parameter.
This means that the
Count
property can be used on the result of these operations, which has O(1) complexity, instead of theCount()
operation, which has O(n) complexity.
Note: Although
Count()
can be O(1) for some cases, it’s not the case for generation operations. It doesn’t change even for this new implementation asCount()
still hasn’t been updated to be aware ofIReadOnlyCollection<T>
. Like I mentioned in a previous article, it’s better to expose these as public API than as internal runtime optimizations.
The implementation of IReadOnlyList<T>
also means that random access and for
loops are now supported.
Benchmarks
The benchmark calculates the sum of the items, using foreach
and for
loops, with a number of items of 0, 100 and 10000. (The original implementations only support foreach
).
The graphs show the mean time that it takes to calculate the sum. Shorter is better.
The new implementation results in major performance improvements.
The new implementation is only slower for Range
and Repeat
, when using foreach
and when empty. This is because the original implementation, by returning an interface, can return different implementations given the parameter values. When count
is 0, it returns Empty()
which is more efficient. This should not be a big issue as empty enumerables are less frequent and its MoveNext()
method is called only once.
Conclusions
There’s still a lot of work required to fully take advantage of these performance improvements. foreach
can use the value types without boxing but all the methods that take IEnumerable<T>
parameters need to use interface constraints.
Still, it shows that these small changes in the code can make a huge difference in the performance.