LINQ Internals: Speed Optimizations

Antão Almada
8 min readAug 29, 2023
Photo by Antão Almada

Behind the elegant facade of LINQ lies a meticulously engineered core designed for more than just simplicity. This article delves into the technical intricacies of LINQ, focusing on the “speed optimizations” that enhance its execution efficiency. These optimizations also have many shortcomings.

LINQ

LINQ is short for “Language Integrated Query”. It’s name comes from the ability to write queries directly in C#:

var query =
from person in people
where person.Age > 20
select person.Name;

The advantage is that all the query components can be validated by the compiler at compile time, instead of having the application fail at runtime. Autocompletion is also supported.

SharpLab is a great tool to see what the compiler actually does to your code. You can see there that the query is actually converted to something equivalent to the following:

var query = 
Enumerable.Select(Enumerable.Where(people, person => person.Age > 20), person => person.Name);

Enumerable is a static class declared in the System.Linq namespace. This means all its methods are static, including the Select() and Where() methods used in this query.

NOTE: Enumerable is a huge class with hundreds of methods. It is actually a great case for the use of the partial keyword. This allows the class to be broken up into multiple files. Feel free to check it source code at https://github.com/dotnet/dotnet/tree/main/src/runtime/src/libraries/System.Linq/src/System/Linq

All the methods in Enumerable are actually declared as extension methods to IEnumerable<T>. This means that the query can be rewritten as follow:

var query = people
.Where(person => person.Age > 20)
.Select(person => person.Name);

This makes easier to understand. people is a collection that implements IEnumerable<Person>, then Where() applies a predicate function and returns an instance of an object that implements IEnumerable<T>, then Select() applies a selector function and returns an instance of another object that implements IEnumerable<string>. This means that query will keep a reference to the IEnumerable<string> returned by Select().

The important thing to get here is that LINQ is supported by the methods declared in the Enumerable class. The performance of LINQ is defined by the performance of these methods. In this article I’ll try to explain many of the tricks used in LINQ to improve its performance.

IQueryable<T>

LINQ actually provides one other query mechanism that is supported by the IQueryable<T> interface that is also defined in the System.Linq namespace. In this case, the query is delegated to a provider which, at runtime, converts the query into an expression tree to then generate a query that can be interpreted by some other engine.

You may find several “LINQ to SQL” providers that convert the query into an SQL query that can be executed by a database. You may also find other providers like LinqToExcel, LinqToTwitter, LinqToCsv, LINQ-to-BigQuery, LINQ-to-GameObject-for-Unity, ElasticLINQ, GpuLinq and many more. These can execute the same query in very different engines.

When using a provider, the query is executed somewhere else and the performance depends on many factors. For this reason, this type of query execution is out of scope for this article.

Implicit enumerator collapse

LINQ operations are composable, meaning that the output of an operation can be the input of another. We’ve already seen above that a Select() can be used after a Where().

The implementation of Where() and Select(), without any optimizations, is something similar to the following:

public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
foreach(var item in source)
{
if(predicate(item))
yield return item;
}
}


public static IEnumerable<TResult> Select<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TResult> selector)
{
foreach(var item in source)
{
yield return selector(item);
}
}

Each of the operations return an enumerable. This means that to traverse the result of its composition, an instance of each enumerator has to be created, each item has to be “pulled” from an enumerator, which “pulls” the item from another one, which “pulls” from the source enumerator. Composition of operations is one of the big advantages of using LINQ but also one of the biggest performance issues. Fortunately, this is avoided in some cases.

The use of Select() after Where() is a very common usage pattern. To improve performance, the enumerable returned by Where() has an overload for Select() that returns an enumerable that takes a predicate and a selector as parameters. This is equivalent to executing the following:

public static IEnumerable<TResult> WhereSelect<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate,
Func<TSource, TResult> selector)
{
foreach(var item in source)
{
if(predicate(item))
yield return selector(item);
}
}

It applies both the predicate and the selector in one single foreach loop. This means that there’s one less layer of enumerators to “pull” the items which results in better performance.

NOTE: Check my other article “Efficient Data Processing: Leveraging C#’s foreach Loop” to understand how foreach works.

The same happens if you happen to have two Where() or two Select() operations in a row. It would be equivalent to executing one of the following:

public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate1,
Func<TSource, bool> predicate2)
=> source.Where(item => predicate1(item) && predicate2(item));

public static IEnumerable<TResult> Select<TSource, TMiddle, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TMiddle> selector1,
Func<TMiddle, TResult> selector2)
=> source.Select(item => selector2(selector1(item)));

All these enumerator optimizations happen implicitly. You don’t need to change anything in your code for them to happen.

Explicit enumerator collapse

There are cases where you have to explicitly use a certain overload so that enumerators are collapsed. That’s the case for operations like First(), FirstOrDefault(), Single(), SingleOrDefault(), Last(), LastOrDefault(), Any(), and Count(). All these operations have an overload that takes a predicate as parameter. It has exactly the same result as preceding the parameterless extension with a Where() using that same predicate. Meaning that, source.Where(predicate).Count() results in the same as source.Count(predicate) but the second may have better performance.

Here’s a possible implementations, without other optimizations, for the Count() operation:

public static int Count<TSource>(
this IEnumerable<TSource> source
{
var count = 0;
using(var enumerator = source.GetEnumerator())
{
checked
{
while(enumerator.MoveNext())
count++;
}
}
return count;
}

public static int Count<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
var count = 0;
foreach(var item in source)
{
checked
{
if(predicate(item))
count++;
}
}
return count;
})

The second overload uses one single foreach to filter and sum. This saves from using an enumerator to execute the predicate.

Please note that implicit collapse will not work for these overloads.

Source types

All LINQ methods are extension methods for IEnumerable<T>. As I already explained in several of my previous articles, IEnumerable<T> is the worst case scenario in terms of performance when traversing a collection. There are several collection types that provide much better performance.

The LINQ library assumes that the source collections may have been cast to a different types. For example, a List<T> can be cast to IEnumerable<T>. By doing this, for example, it’s not possible to explicitly call the Count property to find how many items it contains.

As an example, to find if a collections is empty, we should use the Any() operation. It’s implementation, without optimizations, is as follow:

public static bool Any<TSource>(this IEnumerable<TSource> source)
{
using(var enumerator = source.GetEnumerator())
{
return enumerator.MoveNext();
}
}

It has to create an instance of the enumerator and then call the method MoveNext(). If the method return true, it means that it contains at least one item.

Many collections, like List<T>, store internally the number of items. This can usually be retrieved using the Length property, in the case of arrays and spans, or the property Count, in other cases. These collections usually implement the interfaces IReadOnlyCollection<T> and ICollection<T> that provide the Count property. Calling these properties should be faster than using the enumerator.

NOTE: None of the LINQ operations check for the types IReadOnlyCollection<T> and IReadOnlyList<T>. As I explained in my previous articles, collection that implement these interfaces also have to implement ICollection<T> and IList<T>, respectively, so that LINQ can optimize their traversal.

If you check the implementation of Any(), you’ll find that it calls an internal method TryGetNonEnumeratedCount() to find if the collection provides a Count property. This method then checks if the source collection is null, if it implements ICollection(), IIListProvider<T>, or ICollection. If the method return true, then Any() uses the Count property, otherwise uses the enumerator. This is all done in runtime, consuming CPU time.

NOTE: Iterator<T>, IIListProvider<T> and IPartition<T> are internal interfaces that only the internal enumerables can implement so, only these can take advantage of these optimizations.

This initialization time is mostly noticeable in operations like Any() or when the collections have very few items. For collections with a large amount of items, this initialization time is well worth it because saving a small amount of time on each item will result in massive savings in time for the full traversal of the collection.

These optimizations present a lot of issues. First, not all methods have these optimizations and, the ones that have it, are no consistent and it’s not documented. You have to check the source code of each of the operations and its combination.

Value-type enumerators

Where() uses a custom WhereListIterator<T> for the case the source is a List<T>. This “iterator” guarantees that the value-type enumerator provided by List<T> is not boxed.

NOTE: Check my other article “Performance of value-type vs reference-type enumerators in C#” to understand why value-type enumerators are important.

On the other end, all the collections provided by .NET do have a value-type enumerator but LINQ doesn’t optimize them as this solution is not generic, it only applies to List<T>.

Also, LINQ doesn’t provide any value-type enumerator as all operations that return an enumerable, return IEnumerable<T>, not an explicit enumerator type.

Generic math

Implementation of math operations in .NET has always been a challenge as math operators could not be defined in interfaces. This resulted in LINQ having to have a specific implementations of Sum() and Average() for each numeric type provided by .NET.

Unfortunately .NET 8 makes a faint attempt to use this new feature. It’s only using it internally and still provides all the overloads for the numeric types. This means that third-party numeric types still cannot use the methods Sum() and Average() efficiently.

NOTE: Check my other article “Generic math in .NET” to understand how optimize for your own numeric types.

SIMD

LINQ in .NET 8 finally makes use of SIMD. Unfortunately, it’s used in very limited scenarios:

Unfortunately, only these scenarios guarantee backwards compatibility when dealing with NaN and Infinite.

NOTE: Check my other article “Single Instruction, Multiple Data (SIMD) in .NET” to understand what is SIMD and how you can use it in your own code.

Conclusions

The abstractions provided by IEnumerable<T> makes LINQ a great library so that data processing can be easily implemented and maintained. At the same time, it’s the biggest cause of lack of performance in .NET applications.

.NET performance has been improving mostly due to improvements in the JIT compiler. The framework has also been improving with new APIs. Yet, these cannot be used for the sake of keeping LINQ 100% backwards compatible. Many of the internal optimizations are inconsistent with the latest developments.

There are several alternative implementations to the official LINQ. I’ve been tracking and benchmarking these for several years. You can find the latest benchmarks at https://github.com/NetFabric/LinqBenchmarks

--

--