Enumeration in .NET IV — Finding an item

Antão Almada

4 min readJul 23, 2018

This is part of a series of articles:

Enumeration in .NET
Enumeration in .NET II — Count()
Enumeration in .NET III — Enumerable.Empty<T>()
Enumeration in .NET IV— Finding an item
Enumeration in .NET V — ToList() or not ToList()

Finding an item

I already mentioned in the first article of this series that IEnumerable:

doesn’t allow random access.
doesn’t give access to any other collection information.

This implies high costs when trying to find an item. This article focus on the methods First() and Single(), commonly used to search for an item, and how its real cost can be hidden.

First() and Single()

First() and Single() are extension methods for IEnumerable<T> that return the first element of a sequence. The sequence is usually filtered using an Where() operation and we are interested only in the first item.

Both methods throw an exception when the sequence is empty. Single() also throws an exception if the sequence contains more than one element. Their implementation is very simple:

public static T First<T>(this IEnumerable<T> enumerable)
{
    using(var enumerator = enumerable.GetEnumerator())
    {
        if(!enumerator.MoveNext())
          throw new InvalidOperationException("Sequence contains no elements");

        return enumerator.Current;
    }
}

public static T Single<T>(this IEnumerable<T> enumerable)
{
    using(var enumerator = enumerable.GetEnumerator())
    {
        if(!enumerator.MoveNext())
          throw new InvalidOperationException("Sequence contains no elements");

        var first = enumerator.Current;

        if(enumerator.MoveNext())
          throw new InvalidOperationException("Sequence contains more than one element");

        return first;
    }
}

NOTE: The LINQ implementation contains optimizations for certain scenarios but I’m not going to consider them. As I explained in a previous article, these are misleading.

Both methods get an instance of IEnumerator and call MoveNext(). If it returns false, it means it’s empty so, they throw an InvalidOperationException, otherwise, First() simply returns the value.

Single() calls MoveNext() a second time to check if there are more elements, throwing an InvalidOperationException if it returns true. This makes Single() more expensive than First().

You should call Single() only if you really need to check if the sequence has more than one element.

FirstOrDefault() and SingleOrDefault()

Throwing an exception is a very expensive operation. FirstOrDefault() and SingleOrDefault() are alternatives that return the default value when the sequence is empty. Checking if the returned value is the default is much cheaper than catching the exception.

If empty sequences are an expected result then you should call FirstOrDefault() or SingleOrDefault() instead.

Their implementation is very similar to the previous methods but returning default instead of throwing:

public static T FirstOrDefault<T>(this IEnumerable<T> enumerable)
{
    using(var enumerator = enumerable.GetEnumerator())
    {
        if(!enumerator.MoveNext())
          return default;

        return enumerator.Current;
    }
}

public static T SingleOrDefault<T>(this IEnumerable<T> enumerable)
{
    using(var enumerator = enumerable.GetEnumerator())
    {
        if(!enumerator.MoveNext())
          return default;

        var first = enumerator.Current;

        if(enumerator.MoveNext())
          throw new InvalidOperationException("Sequence contains more than one element");

        return first;
    }
}

None of the methods loops through the sequence which means they have O(1) complexity. This should be good but, not so quickly…

Where()

All the methods from above have a second overload that requires a predicate function. This is a more efficient equivalent to calling Where() followed by any of the previous methods as only one enumerator is created. The predicate argument is the filter function that would have been passed to the Where() method. Check at SharpLab.io for a working code example which shows that both options have the same behavior.

Their implementations can be the following:

public static T First<T>(this IEnumerable<T> enumerable, Func<T, bool> predicate)
{
    using(var enumerator = enumerable.GetEnumerator())
    {
        while(enumerator.MoveNext())
        {
            var current = enumerator.Current;
            if (predicate(current))
                return current;
        }
        throw new InvalidOperationException("Sequence contains no elements");
    }
}

public static T Single<T>(this IEnumerable<T> enumerable, Func<T, bool> predicate)
{
    using(var enumerator = enumerable.GetEnumerator())
    {
        while(enumerator.MoveNext())
        {
            var current = enumerator.Current;
            if (predicate(current))
            {
                // found first, keep going until end or find second
                while (enumerator.MoveNext())
                {
                    if (predicate(enumerator.Current))
                        throw new InvalidOperationException("Sequence contains more than one element");
                }
                return current;
            }
        }
        throw new InvalidOperationException("Sequence contains no elements");
    }
}

The methods now have to loop through the sequence until predicate() returns true. Single() has to keep going through the sequence to find if there’s a second item that makes predicate() return true.

These methods have a O(n) complexity. For First(), n is the position of the first item for which predicate() returns true, while for Single(), n is the position of the second item for which predicate() returns true. Worst case scenario for both, n is the length of the sequence.

Searching for an element with an unique identifier, is the usual scenario for using Single(). Actually, this is the worst case scenario. Even if the element is the first one on the sequence, it will go up til the end of the sequence just to prove that it’s unique…

When using Single() or SingleOrDefault(), you may be paying a huge performance penalty to validate data when you simply want to query it. Unless you are unit testing, you should validate it elsewhere.

This is the same complexity for when using the first version of these methods after an Where() operation. Each call to MoveNext() is possibly hiding many MoveNext() calls upstream. This is why it’s not enough to analyze complexity looking solely into the extension method implementation, generating some misconceptions on the performance of these methods.

First(), Single(), FirstOrDefault() and SingleOrDefault(), most of the time are associated with an Where() operation. The resulting complexity is O(n). That’s the expected complexity of searching for an item on an IEnumerable.

When T is a value-type, multiple copies of each item are made…

NOTE: These rules may not apply when using IQueryable as these operations will be performed by more efficient engines with indices and query plans, e.g. in a database.

Conclusion

To avoid O(n) complexity when getting an item, you should consider data structures that supply more efficient methods to find the elements. The Dictionary is commonly used but there are many more options. Search on the web for “data structures and algorithms” to learn more about the alternatives.

Enumeration in .NET IV — Finding an item

Finding an item

First() and Single()

FirstOrDefault() and SingleOrDefault()

Where()

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Antão Almada

No responses yet

More from Antão Almada

How to use Span<T> and Memory<T>

NOTE: I highly suggest to also check how to use System.Threading.Channels. It’s a feature introduced after this article was published and…

Enumeration in .NET V — ToList(), or not ToList(), that is the question

EDIT: Updated to .NET 8 and improved content.

Single Instruction, Multiple Data (SIMD) in .NET

What is SIMD?

ImmutableArray<T> iteration performance in C#

‘foreach’ on an ImmutableArray<T> is surprisingly fast. Equivalent to an array. Here I explain how is this possible.

Recommended from Medium

Hangfire in .NET

1. What is Hangfire?

Secure and Structured Logging in .NET: Best Practices with Serilog to Prevent Sensitive Data Leaks

When I started implementing logging in .NET, I realized the risk of unintentionally exposing sensitive data. That’s when I discovered the…

Filters vs. Middleware in .NET Core: Key Differences, Use Cases, and Top Interview Questions

Introduction

Seamless File Management in ASP.NET Core: Azure Blob Storage with Configurable Local Mode

Managing file uploads in a web application can be a challenging task, especially when balancing between development environments and…

Implementing Health Checks in .NET 8

Health checks are crucial for monitoring the status of your applications and services. They provide a quick and automated way to verify…

Using Singleton in Multithreading in .NET

A singleton is a class designed to create only one instance of itself and typically provides an easy way to access that instance.