Enumeration in .NET IV — Finding an item
This is part of a series of articles:
- Enumeration in .NET
- Enumeration in .NET II —
- Enumeration in .NET III —
- Enumeration in .NET IV— Finding an item
- Enumeration in .NET V —
Finding an item
I already mentioned in the first article of this series that
- doesn’t allow random access.
- doesn’t give access to any other collection information.
This implies high costs when trying to find an item. This article focus on the methods
Single(), commonly used to search for an item, and how its real cost can be hidden.
First() and Single()
Single() are extension methods for
IEnumerable<T> that return the first element of a sequence. The sequence is usually filtered using an
Where() operation and we are interested only in the first item.
Both methods throw an exception when the sequence is empty.
Single() also throws an exception if the sequence contains more than one element. Their implementation is very simple:
NOTE: The LINQ implementation contains optimizations for certain scenarios but I’m not going to consider them. As I explained in a previous article, these are misleading.
Both methods get an instance of
IEnumerator and call
MoveNext(). If it returns
false, it means it’s empty so, they throw an
First() simply returns the value.
MoveNext() a second time to check if there are more elements, throwing an
InvalidOperationException if it returns
true. This makes
Single() more expensive than
You should call Single() only if you really need to check if the sequence has more than one element.
FirstOrDefault() and SingleOrDefault()
Throwing an exception is a very expensive operation.
SingleOrDefault() are alternatives that return the default value when the sequence is empty. Checking if the returned value is the default is much cheaper than catching the exception.
If empty sequences are an expected result then you should call FirstOrDefault() or SingleOrDefault() instead.
Their implementation is very similar to the previous methods but returning
default instead of throwing:
None of the methods loops through the sequence which means they have O(1) complexity. This should be good but, not so quickly…
All the methods from above have a second overload that requires a predicate function. This is a more efficient equivalent to calling
Where() followed by any of the previous methods as only one enumerator is created. The predicate argument is the filter function that would have been passed to the
Where() method. Check at SharpLab.io for a working code example which shows that both options have the same behavior.
Their implementations can be the following:
The methods now have to loop through the sequence until
Single() has to keep going through the sequence to find if there’s a second item that makes
These methods have a O(n) complexity. For
First(), n is the position of the first item for which
true, while for
Single(), n is the position of the second item for which
true. Worst case scenario for both, n is the length of the sequence.
Searching for an element with an unique identifier, is the usual scenario for using
Single(). Actually, this is the worst case scenario. Even if the element is the first one on the sequence, it will go up til the end of the sequence just to prove that it’s unique…
When using Single() or SingleOrDefault(), you may be paying a huge performance penalty to validate data when you simply want to query it. Unless you are unit testing, you should validate it elsewhere.
This is the same complexity for when using the first version of these methods after an
Where() operation. Each call to
MoveNext() is possibly hiding many
MoveNext() calls upstream. This is why it’s not enough to analyze complexity looking solely into the extension method implementation, generating some misconceptions on the performance of these methods.
First(), Single(), FirstOrDefault() and SingleOrDefault(), most of the time are associated with an Where() operation. The resulting complexity is O(n). That’s the expected complexity of searching for an item on an IEnumerable.
T is a value-type, multiple copies of each item are made…
NOTE: These rules may not apply when using
IQueryableas these operations will be performed by more efficient engines with indices and query plans, e.g. in a database.
To avoid O(n) complexity when getting an item, you should consider data structures that supply more efficient methods to find the elements. The
Dictionary is commonly used but there are many more options. Search on the web for “data structures and algorithms” to learn more about the alternatives.