Enumeration in .NET V — ToList() or not ToList()?

Plaza de Armas, Seville by aalmada

This is part of a series of articles:

ToList()

I find very frequently the use of a ToList() at the end of every LINQ query. Most of the time this is not necessary and can have a huge impact on performance.

Let’s analyze a small example:

This code writes to the console the even numbers between 0 and 10. You can see in SharpLab.io that is does work.

Let’s expand the ToList() to an equivalent code:

NOTE: ToList() and ToArray() implementations have some great optimizations but you won’t know what triggers them unless you look into its code.

Check in SharpLab.io that the output is the same but, is this what you really wanted? ToList() hides an extra List<T> allocation, one foreach loop and a copy of each element into the list.

Actually, the ToList() is unnecessary for this case…

Check in SharpLab.io that the code without ToList() outputs exactly the same.

Performance

Running the code on BenchmarkDotNet, for ranges of 0, 500 and 1000 elements, shows the following results.

NOTE: For the benchmark I calculate the total sum of the sequence elements, instead of outputting to the console.

NOTE: ToArray() is similar to ToList() but returning an array instead of a list and I included it in the benchmarks.

Not surprisingly all the implementations are O(n), meaning the processing time increases with the number of items in the sequence.

Surprisingly, ToArray() is faster than not using any conversion. Must be triggering one of the optimizations.

In terms of memory, it’s where not using a conversion makes a huge difference and its not possible to optimize the conversions further as the data has to be in memory. That’s what they’re for…

Conversion requires heap memory allocations, with the amount required increasing directly with the number of elements on the sequence.

Heap allocations will trigger the garbage collection. If you allocate small amounts and keep them for short period of time, these will be handled by the Gen 0 collection, which is fast but, not free. If you allocate big amount (>85,000 bytes) they will go directly into the LOH (Large Object Heap) causing its fragmentation and making it slow.

Conclusion

ToList() and ToArray() will allocate heap memory, triggering GC more often and risking getting a OutOfMemoryException when the project scales up.

You should only use these when strictly necessary!

Principal Engineer @ Farfetch - Future Retail Lab https://about.me/antao.almada

Principal Engineer @ Farfetch - Future Retail Lab https://about.me/antao.almada