Slicing managed arrays using Span<T>
(Updated to .NET Core 2.1 official release version)
My previous articles on
Span<T> explained how it can be used for handling all types of memory allocations and for p/invoking native code. I hope they are helpful but I understand these are not typical use cases.
Span<T> usage will be much more common with managed arrays. The .NET type system contains one that we deal with every day, it’s the
string for short in C#).
string in .NET is nothing more than an immutable array of
char for short in C#). Immutable means that, once created, its content cannot be modified. You can hold a reference to it and be sure that the string always stays the same. Otherwise, you’d have to clone it. On the down side, it means that many of operations on it require memory allocations and copies.
That’s easy to understand for a
Concat() but “hard to swallow” for a
Substring() as the characters are already lined-up in memory and there is no intention to change them.
Substring() creates a new string, allocating the necessary memory and copying each character into it.
This example calls
Substring() only once but imagine the performance issue that this is for a common scenario like parsing text files (CSV, XML, JSON, YAML and so on) where it’s called thousands of times.
string can easily be converted into a
ReadOnlySpan<char> using the
AsReadOnlySpan() extension method. The resulting span is read-only, preserving the immutability of the
You can then use the
Slice() to get a reference to a portion of the string without copying it.
Slice()is a method that returns another
Span<T>for the same buffer but with different boundaries.
AsReadOnlySpan() has overloads that allow the conversion to span and get a slice of it, in one single step:
Please note that you’ll have major gains if you never convert the span back to a
string as this will result in a memory allocations and a copy. Exactly what we are trying to avoid.
For this reason, the .NET framework developers went through the Herculean task of adding overloads to all methods that accept
string parameters, to now accept
ReadOnlySpan<char>. There is also an implicit converter from
ReadOnlySpan<char> keeping the code simple.
Console.WriteLine() is still missing this treatment so, I have to call
ToArray() to be able to use it. Lets hope this is fixed in a future release.
To get a sub-string is much slower than a slice. While the performance of slices is independent of length, sub-strings are strongly affected by it:
- 16x slower for 10 characters
- 38x slower for 100 characters
- 253x slower for 1000 characters
Slices use no heap allocations so there is also no time wasted in garbage collection.
ReadOnlySpan<char> for slicing strings. Use them also as argument types so that no conversion back to
string is required.
You should extend these rules to any managed array type.
by the author.