LINQ: Flatten nested collections with SelectMany

Last time we looked at some of the possible ways of using LINQ to merge two collections into one.

Today we are going to look at another method that can be used to combine collections: SelectMany.

As an example of how multiple LINQ methods can be combined for interesting results we will use SelectMany and Zip to quickly implement a merging operation that LINQ does not have a method for: Interweaving elements from multiple collections.

Spoiler: Our interweaving implementation with SelectMany will be nice and short, but it will be sub-optimal with regards to performance. We will tackle that problem in a follow up post.

Merging more than two collections

All three methods we looked at recently are limited to joining two collections into one, using different kinds of operations. From simple concatenation to compiling elements based on their position – or a selected key value – we have got a lot of possibilities covered.

However, sometimes we want to merge more than two collections into a single one.

Of course, with the previous methods we can already do so. If the number of input collections is known, we can simply chain a few calls to the appropriate combining methods. If the number is unknown, we can use a foreach loop to do the work for us.

var result = one.Concat(two).Concat(three);

var result = Enumerable.Empty<Type>();
foreach (var list in collections)
{
    result = result.Concat(list);
}

Of course, we do not want to have to write code like this every single time we want to merge a number of collections.

Also, by stacking multiple calls of Concat in this way, we may be creating a literal call stack that could cause the algorithm to run much slower than it has to. Not to mention the use of Enumerable.Empty, which would be much nicer to avoid.

Fortunately, in the case of Concat LINQ has a great shortcut: SelectMany

SelectMany()

SelectMany does exactly what we need. It takes a sequence of sequences, and flattens them by concatenating one after the other to build one large returning sequence.

It has a couple of different overloads, but we will only look at the simplest here.

That overload takes a sequence and a delegate from the type of that sequence, to a generic enumerable type.

public static IEnumerable<TResult> SelectMany<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, IEnumerable<TResult>> selector
)

As the given type constraints of the method imply, this delegate is applied to each item of the input sequence and returns a new intermediate sequence for that item.

All those intermediate sequences are then concatenated by SelectMany and returned as a single sequence.

If our input sequence already contains collections, we can simply pass an identity lambda function to merge all those collections into a single one.

For example:

var arrays = new int[][]
{
    new int [] { 1, 2, 3 },
    new int [] { 4, 5, 6 },
    new int [] { 7, 8, 9 },
};

foreach(var n in arrays.SelectMany(array => array))
{
    // enumerates 1, 2, 3, 4, 5, 6, 7, 8, 9
}

But of course we can use the same method for more complicated operations as well. As the name suggests, we can use it similar to Select to transform or extract values from a sequence, except that we can transform each input item into multiple output items and return a flattened list of them.

For example, we may have a list of projects, each with a number of todo items, and we want to extract the full list of all todo items.

var allTodos = projects.SelectMany(p => p.Todos);

While it is situational, SelectMany can sometimes be used to easily extract large amounts of data from another collection.

Any time you need to combine a number of collections from different sources, take a moment to consider if SelectMany might do the trick.

Interweaving with Zip and SelectMany

One combined use of Zip and SelectMany I saw the other day was to interweave/interleave items from two lists.

What I mean by that is to take two input collections like 1, 2, 3 and 4, 5, 6 and return a result that takes elements from each input in an alternating fashion: 1, 4, 2, 5, 3, 6.

Using the two LINQ methods, this can be implemented as follows.

var result = first
    .Zip(second, (f, s) => new Type[] {a, b})
    .SelectMany(x => x);

As you can see, we first use Zip to take one element from each of the lists, and combine them into a small array of two elements, and then flatten all those arrays using SelectMany.

This ends up exactly with the result we want.

I think this is a great example of how we can use the extension methods LINQ provides for us and combine them to solve new and more complex problems.

In fact, while slightly verbose, I consider the above clear enough that I might use it if I had to quickly interweave elements from two collections.

However, if I were to reuse the same code in multiple places, or would use it in a performance critical piece of code, I would extract it into a separate method, and most likely optimise it.

The main issue regarding performance is the creation of a new array for each pair of items, even though these arrays are not actually needed at the end.

Further, this approach is not very flexible when it comes to dealing with more than two collections.

That however is a topic for a future in post, in which I will show how we can not only combine multiple LINQ methods, but also write our own to extend the functionality of the framework.

Conclusion

In this post we saw how SelectMany can be used both as a form of multi-concatenation, and also to extract and flatten a collection of collections.

Further I gave a small example of how we can combine multiple LINQ methods to achieve new and interesting functionality with only a few lines of code.

As always, feel free to let me know if this has been interesting or useful to you, and make sure to drop me a comment or email if you have specific questions, or ideas for future posts.

Enjoy the pixels!

Leave a Reply