August 2009

Volume 24 Number 08

.NET Matters - Aggregating Exceptions

By Stephen Toub | August 2009

Exceptions in .NET are the fundamental mechanism by which errors and other exceptional conditions are communicated. Exceptions are used not only to store information about such issues but also to propagate that information in object-instance form through a call stack. Based on the Windows structured exception handling (SEH) model, only one .NET exception can be "in flight" at any particular time on any particular thread, a restriction that is usually nary a thought in developers' minds. After all, one operation typically yields only one exception, and thus in the sequential code we write most of the time, we need to be concerned about only one exception at a time. However, there are a variety of scenarios in which multiple exceptions might result from one operation. This includes (but is not limited to) scenarios involving parallelism and concurrency.

Consider the raising of an event in .NET:

public event EventHandler MyEvent; protected void OnMyEvent() { EventHandler handler = MyEvent; if (handler != null) handler(this, EventArgs.Empty); }

Multiple delegates can be registered with MyEvent, and when the handler delegate in the previous code snippet is invoked, the operation is equivalent to code like the following:

foreach(var d in handler.GetInvocationList()) { ((EventHandler)d)(this, EventArgs.Empty); }

Each delegate that makes up the handler multicast delegate is invoked one after the other. However, if any exception is thrown from any of the invocations, the foreach loop ceases processing, which means that some delegates might not be executed in the case of an exception. As an example, consider the following code:

MyEvent += (s, e) => Console.WriteLine("1"); MyEvent += (s, e) => Console.WriteLine("2"); MyEvent += (s, e) => { throw new Exception("uh oh"); }; MyEvent += (s, e) => Console.WriteLine("3"); MyEvent += (s, e) => Console.WriteLine("4");

If MyEvent is invoked now, "1" and "2" are output to the console, an exception is thrown, and the delegates that would have output "3" and "4" will not be invoked.

To ensure that all the delegates are invoked even in the face of an exception, we can rewrite our OnMyEvent method as follows:

protected void OnMyEvent() { EventHandler handler = MyEvent; if (handler != null) { foreach (var d in handler.GetInvocationList()) { try { ((EventHandler)d)(this, EventArgs.Empty); } catch{} } } }

Now we're catching all the exceptions that escape from a registered handler, allowing delegates that come after an exception to still be invoked. If you rerun the earlier example, you'll see that "1", "2", "3", and "4" are output, even though an exception is thrown from one of the delegates. Unfortunately, this new implementation is also eating the exceptions, a practice that is greatly frowned on. Those exceptions could indicate a serious issue with the application, an issue that is not being handled because the exceptions are ignored.

What we really want is to capture any exceptions that might emerge, and then once we've finished invoking all the event handlers, throw again all the exceptions that escaped from a handler. Of course, as already mentioned, only one exception instance can be thrown on a given thread at a given time. Enter AggregateException.

In the .NET Framework 4, System.AggregateException is a new exception type in mscorlib. While a relatively simple type, it enables a plethora of scenarios by providing central and useful exception-related functionality.

AggregateException is itself an exception (deriving from System.Exception) that contains other exceptions. The base System.Exception class already has the notion of wrapping a single Exception instance, referred to as the "inner exception." The inner exception, exposed through the InnerException property on Exception, represents the cause of the exception and is often used by frameworks that layer functionality and that use exceptions to elevate the information being provided. For example, a component that parses input data from a stream might encounter an IOException while reading from the stream. It might then create a CustomParserException that wraps the IOException instance as the InnerException, providing higher-level details about what went wrong in the parse operation while still providing the IOException for the lower-level and underlying details.

AggregateException simply extends that support to enable wrapping of inner exceptions—plural. It provides constructors that accept params arrays or enumerables of these inner exceptions (in addition to the standard constructor that accepts a single inner exception), and it exposes the inner exceptions through an InnerExceptions property (in addition to the InnerException property from the base class). See Figure 1 for an overview of AggregateException's public surface area.

Figure 1 System.AggregateException

[Serializable] [DebuggerDisplay("Count = {InnerExceptions.Count}")] public class AggregateException : Exception { public AggregateException(); public AggregateException(params Exception[] innerExceptions); public AggregateException(IEnumerable<Exception> innerExceptions); public AggregateException(string message); public AggregateException(string message, Exception innerException); public AggregateException(string message, params Exception[] innerExceptions); public AggregateException(string message, IEnumerable<Exception> innerExceptions); public AggregateException Flatten(); public void Handle(Func<Exception, bool> predicate); public ReadOnlyCollection<Exception> InnerExceptions { get; } }

If the AggregateException doesn't have any inner exceptions, InnerException returns null and InnerExceptions returns an empty collection. If the AggregateException is provided with a single exception to wrap, InnerException returns that instance (as you'd expect), and InnerExceptions returns a collection with just that one exception. And if the AggregateException is provided with multiple exceptions to wrap, InnerExceptions returns all those in the collection, and InnerException returns the first item from that collection.

Now, with AggregateException, we can augment our .NET event-raising code as shown in Figure 2, and we're able to have our cake and eat it, too. Delegates registered with the event continue to run even if one throws an exception, and yet we don't lose any of the exceptional information because they're all wrapped into an aggregate and thrown again at the end (only if any of the delegates fail, of course).

Figure 2 Using AggregateException when Raising Events

protected void OnMyEvent() { EventHandler handler = MyEvent; if (handler != null) { List<Exception> exceptions = null; foreach (var d in handler.GetInvocationList()) { try { ((EventHandler)d)(this, EventArgs.Empty); } catch (Exception exc) { if (exceptions == null) exceptions = new List<Exception>(); exceptions.Add(exc); } } if (exceptions != null) throw new AggregateException(exceptions); } }

Events provide a solid example of where exception aggregation is useful for sequential code. However, AggregateException is also of prime importance for the new parallelism constructs in .NET 4 (and, in fact, even though AggregateException is useful for non-parallel code, the type was created and added to the .NET Framework by the Parallel Computing Platform team at Microsoft).

Consider the new Parallel.For method in .NET 4, which is used for parallelizing a for loop. In a typical for loop, only one iteration of that loop can execute at any one time, which means that only one exception can occur at a time. With a parallel "loop," however, multiple iterations can execute in parallel, and multiple iterations can throw exceptions concurrently. A single thread calls the Parallel.For method, which can logically throw multiple exceptions, and thus we need a mechanism through which those multiple exceptions can be propagated onto a single thread of execution. Parallel.For handles this by gathering the exceptions thrown and propagating them wrapped in an AggregateException. The rest of the methods on Parallel (Parallel.ForEach and Parallel.Invoke) handle things similarly, as does Parallel LINQ (PLINQ), also part of .NET 4. In a LINQ-to-Objects query, only one user delegate is invoked at a time, but with PLINQ, multiple user delegates can be invoked in parallel, those delegates might throw exceptions, and PLINQ deals with that by gathering them into an AggregateException and propagating that aggregate.

As an example of this kind of parallel execution, consider Figure 3, which shows a method that uses the ThreadPool to invoke multiple user-provided Action delegates in parallel. (A more robust and scalable implementation of this functionality exists in .NET 4 on the Parallel class.) The code uses QueueUserWorkItem to run each Action. If the Action delegate throws an exception, rather than allowing that exception to propagate and go unhandled (which, by default, results in the process being torn down), the code captures the exception and stores it in a list shared by all the work items. After all the asynchronous invocations have completed (successfully or exceptionally), an AggregateException is thrown with the captured exceptions, if any were captured. (Note that this code could be used in OnMyEvent to run all delegates registered with an event in parallel.)

Figure 3 AggregateException in Parallel Invocation

public static void ParallelInvoke(params Action[] actions) { if (actions == null) throw new ArgumentNullException("actions"); if (actions.Any(a => a == null)) throw new ArgumentException ("actions"); if (actions.Length == 0) return; using (ManualResetEvent mre = new ManualResetEvent(false)) { int remaining = actions.Length; var exceptions = new List<Exception>(); foreach (var action in actions) { ThreadPool.QueueUserWorkItem(state => { try { ((Action)state)(); } catch (Exception exc) { lock (exceptions) exceptions.Add(exc); } finally { if (Interlocked.Decrement(ref remaining) == 0) mre.Set(); } }, action); } mre.WaitOne(); if (exceptions.Count > 0) throw new AggregateException(exceptions); } }

The new System.Threading.Tasks namespace in .NET 4 also makes liberal use of AggregateExceptions. A Task in .NET 4 is an object that represents an asynchronous operation. Unlike QueueUserWorkItem, which doesn't provide any mechanism to refer back to the queued work, Tasks provides a handle to the asynchronous work, enabling a large number of important operations to be performed, such as waiting for a work item to complete or continuing from it to perform some operation when the work completes. The Parallel methods mentioned earlier are built on top of Tasks, as is PLINQ.

Furthering the discussion of AggregateException, an easy construct to reason about here is the static Task.WaitAll method. You pass to WaitAll all the Task instances you want to wait on, and WaitAll "blocks" until those Task instances have completed. (I've placed quotation marks around "blocks" because the WaitAll method might actually assist in executing the Tasks so as to minimize resource consumption and provide better efficiency than just blocking a thread.) If the Tasks all complete successfully, the code goes on its merry way. However, multiple Tasks might have thrown exceptions, and WaitAll can propagate only one exception to its calling thread, so it wraps the exceptions into a single AggregateException and throws that aggregate.

Tasks use AggregateExceptions in other places as well. One that might not be as obvious is in parent/child relationships between Tasks. By default, Tasks created during the execution of a Task are parented to that Task, providing a form of structured parallelism. For example, Task A creates Task B and Task C, and in doing so Task A is considered the parent of both Task B and Task C. These relationships come into play primarily in regard to lifetimes. A Task isn't considered completed until all its children have completed, so if you used Wait on Task A, that instance of Wait wouldn't return until both B and C had also completed. These parent/child relationships not only affect execution in that regard, but they're also visible through new debugger tool windows in Visual Studio 2010, greatly simplifying the debugging of certain types of workloads.

Consider code like the following:

var a = Task.Factory.StartNew(() => { var b = Task.Factory.StartNew(() => { throw new Exception("uh"); }); var c = Task.Factory.StartNew(() => { throw new Exception("oh"); }); }); ... a.Wait();

Here, Task A has two children, which it implicitly waits for before it is considered complete, and both of those children throw unhandled exceptions. To account for this, Task A wraps its children's exceptions into an AggregateException, and it's that aggregate that's returned from A's Exception property and thrown out of a call to Wait on A.

As I've demonstrated, AggregateException can be a very useful tool. For usability and consistency reasons, however, it can also lead to designs that might at first be counterintuitive. To clarify what I mean, consider the following function:

public void DoStuff() { var inputNum = Int32.Parse(Console.ReadLine()); Parallel.For(0, 4, i=> { if (i < inputNum) throw new MySpecialException(i.ToString()); }); }

Here, depending on user input, the code contained in the parallel loop might throw 0, 1, or more exceptions. Now consider the code you'd have to write to handle those exceptions. If Parallel.For wrapped exceptions in an AggregateException only when multiple exceptions were thrown, you, as the consumer of DoStuff, would need to write two separate catch handlers: one for the case in which only one MySpecialException occurred, and one for the case in which an AggregateException occurred. The code for handling the AggregateException would likely search the AggregateException's InnerExceptions for a MySpecialException and then run the same handling code for that individual exception that you would have in the catch block dedicated to MySpecialException. As you start dealing with more exceptions, this duplication problem grows. To address this problem as well as to provide consistency, methods in .NET 4 like Parallel.For that need to deal with the potential for multiple exceptions always wrap, even if only one exception occurs. That way, you need to write only one catch block for AggregateException. The exception to this rule is that exceptions that may never occur in a concurrent scope will not be wrapped. So, for example, exceptions that might result from Parallel.For due to it validating its arguments and finding one of them to be null will not be wrapped. That argument validation occurs before Parallel.For spins off any asynchronous work, and thus it's impossible that multiple exceptions could occur.

Of course, having exceptions wrapped in an AggregateException can also lead to some difficulties in that you now have two models to deal with: unwrapped and wrapped exceptions. To ease the transition between the two, AggregateException provides several helper methods to make working with these models easier.

The first helper method is Flatten. As I mentioned, AggregateException is itself an Exception, so it can be thrown. This means, however, that AggregateException instances can wrap other AggregateException instances, and, in fact, this is a likely occurrence, especially when dealing with recursive functions that might throw aggregates. By default, AggregateExceptions retains this hierarchical structure, which can be helpful when debugging because the hierarchical structure of the contained aggregates will likely correspond to the structure of the code that threw those exceptions. However, this can also make aggregates more difficult to work with in some cases. To account for that, the Flatten method removes the layers of contained aggregates by creating a new AggregateException that contains the non-AggregateExceptions from the whole hierarchy. As an example, let's say I had the following structure of exception instances:

  • AggregateException
  • InvalidOperationException
  • ArgumentOutOfRangeException
  • AggregateException
  • IOException
  • DivideByZeroException
  • AggregateException
  • FormatException
  • AggregateException
  • TimeZoneException

If I call Flatten on the outer AggregateException instance, I get a new AggregateException with the following structure:

  • AggregateException
  • InvalidOperationException
  • ArgumentOutOfRangeException
  • IOException
  • DivideByZeroException
  • FormatException
  • TimeZoneException

This makes it much easier for me to loop through and examine the InnerExceptions of the aggregate, without having to worry about recursively traversing contained aggregates.

The second helper method, Handle, makes such traversal easier. Handle has the following signature:

public void Handle(Func<Exception,bool> predicate);

Here's an approximation of its implementation:

public void Handle(Func<Exception,bool> predicate) { if (predicate == null) throw new ArgumentNullException("predicate"); List<Exception> remaining = null; foreach(var exception in InnerExceptions) { if (!predicate(exception)) { if (remaining == null) remaining = new List<Exception>(); remaining.Add(exception); } } if (remaining != null) throw new AggregateException(remaining); }

Handle iterates through the InnerExceptions in the AggregateException and evaluates a predicate function for each. If the predicate function returns true for a given exception instance, that exception is considered handled. If, however, the predicate returns false, that exception is thrown out of Handle again as part of a new AggregateException containing all the exceptions that failed to match the predicate. This approach can be used to quickly filter out exceptions you don't care about; for example:

try { MyOperation(); } catch(AggregateException ae) { ae.Handle(e => e is FormatException); }

That call to Handle filters out any FormatExceptions from the AggregateException that is caught. If there are exceptions besides FormatExceptions, only those exceptions are thrown again as part of the new AggregateException, and if there aren't any non-FormatException exceptions, Handle returns successfully with nothing being thrown again. In some cases, it might also be useful to first flatten the aggregates, as you see here:

ae.Flatten().Handle(e => e is FormatException);

Of course, at its core an AggregateException is really just a container for other exceptions, and you can write your own helper methods to work with those contained exceptions in a manner that fits your application's needs. For example, maybe you care more about just throwing a single exception than retaining all the exceptions. You could write an extension method like the following:

public static void PropagateOne(this AggregateException aggregate) { if (aggregate == null) throw new ArgumentNullException("aggregate"); if (aggregate.InnerException != null) throw aggregate.InnerException; // just throw one }

which you could then use as follows:

catch(AggregateException ae) { ae.PropagateOne(); }

Or maybe you want to filter to show only those exceptions that match a certain criteria and then aggregate information about those exceptions. For example, you might have an AggregateException containing a whole bunch of ArgumentExceptions, and you want to summarize which parameters caused the problems:

AggregateException aggregate = ...; string [] problemParameters = (from exc in aggregate.InnerExceptions let argExc = exc as ArgumentException where argExc != null && argExc.ParamName != null select argExc.ParamName).ToArray();

All in all, the new System.AggregateException is a simple but powerful tool, especially for applications that can't afford to let any exception go unnoticed. For debugging purposes, AggregateException's ToString implementation outputs a string rendering all the contained exceptions. And as you can see back in Figure 1, there's even a DebuggerDisplayAttribute on AggregateException to help you quickly identify how many exceptions an AggregateException contains.

Send your questions and comments for Stephen to netqa@microsoft.com.

Stephen Toub is a Senior Program Manager Lead on the Parallel Computing Platform team at Microsoft. He is also a Contributing Editor for MSDN Magazine.