In this post I describe an infinite-loop scenario that can occur on .NET Framework when a ThreadAbortException
is raised. I describe when you might run into this scenario, why it happens (it's a bug in the runtime), and how you can avoid it. Finally I show a Roslyn Analyzer that you can use to automatically flag problematic code.
Throwing a ThreadAbortException
with Thread.Abort()
When you're doing parallel/concurrent programming in .NET, and you want to do two things at once, you typically use the Task Parallel Library, Task
, Task<T>
, async
/await
, and all that modern goodness. However, you can also manage threads yourself "manually", by calling Thread.Start()
etc.
These days, in practice, you should almost never be working directly with threads. Use
Task
et al wherever possible so that you're using the ThreadPool to schedule jobs andasync
/await
to handle continuations.
If you have a running thread, and you want to stop it running, you would typically try to use cooperative cancellation, using CancellationToken
s or something similar. However, in some cases that's not possible; maybe the thread is running third party code out of your control, for example. In .NET Framework you have a "kill it with fire" option: Thread.Abort()
.
Note that
Thread.Abort()
only applies to .NET Framework. TheAbort()
method is not supported on .NET Core and throws aPlatformNotSupportedException
to the caller instead.
Calling Abort()
on a thread causes the runtime to throw a ThreadAbortException
in the thread's code. ThreadAbortException
is special, in that you can catch it in application code (unlike some other exceptions such as StackOverflowException
which can't be caught), but the runtime automatically re-throws the ThreadAbortException
at the end of the catch block.
It is possible to "cancel" the exception by calling
ResetAbort()
but I'm not going to go into that in this post.
Just to give a concrete example, the following is a small .NET Framework program that starts a Thread
, which starts doing some work, and then calls Abort()
.
// Start a new thread, which runs the DoWork method
var myThread = new Thread(new ThreadStart(DoWork));
myThread.Start();
Thread.Sleep(300);
Console.WriteLine("Main - aborting thread");
myThread.Abort(); // Trigger a ThreadAbortException
myThread.Join(); // Wait for the thread to exit
Console.WriteLine("Main ending");
static void DoWork()
{
try
{
for (var i = 0; i < 100; i++)
{
Console.WriteLine($"Thread - working {i}");
Thread.Sleep(100);
}
}
catch (ThreadAbortException e)
{
Console.WriteLine($"Thread - caught ThreadAbortException: {e.Message}");
// Even though we caught the exception, the runtime re-throws it
}
// This is never called
Console.WriteLine("Thread - outside the catch block");
}
When you run the program, the output looks something like this:
Thread - working 0
Thread - working 1
Thread - working 2
Main - aborting thread
Thread - caught ThreadAbortException: Thread was being aborted.
Main ending
As you can see, even though we caught the ThreadAbortException
, the thread exited, as the exception was re-thrown. Now we'll look at a scenario where that doesn't quite work as you expect.
Infinite loops and ThreadAbortException
The issue I'm going to describe is based on a real issue we ran into in the Datadog .NET Tracer shortly before I joined in January 2021. The issue occurred during IIS AppDomain recycles (among other cases) and would result in the apps not shutting down. As you might expect given the preamble, the problem was related to ThreadAbortException
.
We can demonstrate the problem easily if we make a slight tweak to the example above. Instead of using a for
loop inside a try
-catch
, we're going to change to a try
-catch
inside a while
loop. The rest of the program remains the same, so I've only shown the DoWork()
method:
static void DoWork()
{
var i = 0;
while (true)
{
try
{
Console.WriteLine($"Thread - working {i}");
i++;
Thread.Sleep(100);
}
catch (ThreadAbortException e)
{
Console.WriteLine($"Thread - caught ThreadAbortException {e.Message}");
// Even though we caught the exception, the runtime _should_ re-throw it
}
}
// This is never called
Console.WriteLine("Thread - outside the catch block.");
}
Now, theoretically, there should be no difference here. The Abort()
is called, caught in the catch
block, and the runtime should-rethrow the exception, exiting the while
loop and the thread. However if we run the app in the Release
configuration we have a problem—we get stuck in an infinite loop in the catch
block:
Thread - working 0
Thread - working 1
Thread - working 2
Main - aborting thread.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
Thread - caught ThreadAbortException Thread was being aborted.
...
This is clearly Not Good™, and ultimately comes down to a bug in the JIT. The explanation of the bug is somewhat complex (and is largely due to a workardound for a different bug) but this comment has all the gory details if you want to dig in.
The bug is present in the RyuJIT compiler, but not in the legacy JIT, so you can also workaround the bug by setting
<useLegacyJit enabled="1" />
in your app.config or web.config.
The bug is triggered specifically when you have a "tight" loop with a try
-catch
directly inside a while
loop:
while(true)
{
try
{
// ...
}
catch
{
// ...
}
}
Adding a Console.WriteLine()
(for example) inside the while
loop but outside the try
-catch
causes the bug to be avoided, as does using a for
loop for example, so it's this specific pattern you need to watch out for. Adding a finally
block also fixes the issue.
Ultimately, Microsoft was decided not to fix this bug, so the workaround is to ensure you always "manually" re-throw
a ThreadAbortException
if you find yourself with the problematic pattern.
Unfortunately, it's not obvious that the pattern is problematic just by looking at it, so it's a great candidate for a Roslyn Analyzer to do the spotting for you.
Creating an analyzer to detect the pattern
In this section I show the Roslyn Analyzer I wrote to make sure we don't accidentally introduce this code into the Datadog library.
If you're building a .NET Core-only application then you don't need to worry about this, because .NET Core doesn't support
ThreadAbortException
s. However, if you're building a library that multi-targets .NET Core and .NET Framework, or usesnetstandard2.0
to do so, then you might want to consider using it.
I'm not going into detail about how to create a analyzer in this post (I covered this some time ago in a previous post). Instead I'm just going to focus on the analyzer code itself.
As a reminder, we are trying to detect code that looks something like this:
while(...)
{
try
{
// ...
}
catch
{
// ...
}
}
and advise you to update it to manually re-throw the exception. The simplest fix might look like this:
while(...)
{
try
{
// ...
}
catch
{
// ...
throw; // Required to avoid infinite recursion
}
}
We'll add a code fix provider to automatically make that basic fix later.
Creating the analyzer
We'll start by looking at the analyzer itself. This derives from DiagnosticAnalyzer
, defines a diagnostic ID, and registers a SyntaxNodeAction
that looks for while
loops. If the while
loop contains a try-catch
that has a problematic catch
clause, we raise the issue.
[DiagnosticAnalyzer(LanguageNames.CSharp)]
public class ThreadAbortAnalyzer : DiagnosticAnalyzer
{
public const string DiagnosticId = "ABRT0001";
private static readonly DiagnosticDescriptor Rule = new(
DiagnosticId,
title: "Potential infinite loop on ThreadAbortException",
messageFormat: "Potential infinite loop - you should rethrow Exception in catch block",
category: "Reliability",
defaultSeverity: DiagnosticSeverity.Error,
isEnabledByDefault: true,
description: "While blocks are vulnerable to infinite loop on ThreadAbortException due to a bug in the runtime. The catch block should rethrow a ThreadAbortException, or use a finally block");
public override ImmutableArray<DiagnosticDescriptor> SupportedDiagnostics { get; } = ImmutableArray.Create(Rule);
public override void Initialize(AnalysisContext context)
{
// Don't bother checking generated code
context.ConfigureGeneratedCodeAnalysis(GeneratedCodeAnalysisFlags.None);
context.EnableConcurrentExecution();
context.RegisterSyntaxNodeAction(AnalyseSyntax, SyntaxKind.WhileStatement);
}
private void AnalyseSyntax(SyntaxNodeAnalysisContext context)
{
if (context.Node is WhileStatementSyntax whileStatement
&& ThreadAbortSyntaxHelper.FindProblematicCatchClause( // shown below
whileStatement, context.SemanticModel) is { } problematicCatch)
{
// If we're in a while statement, and there's a problematic catch
// clause, then create a diagnostic
var diagnostic = Diagnostic.Create(Rule, problematicCatch.GetLocation());
context.ReportDiagnostic(diagnostic);
}
}
}
The ThreadAbortSyntaxHelper
performs the analysis of the while block, looking explicitly for a while
block with the following characteristics:
- The body of the
while
is aBlockSyntax
- The body contains only one statement, which is a
TryStatementSyntax
- The
TryStatementSyntax
contains aCatchClauseSyntax
which catches aThreadAbortException
(or its ancestors) - The
CatchClauseSyntax
does not callthrow;
If all of these conditions are matched, the analyzer flags the catch
as problematic. The code of the helper is shown below:
internal static class ThreadAbortSyntaxHelper
{
public static CatchClauseSyntax FindProblematicCatchClause(WhileStatementSyntax whileStatement, SemanticModel model)
{
if (whileStatement.Statement is not BlockSyntax blockSyntax)
{
return null;
}
var innerStatements = blockSyntax.Statements;
if (innerStatements.Count != 1)
{
// only applies when try directly nested under while and only child
return null;
}
if (innerStatements[0] is not TryStatementSyntax tryCatchStatement)
{
// Not a try catch nested in a while
return null;
}
CatchClauseSyntax catchClause = null;
var willCatchThreadAbort = false;
var willRethrowThreadAbort = false;
foreach (var catchSyntax in tryCatchStatement.Catches)
{
catchClause = catchSyntax;
var exceptionTypeSyntax = catchSyntax.Declaration.Type;
if (CanCatchThreadAbort(exceptionTypeSyntax, model))
{
willCatchThreadAbort = true;
// We're in the catch block that will catch the ThreadAbort
// Make sure that we re-throw the exception
// This is a very basic check, in that it doesn't check control flow etc
// It requires that you have a throw; in the catch block
willRethrowThreadAbort = catchSyntax.Block.Statements
.OfType<ThrowStatementSyntax>()
.Any();
break;
}
}
if (willCatchThreadAbort && !willRethrowThreadAbort)
{
return catchClause;
}
return null;
}
private static bool CanCatchThreadAbort(TypeSyntax syntax, SemanticModel model)
{
var exceptionType = model.GetSymbolInfo(syntax).Symbol as INamedTypeSymbol;
var exceptionTypeName = exceptionType?.ToString();
return exceptionTypeName == typeof(ThreadAbortException).FullName
|| exceptionTypeName == typeof(SystemException).FullName
|| exceptionTypeName == typeof(Exception).FullName;
}
}
There are clearly a bunch of limitations to this analysis, but I'll go through those later. When you run the analyzer, you can see that it works, flagging the exception in a problematic scenario:
Now that we have the analyzer, let's create a simple code fix provider for it
Creating the code fix provider
The CodeFixProvider
is registered as a fixer for the ThreadAbortAnalyzer
we defined above. It takes the diagnostic location provided and registers a code fix which simply adds a throw
statement to the end of the first catch
block that would catch the ThreadAbortException
.
[ExportCodeFixProvider(LanguageNames.CSharp, Name = nameof(ThreadAbortCodeFixProvider))]
[Shared]
public class ThreadAbortCodeFixProvider : CodeFixProvider
{
public sealed override ImmutableArray<string> FixableDiagnosticIds => ImmutableArray.Create(ThreadAbortAnalyzer.DiagnosticId);
public sealed override FixAllProvider GetFixAllProvider() => WellKnownFixAllProviders.BatchFixer;
public sealed override async Task RegisterCodeFixesAsync(CodeFixContext context)
{
var root = await context.Document.GetSyntaxRootAsync(context.CancellationToken).ConfigureAwait(false);
var diagnostic = context.Diagnostics.First();
var diagnosticSpan = diagnostic.Location.SourceSpan;
// Find the catch block catch declaration identified by the diagnostic.
var catchClause = root.FindToken(diagnosticSpan.Start)
.Parent
.AncestorsAndSelf()
.OfType<CatchClauseSyntax>().First();
// Register a code action that will invoke the fix.
context.RegisterCodeFix(
CodeAction.Create(
title: "Rethrow exception",
createChangedDocument: c => AddThrowStatement(context.Document, catchClause, c),
equivalenceKey: nameof(ThreadAbortCodeFixProvider)),
diagnostic);
}
private static async Task<Document> AddThrowStatement(Document document, CatchClauseSyntax catchBlock, CancellationToken cancellationToken)
{
// This messes with the whitespace, but meh, it's simple
var throwStatement = SyntaxFactory.ThrowStatement();
var statements = catchBlock.Block.Statements.Add(throwStatement);
var newCatchBlock = catchBlock.Block.WithStatements(statements);
// replace the syntax and return updated document
var root = await document.GetSyntaxRootAsync(cancellationToken).ConfigureAwait(false);
root = root.ReplaceNode(catchBlock.Block, newCatchBlock);
return document.WithSyntaxRoot(root);
}
}
Now when the analyzer flags an issue, you get a suggestion of how to fix it with one click:
This is clearly a crude fix (as I describe in the next section) but I've not found it to be a big issue in practice, the important thing is that it draws attention to the issue and shows a possible fix.
Limitations of the analyzer and the code fix
The analyzer I show in this post is not particularly sophisticated. It does only very basic analysis of the while
and try
-catch
statements. The limitations include:
- Assumes an infinite
while
loop. For simplicity, the analyzer doesn't check the expression in the while loop, and assumes it will loop infinitely. That's a conservative approach, and will flag some cases that won't trigger the bug, but it's good enough for our purposes. - Exception filters are not considered. For simplicity, I've ignored exception filters on the catch block. That means we might assume an exception is caught when it is not, and in that case we might also incorrectly assume an exception is rethrown when it is not.
- Doesn't consider finally blocks. In practice, the presence of a
finally
block can avoid the bug, so doesn't need to explicitly rethrow. The analyzer does not consider this, and take a more conservative approach, requiring the rethrow. - Doesn't check flow control in catch clause. In some cases, a catch clause might be calling
throw;
, but if it's not a direct child of thecatch
block, the analyzer will ignore it. Again, this is a conservative approach.
In terms of the code fix provider, it's potentially unlikely that you would actually want to call throw;
inside a catch(Exception)
block. A better approach would likely be to introduce an additional catch
clause for ThreadAbortException
specifically, and only re-throwing in that clause.
For example, if you have this:
while(true)
{
try
{
Console.WriteLine("Looping")
Thread.Sleep(100);
}
catch(Exception)
{
Console.WriteLine("Exception!")
}
}
then instead of the code suggested by the analyzer:
while(true)
{
try
{
Console.WriteLine("Looping")
Thread.Sleep(100);
}
catch(Exception)
{
Console.WriteLine("Exception!")
throw; // Added by code fix provider
}
}
you might want to do something like this instead:
while(true)
{
try
{
Console.WriteLine("Looping")
Thread.Sleep(100);
}
catch(ThreadAbortException) // catch ThreadAbortException explicitly
{
Console.WriteLine("ThreadAbortException!")
throw; // Avoid the bug
}
catch(Exception)
{
Console.WriteLine("Exception!")
// No need to throw in this block
}
}
This avoids the bug by re-throwing when you have a ThreadAbortException
specifically, and means you don't rethrow for just any Exception
. In practice, I wasn't going to bother writing a code fix provider at all, so I went for the simplest solution at the time. If I wanted to be more robust I would almost certainly try to use this pattern instead.
Summary
In this post I described a bug in the .NET Framework runtime that can cause a ThreadAbortException
to get stuck in an infinite loop. The bug only occurs when you have a try-catch block tightly nested in a while
block. Normally if you catch a ThreadAbortException
the runtime automatically re-throws the exception after the catch
block has executed. However the bug means that the catch block gets stuck re-executing infinitely.
In the second half of the post I showed a Roslyn Analyzer I created that can detect the problematic pattern and includes a code fix provider that adds a throw;
statement to break out of the infinite loop. It's a relatively crude analyzer, but I know it's saved us at least once from introducing the issue!