|
|
Subscribe / Log in / New account

Compile-time stack validation

By Jonathan Corbet
September 30, 2015
An occasionally heard horror story about the kernel development community concerns developers who are told that, in order to get their code upstream, they must first invest considerable effort into fixing a related subsystem. As with many such stories, this is not an experience many kernel developers have had, but there is also a grain of truth behind it. The ongoing live-patching effort, and the extra work that has been required to push that work forward, is a case in point.

Live patching's rough patch

In one sense, the live-patching work has been quiet for much of this year; when LWN last looked at this work in February, the core code had been merged, but the "consistency model" code remained out-of-tree. This code's job is to ensure that a patch is only applied to a live kernel if it is safe to do so; that job includes checking to be sure that the affected functions are not executing at the time the patch is applied. Without this assurance, only relatively trivial patches can be applied with any degree of safety. This is important: the appeal of live patching is the ability to avoid rebooting, so a patch application that crashes the kernel (or, worse, results in data corruption) defeats the whole purpose.

One way of ensuring that a given function is not executing is to freeze all processes on the system, then examine the call stack of each to see which functions are active at the time. This is the approach that was taken when the kpatch and kGraft consistency models were unified in the February patch set. That work ran into strong opposition at the time for a simple reason: the information in the kernel's call stack is often not reliable. The biggest culprit here is assembly-language code, which can easily dispense with the call-stack discipline observed by code compiled from C. The results are often observed by kernel developers; stack traces from kernel crashes are often unreliable, making it hard to determine the sequence of calls that led to the problem.

It's one thing for an unreliable stack trace to make kernel developers scratch their heads more; it's another if that information can fool a live-patching utility into applying a patch at an inopportune time. The risk of that happening was deemed high enough to block the merging of the proposed consistency code. This code, it was said, could only be used if kernel stack traces were known to be 100% reliable.

At the time, 100% reliable stack traces were not widely seen as an attainable goal. It is certainly possible to fix up all of the assembly code that does not set up proper stack frames (assuming it could all be found), but, since nothing in the kernel's normal operation depends on good call-stack information, there was nothing preventing things from breaking again at any time. In the absence of some sort of ongoing assurance that the kernel's call stack will always remain valid, it is hard to be confident that a live-patching system won't do the wrong thing.

Validating the call stack

Some developers might have given up at this point. Josh Poimboeuf, instead, set out to find a way to make the call stack valid at all times and keep it that way; the result is the "compile-time stack metadata validation" patch set, in its 13th revision as of this writing. This work adds a new tool (called stacktool) that checks the entire kernel as part of the build process to be sure that all code obeys the rules for maintaining the call stack.

The rules are, for the most part, relatively straightforward. For example, every function in assembly code must be marked as a callable function (using the ELF function type). There are some handy macros (ENTRY and ENDPROC) that do this annotation now, but not all assembly functions use them. A clear sign that the rules are not being followed is a ret instruction outside of a function block, so stacktool will complain about those.

The primary source of call-stack problems is assembly code that calls another function (possibly a C function) without setting up a new stack frame first. Such calls work, but they will trip up code that is trying to make sense out of the call stack. The validation tool checks to make sure that function calls are surrounded by the appropriate frame-maintenance code. There are currently assembly macros to do this work, but they are unused; Josh's patch renames them to FRAME_BEGIN and FRAME_END and puts them into use. Versions of these macros for inline assembly in C code have also been added; they can be found in <asm/frame.h>.

There are also some rules about dynamic jumps; for the most part, they are only allowed as part of a C switch statement. The one exception is "sibling calls," where the end of one function jumps to the beginning of another and the frame pointer hasn't changed. These rules make it possible for stacktool to follow the control flow in all cases and ensure that the call stack is always maintained.

If the STACK_VALIDATION configuration option is set, stacktool will be run on the kernel's object files as part of the build process. This pass, Josh says, causes a kernel build to take about three seconds longer (he doesn't say whether that's a kernel with a focused configuration or a distribution kitchen-sink configuration). Three seconds is probably an acceptable delay, even for impatient kernel developers, but Josh suggests that some optimization work could probably reduce that figure anyway.

What might be harder for developers to get used to are the complaints emitted by stacktool when it finds a problem. Such complaints go out as warnings in the current patch set, but the intent is to turn them into hard errors once most of the current problems have been fixed. Even if a given developer doesn't enable stack validation, others will, so changes that break the call stack will be returned for repairs in short order. The included documentation file includes examples of the types of errors that may be indicated and how to respond to them.

The current version of the patch set only supports the x86_64 architecture; evidently provisions have been made for adding other architectures, but the nature of the task ensures that a lot of the work will have to be done over again to support something else. Even with a single supported architecture, though, the stack validation work should help to bring an end to the long era where stack traces could not really be trusted. That is good for live patching, but any developer trying to figure out an oops will also benefit from this work. The live-patching developers may not have wanted to take this digression, but the kernel as a whole will be better off as a result of it.

Index entries for this article
KernelLive patching


to post comments

set up proper stack frames

Posted Oct 1, 2015 9:05 UTC (Thu) by ballombe (subscriber, #9523) [Link] (1 responses)

But surely setting up proper stack frame when it is otherwise unnecessary has a cost ?

set up proper stack frames

Posted Oct 1, 2015 13:04 UTC (Thu) by jpoimboe (subscriber, #23893) [Link]

Yes, setting up frame pointers has a performance cost in both C and asm code. This tool just makes sure that asm code honors CONFIG_FRAME_POINTER so that the option consistently does what it's advertised to do.

Eventually we hope to have an x86 DWARF unwinder which will allow frame pointers to be disabled. This tool can then be extended to do a similar validation of DWARF CFI stack metadata.

Compile-time stack validation

Posted Oct 2, 2015 16:56 UTC (Fri) by karthik_s1 (subscriber, #60525) [Link] (4 responses)

How about inline functions, they also don't appear in the stack trace?

Compile-time stack validation

Posted Oct 3, 2015 2:24 UTC (Sat) by jreiser (subscriber, #11027) [Link] (3 responses)

Recent libgcc_s.Unwind*() code, such as used by glibc.backtrace(), can traceback correctly through most inline functions, as long as the compiler provides the proper DWARF-4 info to map pc to source line.

Compile-time stack validation

Posted Oct 4, 2015 22:51 UTC (Sun) by nix (subscriber, #2304) [Link] (2 responses)

That doesn't mean that doing DWARF-4 parsing in the kernel is remotely sane. The DWARF debugging info for the kernel is *huge*, and even for DWARF 4, insanely duplicated, and cannot by any stretch of the imagination be loaded into nonswappable kernel memory (as would be required for its use by reliable traceback in the sort of situations in which tracebacks often occur).

Compile-time stack validation

Posted Oct 5, 2015 3:55 UTC (Mon) by skissane (subscriber, #38675) [Link] (1 responses)

I assume that the kernel could put sufficient information in the traceback, that an offline tool could analyse that traceback using DWARF-4 data for that kernel binary?

Compile-time stack validation

Posted Oct 6, 2015 18:50 UTC (Tue) by nix (subscriber, #2304) [Link]

I think it would need to dump the whole stack to do that, which could easily include sensitive information. The problem is that even figuring out where the stack frames *are* in the stack requires really quite a lot of the DWARF.

The right approach is probably to deduplicate the DWARF at compile time, write it out in a much more compact form suitable for the stacktracer, and link it into the kernel, perhaps compressed. This is what DTrace does for kernel type resolution (though obviously it is doing this to a different bit of the DWARF, the type info, which needs more deduplication but doesn't need any kernel runtime support at all, since the userspace tool does all the necessary processing: I don't think this is possible for stack backtraces, as I note above).

Compile-time stack validation

Posted Oct 7, 2015 15:02 UTC (Wed) by malor (guest, #2973) [Link]

>As with many such stories, this is not an experience many kernel developers have had

Possibly worth pointing out: this may be because the people who've experienced it didn't become kernel developers.


Copyright © 2015, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds