The first kpatch submission

By Jonathan Corbet
May 7, 2014

It is spring in the northern hemisphere, so a young kernel developer's thoughts naturally turn to … dynamic kernel patching. Last week saw the posting of SUSE's kGraft live-patching mechanism; shortly thereafter, developers at Red Hat came forward with their competing kpatch mechanism. The approaches taken by the two groups show some interesting similarities, but also some significant differences.

Like kGraft, kpatch replaces entire functions within a running kernel. A kernel patch is processed to determine which functions it changes; the kpatch tools (not included with the patch, but available in this repository) then use that information to create a loadable kernel module containing the new versions of the changed functions. A call to the new kpatch_register() function within the core kpatch code will use the ftrace function tracing mechanism to intercept calls to the old functions, redirecting control to the new versions instead. So far, it sounds a lot like kGraft, but that resemblance fades a bit once one looks at the details.

KGraft goes through a complex dance during which both the old and new versions of a replaced function are active in the kernel; this is done in order to allow each running process to transition to the "new universe" at a (hopefully) safe time. Kpatch is rather less subtle: it starts by calling stop_machine() to bring all other CPUs in the system to a halt. Then, kpatch examines the stack of every process running in kernel mode to ensure that none are running in the affected function(s); should one of the patched functions be active, the patch-application process will fail. If things are OK, instead, kpatch patches out the old functions completely (or, more precisely, it leaves an ftrace handler in place that routes around the old function). There is no tracking of whether processes are in the "old" or "new" universe; instead, everybody is forced to the new universe immediately if it is possible.

There are some downsides to this approach. stop_machine() is a massive sledgehammer of a tool; kernel developers prefer to avoid it if at all possible. If kernel code is running inside one of the target functions, kpatch will simply fail; kGraft, instead, will work to slowly patch the system over to the new function, one process at a time. Some functions (examples would include schedule(), do_wait(), or irq_thread()) are always running somewhere in the kernel, so kpatch cannot be used to apply a patch that modifies them. On a typical system, there will probably be a few dozen functions that can block a live patch in this way — a pretty small subset of the thousands of functions in the kernel.

While kpatch, with its use of stop_machine(), may seem heavy-handed, there are some developers who would like to see it take an even stronger approach initially: Ingo Molnar suggested that it should use the process freezer (normally used when hibernating the system) to be absolutely sure that no processes have any running state within the kernel. That would slow live kernel patching even more, but, as he put it:

Well, if distros are moving towards live patching (and they are!), then it looks rather necessary to me that something scary as flipping out live kernel instructions with substantially different code should be as safe as possible, and only then fast.

The hitch with this approach, as noted by kpatch developer Josh Poimboeuf, is that there are a lot of unfreezable kernel threads. Frederic Weisbecker suggested that the kernel thread parking mechanism could be used instead. Either way, Ingo thought, kernel threads that prevented live patching would be likely to be fixed in short order. There was not a consensus in the end on whether freezing or parking kernel threads was truly necessary, but opinion did appear to be leaning in the direction of being slow and safe early on, then improving performance later.

The other question that has come up has to do with patches that change the format or interpretation of in-kernel data. KGraft tries to handle simple cases with its "universe" mechanism but, in many situations, something more complex will be required. According to kGraft developer Jiri Kosina, there is a mechanism in place to use a "band-aid function" that understands both forms of a changed data structure until all processes have been converted to the new code. After that transition has been made, the code that writes the older version of the changed data structure can be patched out, though it may be necessary to retain code that reads older data structures until the next reboot.

On the kpatch side, instead, there is currently no provision for making changes to data structures at all. The plan for the near future is to add a callback that can be packaged with a live patch; its job would be to search out and convert all affected data structures while the system is stopped and the patch is being applied. This approach has the potential to work without the need for maintaining the ability to cope with older data structures, but only if all of the affected structures can be located at patching time — a tall order, in many cases.

The good news is that few patches (of the type that one would consider for live patching) make changes to kernel data structures. As Jiri put it:

We've done some very light preparatory analysis and went through patches which would make most sense to be shipped as hot/live patches without enough time for proper downtime scheduling (i.e. CVE severity high enough (local root), etc). Most of the time, these turn out to be a one-or-few liners, mostly adding extra check, fixing bounds, etc. There were just one or two in a few years history where some extra care would be needed.

So the question of safely handling data-related changes can likely be deferred for now while the question how to change the code in a running kernel is answered. There have already been suggestions that this topic should be discussed at the 2014 Kernel Summit in August. It is entirely possible, though, that the developers involved will find a way to combine their approaches and get something merged before then. There is no real disagreement over the end goal, after all; it's just a matter of finding the best approach for the implementation of that goal.

Index entries for this article
Kernel	Live patching

The first kpatch submission

Posted May 8, 2014 3:40 UTC (Thu) by flewellyn (subscriber, #5047) [Link] (1 responses)

It seems like what they (kGraft) want is a C-based kernelspace version of Common Lisp's UPDATE-INSTANCE-FOR-REDEFINED-CLASS. At least, that's what this reminded me of the most.

The first kpatch submission

Posted May 8, 2014 10:50 UTC (Thu) by NAR (subscriber, #1313) [Link]

kGraft also looks to be quite similar to the Erlang hot code loading. Actually in Erlang gen_servers there is an explicit code_change function which is called at upgrade time to possibly change the internal data structures if it is necessary.

The first kpatch submission

Posted May 8, 2014 3:55 UTC (Thu) by nevets (subscriber, #11875) [Link] (3 responses)

It is entirely possible, though, that the developers involved will find a way to combine their approaches and get something merged before then.

I sure hope not. Kernel patching is extremely volatile and not something that should be taken lightly. In fact, I've set up a session at Linux Plumbers under the tracing mini-summit to talk about best approaches as well. And that's not going to happen until October.

Please let's not rush into this, otherwise we may end up with a solution that will need band-aid fixes for the foreseeable future (kind of like cpu hotplug).

The first kpatch submission

Posted May 8, 2014 7:34 UTC (Thu) by mjthayer (guest, #39183) [Link] (2 responses)

> Please let's not rush into this, otherwise we may end up with a solution that will need band-aid fixes for the foreseeable future (kind of like cpu hotplug).

I wonder if that can be avoided. Chances are that many of the issues involved will only be fully understood once patching is out in the wild anyway, so perhaps planning for something iterative from the start would be wisest. I suspect that people gaining experience in creating fixes which also work well as live patches will be part of the process.

The first kpatch submission

Posted May 8, 2014 10:47 UTC (Thu) by nevets (subscriber, #11875) [Link] (1 responses)

Sure, we can't make it perfect before it gets usage. There will always be something out there we didn't expect. That doesn't mean that we should still rush into this because "oh well, we can't predict what will go wrong anyway".

There's also the issues that all this relies on the ftrace infrastructure, which isn't fully there yet to support live patching. I'm currently working on fixing that, but I also need to make sure that I don't break the current users of ftrace (function tracing) in the mean time. I'm taking this step by step, doing small changes each release for the exact reason you state. Every small update will have corner cases that I didn't expect but wont know about until it's in the wild. Thus, I do one step and see how it works. Then the next. Doing too many steps per release may cause more problems and perhaps hide more bugs.

The first kpatch submission

Posted May 8, 2014 19:21 UTC (Thu) by gerdesj (subscriber, #5446) [Link]

nevets: I like your approach.

I already enjoy rubbing Windows sysadmin's noses in it with up time comparisons. Watching a live patching session blow up would be a little embarrassing 8)

Besides, I've waited a long time for this amazing potential feature and can wait longer.

Please take all the time you need for this.

Cheers
Jon

The first kpatch submission

Posted May 8, 2014 12:47 UTC (Thu) by freemars (subscriber, #4235) [Link]

The 2014 version of garbage collection?

Use cases for kernel patching

Posted May 8, 2014 22:35 UTC (Thu) by jhhaller (guest, #56103) [Link]

There are several use cases for kernel patching, each with different sensitivity to how the kernel is patched:
1. System with very long-running processes which would lose work if they needed to be shut down. Presumably, they have some checkpointing, as there are reasons servers stop working at some random point, like because of hardware failure, other than needing to patch the OS, and work would be lost in those instances without periodic checkpoints. Patching schemes don't seem to affect this use case.

2. Applying patches require less administrative overhead, and are desirable. Again, this is not sensitive to the patching scheme.

3. Systems which are highly available, but without strict latency requirements. As long as the patching can be done without exceeding latency requirements, there is not much sensitivity.

4. Systems which are highly available with strict (sub-millisecond) latency requirements. These systems may be effected by patching schemes which bring the system to a halt while quiescing all processes.

Given that the first 3 use cases are not sensitive to patching mechanisms, if the patching mechanism can meet the 4th use case, it would serve most needs. I understand that the 4th use case may not be the majority, but is still fairly widespread.