Bounded loops in BPF for the 5.3 kernel
BPF programs have gained significantly in capabilities over the last few years and can now perform many useful operations. That said, BPF developers have had to work around an annoying limitation until recently: they could not use loops. This restriction was recently lifted by a patch set from Alexei Starovoitov that was merged for Linux 5.3. In addition to adding support for loops, it also greatly decreases the load time of most BPF programs.
The problem
Before a BPF program runs, it needs to be checked to ensure that it cannot cause harm to the system. For example, if it does not complete in a bounded time, it could be used to carry out a denial-of-service attack on the system. The task of checking programs is performed by the BPF verifier. Until recently, the verifier could not handle loops, meaning that all programs with loops were rejected.
Since loops are one of the basic ingredients of a computer program, this limitation was often hit by developers, who worked around it by unrolling loops (either by hand or using a compiler pragma). It comes as no surprise that there have been a number of attempts to change the situation by allowing loops that provably have a limited (and reasonable) number of iterations. Those include a proof-of-concept proposal from Edward Cree. Another attempt by John Fastabend tried to solve the problem with loop analysis: identifying each loop's induction variable and verifying its use. Fastabend covered the theoretical background and the possible solutions in a presentation at the 2018 Linux Plumbers Conference. The kernel developers decided not to take either solution at that time.
In parallel, important work has been put into optimizing the BPF verifier, as shown in a set of Linux Storage, Filesystem and Memory Management Summit slides by Jakub Kicinski. One important outcome of this work was increasing the size limitation for BPF programs in the 5.2 kernel; instead of 4096 instructions, a program can execute up to one million. These optimizations allow a more direct — even brute-force — approach to the bounded-loop-checking problem: instead of adding special handling for loops, the verifier can simply simulate the iterations of a loop as a collection of states no different from any others.
The BPF verifier
When the BPF verifier analyzes a program, it creates a model in the form of a state machine. It checks each state for incorrect behavior; while it does so, it records the states it has already verified. Later on, when it reaches a state that is equivalent to one that was already verified, it can conclude there is nothing more to do on that path, which can thus be pruned. This pruning significantly reduces the amount of work that the verifier must do.
State-based pruning is essential for verifier performance, but it imposes its own cost in the form of comparing states and copying the safe ones. Recent analysis of a set of networking-related BPF programs showed that 80% of the saved states will never be matched and, thus, will never prune a future search. As a result, there may be performance gains to be had by reducing the number of states and pruning points maintained by the verifier.
The old optimizer worked by placing pruning points both before and after each jump instruction, resulting in pruning points being added approximately every four instructions. It turns out that simply placing a pruning point every ten instructions, regardless of the instruction type, improves performance considerably. The verifier was also modified to aggressively drop saved states that do not actually prune paths. These changes improve verifier performance by up to 20% and increase the length of a program that can be verified in a reasonable time by one-third.
Adding bounded loops
Starovoitov's patch set introducing bounded loops builds on that earlier work. It also includes a number of other improvements. This is partly the result of a big performance regression introduced by the first patch in the series that is a necessary building block for the final solution.
That regression takes the form of a performance degradation when adding variable tracking on the stack. The compiler often puts variables on the stack (or "spills" them) when it needs to free up some registers. The verifier should be able to track such variables, since it may need to make decisions based on their contents. Until now it did not do so, resulting in certain programs being incorrectly rejected. It turns out that loop induction variables are often spilled, so fixing the tracking of their contents was necessary to be able to verify loop termination.
On the other hand, tracking specific variable values (as opposed to ranges of possible values) creates more states, decreasing the effectiveness of state pruning; tracking them on the stack makes the problem even worse. The number of states increases, so the verification time also increases. When debugging this issue, Starovoitov found another effect; the performance penalty is aggravated by changes in the Clang compiler. It turns out that newer Clang versions spill fewer variables onto the stack, reducing state pruning even in the absence of complete value tracking. The two problems together caused an important degradation of verifier performance, with the exact results available in the commit message.
Another feature needed by the bounded-loop support is extending the way the verifier handles conditional branches. If a comparison takes place between two constants, the verifier can easily determine which branch will be taken. Comparisons involving variable values are clearly harder, which is why, until now, the verifier supported only comparisons of a register with a constant. In this patch set, Starovoitov added support for tests comparing two registers as well. This enhancement was necessary to be able to simulate the execution of loop tests.
The third and final piece consists of adding a parent-child relationship between the states. When the verifier needs to explore two branches in the program, it considers them both as child states of the parent state. It also counts the number of branches to explore so that it knows how many are left. With those features, and more aggressive heuristics in the exploration state pruning, the verifier can support bounded loops. It does it simply by simulating all possible iterations of their execution.
With bounded-loop support added, only one item remained: the regression introduced at the beginning of the series. The solution comes in the last patch of the series. Based on the improvements added before (especially the parentage tracking), Starovoitov added tracking of precise scalar values. Those values are stored in registers and will be modified during program execution. The verifier needs to have precise values to analyze branches correctly, but it need not track the value of every register precisely; only those that control branching require that precision. So the verifier does not incur the cost of tracking all registers precisely; instead, when the need arises, it backtracks through the use of a register in the code to generate a precise value if possible.
Summary
The BPF verifier has undergone a number of changes in this patch set. The
resulting code not only adds support for bounded loops, but also a number of
important optimizations. Writing BPF programs for the kernel should be
rather easier in the 5.3 release, though undoubtedly BPF developers will
still have ample opportunity to complain about the remaining hoops they
have to jump through to convince the verifier that their programs are
safe.
Index entries for this article | |
---|---|
Kernel | BPF/Loops |
GuestArticles | Rybczynska, Marta |
Bounded loops in BPF for the 5.3 kernel
Posted Jul 31, 2019 21:51 UTC (Wed)
by mtaht (subscriber, #11087)
[Link] (17 responses)
Posted Jul 31, 2019 21:51 UTC (Wed) by mtaht (subscriber, #11087) [Link] (17 responses)
Bounded loops in BPF for the 5.3 kernel
Posted Jul 31, 2019 22:16 UTC (Wed)
by johill (subscriber, #25196)
[Link] (12 responses)
Posted Jul 31, 2019 22:16 UTC (Wed) by johill (subscriber, #25196) [Link] (12 responses)
Bounded loops in BPF for the 5.3 kernel
Posted Aug 1, 2019 15:29 UTC (Thu)
by Karellen (subscriber, #67644)
[Link] (11 responses)
Posted Aug 1, 2019 15:29 UTC (Thu) by Karellen (subscriber, #67644) [Link] (11 responses)
Note that this means that some useful, safe programs could be rejected. That is unfortunate, but not an argument that verifiers are impossible, or even useless.
Bounded loops in BPF for the 5.3 kernel
Posted Aug 1, 2019 17:27 UTC (Thu)
by excors (subscriber, #95769)
[Link] (10 responses)
Posted Aug 1, 2019 17:27 UTC (Thu) by excors (subscriber, #95769) [Link] (10 responses)
The pre-verification language may be Turing-complete (if it had unbounded loops or equivalent). The post-verification language cannot be, assuming the verifier only accepts programs that it can prove will halt (and rejects ones where it can't decide), because there will always exist a computation that halts but that cannot be implemented in a program that the verifier will accept. The post-verification language may still be useful in practice (as you say), but it's not Turing-complete.
In practice, Turing-completeness seems a pretty useless concept and a big distraction. Turing machines only care about the boundary between infinite and non-infinite, so a computation that requires more time than exists in the universe is just as good as one that completes almost immediately. A practical software engineer cares about a boundary measured in seconds or milliseconds, and a program that takes years is just as bad as one that never halts, so they need a totally different set of tools to analyse that problem.
Bounded loops in BPF for the 5.3 kernel
Posted Aug 1, 2019 19:27 UTC (Thu)
by johill (subscriber, #25196)
[Link]
Posted Aug 1, 2019 19:27 UTC (Thu) by johill (subscriber, #25196) [Link]
The language itself is fairly obviously Turing complete (not that I've written down an emulator), but of course that is a theoretical concept anyway, as you say.
And in any case, going back to the OP's comment, BPF is almost certainly already reading email somewhere, in the form of DPI :-)
Bounded loops in BPF for the 5.3 kernel
Posted Aug 1, 2019 19:48 UTC (Thu)
by Karellen (subscriber, #67644)
[Link] (8 responses)
Posted Aug 1, 2019 19:48 UTC (Thu) by Karellen (subscriber, #67644) [Link] (8 responses)
Bounded loops in BPF for the 5.3 kernel
Posted Aug 2, 2019 13:39 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (3 responses)
Posted Aug 2, 2019 13:39 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (3 responses)
int* p = NULL;
int i = *p;
Bounded loops in BPF for the 5.3 kernel
Posted Aug 5, 2019 16:40 UTC (Mon)
by shane (subscriber, #3335)
[Link] (2 responses)
Hm... turning your code into a C program:
Posted Aug 5, 2019 16:40 UTC (Mon) by shane (subscriber, #3335) [Link] (2 responses)
#include <stddef.h> int main () { int* p = NULL; int i = *p; return i; }We discover that clang doesn't warn or complain in any way:
$ clang --version clang version 8.0.0-3 (tags/RELEASE_800/final) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin $ clang -Weverything foo.c $ ./a.out Segmentation fault (core dumped)Likewise by default GCC doesn't:
$ gcc --version gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc -Wextra -Wall -Wpedantic foo.c $ ./a.out Segmentation fault (core dumped)We can use a feature disabled by default:
$ gcc -Werror=null-dereference -O foo.c foo.c: In function ‘main’: foo.c:7:9: error: null pointer dereference [-Werror=null-dereference] int i = *p; ^ cc1: some warnings being treated as errorsEnabling this warning apparently breaks the GCC self-bootstrapping build (as well as things like building the Linux kernel). The ticket which added the warning is here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16351
Bounded loops in BPF for the 5.3 kernel
Posted Aug 5, 2019 17:09 UTC (Mon)
by excors (subscriber, #95769)
[Link] (1 responses)
Posted Aug 5, 2019 17:09 UTC (Mon) by excors (subscriber, #95769) [Link] (1 responses)
GCC still knows that function must never be executed, so it replaces it with the "ud2" instruction (at least with gcc -O2).
Bounded loops in BPF for the 5.3 kernel
Posted Aug 6, 2019 9:27 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
Posted Aug 6, 2019 9:27 UTC (Tue) by farnz (subscriber, #17727) [Link]
As an aside, examining the code in Matt Godbolt's Compiler Explorer shows that gcc 9.1 leaves an unnecessary MOV before UD2, while clang goes straight to RET. MSVC compiles to clear RAX, load EAX from *RAX, and return 0, while ICC generates code to set up SSE2 the way it wants it, loads AL from *NULL, and then calls an "abort" subroutine.
Bounded loops in BPF for the 5.3 kernel
Posted Aug 6, 2019 11:39 UTC (Tue)
by massimiliano (subscriber, #3048)
[Link] (2 responses)
Posted Aug 6, 2019 11:39 UTC (Tue) by massimiliano (subscriber, #3048) [Link] (2 responses)
The way I see it, the semantics of the BPF is that a program needs to be verified, and then executed. The verification step is an integral part of the execution, because if affects the final result of the high level invocation of a BPF program (the invocation might fail because the verification fails).
My take is that the verifier, being implemented in a Turing complete language (C), is subject to the halting problem and in principle could never terminate.
It is like if every BPF program had an initial implicit "verify" instruction opcode. A BPF program without verification step is Turing complete (thanks to its loops), and one with the verification step... maybe is not Turing complete by itself (it number of states is bounded so in the end it is a a finite state machine), but it is still subject to the halting problems because of the verifier!
Of course, as said in other comments, nothing of this matters in practice. But it is one way of understanding how BPF programs are not violating the "halting problem" principle: it's just that kernel developers are trusting the verifier to always terminate :-)
Bounded loops in BPF for the 5.3 kernel
Posted Aug 7, 2019 7:39 UTC (Wed)
by laarmen (subscriber, #63948)
[Link] (1 responses)
Posted Aug 7, 2019 7:39 UTC (Wed) by laarmen (subscriber, #63948) [Link] (1 responses)
Anyone aware of such work ?
Bounded loops in BPF for the 5.3 kernel
Posted Aug 8, 2019 19:27 UTC (Thu)
by johill (subscriber, #25196)
[Link]
The verifier always halts, simply because it gives up after a while. It only processes a maximum of 1e6 instructions (BPF_COMPLEXITY_LIMIT_INSNS) before giving up:
Posted Aug 8, 2019 19:27 UTC (Thu) by johill (subscriber, #25196) [Link]
if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { verbose(env, "BPF program is too large. Processed %d insn\n", env->insn_processed); return -E2BIG; }
Bounded loops in BPF for the 5.3 kernel
Posted Aug 7, 2019 15:48 UTC (Wed)
by fuhchee (guest, #40059)
[Link]
Posted Aug 7, 2019 15:48 UTC (Wed) by fuhchee (guest, #40059) [Link]
Bounded loops in BPF for the 5.3 kernel
Posted Aug 1, 2019 10:59 UTC (Thu)
by pizza (subscriber, #46)
[Link] (2 responses)
Posted Aug 1, 2019 10:59 UTC (Thu) by pizza (subscriber, #46) [Link] (2 responses)
Bounded loops in BPF for the 5.3 kernel
Posted Aug 1, 2019 13:56 UTC (Thu)
by abatters (✭ supporter ✭, #6932)
[Link] (1 responses)
Posted Aug 1, 2019 13:56 UTC (Thu) by abatters (✭ supporter ✭, #6932) [Link] (1 responses)
Bounded loops in BPF for the 5.3 kernel
Posted Aug 3, 2019 14:06 UTC (Sat)
by sdalley (subscriber, #18550)
[Link]
Posted Aug 3, 2019 14:06 UTC (Sat) by sdalley (subscriber, #18550) [Link]
Bounded loops in BPF for the 5.3 kernel
Posted Aug 1, 2019 14:22 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link]
Posted Aug 1, 2019 14:22 UTC (Thu) by mathstuf (subscriber, #69389) [Link]
Should verification be done in the kernel?
Posted Aug 1, 2019 0:12 UTC (Thu)
by anadav (subscriber, #99427)
[Link] (1 responses)
Posted Aug 1, 2019 0:12 UTC (Thu) by anadav (subscriber, #99427) [Link] (1 responses)
Verifying byte code for safety guarantees is a known problem with practical and proven solutions. Since eBPF verification is not usually on the hot-path (or at least it should not be), it raises the question: why does eBPF verification have to be performed in the kernel? Wouldn’t it be better to perform it in userspace using standard tools instead?
It may be beneficial just to introduce this option to users, by separating the binary rewriting and verification stages and provide hooks for userspace to register an external verifier. IMA and other security measures can then be used to ensure the verifier itself is not compromised. Having such a verifier can also assists in the adoption of eBPF in other environments (e.g., DPDK).
As a reference [1], my colleagues have developed an eBPF verifier prototype that uses the well-known verification technique called abstract interpretation and shows promising results (good asymptotic scalability, very low rate of false positives). It verifies loop safety out of the box and can be extended to also check that programs with loops terminate.
[1] https://github.com/vbpf/ebpf-verifier
Should verification be done in the kernel?
Posted Aug 5, 2019 17:16 UTC (Mon)
by ecree (guest, #95790)
[Link]
Posted Aug 5, 2019 17:16 UTC (Mon) by ecree (guest, #95790) [Link]
There are some problems which are non-trivial, at least, with trying to extend PREVAIL to cover the full feature set of the existing verifier, including:
* the zone domain can only support relational difference constraints _or_ relational integer congruence constraints (Miné 2004 §5.4.6), not both at once (the paper just states this is 'straightforward' without further explanation) which is needed for alignment checking.
* similarly, the paper doesn't actually explain how abstract interpretation can check loop termination; once you've got a fixed-point state set, how do you determine that the program doesn't go round in circles within that set? Just stating that it's possible doesn't guide implementors much.
* the verifier today has checks more complicated than simple 'safety of execution', for instance it can detect certain speculative-execution vulnerabilities (don't ask me how this works). It's not clear that an abstract semantics can be written that deals with this.
* sure your runtime complexity is good, but that memory usage!
Our conclusion was that the best use we could make of PREVAIL was to incorporate some of its detail ideas, while rejecting the overall architecture. In particular, I was intrigued by the concept (§7.3) of having userspace supply (essentially) a safety proof which the kernel merely validates; the equivalent of this in the existing verifier's architecture would be to specify explored_states at pruning points which are both sufficient to prove safety downstream and provable from the upstream code (which currently are discovered by the verifier and often overly precise, leading to path explosion); this would be particularly useful in the form of loop invariants as the kernel verifier would only need to prove (a) holds at loop entry, (b) maintained by loop and (c) sufficient to prove safety from loop exit. I haven't had the time to dig into this myself (and expect to be far too busy in the near future).
Bounded loops in BPF for the 5.3 kernel
Posted Aug 5, 2019 17:21 UTC (Mon)
by ecree (guest, #95790)
[Link]
Posted Aug 5, 2019 17:21 UTC (Mon) by ecree (guest, #95790) [Link]
So the verifier does not incur the cost of tracking all registers precisely; instead, when the need arises, it backtracks through the use of a register in the code to generate a precise value if possible.Alexei can correct me, but I don't think this is quite accurate. AIUI, the verifier always calculates precise value bounds, but normally that precision is ignored for pruning. Only if the path from a pruning point to an exit uses the precise value does the precise value need to be checked for pruning safety. So the backtracking is marking parent states as "needs precision", not actually generating new values.