A Fork in The Road (Fork-Hotos19)
A Fork in The Road (Fork-Hotos19)
2 HISTORY: FORK BEGAN AS A HACK was fast relative to instruction execution, and it provided a
Although the term originates with Conway, the first imple- compelling abstraction. There are two main aspects to this:
mentation of a fork operation is widely credited to the Project Fork was simple. As well as being easy to implement,
Genie time-sharing system [61]. Ritchie and Thompson [70] fork simplified the Unix API. Most obviously, fork needs
themselves claimed that Unix fork was present “essentially as no arguments, because it provides a simple default for all
we implemented it” in Genie. However, the Genie monitor’s the state of a new process: inherit it from the parent. In
fork call was more flexible than that of Unix: it permitted stark contrast, the Windows CreateProcess() API takes
the parent process to specify the address space and machine explicit parameters specifying every aspect of the child’s
context for the new child process [49, 71]. By default, the kernel state—10 parameters and many optional flags.
child shared the address space of its parent (somewhat like More significantly, creating a process with fork is orthog-
a modern thread); optionally, the child could be given an onal to starting a new program, and the space between fork
entirely different address space of memory blocks to which and exec serves a useful purpose. Since fork duplicates the
the user had access; presumably, in order to run a different parent, the same system calls that permit a process to modify
program. Crucially, however, there was no facility to copy its kernel state can be reused in the child prior to exec: the
the address space, as was done unconditionally by Unix. shell opens, closes, and remaps file descriptors prior to com-
Ritchie [69] later noted that “it seems reasonable to sup- mand execution, and programs can reduce permissions or
pose that it exists in Unix mainly because of the ease with alter the namespace of a child to run it in restricted context.
which fork could be implemented without changing much Fork eased concurrency. In the days before threads or
else.” He goes on to describe how the first fork was imple- asynchronous IO, fork without exec provided an effective
mented in 27 lines of PDP-7 assembly, and consisted of copy- form of concurrency. In the days before shared libraries,
ing the current process out to swap and keeping the child it enabled a simple form of code reuse. A program could
resident in memory.1 Ritchie also noted that a combined Unix initialise, parse its configuration files, and then fork multiple
fork-exec “would have been considerably more complicated, copies of itself that ran either different functions from the
if only because exec as such did not exist; its function was same binary or processed different inputs. This design lives
already performed, using explicit IO, by the shell.” on in pre-forking servers; we return to it in §6.
The TENEX operating system [18] yields a notable
counter-example to the Unix approach. It was also influ-
enced by Project Genie, but evolved independently of Unix.
4 FORK IN THE MODERN ERA
Its designers also implemented a fork call for process cre- At first glance, fork still seems simple. We argue that this
ation, however, more similarly to Genie, the TENEX fork is a deceptive myth, and that fork’s effects cause modern
either shared the address space between parent and child, applications more harm than good.
or else created the child with an empty address space [19]. Fork is no longer simple. Fork’s semantics have in-
There was no Unix-style copying of the address space, likely fected the design of each new API that creates process
because virtual memory hardware was available.2 state. The POSIX specification now lists 25 special cases
Unix fork was not a necessary “inevitability” [61]. It was in how the parent’s state is copied to the child [63]: file locks,
an expedient PDP-7 implementation shortcut that, for 50 timers, asynchronous IO operations, tracing, etc. In addi-
years, has pervaded modern OSes and applications. tion, numerous system call flags control fork’s behaviour
with respect to memory mappings (Linux madvise() flags
3 ADVANTAGES OF THE FORK API MADV_DONTFORK/DOFORK/WIPEONFORK, etc.), file descriptors
(O_CLOEXEC, FD_CLOEXEC) and threads (pthread_atfork()).
When Unix was rewritten for the PDP-11 (with memory
Any non-trivial OS facility must document its behaviour
translation hardware permitting multiple processes to re-
across a fork, and user-mode libraries must be prepared for
main resident), copying the process’s entire memory only
their state to be forked at any time. The simplicity and or-
to immediately discard it in exec was already, arguably, in-
thogonality of fork is now a myth.
efficient. We suspect that copying fork survived the early
Fork doesn’t compose. Because fork duplicates an entire
years of Unix mainly because programs and memory were
address space, it is a poor fit for OS abstractions implemented
small (eight 8 KiB pages on the PDP-11), memory access
in user-mode. Buffered IO is a classic example: a user must
explicitly flush IO prior to fork, lest output be duplicated [73].
1 Sharing memory between parent and child (as in Genie) was impracti- Fork isn’t thread-safe. Unix processes today support
cal, because the PDP-7 lacked virtual memory hardware; instead, Unix
implemented multiprocessing by swapping full processes to disk.
threads, but a child created by fork has only a single thread
2 TENEX also supported copy-on-write memory, but this does not appear to (a copy of the calling thread). Unless the parent serialises fork
have been used by fork [20]. with respect to its other threads, the child address space may
2
A fork() in the road HotOS ’19, May 13–15, 2019, Bertinoro, Italy
15
high price. Figure 1 plots the time to fork and exec from a
10
process of varying size under Ubuntu 16.04.3 on an Intel i7-
6850K CPU at 3.6 GHz. The dirty line shows the cost of fork-
5 ing a process with dirty pages, which must be downgraded
to read-only for copy-on-write mappings. In the fragmented
0
0 50 100 150 200 250
case, the parent dirties only its stack, but simulates mem-
ory layout in a complex application using shared libraries,
Parent process size (MiB) address space randomisation, and just-in-time compilation,
by allocating alternating read-only and read-write pages.
Figure 1: Cost of fork() + exec() vs. posix_spawn() By contrast, posix_spawn() takes the same time (around
0.5 ms) regardless of the parent’s size or memory layout.
Fork doesn’t scale. In Linux, the memory management
operations needed to setup fork’s copy-on-write mappings
end up as an inconsistent snapshot of the parent. A simple are known to hurt scalability [22, 82], but the true problem
but common case is one thread doing memory allocation lies deeper: as Clements et al. [29] observed, the mere spec-
and holding a heap lock, while another thread forks. Any ification of the fork API introduces a bottleneck, because
attempt to allocate memory in the child (and thus acquire the (unlike spawn) it fails to commute with other operations on
same lock) will immediately deadlock waiting for an unlock the process. Other factors further impede a scalable imple-
operation that will never happen. mentation of fork. Intuitively, the way to make a system scale
Programming guides advise not using fork in a multi- is to avoid needless sharing. A forked process starts sharing
threaded process, or calling exec immediately afterwards [64, everything with its parent. Since fork duplicates every aspect
76, 77]. POSIX only guarantees that a small list of “async- of a process’s OS state, it encourages centralisation of that
signal-safe” functions can be used between fork and exec, state in a monolithic kernel where it is cheap to copy and/or
notably excluding malloc() and anything else in standard reference count. This then makes it hard to implement, e.g.,
libraries that may allocate memory or acquire locks. Real kernel compartmentalisation for security or reliability.
multi-threaded programs that fork are plagued by bugs aris- Fork encourages memory overcommit. The imple-
ing from the practice [24–26, 66]. menter of fork faces a difficult choice when accounting for
It is hard to imagine a new proposed syscall with these memory used by copy-on-write page mappings. Each such
properties being accepted by any sane kernel maintainer. page represents a potential allocation—if any copy of the
Fork is insecure. By default, a forked child inherits ev- page is modified, a new page of physical memory will be
erything from its parent, and the programmer is responsible needed to resolve the page fault. A conservative implemen-
for explicitly removing state that the child does not need by: tation therefore fails the fork call unless there is sufficient
closing file descriptors (or marking them as close-on-exec), backing store to satisfy all potential copy-on-write faults [55].
scrubbing secrets from memory, isolating namespaces using However, when a large process performs fork and exec, many
unshare() [52], etc. From a security perspective, the inherit- copy-on-write page mappings are created but never modi-
by-default behaviour of fork violates the principle of least fied, particularly if the exec’ed child is small, and having fork
privilege. Furthermore, programs that fork but don’t exec fail because the worst-case allocation (double the virtual size
render address-space layout randomisation ineffective, since of the process) could not be satisfied is undesirable.
each process has the same memory layout [17]. An alternative approach, and the default on Linux, is to
Fork is slow. In the decades since Thompson first im- overcommit virtual memory: operations that establish vir-
plemented fork, memory size and relative access cost have tual address mappings, which includes fork’s copy-on-write
grown continuously. Even by 1979 (when the third BSD Unix clone of an address space, succeed immediately regardless of
introduced vfork() [15]) fork was seen as a performance whether sufficient backing store exists. A subsequent page
problem, and only copy-on-write techniques [3, 72] kept its fault (e.g. a write to a forked page) can fail to allocate required
performance acceptable. Today, even the time to establish memory, invoking the heuristic-based “out-of-memory killer”
copy-on-write mappings is a problem: Chrome experiences to terminate processes and free up memory.
delays of up to 100 ms in fork [28], and Node.js applications To be clear, Unix does not require overcommit, but we
can be blocked for seconds while forking prior to exec [56]. argue that the widespread use of copy-on-write fork (rather
3
HotOS ’19, May 13–15, 2019, Bertinoro, Italy Andrew Baumann, Jonathan Appavoo, Orran Krieger, and Timothy Roscoe
than a spawn-like facility) strongly encourages it. Real appli- SGX-LKL support only single-process applications [6, 50].
cations are unprepared to handle apparently-spurious out-of- Graphene-SGX [79] implements fork by creating a new en-
memory errors in fork [27, 37, 57]. Redis, which uses fork for clave in a new host process, then copying the parent’s mem-
persistence, explicitly advises against disabling memory over- ory over an encrypted RPC stream; this can take seconds.
commit [67]; otherwise, Redis would have to be restricted to Fork is incompatible with heterogeneous hardware.
only half the total virtual memory to avoid the risk of being Fork conflates the abstraction of a process with the hardware
killed in an out-of-memory situation. address space that contains it. In effect, fork restricts the
definition of a process to a single address space and (as we
Summary. Fork today is a convenient API for a single-
saw earlier) a single thread running on some core.
threaded process with a small memory footprint and simple
Modern hardware, and the programs that run on it, just
memory layout that requires fine-grained control over the
don’t look like this. Hardware is increasingly heteroge-
execution environment of its children but does not need
neous, and a process using, say, DPDK with a kernel-bypass
to be strongly isolated from them. In other words, a shell.
NIC [12], or OpenCL with a GPU, cannot safely fork since
It’s no surprise that the Unix shell was the first program to
the OS cannot duplicate the process state on the NIC/GPU.
fork [69], nor that defenders of fork point to shells as the
This appears to have been a continuing source of bafflement
prime example of its elegance [4, 7]. However, most modern
among GPU programmers for a decade at least [58–60, 74].
programs are not shells. Is it still a good idea to optimise the
As future systems-on-chip incorporate more and more state-
OS API for the shell’s convenience?
ful accelerators, this is only going to get worse.
Fork infects an entire system. The mere choice to sup-
5 IMPLEMENTING FORK port fork places significant constraints on the system’s de-
While it is hard to quantify the cost of implementing fork on sign and runtime environment. An efficient fork at any layer
existing systems, there is clear evidence that supporting fork requires a fork-based implementation at all layers below it.
limits changes in OS architecture, and restricts the ability of For example, Cygwin is a POSIX compatibility environment
OSes to adapt with hardware evolution. for Windows; it implements fork in order to run Linux appli-
Fork is incompatible with a single address space. cations. Since the Win32 API lacks fork, Cygwin emulates
Many modern contexts restrict execution to a single address it on top of CreateProcess() [31, 47]: it creates a new pro-
space, including picoprocesses [42], unikernels [53], and en- cess running the same program as the parent and copies all
claves [14]. Despite the fact that a much larger community of writable pages (data sections, heap, stack, etc.) before resum-
OS researchers work with and on Unix systems, researchers ing the child. This is neither fast nor reliable and can fail for
working with systems not based on fork have had a much many reasons, most often when memory addresses in parent
easier time adapting them to these environments. and child differ due to address-space layout randomisation.
For example, the Drawbridge libOS [65] implements a Ironically, the NT kernel natively supports fork; only the
binary-compatible Windows runtime environment within Win32 API on which Cygwin depends does not (user-mode
an isolated user-mode address space, known as a picoprocess. libraries and system services are not fork-aware, so a forked
Drawbridge supports multiple “virtual processes” within Win32 process crashes). As an abstraction, fork fails to com-
the same shared address space; CreateProcess() is imple- pose: unless every layer supports fork, it cannot be used.
mented by loading the new binary and libraries in a different
portion of the address space, and then creating a separate Fork in a research OS: the K42 experience
thread to begin execution of the child, while ensuring cross- Many research operating systems have faced the dilemma of
process system calls function as expected. Needless to say, whether (and if so, how) to implement fork, with the authors
there is no security isolation between these processes—the having direct experience of six [13, 36, 41, 48, 51, 80]. This
meaningful security boundary is the host picoprocess. How- choice has significant implications. Implementing fork opens
ever, this model has been used, for example, to support a the door to a large class of Unix-derived applications, first
full multi-process Windows environment inside an SGX en- among them shells and build tools that ease the construction
clave [14], enabling complex applications that involve multi- of a complete system. However, it also ties the researchers’
ple processes and programs to be deployed in an enclave. hands: we conjecture that a system that implements fork,
In contrast, fork is unimplementable within a single ad- particularly one that attempts to do so efficiently, or early in
dress space [23] without complex compiler and linker mod- its life, inexorably converges to a Unix-like design.
ifications [81]. As a result, Unikernels derived from Unix K42 [48] built on our experience with Tornado [36] that
systems do not support internal multi-process environ- demonstrated the value of a multi-processor-friendly object-
ments [44, 45] and running multi-process Linux applica- oriented approach, per-application customisable objects, and
tions in an enclave is much more complicated. SCONE and microkernel architecture [5] to enable pervasive locality and
4
A fork() in the road HotOS ’19, May 13–15, 2019, Bertinoro, Italy
helper function controlling every possible aspect of process on Linux [4]. Another pattern uses fork to capture a consis-
state. It is infeasible for a single OS API to give complete tent snapshot of a running process’s address space, allowing
control over the initial state of a new process. In Unix today, the parent to continue execution; this includes persistence
the only fallback for advanced use-cases remains code ex- support in Redis [68], and some reverse debuggers [21].
ecuted after fork, but clean-slate designs [e.g., 40, 43] have POSIX would benefit from an API for using copy-on-write
demonstrated an alternative model where system calls that memory independently of forking a new process. Bittau [16]
modify per-process state are not constrained to merely the proposed checkpoint() and resume() calls to take copy-on-
current process, but rather can manipulate any process to write snapshots of an address space, thus reducing the over-
which the caller has access. This yields the flexibility and head of security isolation. More recently, Xu et al. [82] ob-
orthogonality of the fork/exec model, without most of its served that fork time dominates the performance of fuzzing
drawbacks: a new process starts as an empty address space, tools, and proposed a similar snapshot() API. These designs
and an advanced user may manipulate it in a piecemeal fash- are not yet general enough to cover all the use-cases outlined
ion, populating its address-space and kernel context prior to above, but perhaps can serve as a starting point. We note
execution, without needing to clone the parent nor run code that any new copy-on-write memory API must tackle the
in the context of the child. ExOS [43] implemented fork in issue of memory overcommit described in §4, but decoupling
user-mode atop such a primitive. Retrofitting cross-process this problem from fork should make it much simpler.
APIs into Unix seems at first glance challenging, but may
also be productive for future research. 7 GET THE FORK OUT OF MY OS!
Alternative: clone(). This syscall underlies all process
We’ve described how fork is a relic of the past that harms
and thread creation on Linux. Like Plan 9’s rfork() which
applications and OS design. There are three things we must
preceded it, it takes separate flags controlling the child’s
do to rectify the situation.
kernel state: address space, file descriptor table, namespaces,
Deprecate fork. Thanks to the success of Unix, future
etc. This avoids one problem of fork: that its behaviour is
systems will be stuck supporting fork for a long time; never-
implicit or undefined for many abstractions. However, for
theless, an implementation hack of 50 years ago should not
each resource there are two options: either share the resource
be permitted to dictate the design of future OSes. We should
between parent and child, or else copy it. As a result, clone
therefore strongly discourage the use of fork in new code,
suffers most of the same problems as fork (§4–5).
and seek to remove it from existing apps. Once fork is gone
from performance-critical paths, it can be removed from the
core of the OS and reimplemented on top as needed. If future
Fork-only use-cases. There exist special cases where fork is
systems supported fork only in limited cases, such as a single-
not followed by exec, that rely on duplicating the parent.
threaded process [2], it would remain possible to run legacy
Multi-process servers. Traditionally the standard way
software without needless implementation complexity.
to build a concurrent server was to fork off processes. How-
Improve the alternatives. For too long, fork has been
ever, the reasons that motivated multi-process servers are
the generic process creation mechanism on Unix-like sys-
long gone: OS libraries are thread-safe, and the scalability bot-
tems, with other abstractions layered on top. Thankfully, this
tlenecks that plagued early threaded or event-driven servers
has begun to change [32, 38], but there is more to do (§6).
are fixed [10]. While process boundaries may have value
Fix our teaching. Clearly, students need to learn about
from a fault isolation perspective, we believe that it makes
fork, however at present most text books (and we presume
more sense to use a spawn API to start those processes. The
instructors) introduce process creation with fork [7, 35, 78].
performance advantage of the shared initial state created by
This not only perpetuates fork’s use, it is counterproductive—
fork is less relevant when most concurrency is handled by
the API is far from intuitive. Just as a programming course
threads, and modern operating systems deduplicate memory.
would not today begin with goto, we suggest teaching either
Finally, with fork, all processes share the same address-space
posix_spawn() or CreateProcess(), and then introducing
layout and are vulnerable to Blind ROP attacks [17].
fork as a special case with its historic context (§2).
Copy-on-write memory. Modern implementations of
fork use copy-on-write to reduce the overhead of copying
memory that is often soon discarded [72]. A number of ap- ACKNOWLEDGEMENTS
plications have since taken a dependency on fork merely to We thank all who provided feedback, including: Tom Ander-
gain access to copy-on-write memory. One common pattern son, Remzi Arpaci-Dusseau, Marc Auslander, Bill Bolosky,
involves forking from a pre-initialised process, to reduce Ulrich Drepper, Chris Hawblitzel, Eddie Kohler, Petros Ma-
startup overhead and memory footprint of a worker process, niatis, Mathias Payer, Michael Stumm, Robbert Van Renesse,
as in the Android Zygote [39, 62] and Chrome site isolation and the anonymous reviewers.
6
A fork() in the road HotOS ’19, May 13–15, 2019, Bertinoro, Italy
REFERENCES [16] Andrea Bittau. Toward Least-Privilege Isolation for Software. PhD
[1] The BeBook: The Kernel Kit: load_image(). ACCESS Co., 1.0 edition, thesis, Department of Computer Science, University College London,
March 2008. URL https://www.haiku-os.org/legacy-docs/bebook/ November 2009. URL http://www.scs.stanford.edu/~sorbo/bittau-phd.
TheKernelKit_Images.html#load_image. pdf.
[2] The BeBook: Threads and Teams. ACCESS Co., 1.0 edition, March 2008. [17] Andrea Bittau, Adam Belay, Ali Mashtizadeh, David Mazières, and Dan
URL https://www.haiku-os.org/legacy-docs/bebook/TheKernelKit_ Boneh. Hacking blind. In IEEE Symposium on Security and Privacy,
ThreadsAndTeams_Overview.html. pages 227–242. IEEE Computer Society, 2014. ISBN 978-1-4799-4686-0.
[3] Mike Accetta, Robert Baron, William Bolosky, David Golub, Richard doi: 10.1109/SP.2014.22.
Rashid, Avadis Tevanian, and Michael Young. Mach: A new kernel [18] Daniel G. Bobrow, Jerry D. Burchfiel, Daniel L. Murphy, and Raymond S.
foundation for UNIX development. In USENIX Summer Conference, Tomlinson. TENEX, a paged time sharing system for the PDP-10. In
pages 93–113, June 1986. 3rd ACM Symposium on Operating Systems Principles. ACM, 1971. doi:
[4] Thomas Anderson and Michael Dahlin. Operating Systems: Principles 10.1145/800212.806492.
and Practice. Recursive Books, 2nd edition, 2014. ISBN 978-0-9856735- [19] TENEX JSYS Manual. Bolt Beranek and Newman, Cambridge, MA,
2-9. USA, 2nd edition, September 1973. URL http://www.bitsavers.org/pdf/
[5] Jonathan Appavoo, Dilma Da Silva, Orran Krieger, Marc Auslander, bbn/tenex/TenexJSYSMan_Sep73.pdf.
Michal Ostrowski, Bryan Rosenburg, Amos Waterland, Robert W. [20] TENEX 1.33 source code, CFORK system call. Bolt Beranek and Newman,
Wisniewski, Jimi Xenidis, Michael Stumm, and Livio Soares. Ex- 1975. URL https://github.com/PDP-10/tenex/blob/master/133-tenex/
perience distributing objects in an SMMP OS. ACM Transactions forks.mac#L208.
on Computer Systems, 25(3), August 2007. ISSN 0734-2071. doi: [21] Bob Boothe. Efficient algorithms for bidirectional debugging. In
10.1145/1275517.1275518. ACM SIGPLAN Conference on Programming Language Design and Im-
[6] Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, Andre plementation, pages 299–310. ACM, 2000. ISBN 1-58113-199-2. doi:
Martin, Christian Priebe, Joshua Lind, Divya Muthukumaran, Dan 10.1145/349299.349339.
O’Keeffe, Mark L. Stillwell, David Goltzsche, Dave Eyers, Rüdiger [22] Silas Boyd-Wickizer, M. Frans Kaashoek, Robert Morris, and Nickolai
Kapitza, Peter Pietzuch, and Christof Fetzer. SCONE: Secure Linux Zeldovich. OpLog: a library for scaling update-heavy data structures.
containers with Intel SGX. In 12th USENIX Symposium on Operating Technical Report MIT-CSAIL-TR-2014-019, MIT CSAIL, September
Systems Design and Implementation, pages 689–703. USENIX Associ- 2014. URL http://hdl.handle.net/1721.1/89653.
ation, 2016. ISBN 978-1-931971-33-1. URL https://www.usenix.org/ [23] Jeffrey S. Chase, Henry M. Levy, Michael J. Feeley, and Edward D.
conference/osdi16/technical-sessions/presentation/arnautov. Lazowska. Sharing and protection in a single-address-space operat-
[7] Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau. Operating ing system. ACM Transactions on Computer Systems, 12(4):271–307,
Systems: Three Easy Pieces, chapter 5. Arpaci-Dusseau Books, 1.00 November 1994. ISSN 0734-2071. doi: 10.1145/195792.195795.
edition, March 2018. URL http://www.ostep.org/. [24] Chromium Project. Bug 36678, 2010. URL https://crbug.com/36678.
[8] Vaggelis Atlidakis, Jeremy Andrus, Roxana Geambasu, Dimitris [25] Chromium Project. Bug 56596, 2010. URL https://crbug.com/56596.
Mitropoulos, and Jason Nieh. POSIX abstractions in modern oper- [26] Chromium Project. Bug 177218, 2013. URL https://crbug.com/177218.
ating systems: The old, the new, and the missing. In EuroSys Con- [27] Chromium Project. Bug 856535, 2018. URL https://crbug.com/856535.
ference, pages 19:1–19:17. ACM, 2016. ISBN 978-1-4503-4240-7. doi: [28] Chromium Project. Bug 819228, 2018. URL https://crbug.com/819228.
10.1145/2901318.2901350. [29] Austin T. Clements, M. Frans Kaashoek, Nickolai Zeldovich, Robert T.
[9] Jean Bacon and Tim Harris. Operating Systems: Concurrent and Dis- Morris, and Eddie Kohler. The scalable commutativity rule: Designing
tributed Software Design. Addison Wesley, 2003. ISBN 0-321-11789-1. scalable software for multicore processors. ACM Transactions on Com-
[10] Gaurav Banga and Jeffrey C. Mogul. Scalable kernel performance for puter Systems, 32(4):10:1–10:47, January 2015. ISSN 0734-2071. doi:
Internet servers under realistic loads. In 1998 USENIX Annual Technical 10.1145/2699681.
Conference. USENIX Association, 1998. URL https://www.usenix.org/ [30] OpenVMS System Services Reference Manual: $CREPRC. Com-
legacy/publications/library/proceedings/usenix98/banga.html. paq Computer Corporation, Houston, TX, USA, April 2001.
[11] Amnon Barak, Shai Guday, and Richard G. Wheeler. The MOSIX URL http://h30266.www3.hpe.com/odl/vax/opsys/vmsos73/vmsos73/
Distributed Operating System: Load Balancing for UNIX. Springer- 4527/4527pro_018.html#jun_147. Document number ZK4527.
Verlag Berlin Heidelberg, 1993. doi: 10.1007/3-540-56663-5. [31] Cygwin 2.11 User’s Guide. Cygwin, November 2018. URL https://
[12] Dotan Barak. Libibverbs Programmer’s Manual: ibv_fork_init(3), Octo- cygwin.com/cygwin-ug-net/highlights.html#ov-hi-process.
ber 2006. URL https://github.com/linux-rdma/rdma-core/blob/master/ [32] Casper Dik. posix_spawn() as an actual system call. Oracle So-
libibverbs/man/ibv_fork_init.3.md. laris Blog, February 2018. URL https://blogs.oracle.com/solaris/posix_
[13] Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, spawn-as-an-actual-system-call.
Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schuepbach, [33] D. Eastlake, R. Greenblatt, J. Holloway, T. Knight, and S. Nelson. ITS 1.5
and Akhilesh Singhania. The multikernel: a new OS architecture for Refereence Manual. MIT Artificial Intelligence Laboratory, Cambridge,
scalable multicore systems. In 22nd ACM Symposium on Operating MA, USA, July 1969. URL https://hdl.handle.net/1721.1/6165. Memo
Systems Principles. ACM, October 2009. doi: 10.1145/1629575.1629579. number AIM-161A.
[14] Andrew Baumann, Marcus Peinado, and Galen Hunt. Shielding applica- [34] Rich Felker. vfork considered dangerous. October 2012. URL https:
tions from an untrusted cloud with Haven. In 11th USENIX Symposium //ewontfix.com/7.
on Operating Systems Design and Implementation, pages 267–283, Oc- [35] Greg Gagne, Abraham Silberschatz, and Peter B. Galvin. Operating
tober 2014. ISBN 978-1-931971-16-4. URL https://www.usenix.org/ Systems Concepts. John Wiley & Sons, 9th edition, 2012. ISBN 978-1-
conference/osdi14/technical-sessions/presentation/baumann. 118-06333-0.
[15] 2.9.1 BSD System Calls Manual: vfork(2). Berkeley Software Distribu- [36] Ben Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm.
tion, Berkeley, CA, USA, 1983. URL https://www.freebsd.org/cgi/man. Tornado: Maximizing locality and concurrency in a shared mem-
cgi?query=vfork&manpath=2.9.1+BSD. ory multiprocessor operating system. In 3rd USENIX Symposium on
Operating Systems Design and Implementation, February 1999. URL
7 https://www.usenix.org/legacy/events/osdi99/gamsa.html.
HotOS ’19, May 13–15, 2019, Bertinoro, Italy Andrew Baumann, Jonathan Appavoo, Orran Krieger, and Timothy Roscoe
[37] GNOME Project. Merge request 95, 2018. URL https://gitlab.gnome. 2013. ISBN 978-1-4503-1870-9. doi: 10.1145/2451116.2451167.
org/GNOME/glib/merge_requests/95. [54] Windows API: CreateProcessW function. Microsoft, April 2018.
[38] GNU C Library. Bug 10354, 2016. URL https://sourceware.org/bugzilla/ URL https://docs.microsoft.com/en-us/windows/desktop/api/
show_bug.cgi?id=10354. processthreadsapi/nf-processthreadsapi-createprocessw.
[39] Android Developer Documentation: Overview of memory management. [55] Greg Nakhimovsky. Minimizing memory usage for creat-
Google, 2018. URL https://developer.android.com/topic/performance/ ing application subprocesses. Sun Microsystems, May 2006.
memory-overview#SharingRAM. URL https://www.oracle.com/technetwork/server-storage/solaris10/
[40] Gernot Heiser and Kevin Elphinstone. L4 microkernels: The lessons subprocess-136439.html.
from 20 years of research and deployment. ACM Transactions on [56] Node.js. Issue 14917, 2018. URL https://github.com/nodejs/node/issues/
Computer Systems, 34(1):1:1–1:29, April 2016. ISSN 0734-2071. doi: 14917.
10.1145/2893177. [57] Node.js. Issue 25382, 2019. URL https://github.com/nodejs/node/issues/
[41] Gernot Heiser, Kevin Elphinstone, Jerry Vochteloo, Stephen Russell, 25382.
and Jochen Liedtke. The Mungi single-address-space operating system. [58] Nvidia Developer Forum. CUDA and fork(), December
Software, Practice and Experience, 28(9):901–928, 1998. 2007. URL https://devtalk.nvidia.com/default/topic/382954/
[42] Jon Howell, Bryan Parno, and John R. Douceur. How to run POSIX apps cuda-programming-and-performance/cuda-and-fork-/.
in a minimal picoprocess. In 2013 USENIX Annual Technical Conference, [59] Nvidia Developer Forum. Linux fork() and CUDA OOM possible bug,
pages 321–332. USENIX Association, 2013. URL https://www.usenix. March 2009. URL https://devtalk.nvidia.com/default/topic/453458/
org/conference/atc13/technical-sessions/presentation/howell. linux-fork-and-cuda-oom-possible-bug-/.
[43] M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Hector M. [60] Nvidia Developer Forum. (CUDA8.0 BUG?) Child process forked
Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert after cuInit() get CUDA_ERROR_NOT_INITIALIZED on cuInit(),
Grimm, John Jannotti, and Kenneth Mackenzie. Application perfor- October 2016. URL https://devtalk.nvidia.com/default/topic/973477/
mance and flexibility on exokernel systems. In 16th ACM Symposium -cuda8-0-bug-child-process-forked-after-cuinit-get-cuda_error_
on Operating Systems Principles, pages 52–65, 1997. ISBN 0-89791-916-5. not_initialized-on-cuinit-/.
doi: 10.1145/268998.266644. [61] Linus Nyman and Mikael Laakso. Notes on the history of fork and
[44] Antti Kantee. On rump kernels and the Rumprun uniker- join. IEEE Annals of the History of Computing, 38(3):84–87, July 2016.
nel, August 2015. URL https://xenproject.org/2015/08/06/ ISSN 1058-6180. doi: 10.1109/MAHC.2016.34.
on-rump-kernels-and-the-rumprun-unikernel/. [62] Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter,
[45] Avi Kivity, Dor Laor, Glauber Costa, Pekka Enberg, Nadav Har’El, Don Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. SOCK: Rapid task
Marti, and Vlad Zolotarov. OSv—optimizing the operating system for provisioning with serverless-optimized containers. In 2018 USENIX
virtual machines. In 2014 USENIX Annual Technical Conference, pages Annual Technical Conference, pages 57–70, 2018. ISBN 978-1-931971-44-
61–72, 2014. ISBN 978-1-931971-10-2. URL https://www.usenix.org/ 7. URL https://www.usenix.org/conference/atc18/presentation/oakes.
conference/atc14/technical-sessions/presentation/kivity. [63] Base Specifications POSIX.1-2017. The Open Group, San Francisco, CA,
[46] Eddie Kohler. Harvard University CS 61 problem set 4: WeensyOS, USA, 2018. URL http://pubs.opengroup.org/onlinepubs/9699919799/
October 2018. URL https://cs61.seas.harvard.edu/site/2018/WeensyOS/. functions/fork.html. IEEE Std 1003.1-2017.
See also https://twitter.com/xexd/status/951977086331359232. [64] Damian Pietras. Threads and fork(): think twice before mixing
[47] David G. Korn. Porting UNIX to Windows NT. In 1997 USENIX Annual them. June 2009. URL https://www.linuxprogrammingblog.com/
Technical Conference, January 1997. URL https://www.usenix.org/ threads-and-fork-think-twice-before-using-them.
legacy/publications/library/proceedings/ana97/korn.html. [65] Donald E. Porter, Silas Boyd-Wickizer, Jon Howell, Reuben Olinsky,
[48] Orran Krieger, Marc Auslander, Bryan Rosenburg, Robert W. Wis- and Galen C. Hunt. Rethinking the library OS from the top down. In
niewski, Jimi Xenidis, Dilma Da Silva, Michal Ostrowski, Jonathan 16th International Conference on Architectural Support for Programming
Appavoo, Maria Butrico, Mark Mergen, Amos Waterland, and Volk- Languages and Operating Systems, pages 291–304. ACM, 2011. ISBN
mar Uhlig. K42: Building a complete operating system. In EuroSys 978-1-4503-0266-1. doi: 10.1145/1950365.1950399.
Conference, pages 133–145. ACM, 2006. ISBN 1-59593-322-0. doi: [66] Python Project. Issue 27126, 2016. URL https://bugs.python.org/
10.1145/1217935.1217949. issue27126.
[49] Butler W. Lampson. SDS 940 lectures. June 1966. URL [67] Redis FAQ: Background saving fails with a fork() error under Linux.
http://archive.computerhistory.org/resources/text/SDS/sds.lampson. Redis, 2018. URL https://redis.io/topics/faq.
SDS_940_lectures.1966.102634499.pdf. [68] Redis Persistence. Redis, 2018. URL https://redis.io/topics/persistence.
[50] SGX-LKL. Large-Scale Data & Systems Group, Imperial College Lon- [69] Dennis M. Ritchie. The evolution of the Unix time-sharing system. In
don, 2018. URL https://github.com/lsds/sgx-lkl. Jeffrey M. Tobias, editor, Language Design and Programming Method-
[51] Ian Leslie, Derek McAuley, Richard Black, Timothy Roscoe, Paul ology, volume 79 of Lecture Notes in Computer Science, pages 25–35.
Barham, David Evers, Robin Fairbairns, and Eoin Hyden. The design Springer, 1980. ISBN 978-3-540-38579-0. doi: 10.1007/3-540-09745-7_2.
and implementation of an operating system to support distributed mul- [70] Dennis M. Ritchie and Ken Thompson. The UNIX time-sharing system.
timedia applications. IEEE Journal on Selected Areas in Communications, Communications of the ACM, 17(7):365–375, July 1974. ISSN 0001-0782.
14(7):1280–1297, September 1996. doi: 10.1109/49.536480. doi: 10.1145/361011.361061.
[52] Linux Programmer’s Manual: unshare(2). Linux man-pages project, [71] SDS 940 Time-Sharing System Technical Manual. Scien-
March 2019. URL http://man7.org/linux/man-pages/man2/unshare.2. tific Data Systems, Santa Monica, CA, USA, November
html. 1967. URL http://bitsavers.org/pdf/sds/9xx/940/901116A_940_
[53] Anil Madhavapeddy, Richard Mortier, Charalampos Rotsos, David TimesharingTechMan_Nov67.pdf. Publication number 90 11 16A.
Scott, Balraj Singh, Thomas Gazagnaire, Steven Smith, Steven Hand, [72] Jonathan M. Smith and Gerald Q. Maguire, Jr. Effects of copy-on-
and Jon Crowcroft. Unikernels: Library operating systems for the write memory management on the response time of UNIX fork oper-
cloud. In 18th International Conference on Architectural Support for ations. Computing Systems: The Journal of the USENIX Association, 1
Programming Languages and Operating Systems, pages 461–472. ACM, (3):255–278, 1988. URL https://www.usenix.org/legacy/publications/
8
A fork() in the road HotOS ’19, May 13–15, 2019, Bertinoro, Italy