Rethinking Classical Concurrency Patterns
Rethinking Classical Concurrency Patterns
Rethinking Classical
Concurrency Patterns
Bryan C. Mills
[email protected]
GH: @bcmills
Hi, I'm Bryan Mills. I work on the Go Open Source Project at Google.
In this talk we're going to rethink some classical concurrency patterns you may have
encountered in Go programs.
“You cannot flip a brain from zero to one simply by praising the one. You must start at
the zero, extoll its virtues, explore its faults, exhort your listeners to look beyond it. To
weigh the zero against the one, the listener must have both in mind together. Only
when they have freely chosen the one will they abandon the zero.”
― The Codeless Code, Case 196: Fee
This talk covers two principles that you will hopefully find familiar.
We're going to apply them to some concurrency patterns that are hopefully also
familiar.
The two principles relate to the Go concurrency primitives: goroutines and channels.
Start goroutines
when you have
concurrent work.
The first principle is, “Start goroutines when you have concurrent work.”
Share
by communicating.
If you understand the implications of these two principles, I have nothing more to
teach you in this talk.
But I've written about a hundred slides trying to understand them myself, and I'd
appreciate your feedback.
Introduction
Condition Variables
Rethinking Classical
Concurrency Patterns
Worker Pools
Recap
First we'll examine the basic “asynchronous” patterns — futures and queues — which
function as the concurrency primitives in some other languages.
Then we'll do a deep dive on condition variables. If you've been scuba diving, you
know that the deeper you dive, the less time you can spend at depth: the slides will
have a lot of code, but we won't spend much time on the details. The slides will be
available after the talk and there are Playground links in the notes.
In the third section, we'll apply what we've learned to analyze the worker-pool pattern.
Asynchronous
APIs
First, let's talk about asynchronous APIs.
You've heard from Rob Pike that “Concurrency is not Parallelism”. Concurrency is not
Asynchronicity either.
DEFINITION
An asynchronous API
returns to the caller
before its result is ready.
ASYNCHRONOUS APIS
For the purpose of this talk, an asynchronous API is one that returns to the calling
function early.
This is not how we write Go. (You likely know that already.)
ASYNCHRONOUS APIS
The problems with asynchronous callbacks are well-described already. You may be
thinking, "Why are we talking about callbacks? This is Go, and Go programmers know
to use channels and goroutines instead!"
I agree: please don't use asynchronous callbacks, and we won't discuss them further.
But that brings us to two other asynchronous patterns: Futures, and
Producer–Consumer Queues.
https://play.golang.org/p/jlrZshjnJ9z
FUTURE: API
// Fetch immediately returns a channel, then fetches
// the requested item and sends it on the channel.
// If the item does not exist,
// Fetch closes the channel without sending.
func Fetch(name string) <-chan Item {
c := make(chan Item, 1)
go func() {
[…]
c <- item
}()
return c
}
ASYNCHRONOUS APIS
In the Future pattern, instead of returning the result, the function returns a proxy
object that allows the caller to wait for the result at some later point.
You may also know “futures” by the name “async and await” in languages that have
built-in support for the pattern.
https://play.golang.org/p/v_IGf8tU3UT
FUTURE: CALL SITE
Yes: No:
a := Fetch("a") a := <-Fetch("a")
b := Fetch("b") b := <-Fetch("b")
consume(<-a, <-b) consume(a, b)
To use Futures for concurrency, the caller must set up concurrent work
before retrieving results.
ASYNCHRONOUS APIS
Callers of a Future-based API set up the work, then retrieve the results.
If they retrieve the results too early, the program executes sequentially instead of
concurrently.
https://play.golang.org/p/wVzp2Cou54I
PRODUCER–CONSUMER QUEUE: API
// Glob finds all items with names matching pattern
// and sends them on the returned channel.
// It closes the channel when all items have been sent.
func Glob(pattern string) <-chan Item {
c := make(chan Item)
go func() {
defer close(c)
for […] {
[…]
c <- item
}
}()
return c
}
ASYNCHRONOUS APIS
A producer–consumer queue also returns a channel, but the channel receives any
number of results and is typically unbuffered.
https://play.golang.org/p/GpnC3KgwlT0
PRODUCER–CONSUMER QUEUE: CALL SITE
ASYNCHRONOUS APIS
Avoid blocking
Responsiveness. UI and network
threads.
ASYNCHRONOUS APIS
Now that we know what an asynchronous API looks like, let's examine the reasons
we might want to use them.
Most other languages don't multiplex across OS threads, and kernel schedulers can
be unpredictable. So some popular languages and frameworks keep all of the UI or
network logic on a single thread. If that thread makes a call that blocks for too long,
the UI becomes choppy, or network latency spikes.
Since calls to asynchronous APIs by definition don't block, they help to keep
single-threaded programs responsive.
{
CLASSICAL BENEFIT
Efficiency.
Reduce idle
threads.
ASYNCHRONOUS APIS
Languages that don't multiplex over threads can use asynchronous APIs to keep
threads busy, reducing the total number of threads — and context-switches —
needed to run the program.
EFFECTIVE GO
The runtime manages threads for us, so there is no single “UI” or “network” thread to
block, and we don't have to touch the kernel to switch goroutines.
Kavya gave a lot more detail about that in her excellent talk this morning.
The runtime also resizes and relocates thread stacks as needed, so goroutine stacks
can be very small — and don't need to fragment the address space with guard pages.
Today, a goroutine stack starts around two kilobytes — half the size of the smallest
amd64 pages.
https://golang.org/doc/effective_go.html#goroutines
{
CLASSICAL BENEFIT
Efficiency.
Reclaim
stack frames.
ASYNCHRONOUS APIS
An asynchronous call may allow the caller to return from arbitrarily many frames of the
stack.
That frees up the memory containing those stack frames for other uses,¹ and allows
the Go runtime to collect any other allocations that are only reachable from those
frames.
Furthermore, the compiler can already prune out any stack allocations that it knows
are unreachable: it can move large allocations to the heap, and the garbage collector
can ignore dead references.
Finally, the benefit of this optimization depends on the specific call site: if the caller
doesn't have a lot of data on the stack in the first place, then making the call
asynchronous won't help much.
When we take all that into account, asynchronicity as an optimization is very subtle: it
requires careful benchmarks for the impact on specific callers, and the impact may
change or even reverse from one version of the runtime to the next. It's not the sort of
optimization we want to build a stable API around!
https://golang.org/doc/faq#stack_or_heap
{
GO BENEFIT
Initiate
Concurrency. concurrent
work.
ASYNCHRONOUS APIS
When an asynchronous function returns, the caller can immediately make further calls
to start other concurrent work.
Concurrency can be especially important for network RPCs, where the CPU cost of a
call is very low compared to its latency.
ASYNCHRONOUS APIS
Caller-side
ambiguity
Unfortunately, that benefit comes at the cost of making the caller side of the API much
less clear.
Pop quiz!
ASYNCHRONOUS APIS
Let's look at some examples. Suppose that we come across an asynchronous call
while we're debugging or doing a code review.
a := Fetch("a")
b := Fetch("b")
if err := […] {
return err
}
consume(<-a, <-b)
ASYNCHRONOUS APIS
If we return without waiting for the futures to complete, how long will they continue
using resources?
Might we start Fetches faster than we can retire them, and run out of memory?
QUIZ TIME!
a := Fetch(ctx, "a")
b := Fetch(ctx, "b")
[…]
consume(<-a, <-b)
ASYNCHRONOUS APIS
Will Fetch keep using the passed-in context after it has returned?
If so, what happens if we cancel it and then try to read from the channel?
Will we receive a zero-value, some other sentinel value, or block?
QUIZ TIME!
ASYNCHRONOUS APIS
If we return without draining the channel from Glob, will we leak a goroutine?
QUIZ TIME!
ASYNCHRONOUS APIS
Will Glob keep using the passed-in Context as we iterate over the results?
If so, what happens if we cancel it? Will we still get results?
When, if ever, will the channel be closed?
ASYNCHRONOUS APIS
These asynchronous APIs raise a lot of questions, and to answer those questions we
would have to go digging around in the documentation — if the answers are even
there.
So let's rethink this pattern: how can we get the benefits of asynchronicity without this
ambiguity?
Way back to the drawing board. We're using goroutines to implement these
asynchronous APIs, but what is a goroutine, anyway?
https://golang.org/doc/effective_go.html#goroutines
Start goroutines
when you have
concurrent work.
The benefit of asynchronicity is that it allows the caller to initiate other work. But how
do we know that the caller even has any other work?
Functions like Fetch and Glob shouldn't need to know what other work their callers
may be doing. That's not their job.
ASYNCHRONOUS ≡ SYNCHRONOUS
func Async(x In) (<-chan Out) { func Synchronous(x In) Out {
c := make(chan Out, 1) c := Async(x)
go func() { return <-c
c <- Synchronous(x) }
}()
return c
}
ASYNCHRONOUS APIS
https://play.golang.org/p/FxSsaTToIHe
¹ See http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/.
Add concurrency on
the caller side
of the API.
If we keep the API synchronous, we may need to add concurrency at the call site.
CALLER-SIDE CONCURRENCY: SYNCHRONOUS API
ASYNCHRONOUS APIS
ASYNCHRONOUS APIS
The caller can use whatever pattern they like to add concurrency.
In many cases, they won't even need to go through channels, so the questions about
channel usage won't arise.
Here, we're using the golang.org/x/sync/errgroup package and writing the results
directly into local variables.
https://play.golang.org/p/zbP4cko_rTI
Make concurrency
an internal detail.
As long as we present a simple, synchronous API to the caller, they don't need to
care how many concurrent calls its implementation makes.
INTERNAL CONCURRENCY: SYNCHRONOUS API
ASYNCHRONOUS APIS
ASYNCHRONOUS APIS
Internally it can Fetch all of its items concurrently and stream them to a channel, but
the caller doesn't need to know that.
And because the channel is local to the function, we can see both the sender and
receiver locally.
That makes the answers to our channel questions obvious:
● Since the send is unconditional, the receive loop must drain the channel.
● In case of error, the err variable is set and the channel is still closed.
https://play.golang.org/p/XU1kBmfkKl3
Concurrency
is not
Asynchronicity.
We can call synchronous APIs concurrently, and they're clearer at the call site.
We don't need to pay the cost of asynchronicity to get the benefits of concurrency.
Condition
Variables
Condition variables — our next classical pattern — are part of a larger concurrency
pattern called Monitors, but the phrase “condition variable” appears in the Go
standard library, whereas “monitor” (in this sense) does not, so that's what this section
is called.
The concept of monitors dates to 1973,¹ and condition variables to 1974,² so this is a
fairly old pattern.
CONDITION VARIABLES
https://play.golang.org/p/8m1i4IgeaIw
CONDITION VARIABLE: WAIT AND SIGNAL
func (q *Queue) Get() Item { func (q *Queue) Put(item Item) {
q.mu.Lock() q.mu.Lock()
defer q.mu.Unlock() defer q.mu.Unlock()
for len(q.items) == 0 { q.items = append(q.items, item)
q.itemAdded.Wait() q.itemAdded.Signal()
} }
item := q.items[0]
q.items = q.items[1:]
return item
}
CONDITION VARIABLES
The two basic operations on condition variables are Wait, and Signal.
Wait atomically unlocks the mutex and suspends the calling goroutine.
Signal wakes up a waiting goroutine, which relocks the mutex before proceeding.
In our queue, we can use Wait to block on the availability of enqueued items, and
Signal to indicate when another item has been added.
CONDITION VARIABLE: BROADCAST
type Queue struct {
[…]
closed bool
}
CONDITION VARIABLES
Broadcast is usually for events that affect all waiters, such as marking the end of the
queue...
CONDITION VARIABLE: BROADCAST
func (q […]) GetMany(n int) []Item { func (q *Queue) Put(item Item) {
q.mu.Lock() q.mu.Lock()
defer q.mu.Unlock() defer q.mu.Unlock()
for len(q.items) < n { q.items = append(q.items, item)
q.itemAdded.Wait() q.itemAdded.Broadcast()
} }
items := q.items[:n:n]
q.items = q.items[n:]
return items
}
Since we don't know which of the GetMany calls may be ready after a Put,
we can wake them all up and let them decide.
CONDITION VARIABLES
Here, we've changed Get to GetMany. After a Put, one of the waiting GetMany calls
may be ready to complete, but Put has no way of knowing which one to wake, so it
must wake all of them.
¹ These sloppy wakeups, as it turns out, are pretty much the original use-case for the
Broadcast operation.
See Lampson & Redell, Experience with Processes and Monitors in Mesa, 1980.
CONDITION VARIABLES
Condition variables have a lot of different use-cases that we'll want to focus on one at
a time, but the downsides are similar for all of them, so we'll start there.
CONDITION VARIABLES
Spurious
wakeups
For events that aren't really global, Broadcast may wake up too many waiters. For
example, one call to Put wakes up all of the GetMany callers, even though at most
one of them will actually be able to complete.
Even Signal can result in spurious wakeups: if Put used Signal instead of Broadcast,
it could wake up a caller that is not yet ready instead of one that is. If it does that
repeatedly, it could strand items in the queue without corresponding wakeups.
If we're very careful, we can minimize or avoid spurious wakeups — but that generally
adds even more complexity and subtlety to the code.
CONDITION VARIABLES
Forgotten
signals
If we prune out spurious signals too aggressively, we risk going too far and dropping
some that are actually necessary.
And since the condition variable decouples the signal from the data, it's easy to add
some new code to update the data and forget to signal the condition.
CONDITION VARIABLES
Starvation
Even if we don't forget a signal, if the waiters are not uniform, the pickier ones can
starve.
Suppose that we have one call to GetMany(3000), and one caller executing
GetMany(3) in a tight loop. The two waiters will be about equally likely to wake up, but
the GetMany(3) call will be able to consume three items, whereas GetMany(3000)
won't have enough ready. The queue will remain drained and the larger call will block
forever.
Unresponsive
cancellation
The whole point of condition variables is to put a goroutine to sleep while we wait for
something to happen.
But while we're waiting for the condition, we may miss some other event that we
ought to notice too. For example, the caller might decide they don't wait to wait that
long and cancel a passed-in Context, expecting us to notice and return more-or-less
immediately.
Unfortunately, condition variables only let us wait for events associated with their own
mutex, so we can't select on a condition and a cancellation at the same time.¹
Even if the caller cancels, our call will block until the next time the condition is
signalled.
EFFECTIVE GO
Let's look at the use-cases for condition variables and rethink them in terms of
communication. Perhaps we'll spot a pattern.
{
GO BENEFIT
Sharing.
Release
resources.
CONDITION VARIABLES
CONDITION VARIABLES
Our resources for this example will be net.Conns in a pool, and we'll start with the
condition-variable version for reference.
We've got a limit on the total number of connections, plus a pool of idle connections,
and a condition variable that tells us when the set of connections changes.
When we're done with a connection, we can either release it back into the idle pool, or
hijack it so that it no longer counts against the limit.
https://play.golang.org/p/cpDHxvzTMSp
CONDITION VARIABLE: RESOURCE POOL
func (p *Pool) Acquire() ( if len(p.idle) > 0 {
net.Conn, error) { c := p.idle[len(p.idle)-1]
p.idle =
p.mu.Lock() p.idle[:len(p.idle)-1]
defer p.mu.Unlock() return c, nil
for len(p.idle) == 0 && }
p.numConns >= p.limit {
p.cond.Wait() c, err := dial()
} if err == nil {
p.numConns++
}
return c, err
}
Loop until a resource is available, then extract it from the shared state.
CONDITION VARIABLES
Now let's rethink. Let's share resources by communicating the resources themselves.
Resource limits
are resources too!
And the limit is a resource too: in particular, an available slot toward the limit is a thing
we can consume.
A buffered channel can be
used like a semaphore […].
The capacity of the channel
buffer limits the number of
simultaneous calls to
process.
EFFECTIVE GO
Effective Go has a hint for that. It mentions another classical concurrency pattern: the
semaphore, which was described by Dijkstra in the early 1960s.
COMMUNICATION: RESOURCE POOL
type Pool struct { func (p *Pool) Release(c net.Conn) {
sem chan token p.idle <- c
idle chan net.Conn }
}
type token struct{} func (p *Pool) Hijack(c net.Conn) {
<-p.sem
func NewPool(limit int) *Pool { }
sem := make(chan token, limit)
idle :=
make(chan net.Conn, limit)
return &Pool{sem, idle}
}
CONDITION VARIABLES
So we'll have a channel for the limit tokens, and one for the idle-connection
resources. A send on the semaphore channel will communicate that we have
consumed a slot toward the limit, and the idle channel will communicate the actual
connections as they are idled.
Now Release and Hijack have become trivial: Release literally puts the connection
back in the pool, and Hijack releases a token from the semaphore. They've dropped
from four-line bodies to one line each: instead of locking, storing the resource,
signalling, and unlocking, they simply communicate the resource.
If we really wanted to, we could use a single channel for this instead of two: we could
use a nil net.Conn to represent “permission to create a new connection”. Personally, I
think the code is clearer with separate channels.
https://play.golang.org/p/j_OmiKuyWo8
COMMUNICATION: RESOURCE POOL
func (p *Pool) Acquire(ctx context.Context) (net.Conn, error) {
select {
case conn := <-p.idle:
return conn, nil
case p.sem <- token{}:
conn, err := dial()
if err != nil {
<-p.sem
}
return conn, err
case <-ctx.Done():
return nil, ctx.Err()
}
}
CONDITION VARIABLES
Acquire ends up a lot simpler too — and even cancellation is just one more case in
the select.
{
GO BENEFIT
Indicate the
Synchronization. existence of
new data.
CONDITION VARIABLES
Conditions can also indicate the existence of new data for processing.
CONDITION VARIABLE: ONE ITEM PER SIGNAL
func (q *Queue) Get() Item { func (q *Queue) Put(item Item) {
q.mu.Lock() q.mu.Lock()
defer q.mu.Unlock() defer q.mu.Unlock()
for len(q.items) == 0 { q.items = append(q.items, item)
q.itemAdded.Wait() q.itemAdded.Signal()
} }
item := q.items[0]
q.items = q.items[1:]
return item
}
CONDITION VARIABLES
Let's go back to our queue example. For the single-item Get and Put, a signal
indicates the availability of an item of data...
CONDITION VARIABLE: ZERO OR ONE ITEMS?
func ([…]) GetMany(n int) []Item { func (q *Queue) Put(item Item) {
q.mu.Lock() q.mu.Lock()
defer q.mu.Unlock() defer q.mu.Unlock()
for len(q.items) < n { q.items = append(q.items, item)
q.itemAdded.Wait() q.itemAdded.Broadcast()
} }
items := q.items[:n:n]
q.items = q.items[n:]
return items
}
Each Put wakes up all GetMany, but at most one will consume the item.
CONDITION VARIABLES
...while in the GetMany version it indicates potential availability of an item that some
other goroutine may have already consumed.
That imprecise targeting is the cause of both spurious wakeups and starvation.
Share data
by communicating
the data.
To avoid spurious wakeups, we should signal only the goroutine that will actually
consume the data. But if we know which goroutine will consume the data, we may as
well send the data along too.
Sending the data makes it much easier to see whether the signal is spurious: if we
resend the exact same data to the same receiver, or if the caller explicitly ignores the
channel-receive — for example, by executing a continue in a range loop — then we
probably didn't need to send it in the first place.
Sending the data also makes signals harder to forget: we'll very likely notice if we
compute data and then don't send it anywhere, although we do still have to be careful
to send it to all interested receivers.
Metadata
are data too!
The information about “who needs which data” is also data. We can communicate that
too!
COMMUNICATION: QUEUE
CONDITION VARIABLES
We need two channels: one to communicate the items, and another to communicate
whether any items even exist.
Both will have a buffer size of one: the items channel functions like a mutex, while the
empty channel is a like a one-token semaphore.
https://play.golang.org/p/uvx8vFSQ2f0
COMMUNICATION: QUEUE
func (q *Queue) Get() Item { func (q *Queue) Put(item Item) {
items := <-q.items var items []Item
select {
item := items[0] case items = <-q.items:
items = items[1:] case <-q.empty:
if len(items) == 0 { }
q.empty <- true items = append(items, item)
} else { q.items <- items
q.items <- items }
}
return item
}
CONDITION VARIABLES
This time, we really do need the two separate channels: Put needs to know when
there are no items so that it can start a new slice, but Get wants only non-empty
items.
COMMUNICATION: QUEUE CANCELLATION
func (q *Queue) Get(ctx context.Context) (Item, error) {
var items []Item
select {
case <-ctx.Done():
return 0, ctx.Err()
case items = <-q.items:
}
item := items[0]
if len(items) == 1 {
q.empty <- true
} else {
q.items <- items[1:]
}
return item, nil
}
To support cancellation, select on operations that block indefinitely.
Operations that we know will not block do not need to select.
CONDITION VARIABLES
We don't need to select on the sends at the end because we know they won't block:
when we received the items, we also received the information that our goroutine owns
those items.
SPECIFIC COMMUNICATION: QUEUE
type waiter struct { type Queue struct {
n int s chan state
c chan []Item }
}
func NewQueue() *Queue {
type state struct { s := make(chan state, 1)
items []Item s <- state{}
wait []waiter return &Queue{s}
} }
CONDITION VARIABLES
To figure out whether we should wake a GetMany caller, we need to know how many
items it wants. Then we need a channel on which we can send those items to that
particular caller.
We'll put the items and the metadata together in one “queue state” struct and, just for
good measure, we'll share that state by communicating it too. A channel with a
one-element buffer functions much like a selectable mutex.
https://play.golang.org/p/rzSXpophC_p
SPECIFIC COMMUNICATION: QUEUE
func ([…]) GetMany(n int) []Item { func (q *Queue) Put(item Item) {
s := <-q.s s := <-q.s
if len(s.wait) == 0 && s.items = append(s.items, item)
len(s.items) >= n { for len(s.wait) > 0 {
items := s.items[:n:n] w := s.wait[0]
s.items = s.items[n:] if len(s.items) < w.n {
q.s <- s break
return items }
} w.c <- s.items[:w.n:w.n]
c := make(chan []Item) s.items = s.items[w.n:]
s.wait = s.wait = s.wait[1:]
append(s.wait, waiter{n, c}) }
q.s <- s q.s <- s
}
return <-c
}
Put sends to the next waiter if — and only if — it has enough items
for that waiter.
CONDITION VARIABLES
To get a run of items, we first check the current state for sufficient items. If there aren't
enough, we add an entry to the metadata.
To put an item to the queue, we append it to the current state and check the metadata
to see whether that makes enough items to send to the next waiter. When we don't
have enough items left, we'll stop sending items and send back the updated state.
Coordination.
Mark
transitions.
CONDITION VARIABLES
Broadcast on a condition may signal a transition from one state to another: for
example, it may indicate that the program has finished loading its initial configuration,
or that a communication stream has been terminated.
CONDITION VARIABLE: REPEATING TRANSITION
type Idler struct { func (i *Idler) SetBusy(b bool) {
mu sync.Mutex i.mu.Lock()
idle sync.Cond defer i.mu.Unlock()
busy bool wasBusy := i.busy
idles int64 i.busy = b
} if wasBusy && !i.busy {
i.idles++
func (i *Idler) AwaitIdle() { i.idle.Broadcast()
i.mu.Lock() }
defer i.mu.Unlock() }
idles := i.idles
for i.busy && idles == i.idles { func NewIdler() *Idler {
i.idle.Wait() i := new(Idler)
} i.idle.L = &i.mu
} return i
}
Since awakened goroutines must wait to reacquire the mutex,
the waiter must be robust to subsequent changes.
CONDITION VARIABLES
You might think we would only need to store the current state — the “busy” boolean
— but that turns out to be a very subtle decision. If AwaitIdle looped only until it saw a
non-busy state, it would be possible to transition from busy to idle and back before
AwaitIdle got the chance to check, and we would miss short idle events.
Go's condition variables — unlike pthread condition variables — don't have spurious
wakeups, so in theory we could return from AwaitIdle unconditionally after the first
Wait call.
However, it's common for condition-based code to intentionally over-signal — for
example, as a workaround for an undiagnosed deadlock — so to avoid introducing
subtle problems later it's best to keep the code robust to spurious wakeups.
Instead, we can track the cumulative count of events, and wait until either we catch
the idle event in progress or observe its effect on the counter.
https://play.golang.org/p/HYiRtJcyaX9
Share completion
by completing
communication.
CONDITION VARIABLES
https://play.golang.org/p/4WG59Juxjch
{
GO BENEFIT
Fanout.
Broadcast
events.
CONDITION VARIABLES
Broadcast may also signal ephemeral events, such as configuration reload requests.
Events can be data.
We can treat broadcast events like data updates, and send them individually to each
interested subscriber.¹
The os/signal package in the standard library takes that approach, so that waiters can
receive multiple events on the same channel.
Alternately, we can treat the event as the completion of the “hasn't happened yet”
state — and indicate it by closing a channel.¹
That typically results in fewer channel allocations, but when we have closed the
channel we can't communicate any additional data about the event.
We started with asynchronous patterns, which deal with goroutines. Then, we looked
at condition variables, which sometimes deal with resources.
Now, let's put them together. The Worker Pool is a pattern that treats a set of
goroutines as resources.¹
Just a note on terminology: in other languages the pattern is usually called a “thread
pool”,² but in Go we're working with goroutines, so we just call them workers.
² I've had some trouble tracking down the origin of this term: it's attested by 1999, in
Doug Lea's Concurrent Programming in Java, second edition, but I suspect it's older
than that. (The phrase “thread pool” appears much earlier — the first reference I could
find is in The 7th International Conference on Distributed Computing Systems, Berlin,
West Germany, September 21-25, 1987 — but it's not clear to me when it began to
refer to this specific pattern.)
WORKER POOL
Start the workers: work := make(chan Task)
for n := limit; n > 0; n-- {
go func() {
for task := range work {
perform(task)
}
}()
}
WORKER POOLS
In the Worker Pool pattern, we start up a fixed number of “worker” goroutines that
each read and perform tasks from a channel.
Another goroutine — often the same one that started the workers — sends the tasks
to the workers. The sender blocks until a worker is available to receive the next task.
{
CLASSICAL BENEFIT
Efficiency.
Distribute work
across threads
WORKER POOLS
In languages with heavyweight threads, the worker pool pattern allows us to reuse
threads for multiple tasks, avoiding the overhead of creating and destroying threads
for small amounts of work.
EFFECTIVE GO
Flow control.
Limit
work in flight.
WORKER POOLS
The benefit that worker pools do provide in Go is to limit the amount of concurrent
work in flight.
If each task needs to use some limited resource — such as file handles, network
bandwidth, or even a nontrivial amount of RAM — a worker pool can bound the peak
resource usage of the program.
WORKER POOLS: COST
Worker
lifetimes
The simple worker pool I showed earlier has a problem. It leaks the workers forever.
If the API we're implementing is synchronous — and remember what we said before
about asynchronous APIs? — or, if we want to be able to reset the worker state for a
unit-test, then we need to be able to shut down the workers and know when they've
finished.
WORKER POOL: CLEANING UP
Start the workers: work := make(chan Task)
var wg sync.WaitGroup
for n := limit; n > 0; n-- {
wg.Add(1)
go func() {
for task := range work {
perform(task)
}
wg.Done()
}()
}
WORKER POOLS
https://play.golang.org/p/zz7LkM6WqX0
WORKER POOL: CLEANING UP
Send the work: for _, task := range hugeSlice {
work <- task
}
WORKER POOLS
...then, after we send the work, we can close the channel and wait for the workers to
exit.
WORKER POOLS: COST
Idle workers
But we may have another problem: even if we remember to clean up the workers
when we're done, we may leave them idle for a long time — especially toward the end
of work — and “the end of work” may be forever if we've accidentally deadlocked
something.
Assuming we've remembered to clean up, if we have a deadlock our tests will hang
instead of passing. So at least we can get a goroutine dump to help debug, right?
WORKER POOL: DEBUGGING
WORKER POOLS
Hmm. All of those idle workers are still hanging around in our goroutine dump.
That will make the interesting goroutines a lot harder to find — especially if our
program happens to be a large service implemented with several different pools. It
will also be a problem if we want to use a goroutine dump to debug other issues, such
as crashes or memory leaks.
This is an actual goroutine dump from a test failure involving a deadlock between a
worker pool and the goroutine that feeds it. This one is just a toy: there's only one
pool, and it only has a hundred workers. Even so, one of the goroutines involved in
the deadlock ended up all the way at the bottom of page four — and these are long
pages!
https://play.golang.org/p/zz7LkM6WqX0
QUIZ TIME!
2 kilobytes
x N workers
= ?
And remember, goroutines are lightweight, not free: those idle workers still have a
resource cost, too, and for large pools that cost may not be completely negligible.
WORKER POOLS
So let's rethink this pattern: how can we get the same benefits as worker pools
without the complexity of workers and their lifetimes?
Start goroutines
when you have
concurrent work
to do now.
We want to start the goroutines only when we're actually ready to do the work, and let
them exit as soon as the work is done.
Let's do just that part and see where we end up.
WAITGROUP: DISTRIBUTING (UNLIMITED) WORK
Start the work: var wg sync.WaitGroup
for _, task := range hugeSlice {
wg.Add(1)
go func(task Task) {
perform(task)
wg.Done()
}(task)
}
WORKER POOLS
If we only need to distribute work across threads, we can omit the worker pool and its
channel and use only the WaitGroup.
This code is a lot simpler, but now we need to figure out how to limit the in-flight work
again.
Share resources
by communicating
the resources.
WORKER POOLS
The semaphore example in Effective Go acquires a token inside the goroutine, but
we'll acquire it earlier — right where we had the WaitGroup Add call. We don't want a
lot of goroutines sitting around doing nothing, and this way we have only one idle
goroutine instead of many. Recall that we acquire this semaphore by sending a token,
and we release it by discarding a token.
This semaphore fits pretty nicely in place of the WaitGroup, and that's no accident:
sync.WaitGroup is very similar to a semaphore.¹ The only major difference is that the
WaitGroup allows further Add calls during Wait, whereas the wait loop on our
semaphore channel does not. Fortunately, that doesn't usually matter, as in this case.
¹ #20687 notwithstanding.
SEMAPHORE CHANNEL: INVERTED WORKER POOL?
sem := make(chan token, limit) work := make(chan Task)
for _, task := range hugeSlice { for n := limit; n > 0; n-- {
sem <- token{} go func() {
go func(task Task) { for task := range work {
perform(task) perform(task)
<-sem }
}(task) }()
} }
The semaphore pattern has the same number of lines as the worker pool,
but no leaked workers!
WORKER POOLS
Remember our first worker pool with the two loops, and how we leaked all those idle
workers forever?
If you look carefully, these are the same two loops swapped around – we've
eliminated the leak without adding any net lines of code.
Recap
So let's recap.
But before that, I have one last note to add: in this talk I've focused on making the
code clear and robust.
The patterns I'm recommending here should all be reasonably efficient — generally
the right asymptotic complexity and reasonable constant factors — but I don't promise
that they provide optimal performance. I haven't benchmarked them.
If you have, you may find that performance is better with one of the patterns I've
cautioned against. You may take the downsides of those patterns into account and
decide to use the patterns anyway. If you do, please remember to document your
reasoning, and check in the benchmarks that support it. The Go language itself
doesn't change much at the moment, but the implementation certainly does.
Start goroutines
when you have
concurrent work.
Opaque signals about shared memory make it entirely too easy to send the signals to
the wrong place, or miss them entirely.
Instead, communicate where things need to go, and then communicate to send them
there.
GOPHERCON 2018
Rethinking Classical
Concurrency Patterns
Bryan C. Mills
[email protected]
GH: @bcmills
I'd appreciate any feedback you may have, in person here at GopherCon or by email
to the golang-nuts list or to [email protected].
NOTICES
Backup Slides
FUTURE (THE PRECISE WAY)
API: func Fetch(name string) (func() Item) {
item := new(Item)
ready := make(chan struct{})
[…]
return func() Item {
<-ready
return *item
}
}
ASYNCHRONOUS APIS
For completeness, here's the alternative. It doesn't fit on a slide very well, but trust me
that it has similar problems.
CONDITION VARIABLES
https://play.golang.org/p/Fw04CeyIifx
COMMUNICATION: EVENT FANOUT (CHANNELS)
type Notifier struct {
st chan state
}
CONDITION VARIABLES
We can send a description of the event on multiple channels, one per waiter.
https://play.golang.org/p/uRwV_i0v13T
COMMUNICATION: EVENT FANOUT
func (n *Notifier) AwaitChange( select {
ctx context.Context, case <-ctx.Done():
seq int64) (newSeq int64) { return seq
case newSeq = <-c:
c := make(chan int64, 1) return newSeq
st := <-n.st }
if st.seq == seq { }
st.wait = append(st.wait, c)
} else { func (n *Notifier) NotifyChange() {
c <- st.seq st := <-n.st
} for _, c := range st.wait {
n.st <- st c <- st.seq + 1
}
n.st <- state{st.seq + 1, nil}
}
CONDITION VARIABLES
Because we're using multiple channels in place of a single condition variable, the
code is a bit more verbose — but now we can support cancellation.
COMMUNICATION: EVENT FANOUT (CLOSURE)
type Notifier struct {
st chan state
}
CONDITION VARIABLES
We can allocate the channels upfront, or wait for the first waiter. Here, we allocate
eagerly.
https://play.golang.org/p/tWVvXOs87HX
COMMUNICATION: EVENT FANOUT
func (n *Notifier) AwaitChange( func (n *Notifier) NotifyChange() {
ctx context.Context, st := <-n.st
seq int64) (newSeq int64) { close(st.changed)
n.st <- state{
st := <-n.st st.seq + 1,
n.st <- st make(chan struct{}),
if st.seq != seq { }
return st.seq }
}
select {
case <-ctx.Done():
return seq
case <-st.changed:
return seq + 1
}
}
If we close a channel to broadcast, we can select on cancellation
but must convey any additional data out-of-band.
CONDITION VARIABLES
This approach doesn't need as many channel allocations if we have multiple waiters,
but requires a fresh channel for each event — and prevents us from sending any
additional data along with the signal.
CHANNEL SEMAPHORE: CACHING RESOURCES
Set up the sem := make(chan token, limit)
semaphores: idle := make(chan Conn, limit)
CONDITION VARIABLES
CONDITION VARIABLES
CHANNEL SEMAPHORE: CACHING RESOURCES
Drain the pool: for n := limit; n > 0; n-- {
select {
case conn := <-idle:
conn.Close()
case sem <- token{}:
}
}
CONDITION VARIABLES
CHANNEL SEMAPHORE: CACHING RESOURCES
Prune idle resources: empty := false
for !empty {
select {
case conn := <-idle:
conn.Close()
<-sem // Remove conn's token.
default:
empty = true
}
}
CONDITION VARIABLES
CHANNEL SEMAPHORE: LIMITING RESOURCES
Use the pool: var wg sync.WaitGroup
for _, task := range hugeSlice {
wg.Add(1)
go func(task Task) {
prepare(task)
conn := <-idle
perform(task, conn)
idle <- conn
wg.Done()
}(task Task)
}
wg.Wait()
If only part of the work requires the resource,
we can limit only that part (and use a WaitGroup for the rest).
CONDITION VARIABLES
CHANNEL SEMAPHORE: LIMITING RESOURCES
Create the tokens: idle := make(chan Conn, limit)
for len(idle) < cap(idle) {
idle <- newConn()
}
CONDITION VARIABLES
If our “tokens” also need to be closed when no longer in use, we may need to close
and drain the channel when we're done too.
WORKER POOL: TASK-BASED SYNCHRONIZATION
Start the workers: work := make(chan Task)
done := make(map[Task]chan struct{})
for n := limit; n > 0; n-- {
go func() {
for task := range work {
perform(task)
close(done[task])
}
}()
}
WORKER POOLS
WORKER POOLS
SEMAPHORE CHANNEL: DYNAMIC TASK SPLITTING
sem := make(chan token, limit) sem <- token{}
var children sync.WaitGroup walk(root)
walk = func(n *Node) { <-sem
[…] children.Wait()
for _, child := range n.Children {
select {
case sem <- token{}:
children.Add(1)
go func() {
walk(child)
<-sem
children.Done()
} ()
default:
walk(child)
}
}
}
WORKER POOLS
For example, we can try to acquire another token without blocking, and add
concurrency as we go:
This pattern would be much more complicated with a conventional worker pool:
because we can't do any of the work before we send the task, we need to add the
tasks to the WaitGroup before we try to send them on the channel.
WORKER POOLS
Because we can't check whether we have an available worker without sending the
work to it, we have to add every node to the WaitGroup.
A NEW API: WORKER.POOL?
package worker
WORKER POOLS
By now we've got a fair amount of boilerplate. It's mostly straightforward, but at least a
little bit subtle.
There are tons of these in godoc.org. Here's my version: it seems to cover the basics,
it supports Context cancellation, and the implementation — including TryAdd and
MustAdd variants — fits in a hundred lines of code.
Unfortunately, this API isn't very much simpler than the code it's replacing: the caller
still has to remember to call Finish, even if they wait for each individual task to
complete (for example, by reading from a channel of errors). Otherwise, they'll leak
worker goroutines.
But if they call Finish too soon, Add will fail.
https://play.golang.org/p/i81YDGrhK0C
A NEW API: WORKER.POOL?
Start the workers: work := worker.NewPool(limit)
WORKER POOLS
It does clean up the code a bit, though. It's an improvement, but is it enough of one?
A “NEW” API: SEMAPHORE?
package semaphore
WORKER POOLS
https://play.golang.org/p/OJZsRoaFLtN
A “NEW” API: SEMAPHORE?
Start the work: work := semaphore.New(limit)
for _, task := range hugeSlice {
task := task
work.Add(ctx, func() {
perform(task)
})
}
WORKER POOLS
A NEW API: ERRGROUP ADDITIONS?
package errgroup
WORKER POOLS
A NEW API: ERRGROUP ADDITIONS?
work, ctx := errgroup.WithContext(ctx)
work.SetLimit(limit)
for _, task := range hugeSlice {
if ctx.Err() != nil {
work.Go(ctx.Err)
break
}
task := task
work.Go(func() error {
defer func() {
return perform(ctx, task)
})
}
err := work.Wait()
WORKER POOLS