0% found this document useful (0 votes)
138 views

Scaling Memcache at Facebook - Slides

The document summarizes Facebook's approach to scaling their Memcache infrastructure to support over 1 billion requests per second. Key aspects include: 1) Using Memcache as a front end to databases to handle the heavy read load. 2) Partitioning data and servers into multiple Memcache clusters to improve read throughput and allow independent scaling. 3) Synchronizing data between Memcache clusters and databases by tailing database commit logs to invalidate cached entries. 4) Distributing Memcache clusters across data centers for availability and low latency access from different geographic regions.

Uploaded by

gamezzzz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
138 views

Scaling Memcache at Facebook - Slides

The document summarizes Facebook's approach to scaling their Memcache infrastructure to support over 1 billion requests per second. Key aspects include: 1) Using Memcache as a front end to databases to handle the heavy read load. 2) Partitioning data and servers into multiple Memcache clusters to improve read throughput and allow independent scaling. 3) Synchronizing data between Memcache clusters and databases by tailing database commit logs to invalidate cached entries. 4) Distributing Memcache clusters across data centers for availability and low latency access from different geographic regions.

Uploaded by

gamezzzz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Scaling Memcache

at Facebook

Presenter: Rajesh Nishtala ([email protected])


Co-authors: Hans Fugal, Steven Grimm, Marc
Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy,
Mike Paleczny, Daniel Peek, Paul Saab, David Stafford,
Tony Tung, Venkateshwaran Venkataramani
Infrastructure Requirements
for Facebook
1.  Near real-time communication
2.  Aggregate content on-the-fly from
multiple sources
3.  Be able to access and update very popular
shared content
4.  Scale to process millions of user requests
per second
Design Requirements
Support a very heavy read load
•  Over 1 billion reads / second
•  Insulate backend services from high read rates
Geographically Distributed
Support a constantly evolving product
•  System must be flexible enough to support a variety of use cases
•  Support rapid deployment of new features
Persistence handled outside the system
•  Support mechanisms to refill after updates
memcached
•  Basic building block for a distributed key-value store
for Facebook
•  Trillions of items
•  Billions of requests / second
•  Network attached in-memory hash table
•  Supports LRU based eviction
Roadmap

1.  Single front-end cluster Geo Region Geo Region


•  Read heavy workload
Front-End Cluster Front-End Cluster
•  Wide fanout
•  Handling failures Web Server Web Server

Storage Replication
2.  Multiple front-end clusters
FE FE
•  Controlling data replication Memcache Memcache

•  Data consistency
Storage Cluster Storage Cluster
(Master) (Replica)
3.  Multiple Regions
•  Data consistency
Pre-memcache
Just a few databases are enough to support the load

Web Server Web Server Web Server Web Server

Database Database Database

Data sharded across the databases


Why Separate Cache?
High fanout and multiple rounds of data fetching

Interstitial slide

Data dependency DAG for a small request


Scaling memcache in 4 easy steps
10s of servers & millions of operations per second

0 No memcache servers

1 A few memcache servers

2
Interstitial slide
Many memcache servers in one cluster

3 Many memcache servers in multiple clusters

4 Geographically distributed clusters


Need more read capacity

•  Two orders of magnitude


more reads than writes Web Server
•  Solution: Deploy a few 1. Get (key)
memcache hosts to handle
3. DB lookup 4. Set (key)
the read capacity 2. Miss (key)

•  How do we store data?


•  Demand-filled look-aside cache Memcache

•  Common case is data is Database


available in the cache Database Database
Handling updates

•  Memcache needs to be
invalidated after DB write Web Server
•  Prefer deletes to sets
1. Database 2. Delete
•  Idempotent update

•  Demand filled
•  Up to web application
to specify which keys Memcache

to invalidate after Database


database update
Problems with look-aside caching
Stale Sets

•  Extend memcache
Web Server Web Server protocol with leases
A B
•  Return and attach a
1. Read (A) lease-id with every miss
3. Read (B) 4. Set (B) •  Lease-id is invalidated
inside server on a delete
5. Set (A)
•  Disallow set if the
lease-id is
Database Memcache invalid at the server
A
B A
B
2. Updated to (B)

MC & DB Inconsistent
Problems with look-aside caching
Thundering Herds

Web Web Web •  Memcache server


Server Server Server arbitrates access
to database
•  Small extension to leases
•  Clients given a choice
of using a slightly stale
value or waiting
Database Memcache
B A
Scaling memcache in 4 easy steps
100s of servers & 10s of millions of operations per second

0 No memcache servers

1 A few memcache servers

2
Interstitial slide
Many memcache servers in one cluster

3 Many memcache servers in multiple clusters

4 Geographically distributed clusters


Need even more read capacity

Web Server Web Server Web Server Web Server

Memcache Memcache Memcache Memcache

•  Items are distributed across memcache servers by using


consistent hashing on the key
•  Individual items are rarely accessed very frequently so over replication
doesn t make sense
•  All web servers talk to all memcache servers
•  Accessing 100s of memcache servers to process a user request is
common
Incast congestion
Web Server
DROPS

Get key1
10kB val Get
10kBkey2
val Getval
5kB key3 Get keyN
7kB val

Memcache Memcache Memcache Memcache

•  Many simultaneous responses overwhelm shared


networking resources
•  Solution: Limit the number of outstanding requests
with a sliding window
•  Larger windows cause result in more congestion
•  Smaller windows result in more round trips to the network
Scaling memcache in 4 easy
steps
1000 s of servers & 100s of millions of operations per second

0 No memcache servers

1 A few memcache servers

2
Interstitial slide
Many memcache servers in one cluster

3 Many memcache servers in multiple clusters

4 Geographically distributed clusters


Multiple clusters

•  All-to-all limits Front-End Cluster Front-End Cluster

horizontal scaling
Web Server Web Server
•  Multiple memcache
clusters front one FE FE
Memcache Memcache
DB installation
•  Have to keep the caches
consistent
Storage Cluster (Master)
•  Have to manage
over-replication of data
Databases invalidate caches
Front-End Cluster #1 Front-End Cluster #2 Front-End Cluster #3

Web Server Web Server Web Server

MC MC MC MC MC MC MC MC MC MC MC

MySQL McSqueal
Storage Server Commit Log

•  Cached data must be invalidated after database updates


•  Solution: Tail the mysql commit log and issue deletes based
on transactions that have been committed
•  Allows caches to be resynchronized in the event of a problem
Invalidation pipeline
Too many packets

MC MC MC MC MC MC MC MC MC MC MC

Memcache Memcache Memcache


Routers Routers Routers

Memcache Routers
• Aggregating deletes reduces
packet rate by 18x
• Makes configuration
McSqueal McSqueal McSqueal
management easier
DB DB DB • Each stage buffers deletes in
case downstream component is
down
Scaling memcache in 4 easy steps
1000s of servers & > 1 billion operations per second

0 No memcache servers

1 A few memcache servers

2 Many memcache servers in one cluster

3 Many memcache servers in multiple clusters

4 Geographically distributed clusters


Geographically distributed clusters
Replica

Replica Master
Writes in non-master
Database update directly in master

• Race between DB replication and subsequent DB read

Web Web
Server Server
3. Read from DB
(get missed)
2. Delete from mc
1. Write to master 4. Set potentially
state value to
memcache

Race!
Master Replica Memcache
DB 3. MySQL replication DB
Remote markers
Set a special flag that indicates whether a race is likely
Read miss path:
If marker set
read from master DB
else
read from replica DB
Web Server

1. Set remote
marker
2. Write to master

3. Delete from
memcache

Master Replica Memcache


DB 4. Mysql replication DB
5. Delete remote
marker
Putting it all together
1.  Single front-end cluster Geo Region Geo Region
•  Read heavy workload
Front-End Cluster Front-End Cluster
•  Wide fanout
•  Handling failures Web Server Web Server

Storage Replication
2.  Multiple front-end clusters
FE FE
•  Controlling data replication Memcache Memcache

•  Data consistency
Storage Cluster Storage Cluster
(Master) (Replica)
3.  Multiple Regions
•  Data consistency
Lessons Learned
•  Push complexity into the client whenever possible
•  Operational efficiency is as important
as performance
•  Separating cache and persistent store allows them
to be scaled independently
Thanks! Questions?
http://www.facebook.com/careers

You might also like