PostgreSQL Distributed Architectures and Best Practices
PostgreSQL Distributed Architectures and Best Practices
Many different PostgreSQL distributed system architectures with different trade-offs exist.
No network latency
Millions of IOPS
Microsecond disk latency
Low cost / fast hardware
Can co-locate application server
Single machine PostgreSQL?
PostgreSQL on a single machine comes with operational hazards
The cloud enables flexible distributed set ups, with resources shared between customers for
high efficiency and resiliency.
Goals of distributed database architecture
Goal: Offer same functionality and transactional semantics as single node
RDBMS, with superior properties
time Locks!
Application
Application PostgreSQL
Application
Number of connections limited by app architecture Number of processes limited by memory, contention
Network-attached
block storage
Network-attached block storage
PostgreSQL
VM
Hypervisor
Multi-tenant
Network
Single AZ/DC
Network-attached storage
Pros:
Higher durability (replication)
Higher uptime (replace VM, reattach)
Fast backups and replica creation (snapshots)
Disk is resizable
General guideline:
Cons: Always use, durability &
Higher disk latency (~20μs -> ~1000μs) availability are more
Lower IOPS (~1M -> ~10k IOPS) important than performance.
Crash recovery on restart takes time
Cost can be high
Read replicas
Read replicas
Readable replicas can help you scale read throughput, reduce latency through cross-region
replication, improve availability through auto-failover.
PostgreSQL
(replica)
PostgreSQL
(replica)
Scaling read throughput
Readable replicas can help you scale read throughput (when reads are CPU or I/O
bottlenecked) by load balancing queries across replicas.
PostgreSQL
(replica)
?
PostgreSQL
Client
(primary)
Load
Balancing PostgreSQL
Client
(replica)
(Several options)
.. Scale out …
Eventual read-your-writes consistency
Read replicas can be behind on the primary, cannot always read your writes.
Load PostgreSQL
Client
Client Balancing (lsn=9)
Replica B
(lsn=7)
No monotonic read consistency
Load-balancing across read replicas will cause you to go back-and-forth in time.
Replica A
SELECT count(*) 1
(lsn=9)
1 3
INSERT
2 Load PostgreSQL
Client
Client 3 Balancing 2 (lsn=9)
Replica B
(lsn=7)
Poor cache usage
If all replicas are equal, they all have the same stuff in cache
Replica A
SELECT .. WHERE id = 1
(id=1, id=2, …)
Load PostgreSQL
Client Balancing (primary)
Replica B
(id=1, id=2, …)
SELECT .. WHERE id = 2
If working set >> memory, all replicas get bottlenecked on disk I/O.
Read scaling trade-offs
Pros:
Read throughput scales linearly
Low latency stale reads if read replica is closer than primary
Lower load on primary
u1 u4 u2 u5 u3 u6
i1 i4 i2 i5 i3 i6
Tables can be co-located to enable local joins, foreign keys, etc. by the shard key.
Single shard queries for operational workloads
Scale capacity for handling a high rate of single shard key queries:
insert into items (user_id, …) values (123, …);
Load balancer
u1 u4 u2 u5 u3 u6
i1 i4 i2 i5 i3 i6
Load balancer
u1 u4 u2 u5 u3 u6
i1 i4 i2 i5 i3 i6
Compute-heavy queries
Compute-heavy queries (shard key joins, json, vector, …) get the most relative benefit
select compute_stuff(…) from users join items using (user_id) where user_id = 123 …
Load balancer
u1 u4 u2 u5 u3 u6
i1 i4 i2 i5 i3 i6
Multi-shard queries for analytical workloads
Parallel multi-shard queries can quickly answer analytical queries across shard keys:
select country, count(*) from items, users where … group by 1 order by 2 desc limit 10;
Load balancer
u1 u4 u2 u5 u3 u6
i1 i4 i2 i5 i3 i6
Multi-shard queries for operational workloads
Multi-shard queries add significant overhead for simple non-shard-key queries
select * from items where item_id = 87;
Load balancer
u1 u4 u2 u5 u3 u6
i1 i4 i2 i5 i3 i6
Multi-shard queries for analytical workloads
Snapshot isolation is a challenge (involves trade-offs):
select country, count(*) from items, users where … group by 1 order by 2 desc limit 10;
Load balancer
u1 u4 u2 u5 u3 u6
i1 i4 i2 i5 i3 i6
↔ BEGIN;
← INSERT INTO items VALUES (123, …);
→ INSERT INTO items VALUES (456, …);
↔ COMMIT;
Sharding trade-offs
Pros:
Scale throughput for reads & writes (CPU & IOPS)
Scale memory for large working sets
Parallelize analytical queries, batch operations
Cons:
General guideline:
High read and write latency Use for multi-tenant apps,
Data model decisions have high impact on performance otherwise use for large
Snapshot isolation concessions working set (>100GB) or
compute heavy queries.
Active-active
Like BDR, pgactive, pgEdge, …
Active-active / n-way replication
Accept writes from any node, use logical replication to asynchronously exchange and
consolidate writes.
PostgreSQL reads
(primary) writes
async
reads PostgreSQL PostgreSQL reads
writes (primary) (primary) writes
UPDATE counters SET val = val + 1 UPDATE counters SET val = val + 1
Active-active / n-way replication
All nodes can survive network partitions by accepting writes locally, but no linear history
(CAP).
PostgreSQL reads
(primary) writes
async
reads PostgreSQL PostgreSQL reads
writes (primary) (primary) writes
Active-active trade-offs
Pros:
Very high read and write availability
Low read and write latency
Read throughput scales linearly
Cons:
General guideline:
Many internal operations incur high latency Just use PostgreSQL ;)
No local joins in current implementations
Less mature and optimized than PostgreSQL but for simple apps, the
availability benefits can be useful
Conclusion
PostgreSQL can be distributed at different layers.
Client