pg_crdt is an experimental extension adding support for conflict-free replicated data types (CRDTs) in Postgres.
CRDTs are decentralized data structures that can safely be replicated and synchronized across multiple computers/nodes. They are the enabling technology for collaborative editing applications like Notion.
Our goal was to evalaute if we could leverage a Postgres-backed CRDT and Supabase's existing realtime API for change-data-capture to enable development of collaborative apps on the Supabase platform.
The pg_crdt extension is a proof-of-concept that wraps rust's yrs and automerge libraries using the pgx framework to add a Postgres native CRDT, crdt.ydoc. The extension supports creating a new crdt.ydoc and merging crdt.ydocs. For a full list of available methods see API.
The experiment was sucessful in that it enabled a proof-of-concept CRDT-as-a-service. Through that experience we found that there are significant technical hurdles to using Postgres as a CRDT source-of-truth with updates broadcasted to collaborators using supabase/realtime.
For example:
-
Frequently updated CRDTs produce a lot of WAL and dead tuples
-
Large CRDT types in Postgres generate significant serialization/deserialization overhead on-update
-
supabase/realtimebroadcasts database changes from the Postgres write ahead log (WAL). The WAL includes a complete copy of the the underlying data so small updates cause the entire document to broadcast to all collaborators
While many of these challenges are solvable, CRDT support does not trivially drop-in to our stack using this approach. In the short term we're opening this repo to allow others to continue building on the experiment. We're also brainstorming new ways to bring these technologies together to make supabase the go-to choice for collaborative apps.
The database's internal representation of a CRDT Doc is the Doc's state vector encoded as an update. This format can be rehydrated to apply updates. It is also what new clients need first when they join the shared data structure.
- When clients join a Doc, initial state is queried from a table
- When updates from remote clients ocurr, realtime can broadcast the changes to subscribers
Create the extension
create extension pg_crdt;Define a table with a CRDT column
create table posts (
id serial primary key,
content crdt.ydoc default crdt.new_ydoc()
);Insert a row into the table
insert into posts (content)
values (crdt.new_ydoc())
returning id;Update the CRDT by merging an update, where change is a CRDT doc change
update posts
set content = crdt.merge(content, change)
where id = 1;There is a convenience || infix operator that is equivalent to crdt.merge. To use it, make sure your search path includes the crdt schema:
set search_path to public,crdt;
update posts
set content = content || crdt.new_ydoc()
where id = 1;Creates a new, empty, Yjs document (YDoc)
Merges two documents into one.
Synonymous to the || operator available in the crdt schema.
Applies an update to a document. YUpdate can be created by casting byte array (bytea) to crdt.yupdate
Synonymous to the || operator available in the crdt schema.
Creates a new, empty, Automerge document
Merges two documents into one.
Synonymous to the || operator available in the crdt schema.
Applies an update to a document. AutoChange can be created by casting byte array (bytea) to crdt.autochange
Synonymous to the || operator available in the crdt schema.
git clone https://github.com/supabase/pg_crdt.git
cd pg_crdt
docker-compose upThe Postgres database is available at postgresql://postgres:password@localhost:5582/sdb
To enable the extension in Postgres, execute the create extension statement
create extension pg_crdt;First, install pgx
Then clone the repo and install using
git clone https://github.com/supabase/pg_crdt.git
cd pg_crdt
cargo pgx run pg14
Which starts a psql prompt.
To enable the extension in Postgres, execute the create extension statement
create extension pg_crdt;