0% found this document useful (0 votes)
104 views12 pages

Aggregation Layer

Uploaded by

ft6nv9fz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views12 pages

Aggregation Layer

Uploaded by

ft6nv9fz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unicity Infrastructure:

the Aggregation Layer


Risto Laanoja1
1
Unicity Labs

June 28, 2025

Abstract In a hierarchical trustless system, the principle is


that the base layer (e.g., L1 blockchain) provides de-
Unicity is a novel blockchain protocol with the am- centralization, while the layers below it (e.g., rollups)
bitious goal of enabling token transactions to occur present cryptographic proofs of the correctness of
off-chain and, when necessary, offline. This premise their operation. In scaling Unicity, we have designed
requires supporting infrastructure to guarantee that efficient data structures to prove the correctness of
there are no parallel states of assets, or more specifi- operation of Aggregation Layer to the Consensus
cally, that there is no double-spending; a property we Layer. Based on cryptographic hashes alone, the con-
term the unicity. It turns out that the lack of glob- sistency proof grows linearly with respect to the num-
ally shared state and ordering reduces the blockchain ber of user transactions. This imposes a hard limit
overhead considerably. In designing this infrastruc- of approx. 10 000 transactions per second (tx/s), be-
ture, no compromises were made regarding its trust yond which the networking bandwidth of the Con-
assumptions. This paper details the design of the Ag- sensus Layer becomes the bottleneck.
gregation Layer, the component responsible for pro- To scale further, we must use cryptographic zero-
ducing Proofs of Inclusion and Non-inclusion to the knowledge proofs (ZKPs) to compress the size of the
users. We analyze its design for efficiency and evalu- consistency proofs. As an application of ZKPs, this
ate the robustness of its trust and security model, and use-case is fundamentally more efficient than using
gains offered by cryptographic zero-knowledge tools. ZKPs to process the transaction data itself, as is done
in many privacy coins and ZK-rollups.
1 Motivation In this paper, we show how to scale the Aggrega-
tion Layer to 10 000 tx/s per shard. This figure rep-
The foundational principle of the Unicity Net- resents the proving throughput achievable on a sin-
work [8] is to minimize the volume of on-chain data. gle consumer-class computer. Due to the small proof
This is based on the observation that shared (“on- size and efficient verification, the Consensus Layer
chain”) state is unavoidable only to prevent double- can support a practically unlimited number of such
spending.1 The core tenets of Unicity also include trustless shards. Table 1 compares different ZKP
minimizing trust requirements, enhancing user pri- technologies. We have picked subjectively the most
vacy, and providing linear scale. appropriate ZK schemes and supporting front-ends
1 Assuming no centrally controlled, non-transparent tech-
(“stacks”).
nologies such as trusted hardware wallets or Trusted Execution The estimated implementation effort reflects the
Environments (TEEs); and that anyone can be a recipient perceived maturity and learning curve; and the diffi-

1
Table 1: Comparison of zero-knowledge proof technologies for compression of non-deletion proofs.
Hash Proving Proof Proof Size Trusted Impl.
ZK Stack
Function Speed (tx/s) Size Asymptotics Setup Effort
None (“hash based”) SHA-256 10 000* 10 MB O(n) No N/A
CIRCOM + Groth16 Poseidon 25 250 b O(1) Yes Lower
Gnark + Groth16 Poseidon 30 250 b O(1) Yes Low
SP1 zkVM SHA-256 1.5 2 MB O(log n) No Lowest
Cairo 0 + STwo Poseidon 60† 2.4 MB O(log n) No Medium
Cairo + STwo Poseidon 100 2.4 MB O(log n) No Medium
AIR + Plonky3‡ Poseidon2 10 000 1.7 MB O(log n) No High
AIR + Plonky3 Poseidon2 2500 0.7 MB O(log n) No High
AIR + Plonky3 Blake3 250 1.7 MB O(log n) No High
*
Bandwidth-limited, no verification effort reduction.

Trace generation before proving is impractically slow.

See Section 7.5 for details.

culty of producing safe implementations. data structures and cryptographic zero-knowledge


tools (SNARKs) for extra succinctness of messages
and tokens. The Proof of Unicity is a fresh proof
2 System Architecture of inclusion of the token state being spent. This
can be efficiently generated based on a Merkle Tree
To prevent double-spending of tokens, the Unicity data structure. The proof size is logarithmic with
Infrastructure permanently2 records a unique iden- respect to the tree’s capacity, making it highly effi-
tifier for every spent token state. This identifier is cient. If the root of the tree is securely fixed, the
the cryptographic hash of the token state data. If a integrity of the rest of the tree can be verified trust-
user attempts to double-spend a token, the resulting lessly: it is computationally infeasible to generate a
identifier will be identical to the one already recorded, valid inclusion proof for an element not present in the
making it impossible to obtain a new Proof of Unic- tree, without changing the root, or breaking under-
ity. A transaction is considered invalid unless it is lying cryptographic assumptions. The infrastructure
accompanied by a valid Proof of Unicity. also supports non-inclusion proofs, making it possi-
The rest of the processing—executing transactions, ble to prove to other parties that a particular token
running smart contracts, etc.—can happen at the state has not yet been spent. The Unicity Infras-
client layer, executed by users or “agents”. Agents tructure can thus be conceptualized as a large-scale,
are themselves the interested parties in data availabil- distributed Sparse Merkle Tree (SMT). Specifically,
ity and transaction validation, and they choose the the tree is implemented as an indexed variant with
ordering of incoming messages for processing. Thus, some optimizations. In this paper, without the loss of
the Unicity Infrastructure is relieved of these duties, generality, we model the distributed tree as an SMT.
removing a major scaling bottleneck of traditional L1 Furthermore, an SMT is straightforward to shard:
blockchains. the tree is partitionable vertically into slices. Leaves
The Unicity Infrastructure operates in a trust- remain at their deterministically computed positions,
minimized way by utilizing distributed authenticated as an SMT is an indexed data structure. Each leaf’s
identifier encodes its address in the tree, and the
2 Permanent from the perspective of a token, meaning for a
leaf’s shard address is a prefix of the identifier.
duration exceeding the token’s lifetime.

2
Aggregation Layer connects to the Consensus aim to further democratize the participation in the
Layer. For fully trustless operation, each request is network.
accompanied by a cryptographic proof of SMT con- PoW chains encounter rollbacks (“reorgs”) when
sistency. alternative chains with a greater cumulative PoW
work emerge. Limiting the maximum length of alter-
native chains creates the risk of involuntary forking—
Consensus Layer both alternative chains may be too long for a rollback.
This risk is specifically mitigated by a finality gadget.
On the other hand, PoW chains are extremely robust.
Aggregation Layer If any number of validators leave or join the network,
the chain continues to grow, and the block rate even-
tually adjusts to the new total mining power. In
short, PoW trades liveness for safety.
The purpose of BFT consensus layer is twofold: 1)
Execution Layer to provide deterministic (one-block) finality for the
layers below, and 2) to achieve a fast and predictable
block rate. BFT consensus trades liveness for safety:
Figure 1: Layered architecture of the Unicity Net- it is more fragile, as its liveness depends on a super-
work. majority (e.g., two thirds) of validators being online
and cooperative at any moment.
The usual way to achieve permissionless BFT con-
2.1 Consensus Layer sensus is to use a Proof-of-Stake (PoS) setup. This
can be delicate, especially during the launch of a
Decentralization is achieved by a Proof-of-Work blockchain protocol: there are known weaknesses like
(PoW) blockchain instance which manages consen- “nothing at stake attack”, and risk of centralization.
sus, including the validator selection for the BFT fi- PoW-based protocols (and longest-chain-rule proto-
nality gadget, implementing the native token, exe- cols in general) are more robust and well-suited for
cuting the tokenomics plan, and handling the valida- achieving a wide initial token distribution and estab-
tor incentives. PoW is specifically robust during the lishing token value for effective decentralization.
bootstrapping of a decentralized system: when the By combining a PoW chain with a BFT consen-
number of validators fluctuates, the financial value of sus layer, Unicity leverages the desirable properties
tokens is low, and token distribution is relatively con- of both mechanisms. The PoW chain provides de-
centrated. PoW shows great liveness properties. At centralization, robustness, and high security for the
the same time, PoW chains do not provide fast and base currency, while the BFT layer provides fast, de-
deterministic finality: many blocks of confirmations terministic finality for the Aggregation Layer.
are needed to achieve a reasonable level of certainty. In Unicity, the BFT layer operates at a much
In Unicity, this is mitigated by including a BFT “fi- higher block rate than the PoW chain. Validators for
nality gadget” which runs rather fast, and the finality the BFT Consensus Layer are selected infrequently
of transactions below is defined by the consensus of from a pool of recent, high-performing PoW miners,
the BFT cluster. based on a deterministic algorithm and PoW chain
The PoW layer provides permissionlessness, a core content; anyone can execute the algorithm to verify
property of decentralized blockchains. Any valida- the selection. PoW validators may also delegate their
tor can actively participate in mining, and blocks are BFT layer validation rights.
chosen based on the longest-chain rule. By selecting Consensus Layer validators receive their block re-
a PoW mining puzzle that is resistant to acceleration wards at the ends of epochs. It is possible to increase
by GPUs and ASICs (specifically: RandomX [7]), we economic security by implementing slashing based on

3
withheld PoW and Consensus Layer block rewards. using a SNARK. Assuming correct validation of the
non-deletion proof and chaining of the Aggregation
2.1.1 Consensus Roadmap Layer’s state roots by the Consensus Layer, the Ag-
gregation Layer can be considered trustless.
The introduction of economic security mechanisms is
a logical step toward evolving the Consensus Layer
into a full Proof-of-Stake (PoS) system, once the
2.3 Execution Layer
chain is stable and token distribution reasonably di- The Execution Layer, also known as the Agent Layer,
versified. A PoS system would provide stronger eco- is responsible for executing transactions and other
nomic security for the BFT nodes while being more business logic, using the services of the Aggregation
energy-efficient and environmentally responsible than Layer and Unicity in general.
PoW mining.
The switch to PoS includes the following steps:
1) introducing the staking mechanism to create eco- 3 Security Model of the Aggre-
nomic security for the BFT layer, 2) alternative gation Layer
ledger for the native token securing and decentral-
izing the system, and executing the tokenomics plan The Aggregation Layer implements a distributed, au-
there, 3) selecting BFT validators based on the stake, thenticated, append-only dictionary data structure.
4) adjusting incentives (block rewards, optional slash- It authenticates incoming state transfer certification
ing), 5) migrating the token balances, and 5) sunset- requests by verifying that the sender possesses the
ting the PoW chain. private key corresponding to the public key that iden-
tifies the current token owner. The specific authenti-
2.2 Aggregation Layer cation protocol is beyond the scope of this paper.

The Aggregation Layer implements a global, append- Definition 1 (Consistency) An append-only ac-
only key-value store that immutably records every cumulator operates in batches B = (k1 , k2 , . . . , kj ),
spent token state. More specifically, it provides the accepting new keys. The append-only accumulator
following services: 1) recording of key-value tuples is consistent, if 1) during the insertion of a batch
where the key identifies a token state and value of updates, no existing element was deleted or mod-
is recording some meta-data, 2) returning inclusion ified; 2) it is possible to generate inclusion proofs
proofs of keys, 3) returning non-inclusion proofs of inc
πk∈{B 1 ,...,Bi }
= (vk ⇝ r, c) for all previously in-
keys not present in the store. serted elements, but not for non-existent elements;
The Aggregation Layer periodically has its state 3) it is possible to generate non-inclusion proofs
authenticator certified by the Consensus Layer. πkinc∈{B = (∅k ⇝ r, c) for all elements not so
/ 1 ,...,Bi }
The Aggregation layer is sharded based on far inserted to the accumulator, and not for those al-
keyspace slices and can be made hierarchical, as ready inserted.
shown in Figure 2.
Proof of non-deletion: Once a key is set, it has When instantiated as a Sparse Merkle Tree (SMT),
to remain there forever. Every state change of the then vk ⇝ r is the hash chain from the value at k-
Aggregation Layer (or a slice thereof) is accompa- th position to root c, and ∅k ⇝ r denotes the hash
nied by a cryptographic proof establishing that pre- chain from the “empty” value at k-th position to root
existing keys have not been removed or their values c.
altered, only new keys were added. The size of this After each batch of additions, the new root of the
proof is logarithmic with respect to the tree’s capac- Aggregation Layer’s SMT is certified by the BFT fi-
ity and linear with respect to the size of the inclu- nality gadget, ensuring its uniqueness and immutabil-
sion batch. This can be reduced to a constant size ity. This provides a secure trust anchor for all consis-

4
Figure 2: Sharded architecture of the Aggregation Layer.

tency, inclusion, and non-inclusion proofs. The ide- requests is denoted as Bi . At the end of each batch,
alized Consensus Layer is modeled as Algorithm 1. the Aggregation Layer produces its summary root
hash ri and sends it to the Consensus Layer for certi-
Consensus Layer fication. A certification request (ri , ri−1 , π) includes:
1) the previous state root hash, 2) the new state root
hash, 3) a consistency proof of the changes made dur-
(ri , ri−1 , π) c = (i, ri , ri−1 ; scl ) ing the batch, and 4) an authenticator that identifies
the operator.
The Consensus Layer certifies the request only if it
uniquely extends a previously certified state root and
the consistency proof is valid. It returns a certificate
c = (i, ri , ri−1 ; scl ), where scl is a signature from the
SMT Consensus Layer (e.g., a threshold signature from the
consensus nodes or a proof of inclusion in a finalized
block).
Each state can be extended only once, which pre-
inc
πk∈{B =(vk ⇝r,c) vents forks within the Aggregation Layer. Each sub-
1 ,...,Bi }
B = (k1 , k2 , . . . , kj ) inc
πk ∈{B
/ 1 ,...,Bi }
=(∅k ⇝r,c) sequent round extends the most recently certified
state. We model the Consensus Layer as an oracle,
as shown in Algorithm 1.
Token Users The SMT provides users with inclusion and non-
inclusion proofs. Each proof is anchored to a state
Figure 3: Security model of the Aggregation Layer. root certified by the Consensus Layer.
The Consensus Layer must guarantee data avail-
For efficiency reasons client requests are processed ability. If recent state roots were lost, it would
in batches; the tree is re-calculated and the tree root become impossible to reject duplicate state transi-
is certified when a batch is closed. A batch of client tion requests, potentially allowing malicious actors

5
Algorithm 1 Consensus Layer modeled as an oracle The second point is addressed by validating a
function Initialize() unique state root snapshot embedded in the PoW
r− ← ⊥ block header. Since the cumulative state snapshot
i←0 appears with a delay, the block can only be consid-
end function ered final after a snapshot publishing and block con-
function CertificationRequest(ri , ri−1 , π) firmation period; hence, maximalist verification is not
if (ri−1 ̸= r− ) ∨ ¬valid(π, ri , ri−1 ) then instantaneous.
return ⊥ The third point is addressed by auditing the oper-
end if ation of the Aggregation Layer—specifically, ensur-
r− ← ri ing that no Inclusion Proofs have been generated for
i←i+1 the token that are not reflected in its recorded his-
scl ← sigcl (i, ri , ri−1 ) tory. To achieve this, all non-deletion proofs from the
return c = (i, ri , ri−1 ; scl ) token’s genesis up to its current state must be vali-
end function dated. This is made efficient through the use of recur-
sive zero-knowledge proofs (ZKPs), which show that
each round’s non-deletion proof is valid and that no
to double-spend against an old, un-extendable state. rounds were skipped from verification. These recur-
The Aggregation Layer itself does not require an sive proofs are generated periodically and are made
internal consensus mechanism; protocols like Raft available with some latency.
could be used for replication and coordination among
its redundant nodes. The decentralized consensus is
provided by the external Consensus Layer.
If each state transition is accompanied by a cryp- 3.2 Practical Security Assumptions
tographic proof of non-deletion (see Section 4), the
Aggregation Layer can be considered trustless. If we relax the model by assuming that a majority of
BFT consensus nodes exhibit economically rational
behavior and do not collude maliciously with the Ag-
3.1 “Maximalist” Security Assump- gregation Layer, the user can enjoy significantly more
tions practical operational parameters. BFT layer forking
In this model, we assume that users are capable of (case 2 above) or certifying conflicting states (case 3
validating all aspects of system operation that are above) produces strong cryptographic evidence which
relevant to their own assets. This level of trustless- is processed out of the critical path of serving users.
ness is close to the strong guarantees introduced by In this scenario, a transaction is finalized, and
Bitcoin [3], where each “client” functions as a full val- an inclusion proof is returned within a few seconds,
idator, starting from downloading and verifying the allowing the transaction to be independently veri-
blockchain from the genesis block. fied—without consulting external data3 —within the
The Root of Trust is the PoW blockchain. A max- same timeframe.
imalist user maintains a full node of this chain. This The Root of Trust is the set of epoch change
is relatively lightweight, as the “utility” transactions records of the BFT consensus layer. These records
are executed at the Execution Layer. Upon receiving grow slowly (few aggregated signatures per week).
a token, the user must be able to efficiently verify the When transitioning to proof-of-stake (PoS) consen-
following: sus (see Section 2.1.1), the Root of Trust remains the
1. The token is valid (as elaborated elsewhere), same.
2. The Aggregation Layer has not forked,
3. The Aggregation Layer has not certified conflict- 3 Previously obtained Root of Trust is used to validate fu-

ing states of the same token. ture transactions

6
4 Non-deletion Proof 6. The proof is valid if the checks above passed.

A non-deletion proof is a cryptographic construction A valid proof demonstrates that, given authentic
that validates one round of operation of the append- roots ri−1 and ri , the keys in Bi corresponded to
only accumulator. empty leaves prior to the update, and that after the
We have the i-th batch of insertions Bi = update, the values in Bi were recorded at the posi-
(k1 , k2 , . . . , kj ), where k is an inserted item; all inser- tions defined by their respective keys, and there were
tions are applied within a single operational round. no other changes.
The root hash before the round is ri−1 , and after the Complete verification algorithm is presented as Al-
round is ri . The accumulator is implemented as a gorithm 2. Note that there are several assumptions:
Sparse Merkle Tree (SMT). that the batch is sorted by keys; and the proof is an
The non-deletion proof generation for batch Bi array of arrays of tuples, outer array divides siblings
works as follows: into depth layers and inner array is sorted by keys
(first element of tuple).
1. The new leaves in batch Bi are inserted into the Due to the sparseness of the SMT we can further
SMT. improve the encoding, for example, instead of check-
ing if a node’s sibling is the next item in layer’s nodes
2. For each newly inserted leaf, the sibling nodes on
or the next item in proof array or empty element oth-
the path from the leaf to the root are collected.
erwise, we just record a number–how many of the
Siblings present or computable from other leaves
next siblings are empty elements (frequent close to
in the batch are discarded. Siblings can be fur-
the leaves when SMT is sparsely populated); and
ther organized by dividing them into layers, for
same with siblings (frequent close to the root).
more efficient verification. We denote the set as
πi .

3. Record (Bi , ri−1 , ri , πi ). 5 (ZK)-SNARKs


Proof verification works as follows: By using an appropriate cryptographic SNARK sys-
tem, the size of the non-deletion proof can be reduced
1. Verify the authenticity of the state roots ri−1 to a constant.
and ri (e.g., by checking their certification by The statement to be proven in zero-knowledge is
the Consensus Layer). the correct execution of the non-deletion proof veri-
fication algorithm described in the previous section.
2. Build an incomplete SMT tree: for each item The public inputs to the proof (the instance) are the
in Bi , insert the value of an empty leaf at the pre- and post-update roots (ri−1 , ri ). The private in-
appropriate position. put (the witness) ω is the insertion batch Bi and the
set of sibling nodes (proof) πi . While ZK-SNARKs
3. All non-computable siblings needed to compute
can hide the witness, this zero-knowledge property is
the root are available in πi . Compute the root,
not a requirement for our use case; we are primarily
compare with ri−1 ; if not equal then the proof is
interested in the proof’s succinctness.
not valid.
In an experiment [5], the statement is implemented
4. Build again an incomplete SMT tree; for each as a constraint system R using the CIRCOM domain-
item in Bi , insert the value of the key into the specific language. The witness is generated based on
appropriate position. πi and Bi , and is supplemented by control wires that
define how individual hashing blocks in the circuit are
5. Compute the root based on siblings in πi . If the connected to the previous layer and to the inputs. If
root is not equal to ri then the proof is not valid. all constraints are satisfied, the proof is valid.

7
The proving system used is Groth16 [1], which is batch as zero (the value of empty leaf). The second
known for its small proof size. The proving time de- half computes the post-update root using the actual
pends on the depth of the SMT (logarithmic in its ca- values from the batch. The number of hashing units
pacity) and the maximum size of the insertion batch. in each half of the circuit is approximately O(kmax ·d).
Importantly, the proving effort does not depend on Each hashing unit takes its inputs either from the
the total capacity of the SMT, enabling fairly large outputs of the previous layer’s units or from the set
instantiations. of sibling nodes provided in the proof. The pre-
When the Consensus Layer verifies these succinct processing step encodes the positions of batch and
proofs, the Aggregation Layer operates trustlessly. proof elements into these control signals, which are
However, certain redundancy is still required to en- then supplied as part of the witness.
sure data availability of the SMT itself. Each hashing cell in the circuit, as depicted in Fig-
ure 5, is a template consisting of two input multiplex-
ers and one 2-to-1 compressing hash function.
6 Circuit-Based SNARK Defi- The MUX inputs for the leaf layer of the first half
nition are connected to a vector containing:

Due to the limited expressivity of an arithmetic cir- • The “empty” leaf value (0).
cuit (e.g., no data-dependent loops or real branch-
• All new leaves in the batch, which are mapped
ing), the entire computation flow must be fixed at
to ‘empty’ (0).
circuit-creation time. It is therefore helpful to pre-
process the inputs to create a fixed execution trace. • The “proof” or sibling hashes (πi ).
This pre-processing generates a “wiring” signal,
which is supplied as part of the witness. This sig- The MUX inputs for the leaf layer of the second
nal dictates the data flow between the hashing units half are connected to a vector containing:
within the circuit.
To preprocess the proof: • The “empty” leaf value (0).

1. The hash forest, which includes the proof’s sib- • The batch of new leaves (I).
ling nodes and the new batch leaves, is flattened. • The identical “proof” or sibling hashes (πi ).
2. The nodes are sorted first by layer (from leaves
The MUXes for internal layers are connected to a
to root) and then lexicographically within each
vector containing:
layer.
• The “empty” leaf value (0).
3. A wiring signal is generated to control the mul-
tiplexers (MUXes) at the input of each hashing • Output hashes from the previous layer’s cells.
unit in the circuit.
• The “proof” or sibling hashes (πi ).
Let the maximum batch size be kmax and the SMT
depth be d. Since the arithmetic circuit is static, Both halves’ MUXes are controlled by the same
it must be designed to accommodate the maximum wiring signal.
possible batch size, kmax .
The circuit has two halves, both controlled by the
6.1 Performance Indication
same wiring signal. It is critical to security that the
control signal and the proof are the same for both Initial benchmarks on a consumer laptop (Apple M1)
halves. The first half of the circuit computes the using the Poseidon hash function indicate a proving
pre-update root by treating all leaves in the insertion throughput of up to 25 transactions per second.

8
Figure 4: Circuit structure.

execution for that program.


We have implemented the non-deletion proof veri-
fication algorithm as a Rust program [4] to be proved
by the SP1 zkVM [6]. As a commitment to the
“right” program we use a prover key, generated dur-
ing program setup. Its contents are: a commit-
ment to the preprocessed traces, the starting Pro-
gram Counter register, the starting global digest of
the program, after incorporating the initial memory;
the chip information, the chip ordering; and prover
configuration.
For verification, we obtain the prover key hash and
Figure 5: One hashing cell of the circuit. authenticate it off-band.
After verifying the proof ([Link](&proof,
&vk)), we can be sure that proof:
7 Execution Trace-Based SP1ProofWithPublicValues is valid. The proof
STARK data structure embeds its validated “instance”, or
public parameters. Based on these parameters we
An alternative to a bespoke arithmetic circuit is to check that indeed, the right thing was executed. In
use a general-purpose zero-knowledge virtual ma- our case the instance is defined by the old root hash
chine (zkVM). In this approach, the verification logic and the new root hash, which must be authenti-
is written as a traditional imperative program (e.g., cated independently (i.e., using the certificate from
in Rust). The zkVM then generates a proof of correct Consensus Layer).

9
The privacy of the witness (the zero-knowledge ZK. There are attempts to create precompiles for ZK-
property) is not a requirement for this application. friendly hash functions5 , with limited real-world ef-
The primary goal is to achieve computational in- fect.
tegrity and succinctness. Therefore, while the un-
derlying technology is often referred to as “ZK”, we 7.3 More on ZK and Hash Functions
are using it as a Scalable Transparent ARgument of
Knowledge (STARK). Standardized cryptographic hash algorithms like
SHA-2 were optimized mostly for minimal physical
chip area, a design choice driven by NIST. Others,
7.1 zkVM Performance like the Blake family, were designed for fast execution
On a 10-core Apple M1 CPU, proving a 500- on CPUs. They all include numerous bitwise oper-
transaction batch using SHA-256 within the SP1 ations (e.g., rotations, XOR) that are silicon logic-
zkVM takes approximately 5 minutes. However, the native but are notoriously inefficient to prove in ZK.
SP1 framework is robust and designed for scalability, Proving such operations is expensive, because a full
supporting distributed prover networks, industrial- field element (e.g., a 254-bit value on the BN254
grade GPUs, proof chunking and recursion, and other curve) must be used to represent a single bit.6 ZK
advanced features to tackle larger problems with provers are most efficient with arithmetic operations
brute force. native to the underlying finite field, such as addition
and multiplication (and lookups on some ZK stacks).
Other operations must be implemented indirectly.
7.2 Optimization Ideas There are some newer cryptographic hash functions
The ZK proving performance is dominated by the specifically designed for ZK efficiency in mind. Func-
cryptographic hashing primitive used by the pro- tions like Poseidon and Poseidon2 are gaining accep-
gram. tance but are still relatively new. Some are better
At the time of writing, the SP1 zkVM4 offers pre- on large fields (e.g., Reinforced Concrete), some on
compiles for standard hash functions like SHA-256, smaller (e.g., Monolith) and depending on the proof
which accelerates their execution compared to a di- system’s lookup table support. Even newer and ex-
rect RISC-V implementation. The use of these pre- hibiting even higher performance examples are Grif-
compiles (also known as coprocessors or chips) can fin, Anemoi. Some, like GMiMC, are offering a com-
be observed in the prover’s output, which details promise with better silicon CPU performance.
the number of calls to each specialized circuit (e.g., A key advantage of these hashes is that they op-
SHA_EXTEND, SHA_COMPRESS). However, even with ac- erate directly on field elements, avoiding the costly
celeration, proving SHA-256 is computationally ex- translation from integer representations. The secu-
pensive. rity level is defined by the underlying field and in-
A possible optimization is to use “ZK-friendly” stantiation parameters. While some VMs, like the
hash functions. These functions are highly efficient Cairo VM used by Starknet, provide direct access to
when implemented directly in arithmetic circuits, field elements, they are often highly specialized for
where there is direct access to the native field ele- particular use cases, such as L2 rollups.
ments. Their performance advantage in a RISC-V
zkVM is more nuanced, as there is an overhead in 7.4 Performance Roadmap
translating between the VM’s 32-bit integer registers
The overall approach is sound: the proving time de-
and the underlying finite field elements as used by
pends on the size of the addition batch, and notably,
the prover. Operations like range-checking, which
are necessary to prevent overflows, are expensive in 5 [Link]
6 See e.g. [Link]
4 [Link] master/circuits/sha256/[Link]

10
it does not have linear relationship to the total capac- can be tentatively considered secure for this type of
ity of the data structure. The verification algorithm application. It offers an estimated 50× improvement
is tight. in proving performance compared to efficient stan-
To overcome the performance bottleneck, a ZK- dard hash function like Blake3.
friendly hash function is essential. The ideal proving
framework would provide direct access to the native
field elements of its arithmetization layer, a feature 8 Summary
not typically available in general-purpose zkVMs.
Execution trace generation must be highly efficient (a Zero-knowledge proof systems offer a powerful
criterion that excludes older frameworks like Cairo 0).method for creating succinct proofs of performing
The prover itself must be fast. State-of-the-art uses some computation, in our case, checking consistency
small prime fields (e.g., BabyBear, Mersenne-31) proofs of a distributed cryptographic data structure.
and FRI-based polynomial commitment schemes, like For use cases with small changesets, a simple hash-
Circle-STARKs [2]. Promising implementations are based proof, whose size is linear in the batch size, is
Plonky37 and STwo8 . Considering the need for matu- optimal. However, as batch sizes increase and band-
rity, modularity, and an open-source license, Plonky3 width becomes a constraint, the constant or near-
emerges as the strongest option. constant size proofs generated by ZK systems become
To utilize the Plonky3 framework, the verification more advantageous.
logic must be implemented as a custom AIR circuit
Different proof systems offer different trade-offs.
(Algebraic Intermediate Representation) rather than
The properties are: proving effort, necessity of
a general-purpose program.
trusted setup, generality of trusted setup, interactiv-
ity, proof recursion-friendliness, and of course prop-
7.5 Custom AIR Circuit erties like availability of tooling, maturity, trustwor-
thiness. Some, like STARKs, are relatively fast to
Extrapolating from benchmarks of similar computa-
prove but have fairly large proofs; and avoid unde-
tions 9 using the Plonky3 framework, Poseidon2 hash
sirable properties such as trusted setup. Others, like
function, and a small finite field, the projected per-
Groth16, produce small proofs but require more prov-
formance of such a stack on a 10-core CPU is approx-
ing effort and a circuit-specific trusted setup. For
imately 10 000 tx/s. The parameter “blowup factor”
more complex applications, hybrid approaches and
is 21 , resulting in a 1.7 MB proof. A more conserva-
proof recursion can be employed. Figure 6 illustrates
tive configuration with a blowup factor of 23 would
the proof size trade-off.
yield approximately 2500 tx/s with a 0.7 MB proof
and higher memory requirements for the prover.
These figures indicate that operating very large-
scale Aggregation Layer in a trustless manner is eco- References
nomically feasible.
We note that the Poseidon family of hash func- [1] Jens Groth. On the size of pairing-based
tions is relatively new and has undergone less crypto- non-interactive arguments. Cryptology ePrint
graphic analysis than traditional hash functions like Archive, Paper 2016/260, 2016.
the SHA-2 or SHA-3 families. However, among the
new class of ZK-friendly arithmetic hash functions, [2] Ulrich Haböck, David Levit, and Shahar Papini.
Poseidon has undergone the most public scrutiny and Circle STARKs. Cryptology ePrint Archive, Pa-
per 2024/278, 2024.
7 [Link]
8 [Link]
9 Experiment with iterative hashing and a hash-based sig- [3] Satoshi Nakamoto. Bitcoin: A peer-to-peer elec-
nature scheme [Link] tronic cash system. 2009.

11
hash-based consistency proof
Algorithm 2 Verification of non-deletion proof
function VerifyNonDeletion(π, ri−1 , ri , P )
▷ Proof π is a by-layer array of ...
Proof size

▷ ... sorted arrays of k-v tuples


bandwidth limit ▷ Insertion batch P is ...
▷ ... sorted array of k-v tuples
p∅ ← {(k, ∅) | (k, v) ∈ P } ▷ Empty leaves
STARK
r∅ ← ComputeForest(π, p∅ )
assert r∅ = ri−1
SNARK (e.g. Groth16) ▷ Same with batch’s leaves populated
rB ← ComputeForest(π, P )
assert rB = ri
Inclusion batch size return 1 ▷ Success
end function
Figure 6: Proof size vs. use of ZK compression. Dot- function ComputeForest(π, p)
ted line is bandwidth limit, dashed line is compute for ℓ ∈ tree depth do
limit (ZK scheme specific). Not to scale. p′ ← [ ] ▷ computed nodes of parent layer
m ← 0; n ← 0 ▷ indices
while m < |p| do
[4] ristik. SP1 zkVM-based consistency proof. (k, v) ← p[m]
[Link] kp ← ⌊k/2⌋ ▷ Parent key
zkvm-ndsmt, 2025. is right ← k mod 2
[5] ristik. Trustless SMT accumulator. https:// ks ← 2kp + (1 − is right) ▷ Sibling key
[Link]/unicitynetwork/nd-smt, 2025. if ¬is right∧|p| > m+1∧p[m+1].k = ks
then ▷ Right sibling is next
[6] Succinct Labs. SP1. [Link] vs ← p[m + 1].v
succinctlabs/sp1, 2025. m←m+1 ▷ Jump over
else if |π[ℓ]| > n ∧ π[ℓ][n].k = ks then
[7] Tevador. RandomX: Experimental proof-of- vs ← π[ℓ][n].k
work algorithm based on random code execution. n←n+1
[Link] 2025. else
[8] The Unicity Developers. Unicity whitepa- vs ← ∅
per. [Link] end if
whitepaper/releases/tag/latest, 2025. vp ← h(vs , v) if is right else h(v, vs )
p′ ← p′ ∥(kp , vp )
m←m+1
end while
p ← p′
end for
assert |p| = 1 ▷ One root!
return p[0].v ▷ Value of the root
end function

12

You might also like