Skip to content

[EN Performance] Optimize checkpoint serialization for -37GB operational RAM, -2.7 minutes duration, -19.6 million allocs (50% fewer allocs) #2964

@fxamacker

Description

@fxamacker

Problem

Although PR #2792 reduces peak memory used by checkpointing by reusing ledger state, we can further reduce peak memory used by over 35GB during checkpoint serialization.

Updates #1744

Proposed Solution

Replace largest data structure used for checkpoint serialization and process subtries instead of entire trie. Also use preallocations when feasible.

Optionally, allow a flag to specify the number of levels to use. Specifying 4 levels will use 16 subtries, which is a reasonable default for impactful memory savings and faster serialization.

Serializing data in parallel is made easier by this proposed change, but that is outside the scope of this issue.

Preliminary Results Using Levels=4 (16 Subtries)

Using August 12 mainnet checkpoint file with Go 1.18.5:

  • -37GB peak RAM (top command), -23GB RAM (go bench B/op)
  • -19.6 million (-50%) allocs/op in serialization phase
  • -2.7 minutes duration
Before:    625746 ms    88320868048 B/op    39291999 allocs/op
After:     461937 ms    64978613264 B/op    19671410 allocs/op

No benchstat comparisons yet (n=5+) due to duration and memory (requires the big benchnet-dev-004 server).

EDIT: added more details after reading PR #3050 review comments.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions