This page documents the PendingUpdate base class and all concrete high-level update implementations that represent user-facing table operations. It also covers the lower-level TableUpdate hierarchy, which represents atomic metadata changes applied during commit. For how updates are batched and committed atomically, see the Transaction System 5.2. For the requirements that guard commits, see 5.4.
The update system has two distinct layers:
| Layer | Base Class | Role | Location |
|---|---|---|---|
| High-level (user API) | PendingUpdate | Builder-pattern API for expressing intent | src/iceberg/update/ |
| Low-level (metadata changes) | TableUpdate | Discrete, serializable changes applied to TableMetadataBuilder | src/iceberg/table_update.h |
PendingUpdate implementations collect user input, validate it, and produce structured results. Those results are translated by Transaction into one or more TableUpdate objects, which are then serialized and sent to the catalog via Catalog::UpdateTable.
Update Operation Lifecycle
Sources: src/iceberg/update/pending_update.h1-94 src/iceberg/transaction.cc94-161
PendingUpdate is the abstract base for all user-facing update operations. It is declared in src/iceberg/update/pending_update.h1-94
class PendingUpdate : public ErrorCollector
Every PendingUpdate is associated with a Transaction at construction time and cannot be copied. All created instances are tracked (via weak_ptr) by the owning Transaction.
Each concrete type reports its identity via kind():
Kind | Concrete Class |
|---|---|
kExpireSnapshots | ExpireSnapshots |
kSetSnapshot | SetSnapshot |
kUpdateLocation | UpdateLocation |
kUpdatePartitionSpec | UpdatePartitionSpec |
kUpdatePartitionStatistics | UpdatePartitionStatistics |
kUpdateProperties | UpdateProperties |
kUpdateSchema | UpdateSchema |
kUpdateSnapshot | SnapshotUpdate (and subclasses) |
kUpdateSnapshotReference | UpdateSnapshotReference |
kUpdateSortOrder | UpdateSortOrder |
kUpdateStatistics | UpdateStatistics |
Sources: src/iceberg/update/pending_update.h44-56
| Method | Description |
|---|---|
kind() | Returns the Kind of this update (pure virtual) |
Commit() | Delegates to transaction_->Apply(*this), then triggers commit if auto_commit is set |
Finalize(optional<Error>) | Called after commit; implementations clean up resources (e.g., delete written files on failure) |
The protected base() accessor returns const TableMetadata& representing the unmodified starting metadata.
All builder-pattern methods (e.g., Set(), AddSortField()) use ICEBERG_BUILDER_CHECK to defer validation errors. Errors are collected via ErrorCollector. When Apply() is called, CheckErrors() is called first — if any deferred errors exist, they are returned immediately as a ValidationFailed error.
Sources: src/iceberg/update/pending_update.h42-93 src/iceberg/update/update_properties.cc48-68 src/iceberg/update/update_sort_order.cc48-73
Class hierarchy diagram
Sources: src/iceberg/type_fwd.h191-205 src/iceberg/update/pending_update.h1-94 src/iceberg/transaction.h65-110
Header: src/iceberg/update/update_properties.h1-81 Implementation: src/iceberg/update/update_properties.cc1-108
Modifies the key-value property map on table metadata.
Builder methods:
| Method | Description |
|---|---|
Set(key, value) | Adds or updates a property. Rejects reserved keys and keys already marked for removal. |
Remove(key) | Marks a property for removal. Rejects keys already marked for update. |
ApplyResult structure:
If format-version is set as a property key, it is intercepted: the version is parsed, validated against TableMetadata::kSupportedTableFormatVersion, and returned separately as format_version. It is not included in the updates map.
After computing the new effective property map, MetricsConfig::VerifyReferencedColumns is called to ensure any column-level metrics properties still reference valid schema columns.
Transaction application: src/iceberg/transaction.cc206-218
Transaction::ApplyUpdateProperties
-> UpdateProperties::Apply() -> ApplyResult
-> metadata_builder_->SetProperties(updates)
-> metadata_builder_->RemoveProperties(removals)
-> metadata_builder_->UpgradeFormatVersion(version)
Sources: src/iceberg/update/update_properties.h1-81 src/iceberg/update/update_properties.cc1-108 src/iceberg/transaction.cc206-218
Header: src/iceberg/update/update_sort_order.h1-80 Implementation: src/iceberg/update/update_sort_order.cc1-107
Replaces the default sort order with a newly built one.
Builder methods:
| Method | Description |
|---|---|
AddSortField(term, direction, null_order) | Adds a sort field. The term must be an unbound NamedReference or UnboundTransform. It is immediately bound against the current schema. |
AddSortFieldByName(name, direction, null_order) | Convenience method; creates a NamedReference then calls AddSortField. |
CaseSensitive(bool) | Controls case sensitivity when resolving field names. Default is true. |
Apply() returns a shared_ptr<SortOrder>. If no fields were added, it returns SortOrder::Unsorted() (ID = 0). Otherwise a placeholder ID of -1 is used; the actual ID is assigned by TableMetadataBuilder::AddSortOrder during commit.
Transaction application: src/iceberg/transaction.cc274-278
Transaction::ApplyUpdateSortOrder
-> UpdateSortOrder::Apply() -> SortOrder
-> metadata_builder_->SetDefaultSortOrder(sort_order)
Sources: src/iceberg/update/update_sort_order.h1-80 src/iceberg/update/update_sort_order.cc1-107 src/iceberg/transaction.cc274-278
Header: src/iceberg/update/update_schema.h
Implementation: src/iceberg/update/update_schema.cc
Builds an updated Schema and sets it as the current schema.
Apply() returns an ApplyResult containing:
schema — the new shared_ptr<Schema>new_last_column_id — the updated last-assigned field IDTransaction application: src/iceberg/transaction.cc220-225
Transaction::ApplyUpdateSchema
-> UpdateSchema::Apply() -> ApplyResult
-> metadata_builder_->SetCurrentSchema(schema, new_last_column_id)
The AddSchema and SetCurrentSchema TableUpdate objects are generated by TableMetadataBuilder. See 4.2 for schema evolution semantics.
Sources: src/iceberg/transaction.cc220-225 src/iceberg/table_update.h154-197
Header: src/iceberg/update/update_partition_spec.h
Implementation: src/iceberg/update/update_partition_spec.cc
Adds or replaces the table's default partition specification.
Apply() returns an ApplyResult with:
spec — the new shared_ptr<PartitionSpec>set_as_default — whether this spec should become the defaultTransaction application: src/iceberg/transaction.cc196-204
Transaction::ApplyUpdatePartitionSpec
-> UpdatePartitionSpec::Apply() -> ApplyResult
-> metadata_builder_->SetDefaultPartitionSpec(spec) [if set_as_default]
-> metadata_builder_->AddPartitionSpec(spec) [otherwise]
Sources: src/iceberg/transaction.cc196-204 src/iceberg/table_update.h199-262
Header: src/iceberg/update/update_location.h
Implementation: src/iceberg/update/update_location.cc
Sets the table's base storage location.
Apply() returns a string with the new location.
Transaction application: src/iceberg/transaction.cc190-194
Transaction::ApplyUpdateLocation
-> UpdateLocation::Apply() -> string
-> metadata_builder_->SetLocation(location)
Sources: src/iceberg/transaction.cc190-194 src/iceberg/table_update.h495-511
Header: src/iceberg/update/expire_snapshots.h
Implementation: src/iceberg/update/expire_snapshots.cc
Identifies and removes expired snapshots based on retention policies.
Apply() returns an ApplyResult containing sets of IDs to remove:
snapshot_ids_to_removerefs_to_removepartition_spec_ids_to_removeschema_ids_to_removeTransaction application: src/iceberg/transaction.cc163-181
Transaction::ApplyExpireSnapshots
-> ExpireSnapshots::Apply() -> ApplyResult
-> metadata_builder_->RemoveSnapshots(ids)
-> metadata_builder_->RemoveRef(name) [for each ref]
-> metadata_builder_->RemovePartitionSpecs(ids)
-> metadata_builder_->RemoveSchemas(ids)
Sources: src/iceberg/transaction.cc163-181 src/iceberg/table_update.h352-391
Header: src/iceberg/update/update_snapshot_reference.h
Implementation: src/iceberg/update/update_snapshot_reference.cc
Manages named snapshot references (branches and tags). Corresponds to the low-level SetSnapshotRef and RemoveSnapshotRef TableUpdate types.
Apply() returns an ApplyResult with:
to_set — map from name to SnapshotRef to create or updateto_remove — list of reference names to deleteTransaction application: src/iceberg/transaction.cc263-272
Transaction::ApplyUpdateSnapshotReference
-> UpdateSnapshotReference::Apply() -> ApplyResult
-> metadata_builder_->RemoveRef(name) [for each removal]
-> metadata_builder_->SetRef(name, ref) [for each addition]
Sources: src/iceberg/transaction.cc263-272 src/iceberg/table_update.h373-449
Headers: src/iceberg/update/update_statistics.h, src/iceberg/update/update_partition_statistics.h
Manage StatisticsFile and PartitionStatisticsFile records on the table.
UpdateStatistics::Apply() returns an ApplyResult with to_set and to_remove maps keyed by snapshot_id.
Transaction application: src/iceberg/transaction.cc280-299
Transaction::ApplyUpdateStatistics
-> UpdateStatistics::Apply() -> ApplyResult
-> metadata_builder_->SetStatistics(file)
-> metadata_builder_->RemoveStatistics(snapshot_id)
Sources: src/iceberg/transaction.cc280-299 src/iceberg/table_update.h462-566
Headers: src/iceberg/update/set_snapshot.h, src/iceberg/update/snapshot_update.h
SetSnapshot handles rollback to an existing snapshot or setting the current branch. Apply() returns int64_t — the target snapshot ID.
SnapshotUpdate is the base class for all snapshot-producing operations (like FastAppend). Its Apply() returns a result containing the new Snapshot and target branch name.
The transaction application for SnapshotUpdate is more complex: it uses a temporary TableMetadataBuilder to check whether the metadata actually changed before committing (src/iceberg/transaction.cc227-261).
Header: src/iceberg/update/fast_append.h
Implementation: src/iceberg/update/fast_append.cc
A SnapshotUpdate subclass that appends data files without rewriting manifests. Uses RollingManifestWriter to write new manifest files and produces a new Snapshot pointing to a new manifest list.
Finalize is overridden to clean up written manifest files if the commit fails.
Sources: src/iceberg/update/meson.build18-36 src/iceberg/CMakeLists.txt89-101
Header: src/iceberg/update/snapshot_manager.h
Implementation: src/iceberg/update/snapshot_manager.cc
Manages snapshot references (branches and tags) outside the standard PendingUpdate flow. It has its own commit logic and is not added to the transaction's pending update list.
Sources: src/iceberg/transaction.cc426-429 src/iceberg/table.cc226-228
TableUpdate (src/iceberg/table_update.h42-106) is the low-level, serializable representation of a single atomic metadata mutation. Each instance is applied to a TableMetadataBuilder via ApplyTo(), and generates TableRequirement instances via GenerateRequirements().
TableUpdate concrete types
Sources: src/iceberg/table_update.h42-106 src/iceberg/table_update.cc1-568
Each TableUpdate calls methods on TableUpdateContext from GenerateRequirements() to register the requirements that must pass before the commit is accepted:
| TableUpdate | Requirement generated |
|---|---|
AddSchema | RequireLastAssignedFieldIdUnchanged() |
SetCurrentSchema | RequireCurrentSchemaIdUnchanged() |
AddPartitionSpec | RequireLastAssignedPartitionIdUnchanged() |
SetDefaultPartitionSpec | RequireDefaultSpecIdUnchanged() |
RemovePartitionSpecs | RequireDefaultSpecIdUnchanged(), RequireNoBranchesChanged() |
RemoveSchemas | RequireCurrentSchemaIdUnchanged(), RequireNoBranchesChanged() |
SetDefaultSortOrder | RequireDefaultSortOrderIdUnchanged() |
SetSnapshotRef | AssertRefSnapshotID (optimistic concurrency on the branch) |
AssignUUID, UpgradeFormatVersion, AddSortOrder, AddSnapshot, RemoveSnapshots, RemoveSnapshotRef, SetProperties, RemoveProperties, SetLocation, statistics types | No requirements |
Sources: src/iceberg/table_update.cc37-567
The Table class provides factory methods that create a short-lived auto-commit Transaction, then delegate to it:
When UpdateProperties::Commit() is called, it delegates to transaction_->Apply(*this). Because auto_commit=true, Transaction::Apply calls Transaction::Commit() immediately after applying the update.
For multi-operation transactions (auto_commit=false), the user calls Transaction::Commit() manually after configuring all updates.
Sources: src/iceberg/table.cc163-224 src/iceberg/transaction.cc94-161 src/iceberg/update/pending_update.h62-76
All public headers for the update subsystem are installed under iceberg/update/:
| File | Class |
|---|---|
update/pending_update.h | PendingUpdate |
update/expire_snapshots.h | ExpireSnapshots |
update/fast_append.h | FastAppend |
update/set_snapshot.h | SetSnapshot |
update/snapshot_manager.h | SnapshotManager |
update/snapshot_update.h | SnapshotUpdate |
update/update_location.h | UpdateLocation |
update/update_partition_spec.h | UpdatePartitionSpec |
update/update_partition_statistics.h | UpdatePartitionStatistics |
update/update_properties.h | UpdateProperties |
update/update_schema.h | UpdateSchema |
update/update_snapshot_reference.h | UpdateSnapshotReference |
update/update_sort_order.h | UpdateSortOrder |
update/update_statistics.h | UpdateStatistics |
The low-level TableUpdate hierarchy lives at iceberg/table_update.h.
Sources: src/iceberg/update/meson.build18-36 src/iceberg/CMakeLists.txt89-101
Refresh this wiki
This wiki was recently refreshed. Please wait 4 days to refresh again.