This document describes the testing infrastructure and continuous integration/continuous deployment (CI/CD) pipelines for Apache Iceberg C++. It covers the test frameworks, GitHub Actions workflows, multi-platform testing strategies, sanitizer testing, code quality checks, and the release process.
For information about building the project locally, see Building from Source. For details about the library architecture and modular organization, see Library Architecture.
Apache Iceberg C++ uses Google Test (GTest) and Google Mock (GMock) as the testing framework. Tests are organized into multiple test executables, each grouping related functionality.
Google Test is fetched and configured in src/iceberg/test/CMakeLists.txt18-25:
https://github.com/google/googletest.gitb514bdc898e2951020cbdca1304b75f5950d1f59 (release-1.15.2)FetchContent with optional system package fallbackFor Meson builds, GTest is declared as a dependency in src/iceberg/test/meson.build18:
gmock_main_dep = dependency('gmock_main')
Tests are organized into focused executables based on subsystem. Each executable is linked against either iceberg_static (core tests) or iceberg_bundle_static (tests requiring Arrow/Avro/Parquet) and GTest::gmock_main.
Test Executable Summary
| Test Executable | Link Target | Test Sources | Primary Coverage |
|---|---|---|---|
schema_test | iceberg_static | assign_id_visitor_test.cc, schema_test.cc, type_test.cc | Type system, Schema class, SchemaField operations, Transform functions |
table_test | iceberg_static | table_metadata_builder_test.cc, snapshot_test.cc, requirement_test.cc | TableMetadataBuilder, Snapshot, TableRequirement validation |
expression_test | iceberg_static | expression_test.cc, evaluator_test.cc, predicate_test.cc | Expression tree, ExpressionEvaluator, PredicateProjection |
json_serde_test | iceberg_static | json_serde_test.cc, metadata_serde_test.cc | JSON serialization/deserialization, metadata persistence |
util_test | iceberg_static | uuid_test.cc, decimal_test.cc, url_encoder_test.cc | UUID generation, Decimal128 operations, URL encoding |
roaring_test | iceberg_static | roaring_test.cc | CRoaring bitmap operations |
avro_test | iceberg_bundle_static | avro_test.cc, avro_schema_test.cc, avro_data_test.cc | AvroReader/AvroWriter, schema conversion, Avro C API |
arrow_test | iceberg_bundle_static | arrow_test.cc, arrow_fs_file_io_test.cc, metadata_io_test.cc | ArrowFileSystemFileIO, Arrow C Data Interface |
parquet_test | iceberg_bundle_static | parquet_test.cc, parquet_schema_test.cc, parquet_data_test.cc | ParquetReader/ParquetWriter, schema mapping |
manifest_test | iceberg_bundle_static | manifest_reader_test.cc, manifest_writer_versions_test.cc | ManifestReader/ManifestWriter, manifest file format |
scan_test | iceberg_bundle_static | table_scan_test.cc, file_scan_task_test.cc | TableScan, FileScanTask, ManifestGroup |
table_update_test | iceberg_bundle_static | fast_append_test.cc, update_schema_test.cc, transaction_test.cc | FastAppend, UpdateSchema, Transaction operations |
data_writer_test | iceberg_bundle_static | data_writer_test.cc | Data file writing operations |
eval_expr_test | iceberg_bundle_static | eval_expr_test.cc, evaluator_test.cc | Expression evaluation on Arrow record batches |
catalog_test | iceberg_bundle_static | in_memory_catalog_test.cc | InMemoryCatalog implementation |
rest_catalog_test | iceberg_rest_static | auth_manager_test.cc, endpoint_test.cc, rest_json_serde_test.cc | RestCatalog, AuthManager, REST endpoints |
rest_catalog_integration_test | iceberg_rest_static | rest_catalog_test.cc, docker_compose_util.cc | Live REST catalog integration via Docker Compose |
Sources: src/iceberg/test/CMakeLists.txt59-194 src/iceberg/test/meson.build30-144
Tests are added via the custom add_iceberg_test() function defined in src/iceberg/test/CMakeLists.txt31-57:
Function Signature and Arguments:
Implementation Details:
add_executable(${test_name}) - Creates test executable targettarget_sources(${test_name} PRIVATE ${ARG_SOURCES}) - Adds source filesUSE_BUNDLE flag present: links iceberg_bundle_static (includes Arrow/Avro/Parquet)iceberg_static (core library only)GTest::gmock_main (provides main() function and GMock matchers)add_test(NAME ${test_name} COMMAND ${test_name})/bigobj compiler flag on MSVC to handle large object filesExample Usage:
Sources: src/iceberg/test/CMakeLists.txt31-57
Tests are executed via CTest when building with CMake. Test registration occurs automatically during CMake configuration when ICEBERG_BUILD_TESTS is enabled src/iceberg/CMakeLists.txt247-249:
Run tests from build directory:
Sources: src/iceberg/CMakeLists.txt247-249 .github/workflows/test.yml144-146
Meson tests are defined via a dictionary in src/iceberg/test/meson.build30-102 and registered in a loop src/iceberg/test/meson.build134-144:
This creates executable and test registration in a single pass.
Run tests:
Sources: src/iceberg/test/meson.build134-144 .github/workflows/test.yml144-146
Tests reference resource files via configured path src/iceberg/test/CMakeLists.txt27-29:
This generates test_config.h with ICEBERG_TEST_RESOURCES macro pointing to the resources directory containing sample Iceberg metadata files, Avro schemas, and Parquet files used in tests.
Sources: src/iceberg/test/CMakeLists.txt27-29 src/iceberg/test/meson.build20-28
The project uses GitHub Actions for continuous integration and deployment. Workflows are located in .github/workflows/ and are organized into distinct responsibilities:
Workflow Trigger Matrix
| Workflow | Push | Pull Request | Tag | Main Only |
|---|---|---|---|---|
test.yml | ✓ | ✓ | ✓ | - |
sanitizer_test.yml | ✓ | ✓ | ✓ | - |
pre-commit.yml | ✓ | ✓ | - | - |
cpp-linter.yml | - | ✓ | - | - |
license_check.yml | - | ✓ | - | - |
rc.yml | - | - | ✓ (RC tags) | - |
docs.yml | - | - | - | ✓ |
Sources: .github/workflows/test.yml20-27 .github/workflows/cpp-linter.yml20-27 .github/workflows/rc.yml19-22 .github/workflows/sanitizer_test.yml20-27 .github/workflows/pre-commit.yml20-25 .github/workflows/docs.yml20-26 .github/workflows/license_check.yml20
The test.yml workflow defines four job types that execute builds across three platforms:
Platform-Specific Configuration
| Job Name | Runner | Compiler | Dependencies | Build Cache | Test Framework | Timeout |
|---|---|---|---|---|---|---|
ubuntu | ubuntu-24.04 | gcc-14/g++-14 | libcurl4-openssl-dev via apt-get | - | CMake | 30 min |
macos | macos-26 | System default (Apple Clang) | System libraries | - | CMake | 30 min |
windows | windows-2025 | MSVC 2022 via vcvarsall.bat | zlib, nlohmann-json, nanoarrow, roaring, cpr via vcpkg | sccache | CMake | 60 min |
meson (Ubuntu) | ubuntu-24.04 | gcc-14/g++-14 | System packages | - | Meson | 30 min |
meson (Windows) | windows-2025 | MSVC 2022 via --vsenv | vcpkg | - | Meson | 30 min |
meson (macOS) | macos-26 | System default | System libraries | - | Meson | 30 min |
Sources: .github/workflows/test.yml40-106 .github/workflows/test.yml107-147
Both CMake and Meson builds are driven by shell scripts in ci/scripts/:
ci/scripts/build_iceberg.sh - Main library build script
Signature:
Parameters:
source_dir: Path to iceberg-cpp source root (required)rest_integration_tests: ON to enable REST integration tests (requires Docker), defaults to OFFuse_sccache: ON to enable sccache build caching (Windows only), defaults to OFFBuild Steps:
mkdir -p $ICEBERG_HOME/build - Create build directory (uses ICEBERG_HOME=/tmp/iceberg from environment)cmake -S $source_dir -B $ICEBERG_HOME/build -G Ninja -DICEBERG_BUILD_TESTS=ON -DICEBERG_BUILD_BUNDLE=ON -DICEBERG_BUILD_REST=ON - Configure with Ninja generatorcmake --build $ICEBERG_HOME/build - Build all targetsctest --test-dir $ICEBERG_HOME/build --output-on-failure - Run testscmake --install $ICEBERG_HOME/build --prefix $ICEBERG_HOME - Install libraries and headersci/scripts/build_example.sh - Example application build
Signature:
Purpose: Verifies that installed library can be consumed by downstream CMake projects using find_package(iceberg).
Workflow Usage Examples:
| Workflow Job | Script Call | Configuration |
|---|---|---|
ubuntu .github/workflows/test.yml57 | ci/scripts/build_iceberg.sh $(pwd) ON | REST integration tests enabled |
macos .github/workflows/test.yml75 | ci/scripts/build_iceberg.sh $(pwd) | Default configuration |
windows .github/workflows/test.yml100 | bash -c "ci/scripts/build_iceberg.sh $(pwd) OFF ON" | sccache enabled |
| All platforms .github/workflows/test.yml64-104 | ci/scripts/build_example.sh $(pwd)/example | Verify installation |
Sources: .github/workflows/test.yml52-63 .github/workflows/test.yml69-78 .github/workflows/test.yml88-106
The Meson-specific job uses a matrix strategy to test across all three platforms with platform-specific configurations defined in .github/workflows/test.yml111-124:
Meson Setup Arguments by Platform:
| Platform | meson-setup-args | Effect |
|---|---|---|
| Ubuntu 24.04 | -Drest_integration_test=enabled | Enables REST catalog integration tests with Docker |
| Windows 2025 | --vsenv | Auto-configures Visual Studio environment (cl.exe, link.exe paths) |
| macOS 26 | (empty) | Uses default Meson configuration |
The matrix is defined in .github/workflows/test.yml111-124 with conditional compiler environment variables set in .github/workflows/test.yml135-139:
Test execution with .github/workflows/test.yml144-146:
The --timeout-multiplier 0 disables test timeouts to prevent false failures on resource-constrained CI runners.
Sources: .github/workflows/test.yml107-147
The sanitizer_test.yml workflow runs a dedicated sanitizer build to detect memory errors, leaks, and undefined behavior. This runs as job sanitizer-test on ubuntu-24.04 with gcc-14/g++-14 compiler.
Sanitizer Build Flow:
CMake Configuration .github/workflows/sanitizer_test.yml48-55:
The workflow configures CMake with sanitizer flags:
Compiler Instrumentation:
ICEBERG_ENABLE_ASAN=ON - Adds -fsanitize=address flag, links asan runtimeICEBERG_ENABLE_UBSAN=ON - Adds -fsanitize=undefined flag, links ubsan runtimeCMAKE_BUILD_TYPE=Debug - Disables optimizations, enables debug symbols for accurate stack tracesThe CMake options are processed in the build system to inject appropriate compiler and linker flags for sanitizer instrumentation.
Sanitizers are configured via environment variables set in .github/workflows/sanitizer_test.yml59-61:
ASAN_OPTIONS (Address Sanitizer):
| Option | Value | Purpose |
|---|---|---|
log_path | out.log | Write reports to build/test/out.log.* files |
detect_leaks | 1 | Enable Leak Sanitizer (LSAN) integration |
symbolize | 1 | Resolve addresses to function names in stack traces |
strict_string_checks | 1 | Detect out-of-bounds in string operations |
halt_on_error | 1 | Stop on first detected error (fail-fast) |
detect_container_overflow | 0 | Disable STL container overflow detection (reduces false positives) |
LSAN_OPTIONS (Leak Sanitizer):
References suppression file at .github/lsan-suppressions.txt to ignore known leaks in third-party libraries or acceptable allocations.
UBSAN_OPTIONS (Undefined Behavior Sanitizer):
| Option | Value | Purpose |
|---|---|---|
log_path | out.log | Write reports to build/test/out.log.* files |
halt_on_error | 1 | Stop on first undefined behavior |
print_stacktrace | 1 | Include full stack traces in reports |
suppressions | .github/ubsan-suppressions.txt | Suppress known UB in third-party code |
Test output logs are uploaded as artifacts .github/workflows/sanitizer_test.yml64-69 using actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f:
This captures all out.log.* files generated by ASAN/UBSAN for post-mortem analysis if tests fail.
Sources: .github/workflows/sanitizer_test.yml37-70
The cpp-linter.yml workflow enforces code style using cpp-linter-action with clang-format version 22. This runs as job cpp-linter on ubuntu-24.04:
Linter Workflow:
Configuration Details .github/workflows/cpp-linter.yml49-70:
| Parameter | Value | Purpose |
|---|---|---|
style | file | Use .clang-format from repository root |
tidy-checks | '' | Disable clang-tidy (format-only mode) |
version | 22 | clang-format version 22 |
files-changed-only | true | Only check files modified in PR |
lines-changed-only | true | Only check modified lines within files |
thread-comments | true | Post inline PR review comments |
ignore | build|cmake_modules|ci | Exclude generated/infrastructure files |
database | build | Use build/compile_commands.json for include paths |
verbosity | debug | Detailed logging for troubleshooting |
Extra Compilation Arguments .github/workflows/cpp-linter.yml65:
-std=c++23: C++23 standard (required by codebase)-I$PWD/src: Include source directory-I$PWD/build/src: Include generated headers (e.g., version.h)-fno-builtin-std-forward_like: Workaround for LLVM issue #101614The workflow fails if steps.linter.outputs.checks-failed != 0 .github/workflows/cpp-linter.yml66-70 preventing merge of improperly formatted code.
Sources: .github/workflows/cpp-linter.yml30-71
The pre-commit.yml workflow executes pre-commit hooks on all pushes and pull requests (excluding dependabot/** branches):
Pre-commit Flow:
The workflow .github/workflows/pre-commit.yml28-34 executes three steps:
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd - Clone repositoryactions/setup-python@v6 - Install Python runtime for pre-commitpre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd - Execute hooksHook configuration resides in .pre-commit-config.yaml at repository root, which defines linting rules, format checks, and other automated validations.
Sources: .github/workflows/pre-commit.yml18-34
The license_check.yml workflow verifies Apache license headers using SkyWalking Eyes:
License Check Flow:
Configuration .github/workflows/license_check.yml26-35:
The action:
.github/.licenserc.yaml configurationThis ensures ASF licensing compliance before code merge.
Sources: .github/workflows/license_check.yml18-35
The rc.yml workflow automates release candidate creation and verification using three jobs: archive, verify, and upload:
The workflow triggers on tags matching pattern *-rc* (e.g., v1.0.0-rc1, v1.2.3-rc2).
Tag parsing logic .github/workflows/rc.yml39-48:
Environment variables are set for subsequent jobs:
This produces variables like VERSION=1.0.0 and RC=1 used in archive naming and verification.
The archive job creates source distribution .github/workflows/rc.yml50-56:
This produces:
apache-iceberg-cpp-1.0.0.tar.gz - Source archive with version prefix in pathsapache-iceberg-cpp-1.0.0.tar.gz.sha512 - SHA-512 checksum fileThe --prefix ensures extracted files are in a versioned directory (e.g., apache-iceberg-cpp-1.0.0/), not directly in working directory.
Before verification, Apache RAT (Release Audit Tool) scans the archive .github/workflows/rc.yml58-60:
Apache RAT:
This ensures ASF release policy compliance before archiving and distribution.
The verify job executes dev/release/verify_rc.sh .github/workflows/rc.yml104-121:
The script performs:
apache-iceberg-cpp-${version}.tar.gz-DICEBERG_BUILD_TESTS=ON -DICEBERG_BUILD_BUNDLE=ONctest --output-on-failuremake install to verify installation layoutEnvironment variables skip optional verifications:
VERIFY_SIGN=0 - Skip GPG signature verification (no signing in CI)VERIFY_DOWNLOAD=0 - Skip download testing (archive already present)The verification matrix runs on:
ubuntu-24.04 with gcc-14/g++-14macos-26 with default toolchainAfter successful verification, the upload job creates a GitHub release .github/workflows/rc.yml138-148:
Flags:
--prerelease - Marks as pre-release (not production-ready)--title - Sets release title (e.g., "Apache Iceberg C++ v1.0.0-rc1")--generate-notes - Auto-generates changelog from commits--verify-tag - Ensures tag exists before creating releaseAttached assets:
apache-iceberg-cpp-1.0.0.tar.gz - Source distributionapache-iceberg-cpp-1.0.0.tar.gz.sha512 - Checksum fileThe job requires contents: write permission .github/workflows/rc.yml128-129 and only runs if github.ref_type == 'tag' .github/workflows/rc.yml124
Sources: .github/workflows/rc.yml18-149
The docs.yml workflow builds and deploys documentation to GitHub Pages on pushes to main that affect mkdocs/** or src/** paths. This runs as job docs on ubuntu-24.04:
Documentation Pipeline:
Makefile Targets (defined in repository root Makefile):
| Target | Purpose | Implementation |
|---|---|---|
make install-deps | Install Python dependencies for documentation | Runs pip install -r requirements.txt (mkdocs, mkdocs-material, plugins) |
make build-api-docs | Generate Doxygen API documentation | Runs doxygen Doxyfile, outputs to docs/api/ |
make build-docs | Build MkDocs static site | Runs mkdocs build, outputs to mkdocs/site/, integrates Doxygen output |
Orphan Branch Deployment .github/workflows/docs.yml64-75:
The workflow creates a clean history on gh-pages branch:
git checkout --orphan gh-pages-tmp - New branch with no parent commitsgit rm --quiet -rf . - Remove all tracked filescp -r /tmp/site/* . - Copy built documentationecho "cpp.iceberg.apache.org" > CNAME - Configure custom domaingit commit -m "Publish docs from commit ${{ github.sha }}" - Commit sitegit push -f origin gh-pages-tmp:gh-pages - Force push to gh-pagesThis prevents documentation build artifacts from polluting main repository history while maintaining deployment history in the gh-pages branch.
Sources: .github/workflows/docs.yml18-76
Build and run all tests:
Run specific test:
Run tests with verbose output:
Build and run all tests:
Run specific test:
Run tests with verbose output:
Enable sanitizers during CMake configuration:
Set runtime options before running tests:
Workflows use concurrency groups to prevent duplicate runs and optimize CI resource usage:
Standard Concurrency Group Pattern .github/workflows/test.yml29-31 .github/workflows/sanitizer_test.yml29-31 .github/workflows/rc.yml24-26:
Concurrency Key Components:
github.repository - Repository name (e.g., apache/iceberg-cpp)github.head_ref || github.sha - Branch name for PRs (e.g., feature-branch), commit SHA for direct pushesgithub.workflow - Workflow name (e.g., Test, ASAN and UBSAN Tests)Behavior with cancel-in-progress: true:
Documentation Workflow Exception .github/workflows/docs.yml28-30:
Concurrency Key Components:
github.workflow - Workflow name (Release Docs)github.ref - Full ref path (e.g., refs/heads/main)Behavior with cancel-in-progress: false:
main → queues documentation buildsgh-pages branch (orphan branch force-push)Concurrency Group Table:
| Workflow | Concurrency Group | Cancel In-Progress | Rationale |
|---|---|---|---|
test.yml | repo-branch/sha-Test | Yes | Cancel outdated test runs on new commits |
sanitizer_test.yml | repo-branch/sha-ASAN and UBSAN Tests | Yes | Cancel outdated sanitizer runs on new commits |
rc.yml | repo-branch/sha-RC | Yes | Cancel outdated RC builds on new tags |
pre-commit.yml | repo-branch/sha-pre-commit | Yes | Cancel outdated hook runs on new commits |
cpp-linter.yml | (no concurrency group) | N/A | Each PR gets independent linting |
license_check.yml | (no concurrency group) | N/A | Each PR gets independent license check |
docs.yml | Release Docs-refs/heads/main | No | Sequential deploys to prevent gh-pages corruption |
Sources: .github/workflows/test.yml29-31 .github/workflows/sanitizer_test.yml29-31 .github/workflows/rc.yml24-26 .github/workflows/docs.yml28-30 .github/workflows/pre-commit.yml20-25
Common environment variables used across workflows:
| Variable | Value | Purpose | Workflows |
|---|---|---|---|
ICEBERG_HOME | /tmp/iceberg | Installation directory | test.yml |
CC | gcc-14 | C compiler (Ubuntu) | test.yml, cpp-linter.yml, sanitizer_test.yml |
CXX | g++-14 | C++ compiler (Ubuntu) | test.yml, cpp-linter.yml, sanitizer_test.yml |
SCCACHE_GHA_ENABLED | "true" | Enable sccache GitHub Actions cache | test.yml (Windows) |
ASAN_OPTIONS | See section above | Address Sanitizer options | sanitizer_test.yml |
LSAN_OPTIONS | See section above | Leak Sanitizer options | sanitizer_test.yml |
UBSAN_OPTIONS | See section above | UB Sanitizer options | sanitizer_test.yml |
Ubuntu: Explicitly uses gcc-14/g++-14 via environment variables
macOS: Uses system default compiler (Apple Clang)
Windows: Uses MSVC 2022 via vcvarsall.bat activation
Sources: .github/workflows/test.yml36-37 .github/workflows/test.yml54-56 .github/workflows/sanitizer_test.yml50-51
The Windows job in test.yml uses Mozilla's sccache-action for build caching .github/workflows/test.yml92-93:
After the build completes, cache statistics are displayed .github/workflows/test.yml101:
This significantly reduces build times for Windows, which typically has the longest build duration (60-minute timeout vs 30 minutes for other platforms).
All GitHub Actions are pinned to specific commit SHAs for security and reproducibility, for example:
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2This follows GitHub security best practices to prevent supply chain attacks.
Sources: .github/workflows/test.yml92-101
Refresh this wiki
This wiki was recently refreshed. Please wait 7 days to refresh again.