Tags: init4tech/storage
Tags
fix(storage): keep !Send MDBX write tx out of async state machine (#41) * fix(storage): keep !Send MDBX write tx out of async state machine Extract synchronous hot-storage unwind logic from `drain_above` into `unwind_hot_above` so the `!Send` MDBX write transaction never appears in the async generator's state machine. This makes the future returned by `drain_above` `Send`, unblocking use from `Send`-bounded executors like `reth::install_exex`. Adds compile-time `Send` canaries for `drain_above` and `cold_lag` parameterized over `DatabaseEnv` to prevent regressions. Closes ENG-2080 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore(storage): add Send compile canaries to builder and replay_to_cold Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix(cold): drain in-flight reads before executing writes (#39) The task runner's select loop executed writes immediately without waiting for spawned read tasks to complete, allowing reads and writes to race on the backend. Close and drain the TaskTracker before each write to ensure exclusive access, then reopen for subsequent reads. Closes ENG-1980 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
refactor(storage): introduce connector-based architecture (#35) * feat(storage): add configuration and builder for flexible storage instantiation Add configuration layer to signet-storage crate enabling three deployment modes: - Hot-only (MDBX only, no cold storage) - Hot + Cold MDBX (both hot and cold use MDBX) - Hot MDBX + Cold SQL (hot uses MDBX, cold uses PostgreSQL/SQLite) Changes: - Add config module with StorageMode enum and environment variable parsing - Add builder module with StorageBuilder and StorageInstance types - Add StorageInstance enum to safely distinguish hot-only vs unified storage - Update error types to support configuration and backend errors - Add factory methods and into_hot() to UnifiedStorage - Add sql and sqlite features for PostgreSQL and SQLite support - Add comprehensive unit tests for configuration and builder The implementation is fully backward compatible with existing code. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * chore: bump version to 0.6.4 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * refactor(storage): introduce connector-based architecture Replace mode-based builder with trait-based connectors to decouple backend instantiation. Adds HotConnect and ColdConnect traits, unified MdbxConnector implementing both, and SqlConnector for auto-detecting PostgreSQL/SQLite. Removes StorageMode enum and StorageInstance enum, simplifying the API to always return UnifiedStorage. Users now pass connector objects to the builder's fluent API (.hot().cold().build()), enabling flexible backend composition without tight coupling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(storage): simplify EitherCold with dispatch_async! macro Reduces EitherCold boilerplate from 286 lines to 212 lines by introducing a dispatch_async! macro that generates the repetitive match-and-forward pattern for all 16 ColdStorage trait methods. Method signatures remain explicit for documentation while the async match blocks are generated by the macro, maintaining zero runtime overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(storage): use connector-specific errors, remove Config variants Creates dedicated error types for connector initialization: - MdbxConnectorError in cold-mdbx crate - SqlConnectorError in cold-sql crate Removes Config(String) variants from MdbxError and SqlColdError, which were only used for from_env() methods. The new connector error types are more specific and properly handle missing environment variables. ConfigError now uses From implementations to convert from connector errors, simplifying error handling in the builder. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(storage): desugar async fn in ColdConnect trait to specify Send bound Addresses async-fn-in-trait clippy lint by explicitly desugaring async fn to return impl Future + Send. This makes the Send bound explicit and allows callers to know the future can be sent across threads. Changes: - Desugar ColdConnect::connect to return impl Future + Send - Add #[allow(clippy::manual_async_fn)] to MDBX and EitherCold impls - Fix redundant closures in error mapping - Add Debug derive to StorageBuilder - Add serial_test to workspace for test isolation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
fix(hot): MemKvRwTx read cursor now sees queued writes (#32) `HotKvRead for MemKvRwTx` previously returned `MemKvCursor` which only sees committed data. This caused read cursors on write transactions to miss pending writes, creating an inconsistency with `raw_get` which already checked queued ops. Change `Traverse` type to `MemKvCursorMut` which merges committed data with queued writes. Also fix a pre-existing bug in `MemKvCursorMut::last_of_k1` where the k1 comparison used full MAX_KEY_SIZE bytes against the shorter encoded key, causing it to always return None. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
feat(cold): add stream_logs for incremental log streaming (#30) * feat(cold): add stream_logs for incremental log streaming Add `stream_logs` to the `ColdStorage` trait, enabling incremental streaming of log results via bounded `mpsc` channels instead of materializing entire result sets into `Vec<T>`. This bounds memory usage, limits MDBX read transaction and SQL connection hold times, and enables active deadline enforcement independent of consumer polling behavior. Key design decisions: - Spawned producer task + bounded channel (`ReceiverStream`) - Per-block resource acquisition (short-lived MDBX txs / SQL queries) - Fixed anchor hash on the `to` block, re-checked every block for reorg detection - `tokio::time::timeout_at` on sends for active deadline enforcement - Backend-owned semaphore (8 permits) limiting concurrent streams Implements for all three backends: in-memory, MDBX, and SQL (SQLite + PostgreSQL). Includes streaming conformance tests. Bumps workspace version to 0.6.0 (new required trait method). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve broken rustdoc link for LogStream Use fully qualified `crate::LogStream` in doc comment to fix `-D rustdoc::broken-intra-doc-links` in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: address PR review — DRY streaming, functional rewrites, cleanup Move the streaming loop from individual backends into the shared ColdStorageTask, replacing `stream_logs` on the ColdStorage trait with two simpler primitives (`get_block_hash`, `get_logs_block`). This eliminates ~200 lines of duplicated streaming logic across backends. - Rewrite collect_logs_block in MDBX backend functionally - Extract check_block_hash helper in MDBX backend - Extract append_filter_clause utility in SQL backend - Remove unnecessary collect in SQL get_logs_block - Remove unused tokio/tokio-stream deps from backend crates - Update conformance tests to run through ColdStorageHandle Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: delegate produce_log_stream to backends for snapshot consistency Move the streaming loop from the task runner into ColdStorage::produce_log_stream with a default implementation using get_block_hash + get_logs_block per block. MDBX and SQL backends override with single-transaction implementations for MVCC consistency and fewer round-trips. Add caller-supplied deadline clamped to the task's configured maximum. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: snapshot isolation, O(1) log index, and streaming correctness - Remove anchor_hash from produce_log_stream trait; backends that hold a consistent snapshot (MDBX Tx<Ro>, PG REPEATABLE READ) no longer need external reorg detection. Extract produce_log_stream_default for backends without snapshot semantics. - Add first_log_index column to receipts, replacing the O(n*k) correlated subquery with an O(1) JOIN for block_log_index. - Split SQL produce_log_stream by backend: PostgreSQL uses REPEATABLE READ with row-level streaming; SQLite delegates to the default implementation to avoid single-connection starvation. - Document partial-delivery semantics on LogStream and stream_logs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: remove avoidable Vec allocations in log streaming - MDBX: iterate receipt cursors inline instead of collecting into intermediate Vec before processing. Both collect_logs_block and produce_log_stream_blocking now process receipts as they are read. - SQL: write filter placeholders directly into the clause string instead of collect/join. Accept iterators in append_filter_clause to avoid intermediate Vec<&[u8]> in build_log_filter_clause. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: remove get_block_hash and get_logs_block from ColdStorage trait These per-block methods were only used by the default produce_log_stream_default implementation. Replace them with existing get_header and get_logs calls, reducing the trait surface area. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: make DatabaseEnv cloneable, eliminate SQL row allocations - Derive Clone on DatabaseEnv (all fields already cheaply cloneable) - Remove Arc<DatabaseEnv> wrapper from MdbxColdBackend, clone env directly into spawn_blocking for produce_log_stream - Uncomment MDBX produce_log_stream override (single-txn MVCC path) - Change blob/opt_blob helpers to return borrowed &[u8] instead of Vec<u8>, eliminating per-row heap allocations for fixed-size fields - Add b256_col helper for direct B256 extraction from rows - Update decode_u128_required and decode_access_list_or_empty to accept Option<&[u8]> instead of &Option<Vec<u8>> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: borrow filter params instead of cloning to Vec<Vec<u8>> Filter addresses and topics are already fixed-size slices living in the Filter struct. Borrow them as &[u8] instead of copying each one into a fresh Vec<u8>. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: remove produce_log_stream default, move helper to stream.rs Make produce_log_stream a required trait method — every backend must explicitly choose its streaming strategy. The reorg-detecting helper moves from traits.rs to stream.rs and remains exported as produce_log_stream_default for non-snapshot backends (mem, SQLite). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: report correct max_logs in TooManyLogs during streaming The MDBX streaming and default streaming implementations reported the per-block `remaining` count instead of the original `max_logs` in TooManyLogs errors. When logs spanned multiple blocks, this caused the error to report a smaller limit than what the caller configured. Add a multi-block conformance test that catches this. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: introduce StreamParams to reduce produce_log_stream arguments Bundle from, to, max_logs, sender, and deadline into a StreamParams struct, reducing the trait method from 7 parameters to 2 (filter + params). Remove #[allow(clippy::too_many_arguments)] annotations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: reduce streaming error-handling boilerplate with try_stream macro Add try_stream! macros in the MDBX (blocking) and SQL (async) streaming implementations to replace repeated match/send-error/return blocks. Also thread StreamParams through produce_log_stream_pg directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: borrow Filter in get_logs to eliminate per-block clones Change ColdStorage::get_logs from `filter: Filter` to `filter: &Filter` since no implementation needs ownership. In the default streaming helper, clone the filter once before the loop and mutate block_option per iteration instead of cloning the full filter (address + topics arrays) on every block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: tighten code style across cold storage crates - Merge duplicate alloy::consensus import paths in conformance tests - Use mut parameter instead of let-rebinding in collect_stream - Unwrap directly in tests instead of checking is_some() first - Replace closures with function references in header_from_row - Remove duplicate doc comments on append_filter_clause - Condense truncate_above with loop over table names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: use alloy-primitives sqlx impls for direct row decoding Add alloy-primitives with the sqlx feature to get native sqlx Type/Encode/Decode impls for Address, B256, Bytes, and FixedBytes<N>. This eliminates manual from_slice/copy_from_slice calls on the read path and removes the b256_col helper and from_address converter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add explanatory comments to log stream production methods Annotate the block-iteration loops, channel sends, deadline checks, reorg detection, and snapshot-isolation setup in both produce_log_stream_pg and produce_log_stream_default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add explanatory comments to MDBX log stream production Annotate the blocking log stream method with comments covering MVCC snapshot semantics, cursor reuse, block iteration, filter matching, log limit enforcement, and blocking channel sends. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: use u256 sqlx impl * refactor: use alloy-primitives sqlx impls for direct row encoding Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
feat(cold): add max_logs parameter to get_logs (#29) * feat(cold): add max_logs parameter to get_logs Add a `max_logs: usize` parameter to `ColdStorage::get_logs` to cap result sets at the trait level, preventing unbounded memory usage from wide `eth_getLogs` queries. Backend implementations short-circuit early: - In-memory/MDBX: break out of the log collection loop at the limit - SQL: appends `LIMIT $N` to let the database engine stop scanning Bump workspace version to 0.5.0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(cold): error on max_logs exceeded, optimize MDBX log drain Change `get_logs` semantics: return `TooManyLogs` error when the query would produce more than `max_logs` results instead of silently truncating. Each backend checks efficiently: - In-memory: returns error as soon as count exceeds limit - MDBX: collects up to max_logs+1, async wrapper checks and errors; also drains logs via `into_iter()` instead of cloning (ir is owned) - SQL: runs a cheap `COUNT(*)` query first, errors before fetching rows Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: docs * fix(cold-mdbx): check max_logs limit during accumulation, not after Move the TooManyLogs check into the inner log collection loop so we bail out as soon as the limit is reached instead of collecting all results first. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
PreviousNext