Skip to content

Conversation

@youknowone
Copy link
Owner

@youknowone youknowone commented Jan 1, 2026

Summary by CodeRabbit

  • New Features

    • Full Python-facing gc module: collect, enable/disable, per-generation control, stats, referents/referrers, freeze/unfreeze, and gc.garbage/gc.callbacks.
  • Bug Fixes

    • Improved cycle detection/resolution, safer finalization, weakref handling, and cleaner interpreter shutdown with module cleanup.
  • Performance

    • Faster memory reclamation via epoch-based refcounting and deferred-drop support; reduced GC-related overhead.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 1, 2026

📝 Walkthrough
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The pull request title 'gc' is extremely vague and generic, failing to convey what specific garbage collection changes are being made. Provide a more descriptive title that summarizes the main change, such as 'Implement generational garbage collector with epoch-based reclamation' or 'Add GC state management and traversal support for cycle detection'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (16)
crates/vm/src/dict_inner.rs (1)

712-721: pop_all_entries keeps Dict invariants; consider reuse of clear semantics

The method correctly:

  • Obtains a mutable handle via get_mut() (no lock, requires &mut self).
  • Resets used/filled and marks all indices FREE before draining entries.
  • Returns an iterator of (PyObjectRef, T) for GC to consume, leaving the dict logically empty.

One behavioral difference vs clear() is that indices capacity is preserved instead of being reset to length 8. That’s probably fine for GC-focused use, but if this gets used on very large dicts in hot paths you might want to either reuse clear()’s resizing or document this capacity retention explicitly.

crates/common/src/ebr/sync/once_lock.rs (1)

10-17: Consider using Once::is_completed() instead of a separate AtomicBool.

The comment on line 12 mentions Once::is_completed requires Rust 1.43, but the workspace MSRV is 1.89.0 (per root Cargo.toml line 134). Using Once::is_completed() would eliminate the is_initialized field and the associated synchronization overhead.

🔎 Proposed simplification
 pub(crate) struct OnceLock<T> {
     once: Once,
-    // Once::is_completed requires Rust 1.43, so use this to track of whether they have been initialized.
-    is_initialized: AtomicBool,
     value: UnsafeCell<MaybeUninit<T>>,
-    // Unlike std::sync::OnceLock, we don't need PhantomData here because
-    // we don't use #[may_dangle].
 }

Then replace self.is_initialized() calls with self.once.is_completed().

crates/vm/src/object/core.rs (1)

298-305: Subtle double-decrement prevention logic - add clarifying comment.

The interaction between already_cleared and decrement_ref_count prevents double decrements when both GC and drop_slow_inner clear weakrefs. However, the logic is subtle and could benefit from a more explicit comment explaining the scenario.

             // Only decrement ref_count if requested AND not already cleared
+            // This prevents double-decrement when:
+            // 1. GC calls clear_for_gc (decrement_ref_count=false)
+            // 2. Later, drop_slow_inner calls clear (decrement_ref_count=true)
+            //    but obj is already None, so we skip the decrement
             if decrement_ref_count && !already_cleared {
crates/stdlib/src/gc.rs (1)

261-266: Silent error ignoring in callback invocation.

Errors from set_item and callback.call are silently discarded. While CPython also handles callback errors gracefully, completely ignoring them could make debugging difficult.

Consider logging errors or using run_unraisable for callback failures, similar to how __del__ errors are handled in core.rs.

-        let _ = info.set_item("generation", vm.ctx.new_int(generation).into(), vm);
-        let _ = info.set_item("collected", vm.ctx.new_int(collected).into(), vm);
-        let _ = info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm);
+        // These set_item calls should not fail for simple int values
+        info.set_item("generation", vm.ctx.new_int(generation).into(), vm).ok();
+        info.set_item("collected", vm.ctx.new_int(collected).into(), vm).ok();
+        info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm).ok();

         for callback in callbacks {
-            let _ = callback.call((phase_str.clone(), info.clone()), vm);
+            if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) {
+                // Handle callback errors gracefully, similar to __del__ errors
+                vm.run_unraisable(e, Some("gc callback".to_owned()), callback);
+            }
         }
Cargo.toml (1)

208-209: Both dependencies are necessary for the EBR implementation.

The memoffset crate is actively used in crates/common/src/ebr/internal.rs (lines 45, 575, 582). While std::mem::offset_of! is available (MSRV 1.89.0 exceeds its 1.77 stabilization), consolidating on the standard library version could be a future refactoring. The atomic crate is essential—it provides Atomic<T> for arbitrary types, which std::sync::atomic doesn't offer (only concrete types like AtomicUsize).

crates/common/src/ebr/mod.rs (1)

23-23: Verify if internal should be publicly exposed.

The internal module is declared as pub mod internal, which exposes implementation details. This might be intentional for testing or advanced use cases within the rustpython workspace, but it's worth confirming that this is the desired visibility rather than pub(crate).

🔎 Consider restricting visibility if not needed externally

If the internal module is only used within this crate:

-pub mod internal;
+pub(crate) mod internal;

If it needs to be accessible to other rustpython crates but not external users, consider using pub(crate) or visibility modifiers like pub(in crate::ebr).

crates/vm/src/vm/mod.rs (1)

981-1052: Consider making the stdlib module list more maintainable.

The hardcoded list of stdlib module names (lines 1006-1029) and the path-based detection (pylib, Lib) are fragile:

  1. The stdlib list may become stale as modules are added/removed
  2. Path detection may fail on non-standard installations or alternative stdlib locations

Consider one of these alternatives:

  • Use a centralized constant or configuration for stdlib module names
  • Mark stdlib modules with a metadata flag during import
  • Check if the module originated from frozen modules or builtin inits

That said, this is a reasonable first implementation for breaking cycles during shutdown.

crates/vm/src/builtins/function.rs (2)

53-57: Consider traversing name and qualname fields.

The traverse method visits type_params, annotations, module, and doc, but not name and qualname. Since these can be PyStr subclasses (as noted in pop_edges), they may also participate in reference cycles and should be traversed for consistency.

🔎 Proposed addition to traverse
         self.type_params.lock().traverse(tracer_fn);
         self.annotations.lock().traverse(tracer_fn);
         self.module.lock().traverse(tracer_fn);
         self.doc.lock().traverse(tracer_fn);
+        self.name.lock().traverse(tracer_fn);
+        self.qualname.lock().traverse(tracer_fn);
     }

67-74: try_lock failure silently skips clearing defaults.

If try_lock() fails (returns None), the defaults and kwdefaults are not cleared. While this is likely intentional for GC safety (avoiding deadlock), it could lead to reference cycles not being broken if the lock is contended during collection. Consider adding a debug assertion or log to track this case.

crates/common/src/ebr/epoch.rs (1)

33-38: Verify wrapping_sub correctness for edge cases.

The optimization in wrapping_sub relies on the LSB difference being discarded by the right shift. The comment explains this, but consider adding a unit test to verify behavior at epoch boundaries (e.g., near usize::MAX).

crates/vm/src/stdlib/thread.rs (2)

445-469: Constructor uses indexing after length validation - safe but consider bounds-checked access.

The constructor validates seq.len() == 4 before indexing, so this is safe. However, using get() with pattern matching would be more idiomatic Rust and defensive against future refactoring.

🔎 Optional improvement using pattern matching
-            if seq.len() != 4 {
-                return Err(vm.new_type_error(format!(
-                    "_ExceptHookArgs expected 4 arguments, got {}",
-                    seq.len()
-                )));
-            }
-            Ok(Self {
-                exc_type: seq[0].clone(),
-                exc_value: seq[1].clone(),
-                exc_traceback: seq[2].clone(),
-                thread: seq[3].clone(),
-            })
+            match <[_; 4]>::try_from(seq) {
+                Ok([exc_type, exc_value, exc_traceback, thread]) => Ok(Self {
+                    exc_type,
+                    exc_value,
+                    exc_traceback,
+                    thread,
+                }),
+                Err(seq) => Err(vm.new_type_error(format!(
+                    "_ExceptHookArgs expected 4 arguments, got {}",
+                    seq.len()
+                ))),
+            }

499-522: Deep nesting in stderr fallback logic reduces readability.

The nested if let / if / else chain for getting stderr is hard to follow. Consider extracting to a helper function or using early returns.

🔎 Suggested refactor using a helper closure
+        // Helper to get stderr, with fallback to thread._stderr
+        let get_stderr = || -> Option<crate::PyObjectRef> {
+            if let Ok(stderr) = vm.sys_module.get_attr("stderr", vm) {
+                if !vm.is_none(&stderr) {
+                    return Some(stderr);
+                }
+            }
+            let thread = thread.as_ref()?;
+            if vm.is_none(thread) {
+                return None;
+            }
+            let thread_stderr = thread.get_attr("_stderr", vm).ok()?;
+            if vm.is_none(&thread_stderr) {
+                return None;
+            }
+            Some(thread_stderr)
+        };
+
-        // Get stderr - fall back to thread._stderr if sys.stderr is None
-        let stderr = match vm.sys_module.get_attr("stderr", vm) {
-            Ok(stderr) if !vm.is_none(&stderr) => stderr,
-            _ => {
-                // ... deeply nested logic ...
-            }
-        };
+        let stderr = match get_stderr() {
+            Some(s) => s,
+            None => return Ok(()),
+        };
crates/common/src/ebr/internal.rs (1)

55-58: Static mutable variables are unsafe and difficult to use correctly.

MAX_OBJECTS and MANUAL_EVENTS_BETWEEN_COLLECT are static mut, requiring unsafe blocks for every access. Consider using AtomicUsize or once_cell::sync::Lazy for safe interior mutability, especially if these might be configurable at runtime.

🔎 Suggested safer alternative
-/// Maximum number of objects a bag can contain.
-static mut MAX_OBJECTS: usize = 64;
-
-#[allow(dead_code)]
-static mut MANUAL_EVENTS_BETWEEN_COLLECT: usize = 64;
+use std::sync::atomic::AtomicUsize;
+
+/// Maximum number of objects a bag can contain.
+static MAX_OBJECTS: AtomicUsize = AtomicUsize::new(64);
+
+#[allow(dead_code)]
+static MANUAL_EVENTS_BETWEEN_COLLECT: AtomicUsize = AtomicUsize::new(64);

Then update usages:

-Bag(Vec::with_capacity(unsafe { MAX_OBJECTS }))
+Bag(Vec::with_capacity(MAX_OBJECTS.load(Ordering::Relaxed)))
crates/vm/src/gc_state.rs (2)

406-418: Holding read locks during entire collection may cause contention.

Lines 406-408 acquire read locks on all generation sets and hold them until line 513/523. During this time, track_object and untrack_object from other threads will block on write locks.

For a GC that may take significant time, this could cause allocation pauses. Consider copying the object set and releasing locks earlier.

🔎 Alternative: copy and release locks early
-        let gen_locks: Vec<_> = (0..=generation)
-            .filter_map(|i| self.generation_objects[i].read().ok())
-            .collect();
-
         let mut collecting: HashSet<GcObjectPtr> = HashSet::new();
-        for gen_set in &gen_locks {
-            for &ptr in gen_set.iter() {
+        for i in 0..=generation {
+            if let Ok(gen_set) = self.generation_objects[i].read() {
+                for &ptr in gen_set.iter() {
+                    let obj = unsafe { ptr.0.as_ref() };
+                    if obj.strong_count() > 0 {
+                        collecting.insert(ptr);
+                    }
+                }
+            } // Lock released here
+        }

This trades off snapshot consistency for reduced lock contention.


690-712: promote_survivors has O(n*m) complexity - can be optimized.

For each survivor, the code iterates through all generations 0..=from_gen to find which one contains it (lines 700-710). Since objects should be in exactly one generation, this is inefficient.

A more efficient approach would track which generation each object is in, or iterate the generation sets instead of survivors.

🔎 Sketch of more efficient approach
fn promote_survivors(&self, from_gen: usize, survivors: &HashSet<GcObjectPtr>) {
    if from_gen >= 2 {
        return;
    }
    let next_gen = from_gen + 1;
    
    // Process each generation once instead of per-survivor
    for gen_idx in 0..=from_gen {
        if let Ok(mut gen_set) = self.generation_objects[gen_idx].write() {
            let to_promote: Vec<_> = gen_set.intersection(survivors).copied().collect();
            for ptr in to_promote {
                gen_set.remove(&ptr);
                if let Ok(mut next_set) = self.generation_objects[next_gen].write() {
                    next_set.insert(ptr);
                }
            }
        }
    }
}
crates/common/src/ebr/collector.rs (1)

70-96: LocalHandle safely stores a raw pointer via handle_count tracking—add safety documentation to explain the invariants.

The unsafe code is sound: Local is heap-allocated via RawShared::from_owned() and lives longer than any LocalHandle due to the handle_count mechanism (initialized to 1, incremented/decremented on handle creation/drop, and only finalized when both handle_count and guard_count reach 0). The design correctly follows the epoch-based reclamation pattern.

Add a safety comment to LocalHandle documenting these invariants so maintainers understand why the unsafe code is safe.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b19312b and 9bb9097.

⛔ Files ignored due to path filters (11)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/support/__init__.py is excluded by !Lib/**
  • Lib/test/test_builtin.py is excluded by !Lib/**
  • Lib/test/test_dict.py is excluded by !Lib/**
  • Lib/test/test_gc.py is excluded by !Lib/**
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_subprocess.py is excluded by !Lib/**
  • Lib/test/test_super.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
  • Lib/test/test_with.py is excluded by !Lib/**
📒 Files selected for processing (40)
  • Cargo.toml
  • crates/common/Cargo.toml
  • crates/common/src/ebr/collector.rs
  • crates/common/src/ebr/default.rs
  • crates/common/src/ebr/deferred.rs
  • crates/common/src/ebr/epoch.rs
  • crates/common/src/ebr/guard.rs
  • crates/common/src/ebr/internal.rs
  • crates/common/src/ebr/mod.rs
  • crates/common/src/ebr/pointers.rs
  • crates/common/src/ebr/sync/list.rs
  • crates/common/src/ebr/sync/mod.rs
  • crates/common/src/ebr/sync/once_lock.rs
  • crates/common/src/ebr/sync/queue.rs
  • crates/common/src/lib.rs
  • crates/common/src/refcount.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/derive-impl/src/util.rs
  • crates/stdlib/src/fcntl.rs
  • crates/stdlib/src/gc.rs
  • crates/stdlib/src/ssl.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/list.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/super.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/gc_state.rs
  • crates/vm/src/lib.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/traverse.rs
  • crates/vm/src/object/traverse_object.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
  • crates/vm/src/warn.rs
🧰 Additional context used
🧬 Code graph analysis (24)
crates/vm/src/vm/context.rs (2)
crates/vm/src/object/core.rs (2)
  • new_ref (1290-1307)
  • default (342-344)
crates/vm/src/builtins/list.rs (1)
  • new_ref (91-93)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
  • gc_state (742-744)
crates/vm/src/builtins/dict.rs (2)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/object/traverse.rs (9)
  • traverse (34-34)
  • traverse (43-45)
  • traverse (49-51)
  • traverse (55-55)
  • traverse (60-64)
  • traverse (72-76)
  • traverse (84-88)
  • traverse (96-100)
  • pop_edges (39-39)
crates/common/src/ebr/default.rs (4)
crates/common/src/ebr/collector.rs (6)
  • new (30-32)
  • global_epoch (41-43)
  • drop (91-95)
  • drop (248-250)
  • drop (313-315)
  • drop (393-395)
crates/common/src/ebr/epoch.rs (1)
  • new (90-93)
crates/common/src/ebr/sync/once_lock.rs (2)
  • new (25-31)
  • drop (98-103)
crates/common/src/ebr/guard.rs (1)
  • drop (150-154)
crates/vm/src/warn.rs (6)
crates/vm/src/vm/mod.rs (1)
  • modules (990-990)
crates/vm/src/builtins/type.rs (1)
  • __dict__ (963-965)
crates/vm/src/stdlib/builtins.rs (1)
  • globals (421-423)
crates/vm/src/object/core.rs (1)
  • dict (732-734)
crates/vm/src/builtins/dict.rs (1)
  • dict (782-782)
crates/vm/src/frame.rs (1)
  • dict (801-801)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/str.rs (2)
  • try_traverse (1932-1934)
  • try_pop_edges (1936-1938)
crates/vm/src/object/traverse.rs (2)
  • try_traverse (19-19)
  • try_pop_edges (21-21)
crates/stdlib/src/fcntl.rs (1)
extra_tests/snippets/stdlib_ctypes.py (1)
  • c_ulong (94-95)
crates/common/src/ebr/guard.rs (2)
crates/common/src/ebr/internal.rs (4)
  • defer (349-359)
  • new (68-70)
  • new (160-166)
  • pin (392-454)
crates/common/src/ebr/deferred.rs (5)
  • mem (33-33)
  • mem (34-34)
  • mem (37-37)
  • mem (37-37)
  • new (32-70)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
  • debug_obj (87-93)
  • drop_dealloc_obj (84-86)
  • try_pop_edges_obj (103-106)
  • try_trace_obj (96-100)
crates/vm/src/builtins/function.rs (2)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/common/src/ebr/sync/queue.rs (1)
crates/common/src/ebr/guard.rs (1)
  • unprotected (173-177)
crates/common/src/ebr/collector.rs (2)
crates/common/src/ebr/internal.rs (9)
  • fmt (117-119)
  • default (101-103)
  • new (68-70)
  • new (160-166)
  • register (300-323)
  • pin (392-454)
  • drop (107-112)
  • release_handle (517-526)
  • collector (333-335)
crates/common/src/ebr/guard.rs (2)
  • fmt (158-160)
  • drop (150-154)
crates/vm/src/vm/thread.rs (1)
crates/common/src/ebr/default.rs (1)
  • cs (24-26)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
  • ensure_pinned (30-36)
  • drop_guard (55-59)
crates/stdlib/src/ssl.rs (2)
crates/vm/src/builtins/function.rs (1)
  • traverse (47-58)
crates/vm/src/builtins/dict.rs (1)
  • traverse (48-50)
crates/common/src/ebr/mod.rs (5)
crates/common/src/ebr/internal.rs (2)
  • collector (333-335)
  • default (101-103)
crates/common/src/ebr/default.rs (1)
  • collector (11-15)
crates/common/src/ebr/sync/list.rs (1)
  • default (133-137)
crates/common/src/ebr/collector.rs (1)
  • default (21-25)
crates/common/src/ebr/pointers.rs (1)
  • default (29-31)
crates/vm/src/builtins/tuple.rs (3)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/builtins/dict.rs (2)
  • traverse (48-50)
  • pop_edges (52-58)
crates/vm/src/object/traverse.rs (10)
  • traverse (34-34)
  • traverse (43-45)
  • traverse (49-51)
  • traverse (55-55)
  • traverse (60-64)
  • traverse (72-76)
  • traverse (84-88)
  • traverse (96-100)
  • traverse (105-112)
  • pop_edges (39-39)
crates/vm/src/object/traverse.rs (4)
crates/vm/src/builtins/str.rs (2)
  • try_traverse (1932-1934)
  • try_pop_edges (1936-1938)
crates/vm/src/builtins/list.rs (1)
  • pop_edges (66-73)
crates/vm/src/builtins/dict.rs (1)
  • pop_edges (52-58)
crates/vm/src/builtins/tuple.rs (1)
  • pop_edges (47-51)
crates/vm/src/builtins/str.rs (1)
crates/vm/src/object/traverse.rs (2)
  • try_traverse (19-19)
  • try_pop_edges (21-21)
crates/common/src/ebr/epoch.rs (1)
crates/common/src/ebr/internal.rs (2)
  • new (68-70)
  • new (160-166)
crates/common/src/refcount.rs (2)
crates/common/src/ebr/default.rs (2)
  • cs (24-26)
  • global_epoch (44-46)
crates/common/src/ebr/collector.rs (1)
  • global_epoch (41-43)
crates/vm/src/gc_state.rs (3)
crates/common/src/refcount.rs (2)
  • default (244-246)
  • with_deferred_drops (396-413)
crates/common/src/ebr/default.rs (1)
  • cs (24-26)
crates/common/src/ebr/guard.rs (1)
  • drop (150-154)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
  • reactivate_guard (44-50)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (18)
  • gc_state (742-744)
  • enable (164-166)
  • disable (169-171)
  • collect (368-370)
  • get_threshold (184-190)
  • set_threshold (64-66)
  • set_threshold (193-201)
  • get_count (204-210)
  • get_debug (174-176)
  • set_debug (179-181)
  • get_stats (213-219)
  • stats (68-75)
  • get_objects (297-338)
  • new (44-54)
  • new (134-156)
  • freeze (720-726)
  • unfreeze (729-732)
  • get_freeze_count (715-717)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Collect benchmark data
  • GitHub Check: Collect what is left data
  • GitHub Check: Collect code coverage data
  • GitHub Check: Collect regression test data
🔇 Additional comments (73)
crates/derive-impl/src/util.rs (1)

375-375: LGTM! The pop_edges attribute is properly integrated throughout the codebase.

The addition of "pop_edges" to ALLOWED_NAMES follows the pattern established by other attributes like "traverse" and enables #[pyclass(pop_edges)] to pass validation. The macro implementation already checks for this attribute using _has_key("pop_edges") and generates the appropriate HAS_POP_EDGES constant and try_pop_edges method based on its presence. Multiple builtin types (dict, list, tuple, function) already use this attribute with corresponding pop_edges method implementations, and the GC system successfully leverages it to extract child references for circular reference resolution.

crates/stdlib/src/fcntl.rs (1)

95-95: Conversion logic is correct; i64 choice is reasonable for the Python API.

The truncation-then-zero-extend pattern correctly handles both positive values and negative i32 values passed as i64 (preserving their bit patterns). The i64 parameter type is appropriate for a Python API—it accepts a wider range of Python integer inputs while still safely truncating to u32 before passing to the syscall, which aligns with CPython's approach of accepting platform-sized integers for ioctl request codes. No breaking changes to internal Rust callers were found.

crates/common/Cargo.toml (1)

34-38: EBR dependency block looks consistent with existing workspace patterns

The added crossbeam-utils, memoffset, scopeguard, and atomic workspace dependencies are grouped clearly under an EBR comment and match the style of existing entries; no issues from the manifest side.

crates/derive-impl/src/pyclass.rs (1)

608-643: MaybeTraverse generation for traverse/pop_edges looks consistent with GC design

The new MaybeTraverse impl:

  • Sets IS_TRACE based on has_traverse and only calls Traverse::traverse when requested.
  • Exposes a HAS_POP_EDGES flag and routes try_pop_edges to Traverse::pop_edges only when pop_edges is present.
  • Leaves both hooks as no‑ops when the attributes are absent, so all #[pyclass] types implement MaybeTraverse uniformly.

This matches the documented cases (manual vs derived traverse, optional pop_edges) and keeps the generated code straightforward; aside from the _optional_str error handling above, the structure looks solid.

Also applies to: 699-706

crates/stdlib/src/ssl.rs (1)

2296-2345: GC traversal configuration for _SSLSocket looks appropriate

Marking _SSLSocket as #[pyclass(..., traverse)] and selectively skipping fields with #[pytraverse(skip)] aligns well with what actually holds Python references:

  • Traced: sock, context, owner, session, incoming_bio, outgoing_bio, pending_context — all contain PyObjectRef/PyRef and should participate in GC.
  • Skipped: server_side, server_hostname, connection, handshake_done, session_was_reused, verified_chain, sni_state, client_hello_buffer, shutdown_state, deferred_cert_error — these are pure Rust state or TLS/certificate structures with no embedded Python objects.

This should give the collector visibility into real Python edges (including the _SSLSocket ↔ owner/session relationships) without forcing Traverse impls on the rustls plumbing types.

crates/common/src/ebr/sync/once_lock.rs (1)

97-103: LGTM: Drop implementation is correct.

The Drop implementation properly checks initialization before dropping the inner value, and uses drop_in_place correctly through a raw pointer cast.

crates/vm/src/object/core.rs (3)

870-888: Good: Proper finalization guard prevents double __del__ calls.

The is_finalized/mark_finalized pattern correctly prevents __del__ from being called twice on resurrected objects. This aligns with CPython's behavior.


1290-1307: LGTM: GC tracking during object creation.

The tracking logic correctly tracks objects with IS_TRACE or instance dict, matching the untracking condition in drop_slow. The maybe_collect call enables automatic GC triggering.


831-856: Resurrection detection logic is sound.

The RefCount::inc() behavior is correctly implemented as documented: when called on strong count 0, it performs a double increment (0→2) by executing two separate atomic operations when the first increment observes strong count = 0. This is intentional for thread safety, as confirmed in the RefCount implementation (crates/common/src/refcount.rs:273-276).

The resurrection detection correctly leverages this: after inc(), the expected strong count is 2. If after_del exceeds this, the __del__ method created new references and the object is resurrected; the second dec() is skipped to keep the object alive. Otherwise, both decrements run to return the count to 0, matching the double increment. The logic is correct.

crates/stdlib/src/gc.rs (1)

50-86: LGTM: collect function correctly implements generation-based collection.

The implementation properly validates generation bounds, invokes start/stop callbacks, forces collection even when GC is disabled (matching CPython behavior), and transfers garbage to the VM context.

crates/common/src/lib.rs (1)

17-17: LGTM: EBR module export.

The new ebr module is correctly added to the public API surface, enabling epoch-based reclamation functionality across the crate.

crates/vm/src/lib.rs (1)

80-80: LGTM: GC state module export.

The gc_state module is correctly exposed as a public module, enabling GC functionality to be accessed from the stdlib and other components.

crates/vm/src/builtins/str.rs (1)

1928-1938: LGTM: Correct traversal implementation for PyUtf8Str.

PyUtf8Str correctly implements MaybeTraverse:

  • IS_TRACE = true enables GC participation
  • HAS_POP_EDGES = false is correct since strings don't hold references to other Python objects
  • try_pop_edges is appropriately a no-op
crates/vm/src/object/traverse.rs (3)

16-17: LGTM!

The HAS_POP_EDGES flag follows the same pattern as IS_TRACE and provides a clean compile-time capability check for the pop_edges functionality.


21-21: LGTM!

The try_pop_edges method signature and default implementation are appropriate. The &mut self parameter correctly allows implementations to take ownership of child references during GC cleanup.


36-39: LGTM!

The pop_edges method is well-documented with clear semantics for circular reference resolution. The default no-op implementation is safe, and the relevant code snippets show that concrete types like PyDict, PyList, and PyTuple properly implement this method to extract their owned child references.

crates/vm/src/vm/context.rs (3)

55-57: LGTM!

The new GC state fields are appropriately typed as PyListRef and clearly documented. The public visibility is correct for VM-level GC integration.


336-338: LGTM!

The initialization of gc_callbacks and gc_garbage follows the correct pattern used for other context objects like empty_bytes. Both lists are properly tracked by the GC system via PyRef::new_ref.


360-361: LGTM!

The struct initialization correctly includes the new GC fields.

crates/common/src/ebr/sync/mod.rs (1)

1-5: LGTM!

The synchronization primitives module structure is clean and appropriately scoped with pub(crate) visibility for internal EBR use.

crates/common/src/ebr/mod.rs (2)

1-17: LGTM!

The module documentation provides a clear and comprehensive explanation of Epoch-Based Reclamation, including key concepts like pinning and critical sections.


27-30: LGTM!

The re-exports provide a clean public API surface, making core EBR types and functions easily accessible through the module root.

crates/vm/src/builtins/super.rs (2)

64-64: LGTM!

The type signature change from PyTypeRef to PyObjectRef provides more flexibility and aligns with Python's dynamic type system, where super() can accept any object that is then checked at runtime.


78-81: LGTM!

The downcast logic correctly validates that the first argument to super() is a type object. The error message is clear and matches Python conventions, and the error handling via map_err is appropriate.

crates/vm/src/builtins/list.rs (2)

60-74: LGTM on the Traverse implementation.

The unsafe impl Traverse correctly visits all owned PyObjectRefs. The pop_edges implementation using try_write() is appropriate for GC scenarios where the object should be unreachable.

One minor observation: if try_write() returns None (lock contention), elements won't be popped. The comment explains this is safe because objects should be unreachable during GC, but you may want to consider adding a debug assertion or logging for the failure case to catch unexpected contention during development.


27-33: Pyclass attributes look correct.

The traverse = "manual" and pop_edges attributes properly indicate that this type provides a manual Traverse implementation with edge-popping capability for GC, consistent with the pattern used for PyDict and PyTuple.

crates/vm/src/vm/thread.rs (4)

15-20: Thread-local EBR guard storage looks correct.

The RefCell<Option<Guard>> pattern is appropriate for thread-local state that needs interior mutability. The doc comment clearly explains this is for coarse-grained pinning.


24-36: LGTM on ensure_pinned().

The function correctly uses ebr::cs() to enter a critical section and stores the guard thread-locally. The idempotent check-then-set pattern is appropriate.


38-50: LGTM on reactivate_guard().

Correctly calls reactivate() on the existing guard to allow epoch advancement at safe points. The pattern of borrowing mutably only when a guard exists is efficient.


52-59: LGTM on drop_guard().

Setting the guard to None will drop the Guard, which unpins the thread as shown in the relevant snippet from guard.rs (the Drop impl calls local.unpin()).

crates/vm/src/vm/mod.rs (2)

475-480: Correct CPython-aligned behavior for unraisable exceptions during shutdown.

Suppressing unraisable exceptions when finalizing is true matches CPython's behavior where daemon thread exceptions and __del__ errors are silently ignored during interpreter shutdown.


533-538: Good placement for EBR guard reactivation.

Frame boundaries are appropriate safe points for reactivating the EBR guard. This allows the GC to advance epochs and free deferred objects while ensuring no object references are held across the reactivation.

The #[cfg(feature = "threading")] guard is correct since EBR is only relevant in threaded contexts.

crates/vm/src/vm/interpreter.rs (2)

134-145: Good shutdown sequence for thread cleanup.

Setting the finalizing flag before calling threading._shutdown() ensures that any exceptions from daemon threads during shutdown are suppressed. The error handling with let _ = shutdown.call((), vm) silently ignores failures, which is appropriate during finalization.


149-158: Code pattern verified — both gc_state() and collect_force() are properly exposed and functional.

The two-phase GC collection around module finalization is correct:

  1. First pass collects existing cycles before breaking module references
  2. Module cleanup breaks references to enable collection
  3. Second pass collects the newly unreachable cyclic garbage

Both crate::gc_state::gc_state() (public function returning &'static GcState) and collect_force() (public method accepting generation parameter) are properly defined and accessible.

crates/derive-impl/src/pystructseq.rs (1)

610-622: LGTM on MaybeTraverse implementation for struct sequences.

The implementation correctly:

  • Sets IS_TRACE = true to enable traversal
  • Sets HAS_POP_EDGES = false since edge popping is delegated to the inner PyTuple
  • Provides a no-op try_pop_edges with appropriate comment
  • Delegates try_traverse to the inner tuple (line 616)

This follows the same pattern as PyUtf8Str and other wrapper types.

crates/vm/src/builtins/tuple.rs (2)

40-52: Correct Traverse implementation for PyTuple.

The implementation is sound:

  • traverse properly visits all elements via delegation
  • pop_edges uses std::mem::take which replaces self.elements with an empty boxed slice and returns the original
  • Converting to Vec via into_vec() and extending is correct

Unlike PyList, no locking is needed here because tuples are conceptually immutable (the elements are fixed at construction), and during GC the object is unreachable anyway.


28-28: Pyclass attributes correctly configured.

The traverse = "manual" and pop_edges attributes match the manual Traverse implementation below.

crates/common/src/ebr/default.rs (3)

33-41: Good fallback handling in with_handle.

The try_with with unwrap_or_else fallback correctly handles the case where the thread-local HANDLE has been destroyed (e.g., during thread exit). By registering a new temporary handle with the collector, the function remains functional even during thread teardown.

This is validated by the pin_while_exiting test which demonstrates pinning after HANDLE is dropped doesn't panic.


11-26: Clean EBR default module design.

The module correctly implements lazy initialization:

  • OnceLock ensures the global Collector is initialized exactly once
  • Thread-local HANDLE is registered lazily on first use per thread
  • cs() provides a simple API for entering critical sections

48-76: Good test coverage for edge case.

The pin_while_exiting test validates that calling cs() during thread exit (after HANDLE is dropped but before other thread-locals like FOO) doesn't panic. This is an important safety property for the EBR system.

crates/vm/src/object/traverse_object.rs (2)

17-19: LGTM! Documentation and signature are clear.

The pop_edges hook is well-documented and correctly typed as an optional function pointer for extracting child references before deallocation.


34-40: LGTM! Conditional initialization follows the existing trace pattern.

The const block pattern mirrors the trace field initialization, maintaining consistency in how optional vtable hooks are configured based on type traits.

crates/common/src/ebr/deferred.rs (1)

74-77: call consumes self but doesn't prevent double-drop.

The call method consumes self, which is correct. However, since there's no Drop impl, ensure that Deferred instances are always consumed via call() and never dropped without invocation. The current test coverage suggests this is the intended usage pattern.

crates/common/src/ebr/guard.rs (2)

116-136: Good use of scopeguard::defer! for panic safety in reactivate_after.

The implementation correctly uses scopeguard::defer! to ensure the guard is re-pinned even if the provided function panics. The mem::forget(local.pin()) pattern properly transfers ownership of the new guard's lifetime to the existing Guard struct.


148-155: LGTM! Drop implementation correctly handles null local.

The as_ref() pattern safely handles both regular guards and unprotected guards (with null local pointer).

crates/common/src/ebr/epoch.rs (2)

16-20: LGTM! Clear documentation of the bit layout.

The LSB-as-pinned-flag design is well-documented and the data: usize representation is appropriate for atomic operations.


62-70: LGTM! successor correctly preserves pin state.

Adding 2 increments the epoch value (stored in bits 1+) while preserving the LSB (pin state). This is correct given the bit layout.

crates/common/src/ebr/sync/list.rs (4)

219-233: Drop uses unprotected() guard - verify thread safety assumptions.

The Drop implementation uses unprotected() which bypasses EBR protection. This is only safe if:

  1. The list is not being concurrently accessed by other threads
  2. All elements have already been marked as deleted (checked by assertion)

The assertion on line 227 (assert_eq!(succ.tag(), 1)) will panic if any element wasn't properly deleted before the list is dropped. Consider using debug_assert! or returning an error if this is a recoverable condition.


162-194: LGTM! Insert uses correct lock-free CAS pattern.

The insert operation correctly:

  1. Reads current head
  2. Sets new entry's next to current head
  3. Uses compare_exchange_weak in a retry loop
  4. Uses appropriate memory orderings (Release for success, Relaxed for retry)

239-296: Iterator correctly handles concurrent modifications.

The iterator properly:

  1. Detects deleted nodes via tag bit
  2. Attempts to unlink deleted nodes via CAS
  3. Restarts from head when predecessor is deleted
  4. Returns IterError::Stalled to signal restart

The logic is correct for a lock-free list traversal with concurrent modifications.


306-321: Test IsElement implementation uses raw pointer cast.

The finalize implementation correctly defers destruction via the guard. The unsafe block is properly scoped with a safety comment.

crates/vm/src/builtins/dict.rs (1)

46-59: No action required. The pop_all_entries method exists and is correctly implemented in dict_inner.rs at line 715. It returns an iterator of (PyObjectRef, T) pairs, which matches the expected usage in pop_edges. The Traverse implementation properly delegates to self.entries.traverse and correctly collects both keys and values in pop_edges.

crates/vm/src/builtins/function.rs (1)

85-92: No safety issue with Context::genesis() during GC.

The code is safe. Context::genesis() uses Rust's static_cell with get_or_init() pattern, which is thread-safe and idempotent—it initializes exactly once and returns a &'static reference. The method cannot panic, even on first access. Additionally, garbage collection only runs after the VirtualMachine is fully initialized, and VM initialization explicitly calls Context::genesis() before any object creation. The pop_edges method will always have a valid, initialized genesis context available.

crates/vm/src/stdlib/thread.rs (1)

351-376: EBR pinning correctly brackets thread execution.

The ensure_pinned() at thread start and drop_guard() at thread exit properly implements coarse-grained EBR pinning. This ensures the GC won't reclaim objects while the thread is active.

One consideration: if the thread panics between ensure_pinned() and drop_guard(), the guard may not be dropped. However, since EBR_GUARD is thread-local, it will be cleaned up when the thread exits anyway.

crates/common/src/ebr/collector.rs (2)

104-423: Comprehensive test coverage for EBR pinning and reclamation.

The tests cover critical scenarios: reentrant pinning, local bag flushing, garbage buffering, epoch advancement under pinning, stress tests with multiple threads, and array destruction. Good use of #[cfg(miri)] for reduced iteration counts during Miri testing.


15-16: Manual Send and Sync implementations are necessary and appropriately justified.

The Collector wraps Arc<Global>, which is not automatically Send + Sync. This is evidenced by the #[allow(clippy::arc_with_non_send_sync)] lint suppression in Collector::default(), which the Rust clippy linter uses to warn when Arc wraps types that don't implement Send or Sync. Since Global contains List<Local> with intrusive pointers to Local entries—which themselves contain UnsafeCell—the manual implementations are correctly required to affirm thread-safety. These impls should remain in place.

crates/common/src/ebr/sync/queue.rs (4)

40-42: Manual Send and Sync impls are appropriate for the queue.

The safety comment on line 40 is accurate: individual T values are not accessed concurrently since they're only read/written during push/pop which are atomic operations. The Send bound on T is correctly required.


141-169: pop_if_internal reads data before CAS - potential for reading invalid data if condition races.

The condition reads n.data.as_ptr() at line 151 before the CAS at line 153. If another thread pops this node between the read and the CAS:

  1. The CAS will fail (returning Err)
  2. But the condition was evaluated on potentially stale/freed data

However, since the guard is held during this operation, EBR guarantees the memory won't be reclaimed until the guard is dropped. The T: Sync bound ensures the read is safe.


201-213: Drop implementation correctly drains queue before destroying sentinel.

The unprotected() guard usage in Drop is safe because drop has exclusive access (&mut self). The sentinel node is properly destroyed after draining all data nodes.


249-257: Test wrapper's pop() spins indefinitely - acceptable for tests but would deadlock on empty queue.

The pop method in the test wrapper loops forever until an element is available. This is fine for test scenarios where producers are guaranteed to push, but the comment or implementation could note this is intentional for testing.

crates/common/src/ebr/internal.rs (4)

134-141: SealedBag expiration uses 3-epoch distance - matches EBR guarantees.

The is_expired check at line 139 ensures bags are only reclaimed when global_epoch.wrapping_sub(self.epoch) >= 3. This is correct for EBR: a pinned thread can observe at most one epoch advancement, so 3 epochs guarantees all pinned threads have unpinned since the bag was sealed.


390-454: Local::pin implements correct EBR pinning with x86 optimization.

The pinning logic:

  1. Increments guard_count (lines 395-396)
  2. On first pin (guard_count == 0), stores pinned epoch with proper synchronization
  3. Uses x86-specific compare_exchange optimization (lines 406-434) for performance
  4. Retries if global epoch advanced during pinning (lines 440-444)
  5. Resets advance counter on epoch change (lines 447-450)

The x86 hack using compare_exchange instead of mfence is a known optimization from crossbeam-epoch.


456-480: Local::unpin triggers collection when last guard is dropped.

The unpin logic at lines 460-470 triggers collection while still pinned (guard_count == 1). This is intentional - collection requires being pinned to safely traverse the object graph. The collecting flag prevents reentrancy.

Note: Line 472 decrements guard_count after the collection loop, which is correct - the ManuallyDrop guard at line 465 prevents double-decrement.


573-593: IsElement trait implementation uses offset_of! for intrusive list.

The entry_of and element_of implementations correctly convert between Local and Entry using offset_of!. The finalize method defers destruction of the Local node.

Line 590: The RawShared::from conversion from *const Local is safe because the caller guarantees the entry belongs to a valid Local.

crates/common/src/ebr/pointers.rs (3)

62-72: High tag constants and bit manipulation are correct.

HIGH_TAG_WIDTH = 4 uses the top 4 bits of the pointer. The high_bits() and high_bits_pos() calculations are correct:

  • high_bits_pos() = 64 - 4 = 60 (on 64-bit)
  • high_bits() = 0xF << 60 = 0xF000_0000_0000_0000

This reserves the top 4 bits for epoch tagging while preserving pointer validity in the middle bits.


148-156: low_bits<T>() correctly computes alignment-based tag mask.

The function (1 << align_of::<T>().trailing_zeros()) - 1 computes the maximum value that fits in the alignment bits. For example, if align_of::<T>() == 8, this returns 0b111 = 7.


258-306: RawShared provides safe wrappers around unsafe pointer operations.

The from_owned, drop, deref, and as_ref methods correctly document their safety requirements. The #[cfg(test)] on is_null keeps it out of production builds while available for testing.

crates/common/src/refcount.rs (4)

301-312: dec() correctly handles LEAKED objects.

The decrement returns false for leaked objects (line 307-309), preventing deallocation of interned objects. The check old.strong() == 1 (line 311) correctly returns true only when this was the last reference.


382-448: Deferred drop infrastructure is well-designed for deadlock prevention.

The thread-local IN_DEFERRED_CONTEXT and DEFERRED_QUEUE pattern allows deferring untrack calls during GC collection. Key points:

  • with_deferred_drops sets context and flushes on exit (lines 400-412)
  • try_defer_drop queues if in context, else executes immediately (lines 425-433)
  • flush_deferred_drops drains queue outside borrow (lines 440-447)

The comment at lines 378-380 explains the deadlock scenario this prevents.


461-468: defer_destruction is marked unsafe but doesn't validate guard.

The safety comment requires the guard to be valid, but there's no runtime check. This is acceptable since it's unsafe, but consider adding a debug assertion.


264-277: Clarify the strong==0 case and document abort rationale.

Line 273-276: The comment "The previous fetch_add created a permission to run decrement again" is cryptic and doesn't explain why a second fetch_add is needed when strong == 0. This appears to handle a race condition with concurrent decrements, but the logic should be documented explicitly.

Lines 268-272: The abort() call on a destructed object is a severe failure (kills the process). Document why this is preferred over panic!() or unreachable!(). Note that safe_inc() (lines 282-299) handles the destructed case more gracefully with compare-exchange; clarify which path is recommended and why inc() takes the fatal approach.

crates/vm/src/gc_state.rs (2)

85-92: GcObjectPtr manual Send/Sync requires careful justification.

The wrapper makes NonNull<PyObject> Send+Sync. The safety comment "proper synchronization is used" is vague. The actual safety relies on:

  1. Objects being accessed through PyObjectRef which manages synchronization
  2. GcObjectPtr only being used as a key (for hashing/equality), not for dereferencing without synchronization

Consider strengthening the safety comment to document these invariants.


735-744: Global singleton pattern using OnceLock is appropriate.

The GC_STATE singleton with OnceLock provides thread-safe lazy initialization. The gc_state() accessor is clean and idiomatic.

Comment on lines 39 to 43
/// Reads the global epoch, without issueing a fence.
#[inline]
pub fn global_epoch(&self) -> Epoch {
self.global.epoch.load(Ordering::Relaxed)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in doc comment: "issueing" should be "issuing".

🔎 Proposed fix
-    /// Reads the global epoch, without issueing a fence.
+    /// Reads the global epoch, without issuing a fence.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Reads the global epoch, without issueing a fence.
#[inline]
pub fn global_epoch(&self) -> Epoch {
self.global.epoch.load(Ordering::Relaxed)
}
/// Reads the global epoch, without issuing a fence.
#[inline]
pub fn global_epoch(&self) -> Epoch {
self.global.epoch.load(Ordering::Relaxed)
}
🤖 Prompt for AI Agents
In crates/common/src/ebr/collector.rs around lines 39 to 43, the doc comment for
global_epoch contains a typo: "issueing" should be corrected to "issuing";
update the doc comment to read "Reads the global epoch, without issuing a
fence." so the documentation is spelled correctly.

Comment on lines 18 to 22
pub(crate) struct Deferred {
call: unsafe fn(*mut u8),
data: MaybeUninit<Data>,
_marker: PhantomData<*mut ()>, // !Send + !Sync
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing Drop implementation may cause memory leaks.

Deferred stores a FnOnce() closure inline or boxed, but has no Drop implementation. If a Deferred is dropped without call() being invoked, the stored closure (and any captured data) will be leaked. This is particularly concerning for the heap-allocated path where a Box<F> is stored.

Consider adding a Drop impl that invokes the closure or at least drops the stored data, unless leaking is intentional for the EBR design (in which case, document this behavior).

🔎 Proposed Drop implementation
+impl Drop for Deferred {
+    fn drop(&mut self) {
+        // Call the deferred function to ensure it's not leaked
+        // This is safe because we own self
+        let call = self.call;
+        unsafe { call(self.data.as_mut_ptr().cast::<u8>()) };
+    }
+}

Alternatively, if leaking is intentional, add documentation explaining why.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/common/src/ebr/deferred.rs around lines 18-22, Deferred currently
holds a closure (inline or boxed) but has no Drop implementation, so if a
Deferred is dropped without call() being invoked the stored closure/data will be
leaked; add a Drop impl that safely releases stored resources: on drop, check
the active variant/state and either invoke the stored destructor function (the
unsafe fn(*mut u8)) against the data buffer to run/drop the closure or, if
invoking is unsafe for EBR semantics, explicitly run the correct destructor for
the boxed path and drop the inline data to avoid leaks; ensure the Drop uses the
same unsafe logic as call(), avoids double-running if call() was already
executed (e.g. by clearing a flag or setting data to a neutral state), and
documents the chosen behavior if intentional leaking is preserved.

Comment on lines 32 to 44
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows.
///
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows. The closure must be `Send` as another thread may execute it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate safety documentation.

The safety documentation for defer_unchecked is duplicated (lines 32-39 and 40-44). Remove the duplicate section.

🔎 Proposed fix
     /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
     /// the closure must be `Send`.
-    /// # Safety
-    ///
-    /// The given function must not hold reference onto the stack. It is highly recommended that
-    /// the passed function is **always** marked with `move` in order to prevent accidental
-    /// borrows. The closure must be `Send` as another thread may execute it.
     pub unsafe fn defer_unchecked<F, R>(&self, f: F)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows.
///
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows. The closure must be `Send` as another thread may execute it.
/// # Safety
///
/// The given function must not hold reference onto the stack. It is highly recommended that
/// the passed function is **always** marked with `move` in order to prevent accidental
/// borrows.
///
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
pub unsafe fn defer_unchecked<F, R>(&self, f: F)
🤖 Prompt for AI Agents
crates/common/src/ebr/guard.rs lines 32-44: the safety documentation for
defer_unchecked is duplicated across two blocks; remove the redundant copy so
only a single Safety section remains, keeping the more concise/accurate wording
(ensure it states that the closure must not hold stack references, should be
marked move to avoid borrows, and must be Send), and adjust surrounding comment
spacing to preserve formatting.

Comment on lines 208 to 215
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
let prev = inner.fetch_or(low_bits::<T>() & tag, order);
RawShared::from(prev as *const _)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

fetch_or uses unsafe transmute to AtomicUsize - missing size assertion.

The comment mentions "The equality of the sizes is checked by const_assert!" but no const_assert! is present in this file. This is a latent bug if Atomic<Tagged<T>> ever differs in size from AtomicUsize.

🔎 Add the missing size assertion
+use static_assertions::const_assert_eq;
+const_assert_eq!(core::mem::size_of::<Atomic<Tagged<()>>>(), core::mem::size_of::<AtomicUsize>());
+
 pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
     // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
     // The equality of the sizes is checked by `const_assert!`.

Alternatively, add a debug_assert at runtime:

 pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
+    debug_assert_eq!(
+        core::mem::size_of::<Atomic<Tagged<T>>>(),
+        core::mem::size_of::<AtomicUsize>()
+    );
     let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
let prev = inner.fetch_or(low_bits::<T>() & tag, order);
RawShared::from(prev as *const _)
}
}
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
debug_assert_eq!(
core::mem::size_of::<Atomic<Tagged<T>>>(),
core::mem::size_of::<AtomicUsize>()
);
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };
let prev = inner.fetch_or(low_bits::<T>() & tag, order);
RawShared::from(prev as *const _)
}
}

Comment on lines 33 to 87
/// State wraps reference count + flags in a single 64-bit word
#[derive(Clone, Copy)]
pub struct State {
inner: u64,
}

impl State {
#[inline]
pub fn from_raw(inner: u64) -> Self {
Self { inner }
}

#[inline]
pub fn as_raw(self) -> u64 {
self.inner
}

#[inline]
pub fn strong(self) -> u32 {
((self.inner & STRONG) / COUNT) as u32
}

#[inline]
pub fn weak(self) -> u32 {
((self.inner & WEAK) / WEAK_COUNT) as u32
}

#[inline]
pub fn destructed(self) -> bool {
(self.inner & DESTRUCTED) != 0
}

#[inline]
pub fn weaked(self) -> bool {
(self.inner & WEAKED) != 0
}

#[inline]
pub fn leaked(self) -> bool {
(self.inner & LEAKED) != 0
}

#[inline]
pub fn epoch(self) -> u32 {
((self.inner & EPOCH) >> EPOCH_MASK_HEIGHT) as u32
}

#[inline]
pub fn with_epoch(self, epoch: usize) -> Self {
Self::from_raw((self.inner & !EPOCH) | (((epoch as u64) << EPOCH_MASK_HEIGHT) & EPOCH))
}

#[inline]
pub fn add_strong(self, val: u32) -> Self {
Self::from_raw(self.inner + (val as u64) * COUNT)
}

#[inline]
pub fn sub_strong(self, val: u32) -> Self {
debug_assert!(self.strong() >= val);
Self::from_raw(self.inner - (val as u64) * COUNT)
}

#[inline]
pub fn add_weak(self, val: u32) -> Self {
Self::from_raw(self.inner + (val as u64) * WEAK_COUNT)
}

#[inline]
pub fn with_destructed(self, dest: bool) -> Self {
Self::from_raw((self.inner & !DESTRUCTED) | if dest { DESTRUCTED } else { 0 })
}

#[inline]
pub fn with_weaked(self, weaked: bool) -> Self {
Self::from_raw((self.inner & !WEAKED) | if weaked { WEAKED } else { 0 })
}

#[inline]
pub fn with_leaked(self, leaked: bool) -> Self {
Self::from_raw((self.inner & !LEAKED) | if leaked { LEAKED } else { 0 })
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

State and PyState are identical - consider removing duplication.

PyState (lines 151-233) has the exact same fields and methods as State (lines 33-115). This violates DRY and creates maintenance burden. If they need to diverge in the future, document why. Otherwise, use a single type or a type alias.

🔎 Proposed fix - use type alias or single type
-/// PyState extends State with LEAKED support for RustPython
-#[derive(Clone, Copy)]
-pub struct PyState {
-    inner: u64,
-}
-
-impl PyState {
-    // ... 80+ lines of duplicated methods ...
-}
+/// PyState is an alias for State, used in RustPython integration
+pub type PyState = State;

If distinct types are truly needed for type safety, consider a generic or trait-based approach.

Also applies to: 151-233

🤖 Prompt for AI Agents
crates/common/src/refcount.rs lines 33-115 (and PyState at 151-233): the State
and PyState implementations are identical, so remove duplication by
consolidating into a single definition; either replace PyState with a type alias
(pub type PyState = State) and remove the duplicate struct/impl, or keep one
struct (State) and extract shared methods into a trait/generic used by both
names, updating all call sites to use the unified type; ensure visibility (pub)
and any module exports are preserved and add a short comment if you
intentionally keep a separate alias for future divergence.

Comment on lines 576 to 609
const ALLOWED_TRAVERSE_OPTS: &[&str] = &["manual"];
// try to know if it have a `#[pyclass(trace)]` exist on this struct
// TODO(discord9): rethink on auto detect `#[Derive(PyTrace)]`

// 1. no `traverse` at all: generate a dummy try_traverse
// 2. `traverse = "manual"`: generate a try_traverse, but not #[derive(Traverse)]
// 3. `traverse`: generate a try_traverse, and #[derive(Traverse)]
let (maybe_trace_code, derive_trace) = {
if class_meta.inner()._has_key("traverse")? {
let maybe_trace_code = quote! {
impl ::rustpython_vm::object::MaybeTraverse for #ident {
const IS_TRACE: bool = true;
fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) {
::rustpython_vm::object::Traverse::traverse(self, tracer_fn);
}
// Generate MaybeTraverse impl with both traverse and pop_edges support
//
// For traverse:
// 1. no `traverse` at all: IS_TRACE = false, try_traverse does nothing
// 2. `traverse = "manual"`: IS_TRACE = true, but no #[derive(Traverse)]
// 3. `traverse`: IS_TRACE = true, and #[derive(Traverse)]
//
// For pop_edges:
// 1. no `pop_edges`: HAS_POP_EDGES = false, try_pop_edges does nothing
// 2. `pop_edges`: HAS_POP_EDGES = true, try_pop_edges calls Traverse::pop_edges
let has_traverse = class_meta.inner()._has_key("traverse")?;
let has_pop_edges = class_meta.inner()._has_key("pop_edges")?;

let derive_trace = if has_traverse {
let value = class_meta.inner()._optional_str("traverse");
if let Ok(Some(s)) = value {
if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) {
bail_span!(
item,
"traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all",
);
}
assert_eq!(s, "manual");
quote! {}
} else {
quote! {#[derive(Traverse)]}
}
} else {
quote! {}
};

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Don’t ignore errors from _optional_str("traverse") when deriving Traverse

The new derive_trace logic is almost there, but this pattern:

let value = class_meta.inner()._optional_str("traverse");
if let Ok(Some(s)) = value {
    // ...
} else {
    quote! {#[derive(Traverse)]}
}

silently treats any Err(_) from _optional_str as “no value, derive Traverse”, rather than surfacing a compile‑time error. That means malformed uses like #[pyclass(traverse(123))] could compile and unexpectedly auto‑derive Traverse instead of failing in the macro.

You can preserve the intended behavior while propagating parse errors by binding the Result with ? first:

Proposed fix for error propagation in traverse handling
-    let derive_trace = if has_traverse {
-        let value = class_meta.inner()._optional_str("traverse");
-        if let Ok(Some(s)) = value {
+    let derive_trace = if has_traverse {
+        let value = class_meta.inner()._optional_str("traverse")?;
+        if let Some(s) = value {
             if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) {
                 bail_span!(
                     item,
                     "traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all",
                 );
             }
             assert_eq!(s, "manual");
             quote! {}
         } else {
             quote! {#[derive(Traverse)]}
         }
     } else {
         quote! {}
     };
🤖 Prompt for AI Agents
crates/derive-impl/src/pyclass.rs lines 576-607: the code currently ignores Err
from _optional_str("traverse") which can hide parse errors; change to propagate
the Result by using let value = class_meta.inner()._optional_str("traverse")?;
then match on value (Some(s) => validate s is "manual" and return quote!{} for
manual, None => return quote!{#[derive(Traverse)]}) so any Err is returned to
the caller instead of silently treating it as None.

Comment on lines +191 to 196
/// Return the list of objects that directly refer to any of the arguments.
#[pyfunction]
fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult {
Err(vm.new_not_implemented_error(""))
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
// This is expensive: we need to scan all tracked objects
// For now, return an empty list (would need full object tracking to implement)
let _ = args;
vm.ctx.new_list(vec![])
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incomplete implementation: get_referrers always returns an empty list.

This function is documented to "Return the list of objects that directly refer to any of the arguments" but always returns an empty list. This could cause subtle bugs in code relying on this CPython feature.

Consider either:

  1. Implementing it properly by scanning all tracked objects
  2. Raising NotImplementedError to make the limitation explicit
  3. Adding a clear warning in the docstring
+    /// Return the list of objects that directly refer to any of the arguments.
+    /// 
+    /// NOTE: This is currently not implemented and always returns an empty list.
+    /// Implementing this requires scanning all tracked objects which is expensive.
     #[pyfunction]
     fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
-        // This is expensive: we need to scan all tracked objects
-        // For now, return an empty list (would need full object tracking to implement)
         let _ = args;
+        // TODO: Implement by iterating gc_state().get_objects(None) and checking referents
         vm.ctx.new_list(vec![])
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Return the list of objects that directly refer to any of the arguments.
#[pyfunction]
fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult {
Err(vm.new_not_implemented_error(""))
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
// This is expensive: we need to scan all tracked objects
// For now, return an empty list (would need full object tracking to implement)
let _ = args;
vm.ctx.new_list(vec![])
}
/// Return the list of objects that directly refer to any of the arguments.
///
/// NOTE: This is currently not implemented and always returns an empty list.
/// Implementing this requires scanning all tracked objects which is expensive.
#[pyfunction]
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
let _ = args;
// TODO: Implement by iterating gc_state().get_objects(None) and checking referents
vm.ctx.new_list(vec![])
}
🤖 Prompt for AI Agents
In crates/stdlib/src/gc.rs around lines 191 to 198, get_referrers currently
returns an empty list which misrepresents its documented behavior; change it to
explicitly signal the limitation by returning/raising a NotImplementedError (or
the project-equivalent VM exception) instead of an empty list and update the
docstring to mention that get_referrers is not implemented on this platform;
ensure the function uses the VM to construct and return the proper exception so
callers receive an explicit error rather than an empty result.

Comment on lines 405 to 441
let (globals, filename, lineno) = if let Some(f) = f {
(f.globals.clone(), f.code.source_path, f.f_lineno())
} else if let Some(frame) = vm.current_frame() {
// We have a frame but it wasn't found during stack walking
(frame.globals.clone(), vm.ctx.intern_str("sys"), 1)
} else {
(vm.current_globals().clone(), vm.ctx.intern_str("sys"), 1)
// No frames on the stack - we're in interpreter shutdown or similar state
// Use __main__ globals if available, otherwise skip the warning
match vm
.sys_module
.get_attr(identifier!(vm, modules), vm)
.and_then(|modules| modules.get_item(vm.ctx.intern_str("__main__"), vm))
.and_then(|main_mod| main_mod.get_attr(identifier!(vm, __dict__), vm))
{
Ok(globals) => {
if let Ok(dict) = globals.downcast::<crate::builtins::PyDict>() {
(dict, vm.ctx.intern_str("sys"), 1)
} else {
// Cannot get globals, skip warning
return Ok((
vm.ctx.intern_str("sys").to_owned(),
1,
None,
vm.ctx.new_dict().into(),
));
}
}
Err(_) => {
// Cannot get __main__ globals, skip warning
return Ok((
vm.ctx.intern_str("sys").to_owned(),
1,
None,
vm.ctx.new_dict().into(),
));
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Shutdown fallback currently still emits warnings despite “skip warning” comment

In the final else branch of setup_context, both “Cannot get globals, skip warning” comments suggest the intent is to suppress the warning when there are no frames / __main__ globals. However, the function returns a normal (filename, lineno, module = None, registry = new_dict()) tuple; warn then still calls warn_explicit, which normalizes module from filename and proceeds to emit the warning.

If the goal is to truly skip warnings in these shutdown-like states, you’ll need a distinct signal (e.g., have setup_context return an Option and early‑return in warn, or return a sentinel registry/module that warn checks before calling warn_explicit). Otherwise, it’d be good to adjust or remove the “skip warning” comments to match the actual behavior and avoid confusion.

🤖 Prompt for AI Agents
In crates/vm/src/warn.rs around lines 405-441, the shutdown fallback branches
claim “skip warning” but still return a normal tuple so warn continues and emits
warnings; change the API so setup_context can signal “skip” (e.g., return
Option<(filename, lineno, module, registry)> or a Result with a distinct
variant) and update warn to early-return when setup_context returns None (or the
skip variant) so no warn_explicit is called during
interpreter-shutdown/no-globals cases; alternatively, if you prefer not to
change the signature, return a clear sentinel (like module
Some(vm.ctx.intern_str("__shutdown__")) or an empty registry flagged with a
special key) and add a check in warn to detect that sentinel and return
early—implement one of these approaches and remove or adjust the misleading
“skip warning” comments to match the new behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/stdlib/src/json/machinery.rs (1)

199-246: Fix error position for malformed second \uXXXX escape.
The surrogate-pair branch uses next_char_i from the first escape, so DecodeError::pos is off when the second escape is invalid. Capture the actual char index from chars2.

Proposed fix
-                            if let Some(((_, (byte_pos2, _)), (_, _))) = chars2
+                            if let Some(((char_pos2, (byte_pos2, _)), (_, _))) = chars2
                                 .next_tuple()
                                 .filter(|((_, (_, c1)), (_, (_, c2)))| *c1 == '\\' && *c2 == 'u')
                             {
-                                let uni2 =
-                                    decode_unicode(&mut chars2, char_offset + next_char_i + 1)?;
+                                let uni2 = decode_unicode(&mut chars2, char_offset + char_pos2)?;
🤖 Fix all issues with AI agents
In @.claude/commands/upgrade-pylib.md:
- Around line 23-27: The fenced code block containing the three commands
(/upgrade-pylib inspect, /upgrade-pylib json, /upgrade-pylib asyncio) is missing
a language specifier; update that fenced block to include a language tag (e.g.,
"text") after the opening backticks so the block reads as a language-specified
code fence to enable proper syntax highlighting and rendering.

In `@crates/codegen/src/ir.rs`:
- Around line 259-266: Replace the unimplemented!() call for the unreachable
placeholder variants PseudoInstruction::JumpNoInterrupt, Reserved258,
SetupCleanup, SetupFinally, SetupWith, and StoreFastMaybeNull with a panic (or
other hard failure) that clearly states the instruction is unreachable and
includes the instruction debug info (e.g., panic!("Unreachable: {instr:?} should
never be emitted")). Update the match arm that currently contains
unimplemented!("Got a placeholder pseudo instruction ({instr:?})") so it uses
the new panic message to improve diagnostics when bytecode is corrupted or
manually constructed.

In `@crates/common/src/refcount.rs`:
- Around line 262-268: The header comment above the RefCount struct incorrectly
documents a 4-bit epoch while EPOCH_WIDTH is 0; update the comment (or make it
conditional on EPOCH_WIDTH) to reflect the real bit layout used by RefCount:
when EPOCH_WIDTH == 0 the layout should be "[1 bit: destructed] [1 bit: weaked]
[1 bit: leaked] [30 bits: weak_count] [31 bits: strong_count]" (or document both
variants and reference the EPOCH_WIDTH constant), so change the comment text
near the RefCount declaration and mention EPOCH_WIDTH so readers see the actual
mapping.
- Around line 30-42: The LEAKED flag overlaps the WEAK count because
TOTAL_COUNT_WIDTH subtracts only 2 reserved bits but there are 3 flag bits
(DESTRUCTED, WEAKED, LEAKED); update TOTAL_COUNT_WIDTH to subtract 3 (i.e.
TOTAL_COUNT_WIDTH = u64::BITS - EPOCH_WIDTH - 3) so
WEAK_WIDTH/STRONG_WIDTH/STRONG/WEAK/WEAK_COUNT/COUNT calculations shift
accordingly and the LEAKED bit (1 << (EPOCH_MASK_HEIGHT - 3)) no longer overlaps
the weak count; verify constants DESTRUCTED, WEAKED, LEAKED, WEAK, STRONG,
WEAK_COUNT remain consistent after this change.

In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 853-858: The match in function label_arg does not account for the
JumpNoInterrupt variant, so its target Arg<Label> is never returned; update
label_arg (in crate's instruction.rs) to match both Self::Jump { target: l } and
Self::JumpNoInterrupt { target: l } and return Some(*l) for both cases (keeping
the existing _ => None fallback) so label resolution handles JumpNoInterrupt
correctly.

In `@crates/derive-impl/src/pyclass.rs`:
- Around line 631-639: The impl of ::rustpython_vm::object::MaybeTraverse for
`#ident` uses the wrong type for the tracer: update the try_traverse signature to
take a lifetime on TraverseFn (match the trait) by changing &mut
::rustpython_vm::object::TraverseFn to &mut
::rustpython_vm::object::TraverseFn<'_>; modify the function header for fn
try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) {
... } in the impl for `#ident` so it matches the trait definition and compiles.

In `@crates/stdlib/src/json.rs`:
- Around line 229-236: The JSON parser currently calls Wtf8Buf::to_string() and
uses a HashMap<String, PyStrRef> for memoization which loses unpaired
surrogates; change parse_object (and the other similar sites) to preserve WTF-8
by constructing keys/values from the Wtf8Buf directly (or use PyStrRef that
preserves WTF-8) instead of calling Wtf8Buf::to_string(), and change the memo
map type to use the WTF-8 preserving key (e.g., HashMap<Wtf8Buf, PyStrRef> or
HashMap<PyStrRefKey, PyStrRef>) so keys like "\ud800" remain distinct and are
not corrupted.

In `@crates/stdlib/src/opcode.rs`:
- Around line 83-99: The has_name() matcher in opcode.rs currently treats some
PseudoInstruction variants as name-bearing but misses the super pseudo-opcodes;
update the pattern in has_name (the matches! over
Self::try_from(opcode).map(|op| op.inner())) to also consider
AnyInstruction::Pseudo(PseudoInstruction::LoadSuperMethod { .. }),
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperMethod { .. }), and
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperAttr { .. }) so those
pseudo-opcodes return true for has_name().

In `@crates/vm/src/gc_state.rs`:
- Around line 690-711: promote_survivors currently moves pointers between
generation_objects but never updates the corresponding GcGeneration.count,
causing get_count()/threshold logic to be wrong; modify promote_survivors so
that when you successfully remove ptr from the source generation (inside the
block that has mut gen_set) you also decrement
self.generation_objects[from_gen]’s GcGeneration.count and when you insert into
next_set (inside the block that has mut next_set) you increment
self.generation_objects[next_gen]’s GcGeneration.count (use the same write-locks
already acquired or the atomic/field update mechanism used by GcGeneration.count
to keep updates thread-safe), ensuring counts stay in sync with the sets
referenced by get_count() and threshold logic.
- Around line 719-732: freeze and unfreeze currently only adjust the numeric
counters (permanent.count and generations[*].count) leaving the actual object
pointers in generation_objects so frozen objects can still be collected and
get_objects(g) remains inconsistent; modify freeze and unfreeze to move object
handles between storage sets: maintain a permanent object container (e.g.
permanent.objects or permanent_set), on freeze iterate generations (or use
generation_objects) to remove each object's pointer from its generation_objects
entry and insert it into permanent.objects while adjusting counts atomically,
and on unfreeze move pointers from permanent.objects back into
generations[2].generation_objects and update counts; also update get_objects to
consult permanent.objects when g == Permanent (or to exclude permanent objects
from normal collection paths) and ensure any collector loops skip
permanent.objects.
♻️ Duplicate comments (5)
crates/common/src/refcount.rs (1)

170-260: State and PyState are identical - duplication concern.

This was flagged in a previous review. These two types have identical implementations. Consider consolidating with pub type PyState = State; or documenting why separate types are necessary.

crates/derive-impl/src/pyclass.rs (1)

590-603: Propagate errors from traverse attribute parsing.
The current Result is silently ignored, so malformed #[pyclass(traverse(...))] can compile unexpectedly. This is the same issue noted previously.

🛠️ Suggested fix
-    let value = class_meta.inner()._optional_str("traverse");
-    if let Ok(Some(s)) = value {
+    let value = class_meta.inner()._optional_str("traverse")?;
+    if let Some(s) = value {
         if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) {
             bail_span!(
                 item,
                 "traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all",
             );
         }
         assert_eq!(s, "manual");
         quote! {}
     } else {
         quote! {#[derive(Traverse)]}
     }
crates/vm/src/object/core.rs (1)

1094-1106: Aliasing concern with &self to *mut cast remains.

This method casts &self to *mut PyObject which can violate Rust's aliasing rules even with the safety comment. The previous review flagged this issue.

crates/stdlib/src/gc.rs (1)

191-198: get_referrers stub not implemented - previously flagged.

This function always returns an empty list despite being documented to "return the list of objects that directly refer to any of the arguments."

crates/vm/src/gc_state.rs (1)

247-263: Track the correct generation when untracking.

untrack_object always decrements gen0, which will corrupt counts for objects in gen1/gen2. Please decrement the actual generation where the object was found.

🔧 Proposed fix (per-generation decrement)
 pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) {
     let gc_ptr = GcObjectPtr(obj);
+    let mut found_gen: Option<usize> = None;

     // Remove from all generation tracking lists
-    for generation in &self.generation_objects {
+    for (gen_idx, generation) in self.generation_objects.iter().enumerate() {
         if let Ok(mut gen_set) = generation.write() {
             if gen_set.remove(&gc_ptr) {
+                found_gen = Some(gen_idx);
                 break; // Object can only be in one generation
             }
         }
     }

-    // Update counts (simplified - just decrement gen0 for now)
-    let count = self.generations[0].count.load(Ordering::SeqCst);
-    if count > 0 {
-        self.generations[0].count.fetch_sub(1, Ordering::SeqCst);
+    // Decrement the correct generation's count
+    if let Some(gen_idx) = found_gen {
+        self.generations[gen_idx].count.fetch_sub(1, Ordering::SeqCst);
     }
🧹 Nitpick comments (8)
.claude/commands/upgrade-pylib.md (2)

8-20: Consider adding notes about validation and error handling.

The workflow is clear, but consider documenting:

  • How to validate that $ARGUMENTS is a valid library name before deletion
  • What to do if the library doesn't exist in cpython/Lib/
  • Recommended practice of working in a separate branch before making destructive changes
  • Brief explanation of what --quick-upgrade does

These additions would make the documentation more robust and helpful for users unfamiliar with the process.


29-32: Helpful contextual notes.

The notes provide useful context about the upgrade process. The suggested test command is appropriate.

Optionally, you could mention that during iterative development, omitting --release may provide faster compile times at the cost of slower test execution.

crates/common/src/refcount.rs (2)

419-440: Consider panic safety for deferred context.

If f() panics, in_ctx.set(was_in_context) and flush_deferred_drops() are never called. While thread unwinding often terminates the thread, if the panic is caught upstream, the deferred context flag remains set and queued operations are lost.

♻️ Suggested RAII guard approach
struct DeferredContextGuard<'a> {
    cell: &'a Cell<bool>,
    was_in_context: bool,
}

impl Drop for DeferredContextGuard<'_> {
    fn drop(&mut self) {
        self.cell.set(self.was_in_context);
        if !self.was_in_context {
            flush_deferred_drops();
        }
    }
}

pub fn with_deferred_drops<F, R>(f: F) -> R
where
    F: FnOnce() -> R,
{
    IN_DEFERRED_CONTEXT.with(|in_ctx| {
        let _guard = DeferredContextGuard {
            cell: in_ctx,
            was_in_context: in_ctx.replace(true),
        };
        f()
    })
}

463-475: Potential issue: panic in deferred op leaves remaining ops unexecuted.

If any op() panics, the remaining operations in the queue will not execute. Consider whether this is acceptable for your use case, or if std::panic::catch_unwind should be used to ensure all operations attempt execution.

crates/stdlib/src/gc.rs (1)

259-267: Silent error handling in invoke_callbacks may hide issues.

The set_item calls (lines 261-263) and callback invocation (line 266) use let _ = to ignore errors. While this prevents GC callbacks from crashing the collection process, errors in user-provided callbacks will be silently swallowed.

Consider logging or reporting these errors via vm.run_unraisable() similar to how __del__ errors are handled, so users are aware their callbacks are failing.

♻️ Suggested improvement
         for callback in callbacks {
-            let _ = callback.call((phase_str.clone(), info.clone()), vm);
+            if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) {
+                vm.run_unraisable(e, None, callback.clone());
+            }
         }
crates/vm/src/object/core.rs (1)

873-898: Resurrection detection logic is correct - consider adding tests.

The logic correctly handles both normal destruction and resurrection:

  1. Increment ref_count to protect del from concurrent drops (1→2)
  2. Run del and capture ref_count before and after
  3. Decrement once to undo the increment
  4. If ref_count increased during del (resurrection detected), skip the second decrement and leave the object alive
  5. If no resurrection, decrement again to reach 0

No dedicated unit tests for resurrection scenarios exist in the Rust codebase. Consider adding tests covering:

  • Normal finalization without del side effects
  • Resurrection (where del creates new references)
  • Error handling during del

Minor: The comment on lines 15–16 stating "inc() from 0 does a double increment (0→2)" is misleading in this context, since inc() is called when ref_count=1. Clarify the comment to explain the thread-safety rationale for the increment/decrement pattern.

crates/compiler-core/src/bytecode/instruction.rs (2)

282-284: Track the pending migration of LoadClosure to PseudoInstruction.

The TODO comment indicates this instruction should be moved to the pseudo-instruction enum. Consider opening an issue to track this migration to avoid it being forgotten.

Would you like me to open a GitHub issue to track this migration task?


881-892: fmt_dis panics at runtime for pseudo-instructions.

Using unimplemented!() will cause a panic if pseudo-instructions are ever formatted for display, which could happen during debugging or error reporting. Consider providing a basic implementation.

Proposed implementation
     fn fmt_dis(
         &self,
-        _arg: OpArg,
-        _f: &mut fmt::Formatter<'_>,
-        _ctx: &impl InstrDisplayContext,
+        arg: OpArg,
+        f: &mut fmt::Formatter<'_>,
+        ctx: &impl InstrDisplayContext,
         _expand_code_objects: bool,
-        _pad: usize,
+        pad: usize,
         _level: usize,
     ) -> fmt::Result {
-        unimplemented!()
+        match self {
+            Self::Jump { target } => write!(f, "{:pad$}({})", "JUMP", target.get(arg)),
+            Self::JumpNoInterrupt { target } => write!(f, "{:pad$}({})", "JUMP_NO_INTERRUPT", target.get(arg)),
+            Self::LoadAttrMethod { idx } => {
+                let name = ctx.get_name(idx.get(arg) as usize);
+                write!(f, "{:pad$}({}, {})", "LOAD_ATTR_METHOD", idx.get(arg), name)
+            }
+            Self::LoadSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_SUPER_METHOD", idx.get(arg)),
+            Self::LoadZeroSuperAttr { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_ATTR", idx.get(arg)),
+            Self::LoadZeroSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_METHOD", idx.get(arg)),
+            Self::PopBlock => write!(f, "POP_BLOCK"),
+            Self::SetupCleanup => write!(f, "SETUP_CLEANUP"),
+            Self::SetupFinally => write!(f, "SETUP_FINALLY"),
+            Self::SetupWith => write!(f, "SETUP_WITH"),
+            Self::StoreFastMaybeNull => write!(f, "STORE_FAST_MAYBE_NULL"),
+            Self::Reserved258 => write!(f, "RESERVED_258"),
+        }
     }
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9bb9097 and ef12e44.

⛔ Files ignored due to path filters (61)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/_opcode_metadata.py is excluded by !Lib/**
  • Lib/asyncio/__main__.py is excluded by !Lib/**
  • Lib/asyncio/base_events.py is excluded by !Lib/**
  • Lib/asyncio/base_subprocess.py is excluded by !Lib/**
  • Lib/asyncio/events.py is excluded by !Lib/**
  • Lib/asyncio/format_helpers.py is excluded by !Lib/**
  • Lib/asyncio/futures.py is excluded by !Lib/**
  • Lib/asyncio/locks.py is excluded by !Lib/**
  • Lib/asyncio/proactor_events.py is excluded by !Lib/**
  • Lib/asyncio/queues.py is excluded by !Lib/**
  • Lib/asyncio/runners.py is excluded by !Lib/**
  • Lib/asyncio/selector_events.py is excluded by !Lib/**
  • Lib/asyncio/sslproto.py is excluded by !Lib/**
  • Lib/asyncio/staggered.py is excluded by !Lib/**
  • Lib/asyncio/streams.py is excluded by !Lib/**
  • Lib/asyncio/taskgroups.py is excluded by !Lib/**
  • Lib/asyncio/tasks.py is excluded by !Lib/**
  • Lib/asyncio/timeouts.py is excluded by !Lib/**
  • Lib/asyncio/transports.py is excluded by !Lib/**
  • Lib/asyncio/unix_events.py is excluded by !Lib/**
  • Lib/asyncio/windows_events.py is excluded by !Lib/**
  • Lib/test/support/__init__.py is excluded by !Lib/**
  • Lib/test/test__opcode.py is excluded by !Lib/**
  • Lib/test/test_ast/test_ast.py is excluded by !Lib/**
  • Lib/test/test_bdb.py is excluded by !Lib/**
  • Lib/test/test_builtin.py is excluded by !Lib/**
  • Lib/test/test_bytes.py is excluded by !Lib/**
  • Lib/test/test_class.py is excluded by !Lib/**
  • Lib/test/test_compile.py is excluded by !Lib/**
  • Lib/test/test_configparser.py is excluded by !Lib/**
  • Lib/test/test_context.py is excluded by !Lib/**
  • Lib/test/test_descr.py is excluded by !Lib/**
  • Lib/test/test_dict.py is excluded by !Lib/**
  • Lib/test/test_docxmlrpc.py is excluded by !Lib/**
  • Lib/test/test_enum.py is excluded by !Lib/**
  • Lib/test/test_future_stmt/test_future.py is excluded by !Lib/**
  • Lib/test/test_gc.py is excluded by !Lib/**
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_http_cookiejar.py is excluded by !Lib/**
  • Lib/test/test_importlib/test_threaded_import.py is excluded by !Lib/**
  • Lib/test/test_io.py is excluded by !Lib/**
  • Lib/test/test_itertools.py is excluded by !Lib/**
  • Lib/test/test_json/test_decode.py is excluded by !Lib/**
  • Lib/test/test_logging.py is excluded by !Lib/**
  • Lib/test/test_marshal.py is excluded by !Lib/**
  • Lib/test/test_multiprocessing_fork/test_threads.py is excluded by !Lib/**
  • Lib/test/test_multiprocessing_spawn/test_threads.py is excluded by !Lib/**
  • Lib/test/test_signal.py is excluded by !Lib/**
  • Lib/test/test_socket.py is excluded by !Lib/**
  • Lib/test/test_sort.py is excluded by !Lib/**
  • Lib/test/test_sqlite3/test_dbapi.py is excluded by !Lib/**
  • Lib/test/test_str.py is excluded by !Lib/**
  • Lib/test/test_subprocess.py is excluded by !Lib/**
  • Lib/test/test_tarfile.py is excluded by !Lib/**
  • Lib/test/test_thread.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/test/test_time.py is excluded by !Lib/**
  • Lib/test/test_timeout.py is excluded by !Lib/**
  • Lib/test/test_unittest/test_async_case.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
📒 Files selected for processing (30)
  • .claude/commands/upgrade-pylib.md
  • .github/copilot-instructions.md
  • .github/workflows/ci.yaml
  • Cargo.toml
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/common/Cargo.toml
  • crates/common/src/lib.rs
  • crates/common/src/refcount.rs
  • crates/compiler-core/src/bytecode.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/compiler-core/src/bytecode/oparg.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/derive-impl/src/util.rs
  • crates/jit/src/instructions.rs
  • crates/stdlib/src/gc.rs
  • crates/stdlib/src/json.rs
  • crates/stdlib/src/json/machinery.rs
  • crates/stdlib/src/opcode.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/list.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/gc_state.rs
  • crates/vm/src/lib.rs
  • crates/vm/src/object/core.rs
💤 Files with no reviewable changes (1)
  • .github/workflows/ci.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
  • crates/common/src/lib.rs
  • Cargo.toml
  • crates/common/Cargo.toml
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/tuple.rs
🧰 Additional context used
🧬 Code graph analysis (12)
crates/derive-impl/src/pyclass.rs (3)
crates/vm/src/object/core.rs (2)
  • traverse (161-163)
  • traverse (169-171)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/object/traverse.rs (2)
  • pop_edges (39-39)
  • try_traverse (19-19)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
  • gc_state (742-744)
crates/stdlib/src/json/machinery.rs (1)
crates/stdlib/src/json.rs (1)
  • scanstring (702-732)
crates/vm/src/builtins/list.rs (5)
crates/vm/src/builtins/dict.rs (2)
  • traverse (48-50)
  • pop_edges (52-58)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/builtins/set.rs (1)
  • traverse (176-179)
crates/vm/src/builtins/mappingproxy.rs (1)
  • traverse (30-35)
crates/vm/src/object/traverse.rs (1)
  • pop_edges (39-39)
crates/codegen/src/ir.rs (2)
crates/compiler-core/src/bytecode.rs (1)
  • encode_exception_table (63-75)
crates/compiler-core/src/bytecode/instruction.rs (3)
  • encode_load_attr_arg (1108-1110)
  • encode_load_super_attr_arg (1122-1124)
  • marker (1056-1058)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/str.rs (2)
  • try_traverse (1930-1932)
  • try_pop_edges (1934-1936)
crates/vm/src/object/traverse.rs (2)
  • try_traverse (19-19)
  • try_pop_edges (21-21)
crates/vm/src/gc_state.rs (2)
crates/common/src/refcount.rs (4)
  • new (143-145)
  • new (278-283)
  • with_deferred_drops (423-440)
  • state (366-368)
crates/stdlib/src/gc.rs (9)
  • enable (26-28)
  • disable (32-34)
  • get_debug (126-128)
  • set_debug (132-134)
  • get_threshold (90-99)
  • get_count (113-122)
  • get_stats (138-155)
  • get_objects (165-176)
  • collect (50-86)
crates/vm/src/frame.rs (2)
crates/compiler-core/src/bytecode/instruction.rs (1)
  • get (1074-1076)
crates/compiler-core/src/bytecode/oparg.rs (1)
  • get (81-87)
crates/stdlib/src/json.rs (1)
crates/stdlib/src/json/machinery.rs (3)
  • scanstring (140-252)
  • new (112-115)
  • len (123-128)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (17)
  • gc_state (742-744)
  • enable (164-166)
  • disable (169-171)
  • collect (368-370)
  • get_threshold (184-190)
  • set_threshold (64-66)
  • set_threshold (193-201)
  • get_count (204-210)
  • get_debug (174-176)
  • set_debug (179-181)
  • get_stats (213-219)
  • stats (68-75)
  • get_objects (297-338)
  • new (44-54)
  • new (134-156)
  • freeze (720-726)
  • get_freeze_count (715-717)
crates/vm/src/object/core.rs (3)
crates/vm/src/object/traverse_object.rs (1)
  • try_pop_edges_obj (39-39)
crates/common/src/refcount.rs (3)
  • try_defer_drop (448-461)
  • new (143-145)
  • new (278-283)
crates/vm/src/gc_state.rs (3)
  • gc_state (742-744)
  • new (44-54)
  • new (134-156)
crates/compiler-core/src/bytecode.rs (1)
crates/compiler-core/src/bytecode/instruction.rs (4)
  • decode_load_attr_arg (1114-1118)
  • decode_load_super_attr_arg (1128-1133)
  • encode_load_attr_arg (1108-1110)
  • encode_load_super_attr_arg (1122-1124)
🪛 markdownlint-cli2 (0.18.1)
.claude/commands/upgrade-pylib.md

23-23: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Collect benchmark data
  • GitHub Check: Collect what is left data
  • GitHub Check: Collect regression test data
  • GitHub Check: Collect code coverage data
🔇 Additional comments (55)
.claude/commands/upgrade-pylib.md (1)

1-7: LGTM! Clear documentation structure.

The header and arguments section clearly defines the command's purpose and expected input format.

.github/copilot-instructions.md (2)

90-106: Clear testing guidance improvements.

The distinction between using debug mode for snippet tests (faster compilation) and release mode for unittest suites (better runtime performance) is helpful and aligns with typical Rust development workflows. The consolidated note at lines 105-106 reinforces this guidance effectively.


192-197: Unable to complete verification due to repository access limitations. Manual verification is required to:

  1. Confirm DEVELOPMENT.md contains the referenced "CPython Version Upgrade Checklist" section with venvlauncher build instructions
  2. Identify the 4 venvlauncher binary names and their current build/copy approach in Cargo.toml
  3. Assess whether more specific build instructions and examples should be added to .github/copilot-instructions.md
crates/jit/src/instructions.rs (1)

563-569: LGTM - The JIT correctly handles only resolved directional jump instructions.

The JIT only handles JumpBackward, JumpBackwardNoInterrupt, and JumpForward because these are the real instructions that reach JIT compilation. The pseudo-instruction PseudoInstruction::Jump (which is compiler-only, distinct from the Instruction enum) is resolved to either JumpForward or JumpBackward during IR lowering based on whether the target offset is forward or backward relative to the current position. Any pseudo-instructions would never reach the JIT—they are lowered to real instructions before bytecode generation.

crates/common/src/refcount.rs (8)

1-16: LGTM!

Module documentation and re-exports are clear. The aliasing of pin as cs maintains API compatibility while leveraging crossbeam-epoch.


44-134: State implementation is correct, pending constant fix.

The accessor and mutator methods correctly implement bit manipulation for the packed state. Once the TOTAL_COUNT_WIDTH constant is fixed, this will work correctly.


136-168: LGTM!

The modular arithmetic helper correctly handles epoch wraparound comparisons. The transformation to/from modular space ensures proper ordering even when values wrap around.


291-304: Confusing increment logic when strong count is zero.

The logic at lines 300-303 is unclear: if old.strong() == 0, an additional COUNT is added, resulting in a final count of 2 instead of 1. The comment mentions "created a permission to run decrement again" but doesn't explain the rationale.

Additionally, the fetch_add modifies state before checking destructed(), though abort() prevents further damage.

Consider using a CAS loop (like safe_inc) for consistency, or adding documentation explaining this specific recovery pattern.


306-326: LGTM!

safe_inc correctly uses a CAS loop to atomically check the destructed flag before incrementing. This is the preferred pattern for thread-safe reference counting.


328-362: LGTM!

The dec, leak, and is_leaked methods are correctly implemented:

  • dec properly returns true when strong count reaches zero, with LEAKED objects always returning false.
  • leak uses a CAS loop for atomic flag setting.
  • is_leaked uses appropriate Acquire ordering for reads.

364-399: LGTM!

The accessors and mark_destructed are well-implemented:

  • mark_destructed correctly uses a CAS loop and validates that strong count is 0 and object is not leaked before marking.
  • Exposing the raw AtomicU64 enables advanced operations when needed.

477-495: LGTM!

The defer_destruction wrapper correctly delegates to crossbeam-epoch's defer_unchecked with appropriate safety documentation. The 3-epoch guarantee explanation is helpful for understanding the EBR semantics.

crates/derive-impl/src/util.rs (1)

366-376: Allowing pop_edges in #[pyclass] meta looks good.
This keeps the macro validation aligned with the new traversal/edge‑popping hooks.

crates/derive-impl/src/pystructseq.rs (1)

608-618: Confirm HAS_POP_EDGES = false is intentional for struct sequences.
Since these wrap PyTuple (which now supports edge‑popping), please verify that skipping pop_edges won’t leave cycles unbroken in GC paths. If you intended delegation, consider:

🔧 Optional delegation to inner PyTuple
 impl ::rustpython_vm::object::MaybeTraverse for `#pytype_ident` {
     const IS_TRACE: bool = true;
-    const HAS_POP_EDGES: bool = false;
+    const HAS_POP_EDGES: bool = true;

     fn try_traverse(&self, traverse_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) {
         self.0.try_traverse(traverse_fn)
     }

-    fn try_pop_edges(&mut self, _out: &mut ::std::vec::Vec<::rustpython_vm::PyObjectRef>) {
-        // Struct sequences don't need pop_edges
+    fn try_pop_edges(&mut self, out: &mut ::std::vec::Vec<::rustpython_vm::PyObjectRef>) {
+        ::rustpython_vm::object::Traverse::pop_edges(&mut self.0, out);
     }
 }
crates/derive-impl/src/pyclass.rs (1)

696-701: Emitting #maybe_traverse_code alongside the class item looks good.
Ensures the new traversal behavior is consistently generated with the class definition.

crates/vm/src/lib.rs (1)

77-81: Publicly exporting gc_state looks good.
Clear API surface for the new GC state module.

crates/vm/src/dict_inner.rs (1)

728-737: pop_all_entries efficiently clears state and drains entries.
Good fit for GC edge‑popping, and the &mut self requirement keeps locking simple.

crates/vm/src/builtins/list.rs (1)

60-74: LGTM - Traverse implementation is consistent with the GC framework.

The implementation correctly delegates traverse to the inner elements and uses try_write() for pop_edges to handle the PyRwLock. The try_write() approach is appropriate here since during GC the object should be unreachable, and if the lock is somehow held, silently skipping is safer than blocking.

One minor observation: Unlike PyTuple which uses std::mem::take(&mut self.elements), PyList must use try_write() due to interior mutability. This is the correct approach given the struct design.

crates/vm/src/builtins/function.rs (2)

46-57: LGTM - Extended traversal coverage for PyFunction.

The traverse implementation now properly visits additional fields (type_params, annotations, module, doc) that may contain references. This ensures the GC can properly track all reachable objects from a function.


82-92: No changes needed—Context::genesis() is already optimized.

The implementation correctly uses a static cached context via the get_or_init pattern, so Context::genesis() returns a reference to a static value after initialization. This is a cheap operation and the same pattern is used throughout the codebase (interpreter initialization, exception_group, etc.). Additionally, empty_str is already a &'static PyStrInterned, so there's no allocation at runtime. The .to_owned() call simply increments the refcount, which is appropriate during garbage collection. The code matches CPython's func_clear behavior and performs well.

crates/vm/src/builtins/dict.rs (2)

46-59: LGTM - Dict traversal follows established patterns.

The Traverse implementation correctly delegates to the inner entries and collects both keys and values during edge popping. This ensures all references held by the dictionary are properly handled during GC.


52-58: The pop_all_entries method is correctly implemented and accessible.

The method exists on dict_inner::Dict (alias DictContentType) at line 731 of crates/vm/src/dict_inner.rs with the signature pub fn pop_all_entries(&mut self) -> impl Iterator<Item = (PyObjectRef, T)> + '_. The pop_edges implementation correctly calls this method and consumes the iterator, pushing both keys and values to the output vector.

crates/vm/src/object/core.rs (3)

271-330: Good fix: already_cleared flag prevents double ref_count decrement.

The separation of clear() and clear_for_gc() with the already_cleared check at line 281 prevents a subtle bug where both GC clearing and normal drop could decrement ref_count. This is a solid defensive fix.


944-976: Correct ordering: untrack → pop_edges → dealloc → drop children.

The destruction sequence properly:

  1. Untracking via try_defer_drop (handles GC lock contention)
  2. Extracts child references before deallocation
  3. Deallocates the object
  4. Drops children (may trigger recursive destruction)

This breaks cycles correctly and avoids use-after-free.


1325-1344: Object tracking triggers automatic GC collection.

Objects with IS_TRACE or instance dict are tracked, and maybe_collect() is called after tracking. This enables automatic generational collection. The implementation is lightweight when thresholds aren't met—it performs early returns for disabled GC and only executes atomic loads and comparisons in the common path where collection isn't triggered.

crates/stdlib/src/gc.rs (2)

49-86: LGTM - Well-structured collect implementation with proper validation and callbacks.

The collect function correctly:

  1. Validates generation bounds (0-2)
  2. Invokes callbacks with "start" and "stop" phases
  3. Uses collect_force to run collection even when GC is disabled
  4. Migrates garbage objects to the Python-visible list

136-155: LGTM - Stats exposure follows CPython convention.

The get_stats function returns a list of dicts with collections, collected, and uncollectable keys per generation, matching CPython's API.

crates/stdlib/src/json/machinery.rs (4)

112-115: Public DecodeError::new looks fine.
No concerns.


130-196: scanstring API/doc and fast/slow path split look solid.
The char_offset contract and bytes_consumed return are clear, and the early-return path keeps counts consistent.


259-266: decode_unicode accumulator change is OK.
A u16 accumulator fits \uXXXX and keeps the bit assembly straightforward.


33-33: memchr is properly declared in the stdlib crate.

The dependency is present in crates/stdlib/Cargo.toml (line 51) as memchr = { workspace = true }, pinned to version 2.7.4 in the root workspace manifest, and resolved to 2.7.6 in the lockfile. The import at line 33 is valid.

crates/compiler-core/src/bytecode/oparg.rs (1)

81-85: LGTM — clearer ExtendedArg guard.

The matches! form preserves semantics while making the intent explicit.

crates/stdlib/src/json.rs (7)

20-40: Helpers are clear and JSON‑spec compliant.

ASCII whitespace and prefix checks are straightforward and efficient.


92-188: Byte/char index tracking looks consistent.

The dual-index approach and early StopIteration handling are coherent.


388-503: Array/object assembly and hook finalization look solid.

Whitespace handling, delimiter checks, and hook priority match expected semantics.


505-526: Token dispatch in call_scan_once is comprehensive.

The explicit branches for literals/numbers/objects/arrays are clear.

Also applies to: 538-618


619-629: No issues in centralized decode error construction.


632-663: Entry-point byte index derivation is clean and contained.

The O(n) char→byte conversion is limited to the public entry.


701-732: scanstring byte-index conversion is correct and localized.

The conversion keeps error positioning consistent with the new byte/char split.

crates/codegen/src/ir.rs (3)

88-96: Good expansion of InstructionInfo metadata.

end_location and except_handler align with the later linetable/exception-table generation.


199-255: Pseudo‑to‑real lowering for attribute loads is clean.

Encoding once here keeps the emission path uniform.


305-322: Jump direction selection reads correctly.

Using target offset vs. current offset is the right split for forward/backward encoding.

crates/compiler-core/src/bytecode.rs (1)

17-22: Re‑exports align with the new instruction model.

Public surface now matches the refactor in instruction.rs.

crates/vm/src/frame.rs (1)

10-11: Instruction alias adoption looks consistent throughout execution paths.

The updated dispatch and reraise detection are coherent with the new Instruction enum usage.

Also applies to: 390-401, 442-450, 552-1756

crates/codegen/src/compile.rs (3)

277-297: emit! macro expansion looks good.

The struct/tuple/no‑arg variants are now covered cleanly and route through the generic emission helpers as intended.


6955-6983: Generic _emit/emit_arg plumbing is solid.

The Into<AnyInstruction> flow is consistent and keeps instruction/arg/target wiring centralized.


7455-7477: EmitArg trait update is clean.

The AnyInstruction return type aligns well with the new emission path.

crates/compiler-core/src/bytecode/instruction.rs (8)

286-354: LGTM!

The conversion implementations correctly handle the non-contiguous opcode space with explicit validation. The use of transmute is safe given the repr(u8) guarantee and the size assertion.


356-389: LGTM!

The label_arg and unconditional_branch implementations correctly identify the branching instructions. The pattern matching is comprehensive for all jump-related variants.


860-862: Consider including JumpNoInterrupt in unconditional_branch.

If JumpNoInterrupt represents an unconditional jump (without interrupt checks), it should likely return true from unconditional_branch. Otherwise, code flow analysis may be incorrect.

Please verify the intended semantics of JumpNoInterrupt - if it's an unconditional jump, update accordingly:

     fn unconditional_branch(&self) -> bool {
-        matches!(self, Self::Jump { .. },)
+        matches!(self, Self::Jump { .. } | Self::JumpNoInterrupt { .. })
     }

834-850: LGTM!

The TryFrom<u16> implementation correctly validates the contiguous discriminant range and uses transmute safely given the repr(u16) guarantee.


920-929: LGTM!

The TryFrom<u16> logic correctly routes values 0-255 to Instruction and values 256+ to PseudoInstruction, matching the discriminant design.


931-958: LGTM!

The inst_either! macro cleanly delegates trait methods to the inner variant, reducing boilerplate while maintaining type safety.


998-1049: LGTM!

The InstructionMetadata trait is well-designed with clear documentation and useful examples. The display helper using a closure wrapper is idiomatic.


1106-1133: LGTM!

The encode/decode helpers are symmetric, well-documented, and correctly implement the bit-packing scheme for opargs.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 23 to 27
```
/upgrade-pylib inspect
/upgrade-pylib json
/upgrade-pylib asyncio
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language specifier to the code block.

The code block is missing a language specifier, which is a markdown best practice for proper syntax highlighting and rendering.

📝 Proposed fix
-```
+```text
 /upgrade-pylib inspect
 /upgrade-pylib json
 /upgrade-pylib asyncio

Based on static analysis hints.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
/upgrade-pylib inspect
/upgrade-pylib json
/upgrade-pylib asyncio
```
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

23-23: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In @.claude/commands/upgrade-pylib.md around lines 23 - 27, The fenced code
block containing the three commands (/upgrade-pylib inspect, /upgrade-pylib
json, /upgrade-pylib asyncio) is missing a language specifier; update that
fenced block to include a language tag (e.g., "text") after the opening
backticks so the block reads as a language-specified code fence to enable proper
syntax highlighting and rendering.

Comment on lines 259 to 266
PseudoInstruction::JumpNoInterrupt { .. }
| PseudoInstruction::Reserved258
| PseudoInstruction::SetupCleanup
| PseudoInstruction::SetupFinally
| PseudoInstruction::SetupWith
| PseudoInstruction::StoreFastMaybeNull => {
unimplemented!("Got a placeholder pseudo instruction ({instr:?})")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's look at the file structure and context
cd crates/codegen/src && wc -l ir.rs

Repository: youknowone/RustPython

Length of output: 74


🏁 Script executed:

#!/bin/bash
# Get the context around the mentioned lines
cd crates/codegen/src && sed -n '250,280p' ir.rs

Repository: youknowone/RustPython

Length of output: 1810


🏁 Script executed:

#!/bin/bash
# Search for all references to these pseudo-instructions
rg -n "(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)" --type rs

Repository: youknowone/RustPython

Length of output: 91


🏁 Script executed:

#!/bin/bash
# Look for lowering logic or where these pseudo-instructions are created/emitted
rg -n "PseudoInstruction::(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)" --type rs -A 5 -B 5

Repository: youknowone/RustPython

Length of output: 91


🏁 Script executed:

#!/bin/bash
# Search for all references to these pseudo-instructions (corrected command)
rg -n "(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)" --type-list | head -5

Repository: youknowone/RustPython

Length of output: 454


🏁 Script executed:

#!/bin/bash
# Try simpler rg command without type filter
rg -n "JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull" crates/

Repository: youknowone/RustPython

Length of output: 1984


🏁 Script executed:

#!/bin/bash
# Look for PseudoInstruction enum definition to understand where these are defined
rg -n "enum PseudoInstruction" crates/ -A 50

Repository: youknowone/RustPython

Length of output: 4253


🏁 Script executed:

#!/bin/bash
# Search for where these instructions might be emitted/created
rg -n "PseudoInstruction::" crates/codegen/ | head -30

Repository: youknowone/RustPython

Length of output: 2856


🏁 Script executed:

#!/bin/bash
# Check if there's any lowering or validation that prevents these from being reached
rg -n "unimplemented" crates/codegen/src/ir.rs

Repository: youknowone/RustPython

Length of output: 159


🏁 Script executed:

#!/bin/bash
# Check if any code tries to emit or create these placeholder instructions
rg -n "SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull|JumpNoInterrupt|Reserved258" crates/codegen/src/compile.rs

Repository: youknowone/RustPython

Length of output: 208


🏁 Script executed:

#!/bin/bash
# Look at the full context of the match statement to understand the flow
cd crates/codegen/src && sed -n '200,280p' ir.rs

Repository: youknowone/RustPython

Length of output: 4532


🏁 Script executed:

#!/bin/bash
# Search for where PseudoInstructions are created to see all emission sites
rg -n "PseudoInstruction::" crates/codegen/src/compile.rs | grep -E "(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)"

Repository: youknowone/RustPython

Length of output: 47


🏁 Script executed:

#!/bin/bash
# Check if there's validation or error handling elsewhere that prevents these from being emitted
rg -n "placeholder" crates/codegen/ -i

Repository: youknowone/RustPython

Length of output: 184


🏁 Script executed:

#!/bin/bash
# Search across the entire codebase for any attempt to create these placeholder instructions
rg -n "JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull" --type-list 2>/dev/null || echo "Getting all file types"
rg "JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull" crates/ | grep -v "^crates/codegen/src/ir.rs:" | grep -v "^crates/codegen/src/compile.rs:" | grep -v "^crates/compiler-core/src/bytecode/instruction.rs:"

Repository: youknowone/RustPython

Length of output: 87


🏁 Script executed:

#!/bin/bash
# Verify the comments about not emitting SetupFinally
rg -n "SetupFinally" crates/codegen/src/compile.rs -B 5 -A 5

Repository: youknowone/RustPython

Length of output: 1286


🏁 Script executed:

#!/bin/bash
# Search for the emit! macro to see all pseudo instruction emissions in compile.rs
rg -n 'emit!\(.*PseudoInstruction::' crates/codegen/src/compile.rs

Repository: youknowone/RustPython

Length of output: 1665


🏁 Script executed:

#!/bin/bash
# Check if there's any documentation or comments about these being unreachable
rg -n "unreachable|placeholder|Placeholder" crates/compiler-core/src/bytecode/instruction.rs -i

Repository: youknowone/RustPython

Length of output: 1407


Improve error handling for unreachable placeholder pseudo-instructions.

These placeholder instructions (JumpNoInterrupt, Reserved258, SetupCleanup, SetupFinally, SetupWith, StoreFastMaybeNull) are never emitted by the compiler pipeline and are dead code. Rather than unimplemented!(), replace with a panic or error that clearly identifies what went wrong—e.g., panic!("Unreachable: {instr:?} should never be emitted"). This provides better diagnostics if bytecode is corrupted or manually constructed.

🤖 Prompt for AI Agents
In `@crates/codegen/src/ir.rs` around lines 259 - 266, Replace the
unimplemented!() call for the unreachable placeholder variants
PseudoInstruction::JumpNoInterrupt, Reserved258, SetupCleanup, SetupFinally,
SetupWith, and StoreFastMaybeNull with a panic (or other hard failure) that
clearly states the instruction is unreachable and includes the instruction debug
info (e.g., panic!("Unreachable: {instr:?} should never be emitted")). Update
the match arm that currently contains unimplemented!("Got a placeholder pseudo
instruction ({instr:?})") so it uses the new panic message to improve
diagnostics when bytecode is corrupted or manually constructed.

Comment on lines 853 to 1051
fn label_arg(&self) -> Option<Arg<Label>> {
match self {
Self::Jump { target: l } => Some(*l),
_ => None,
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

JumpNoInterrupt label is not returned by label_arg.

JumpNoInterrupt has a target: Arg<Label> field but is not included in label_arg. This will cause label resolution issues if this instruction is used.

Proposed fix
     fn label_arg(&self) -> Option<Arg<Label>> {
         match self {
-            Self::Jump { target: l } => Some(*l),
+            Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l),
             _ => None,
         }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn label_arg(&self) -> Option<Arg<Label>> {
match self {
Self::Jump { target: l } => Some(*l),
_ => None,
}
}
fn label_arg(&self) -> Option<Arg<Label>> {
match self {
Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l),
_ => None,
}
}
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 853 - 858, The
match in function label_arg does not account for the JumpNoInterrupt variant, so
its target Arg<Label> is never returned; update label_arg (in crate's
instruction.rs) to match both Self::Jump { target: l } and Self::JumpNoInterrupt
{ target: l } and return Some(*l) for both cases (keeping the existing _ => None
fallback) so label resolution handles JumpNoInterrupt correctly.

Comment on lines 229 to 236
fn parse_object(
&self,
pystr: PyStrRef,
start_char_idx: usize,
start_byte_idx: usize,
scan_once: &PyObjectRef,
memo: &mut HashMap<String, PyStrRef>,
vm: &VirtualMachine,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve WTF‑8 for JSON strings (lossy conversion breaks CPython compatibility).

Wtf8Buf::to_string() drops unpaired surrogates, so keys/values like "\ud800" are corrupted and key memoization can collapse distinct strings. Use a WTF‑8‑preserving constructor and memoize by Wtf8Buf (or PyStrRef) instead of String.

🔧 Suggested fix (preserve WTF‑8)
-            memo: &mut HashMap<String, PyStrRef>,
+            memo: &mut HashMap<Wtf8Buf, PyStrRef>,

-                let key_str = key_wtf8.to_string();
-                let key: PyObjectRef = match memo.get(&key_str) {
+                let key: PyObjectRef = match memo.get(&key_wtf8) {
                     Some(cached) => cached.clone().into(),
                     None => {
-                        let py_key = vm.ctx.new_str(key_str.clone());
-                        memo.insert(key_str, py_key.clone());
+                        let py_key = vm.ctx.new_str(key_wtf8.clone());
+                        memo.insert(key_wtf8.clone(), py_key.clone());
                         py_key.into()
                     }
                 };

-                    let py_str = vm.ctx.new_str(wtf8_result.to_string());
+                    let py_str = vm.ctx.new_str(wtf8_result);

Also applies to: 283-291, 527-535

🤖 Prompt for AI Agents
In `@crates/stdlib/src/json.rs` around lines 229 - 236, The JSON parser currently
calls Wtf8Buf::to_string() and uses a HashMap<String, PyStrRef> for memoization
which loses unpaired surrogates; change parse_object (and the other similar
sites) to preserve WTF-8 by constructing keys/values from the Wtf8Buf directly
(or use PyStrRef that preserves WTF-8) instead of calling Wtf8Buf::to_string(),
and change the memo map type to use the WTF-8 preserving key (e.g.,
HashMap<Wtf8Buf, PyStrRef> or HashMap<PyStrRefKey, PyStrRef>) so keys like
"\ud800" remain distinct and are not corrupted.

Comment on lines 83 to 99
matches!(
Self::try_from(opcode).map(|op| op.inner()),
Ok(AnyInstruction::Real(
Instruction::DeleteAttr { .. }
| Instruction::DeleteGlobal(_)
| Instruction::DeleteName(_)
| Instruction::ImportFrom { .. }
| Instruction::ImportName { .. }
| Instruction::LoadAttr { .. }
| Instruction::LoadGlobal(_)
| Instruction::LoadAttrMethod { .. }
| Instruction::LoadName(_)
| Instruction::StoreAttr { .. }
| Instruction::StoreGlobal(_)
| Instruction::StoreName(_))
)
| Instruction::StoreName(_)
) | AnyInstruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. }))
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Include super pseudo-opcodes in has_name.

compile.rs emits PseudoInstruction::LoadSuperMethod, PseudoInstruction::LoadZeroSuperMethod, and PseudoInstruction::LoadZeroSuperAttr, all of which carry name indices. opcode.has_name should return true for those to keep co_names handling/disassembly consistent.

🛠️ Suggested fix
-                ) | AnyInstruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. }))
+                )
+                | AnyInstruction::Pseudo(
+                    PseudoInstruction::LoadAttrMethod { .. }
+                        | PseudoInstruction::LoadSuperMethod { .. }
+                        | PseudoInstruction::LoadZeroSuperMethod { .. }
+                        | PseudoInstruction::LoadZeroSuperAttr { .. }
+                ))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
matches!(
Self::try_from(opcode).map(|op| op.inner()),
Ok(AnyInstruction::Real(
Instruction::DeleteAttr { .. }
| Instruction::DeleteGlobal(_)
| Instruction::DeleteName(_)
| Instruction::ImportFrom { .. }
| Instruction::ImportName { .. }
| Instruction::LoadAttr { .. }
| Instruction::LoadGlobal(_)
| Instruction::LoadAttrMethod { .. }
| Instruction::LoadName(_)
| Instruction::StoreAttr { .. }
| Instruction::StoreGlobal(_)
| Instruction::StoreName(_))
)
| Instruction::StoreName(_)
) | AnyInstruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. }))
)
}
matches!(
Self::try_from(opcode).map(|op| op.inner()),
Ok(AnyInstruction::Real(
Instruction::DeleteAttr { .. }
| Instruction::DeleteGlobal(_)
| Instruction::DeleteName(_)
| Instruction::ImportFrom { .. }
| Instruction::ImportName { .. }
| Instruction::LoadAttr { .. }
| Instruction::LoadGlobal(_)
| Instruction::LoadName(_)
| Instruction::StoreAttr { .. }
| Instruction::StoreGlobal(_)
| Instruction::StoreName(_)
)
| AnyInstruction::Pseudo(
PseudoInstruction::LoadAttrMethod { .. }
| PseudoInstruction::LoadSuperMethod { .. }
| PseudoInstruction::LoadZeroSuperMethod { .. }
| PseudoInstruction::LoadZeroSuperAttr { .. }
))
)
}
🤖 Prompt for AI Agents
In `@crates/stdlib/src/opcode.rs` around lines 83 - 99, The has_name() matcher in
opcode.rs currently treats some PseudoInstruction variants as name-bearing but
misses the super pseudo-opcodes; update the pattern in has_name (the matches!
over Self::try_from(opcode).map(|op| op.inner())) to also consider
AnyInstruction::Pseudo(PseudoInstruction::LoadSuperMethod { .. }),
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperMethod { .. }), and
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperAttr { .. }) so those
pseudo-opcodes return true for has_name().

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@Cargo.toml`:
- Around line 199-200: The Cargo.toml currently pins the atomic crate at "0.5",
which is outdated; update the dependency entry for atomic (the "atomic" key in
Cargo.toml alongside memoffset) to "0.6" (e.g., change atomic = "0.5" to atomic
= "0.6"), then run cargo update -p atomic to refresh Cargo.lock and verify the
build and tests; leave memoffset unchanged unless you also want to bump it
separately.

In `@crates/common/src/refcount.rs`:
- Around line 297-313: with_deferred_drops currently sets the thread-local
IN_DEFERRED_CONTEXT, calls f, and resets the flag only after f returns, so a
panic in f leaves the flag set and prevents flush_deferred_drops; fix by
introducing an RAII guard type (e.g., DeferredDropsGuard) inside
with_deferred_drops that captures the previous IN_DEFERRED_CONTEXT value on
construction and, in its Drop impl, restores IN_DEFERRED_CONTEXT and calls
flush_deferred_drops if this guard is the outermost context; create the guard
before invoking f so the Drop runs on unwind and ensures proper reset/flush
regardless of panic.

In `@crates/vm/src/gc_state.rs`:
- Around line 226-234: track_object currently increments generations[0].count
and alloc_count before acquiring the generation_objects[0] write lock, which
races with collect_inner calling generations[0].count.store(0, ...) and can lose
concurrent allocations; to fix, ensure gen0 count is updated/cleared under the
same lock by replacing every unconditional generations[0].count.store(0, ...) in
collect_inner with a call to self.reset_gen0_count_from_set() so the count is
derived from the protected generation_objects[0] set, and keep track_object
as-is but rely on reset_gen0_count_from_set to reconcile counts atomically with
the gen0 write lock (apply the same replacement at the other occurrences
flagged: the collect_inner spots around lines noted and any other places that
zero gen0 directly).
♻️ Duplicate comments (9)
crates/derive-impl/src/pyclass.rs (2)

590-606: Don’t swallow malformed traverse attributes.

.ok().flatten() turns any parse error into “no value,” so invalid usages (e.g., non-string values) silently derive Traverse. Please handle the “flag-only” case explicitly while still propagating real parse errors.


632-639: TraverseFn lifetime missing in generated impl.

MaybeTraverse::try_traverse requires &mut TraverseFn<'_>. The current signature won’t satisfy the trait and should fail to compile.

🐛 Proposed fix
-                fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) {
+                fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) {
                     `#try_traverse_body`
                 }
crates/common/src/refcount.rs (2)

30-37: LEAKED bit overlaps weak count.

TOTAL_COUNT_WIDTH subtracts only 2 bits, but there are 3 flag bits (DESTRUCTED, WEAKED, LEAKED). With EPOCH_WIDTH = 0, LEAKED (bit 61) overlaps the weak-count range (bits 31–61). This corrupts counts.

🐛 Proposed fix
-pub const TOTAL_COUNT_WIDTH: u32 = u64::BITS - EPOCH_WIDTH - 2;
+pub const TOTAL_COUNT_WIDTH: u32 = u64::BITS - EPOCH_WIDTH - 3; // DESTRUCTED, WEAKED, LEAKED

136-139: State layout comment is stale.

The comment assumes 4 epoch bits and 28/29 counts, but EPOCH_WIDTH is 0 and the actual bit allocation differs. Please update the doc (or make it conditional on EPOCH_WIDTH).

crates/stdlib/src/gc.rs (1)

191-198: Incomplete implementation: get_referrers always returns an empty list.

This function is documented to "Return the list of objects that directly refer to any of the arguments" but always returns an empty list. This was flagged in a past review.

Consider either implementing it by scanning tracked objects, or raising NotImplementedError to make the limitation explicit rather than silently returning incorrect results.

crates/vm/src/object/core.rs (1)

1172-1177: Potential undefined behavior: casting &self to *mut violates aliasing rules.

gc_pop_edges takes &self but casts to *mut PyObject. Even with the safety comment about exclusive GC access, creating a mutable pointer from a shared reference can violate Rust's aliasing rules.

This was flagged in a previous review. Consider using only the gc_pop_edges_raw variant with pointers obtained through safe means, or using UnsafeCell for fields that need mutation during GC.

crates/vm/src/gc_state.rs (3)

247-265: Gen count decrement is incorrect for non‑gen0 objects.
untrack_object always decrements gen0, which desynchronizes counts for objects tracked in gen1/gen2.


704-723: Promotion doesn’t update generation counts.
Objects move between generation sets without adjusting GcGeneration.count, so thresholds and stats drift.


735-747: Freeze/unfreeze only changes counters; objects remain collectible.
Without moving pointers out of generation_objects, frozen objects can still be collected and get_objects stays inconsistent.

🧹 Nitpick comments (2)
crates/vm/src/vm/mod.rs (1)

1152-1219: Heuristic-based module classification may be fragile.

The module finalization logic relies on:

  1. Hardcoded stdlib module names (may become stale as stdlib evolves)
  2. Path substring checks ("pylib", "Lib") that may not work across all deployment scenarios (e.g., custom installations, virtual environments, frozen executables)

Consider:

  • Using a more robust marker (e.g., a flag set during import from frozen/builtin sources)
  • Documenting that the list needs maintenance when stdlib changes
  • Adding "itertools", "operator", "re" and other commonly-used stdlib modules to the skip list if they're needed during finalization

That said, this matches CPython's approach of best-effort module cleanup during shutdown.

crates/stdlib/src/gc.rs (1)

244-267: Consider reporting callback errors as unraisable exceptions.

Callback errors are silently ignored (let _ = callback.call(...)). In CPython, errors in GC callbacks are typically reported as unraisable exceptions rather than being completely silenced.

Consider using vm.run_unraisable() to report callback errors:

♻️ Suggested improvement
         for callback in callbacks {
-            let _ = callback.call((phase_str.clone(), info.clone()), vm);
+            if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) {
+                vm.run_unraisable(e, Some("gc callback".to_owned()), callback);
+            }
         }
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef12e44 and de0bb10.

⛔ Files ignored due to path filters (10)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/test/support/__init__.py is excluded by !Lib/**
  • Lib/test/test_builtin.py is excluded by !Lib/**
  • Lib/test/test_concurrent_futures/test_process_pool.py is excluded by !Lib/**
  • Lib/test/test_dict.py is excluded by !Lib/**
  • Lib/test/test_gc.py is excluded by !Lib/**
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_subprocess.py is excluded by !Lib/**
  • Lib/test/test_threading.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
📒 Files selected for processing (27)
  • Cargo.toml
  • crates/codegen/src/compile.rs
  • crates/common/Cargo.toml
  • crates/common/src/lib.rs
  • crates/common/src/refcount.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/derive-impl/src/util.rs
  • crates/stdlib/src/gc.rs
  • crates/stdlib/src/multiprocessing.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/list.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/gc_state.rs
  • crates/vm/src/lib.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/traverse.rs
  • crates/vm/src/object/traverse_object.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (8)
  • crates/vm/src/lib.rs
  • crates/derive-impl/src/util.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/vm/src/object/traverse.rs
  • crates/vm/src/builtins/str.rs
  • crates/common/src/lib.rs
  • crates/common/Cargo.toml
  • crates/vm/src/vm/thread.rs
🧰 Additional context used
🧬 Code graph analysis (12)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
  • reactivate_guard (56-62)
crates/stdlib/src/multiprocessing.rs (1)
crates/vm/src/stdlib/os.rs (2)
  • __reduce__ (715-717)
  • __reduce__ (775-777)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
  • debug_obj (87-93)
  • drop_dealloc_obj (84-86)
  • try_pop_edges_obj (128-131)
  • try_trace_obj (96-100)
crates/vm/src/builtins/tuple.rs (2)
crates/vm/src/builtins/dict.rs (2)
  • traverse (48-50)
  • pop_edges (52-58)
crates/vm/src/object/traverse.rs (2)
  • traverse (34-34)
  • pop_edges (39-39)
crates/vm/src/vm/context.rs (3)
crates/vm/src/builtins/list.rs (1)
  • new_ref (91-93)
crates/vm/src/object/core.rs (2)
  • new_ref (1398-1415)
  • default (420-422)
crates/vm/src/gc_state.rs (1)
  • default (128-130)
crates/vm/src/vm/interpreter.rs (2)
crates/vm/src/stdlib/atexit.rs (1)
  • _run_exitfuncs (36-47)
crates/vm/src/gc_state.rs (1)
  • gc_state (758-760)
crates/vm/src/builtins/dict.rs (2)
crates/vm/src/dict_inner.rs (2)
  • traverse (40-42)
  • traverse (84-94)
crates/vm/src/object/traverse.rs (2)
  • traverse (34-34)
  • pop_edges (39-39)
crates/vm/src/builtins/function.rs (4)
crates/vm/src/builtins/dict.rs (2)
  • traverse (48-50)
  • pop_edges (52-58)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/object/traverse.rs (5)
  • traverse (34-34)
  • traverse (43-45)
  • traverse (49-51)
  • traverse (55-55)
  • pop_edges (39-39)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
  • ensure_pinned (42-48)
  • drop_guard (67-71)
crates/vm/src/builtins/list.rs (5)
crates/vm/src/builtins/function.rs (2)
  • traverse (47-58)
  • pop_edges (60-97)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/dict_inner.rs (1)
  • traverse (40-42)
crates/vm/src/builtins/mappingproxy.rs (1)
  • traverse (30-35)
crates/vm/src/object/traverse.rs (1)
  • pop_edges (39-39)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (16)
  • gc_state (758-760)
  • enable (164-166)
  • disable (169-171)
  • collect (370-372)
  • get_threshold (184-190)
  • set_threshold (64-66)
  • set_threshold (193-201)
  • get_count (204-210)
  • get_debug (174-176)
  • set_debug (179-181)
  • get_stats (213-219)
  • stats (68-75)
  • get_objects (299-340)
  • freeze (736-742)
  • unfreeze (745-748)
  • get_freeze_count (731-733)
crates/codegen/src/compile.rs (2)
crates/compiler-core/src/bytecode/instruction.rs (2)
  • marker (1056-1058)
  • new (1061-1063)
crates/compiler-core/src/bytecode/oparg.rs (2)
  • null (19-21)
  • null (42-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check the WASM package and demo
🔇 Additional comments (78)
crates/vm/src/stdlib/thread.rs (3)

3-6: Re-export list looks fine.


661-760: Exception hook flow looks solid and robust.

Finalization suppression and stderr routing are well handled, and the traceback output path is clear.


429-458: Make EBR guard drop panic-safe via scopeguard.

The drop_guard() call runs only on normal exit. If func.invoke() or cleanup code panics, the guard won't be released until TLS teardown, which delays epoch advancement. Using scopeguard::defer! ensures cleanup happens regardless of panic.

♻️ Suggested change
 fn run_thread(func: ArgCallable, args: FuncArgs, vm: &VirtualMachine) {
     // Enter EBR critical section for this thread (Coarse-grained pinning)
     // This ensures GC won't free objects while this thread might access them
     crate::vm::thread::ensure_pinned();
+    scopeguard::defer! {
+        // Drop EBR guard even if we panic during thread execution/cleanup
+        crate::vm::thread::drop_guard();
+    }

     match func.invoke(args, vm) {
         Ok(_obj) => {}
         Err(e) if e.fast_isinstance(vm.ctx.exceptions.system_exit) => {}
         Err(exc) => {
@@
     crate::vm::thread::cleanup_current_thread_frames(vm);
     vm.state.thread_count.fetch_sub(1);

-    // Drop EBR guard when thread exits, allowing epoch advancement
-    crate::vm::thread::drop_guard();
+    // drop_guard handled by scopeguard
 }
crates/vm/src/stdlib/sys.rs (1)

829-832: Frame cloning here looks safe and intentional.

Using frame.as_object().to_owned() preserves the FrameRef while storing a proper owned PyObjectRef in the dict; semantics remain correct.

crates/vm/src/object/traverse_object.rs (1)

14-43: pop_edges vtable wiring looks consistent.

The conditional hook (HAS_POP_EDGEStry_pop_edges_obj::<T>) mirrors the trace path and keeps the vtable cohesive.

crates/vm/src/vm/context.rs (1)

55-58: GC state fields wired into Context cleanly.

The new gc_callbacks / gc_garbage lists are initialized alongside other core objects, which matches the GC module’s needs.

crates/stdlib/src/multiprocessing.rs (1)

550-555: LGTM - Consistent with existing unpicklable type patterns.

The __reduce__ implementation correctly prevents direct pickling of SemLock objects. The error message and pattern match similar implementations in os.rs for DirEntry and ScandirIterator.

crates/vm/src/dict_inner.rs (1)

727-737: LGTM - Clean implementation for GC edge collection.

The method correctly:

  1. Requires &mut self to ensure exclusive access during GC
  2. Resets internal state (used, filled, indices)
  3. Uses drain(..).flatten() to handle Option<DictEntry<T>> entries
  4. Returns an iterator yielding owned (key, value) pairs

This aligns well with the broader GC traversal framework introduced in this PR.

crates/vm/src/vm/mod.rs (2)

553-562: LGTM - Correct suppression of unraisable exceptions during finalization.

The early return when finalizing is set matches CPython's behavior of silently ignoring __del__ errors and daemon thread exceptions during interpreter shutdown. The Acquire memory ordering is correct for synchronizing with the Release store that sets the flag.


673-678: LGTM - Appropriate EBR guard reactivation at frame boundary.

Reactivating the EBR guard after a frame completes allows the garbage collector to advance epochs and free deferred objects. The frame boundary is a safe point where no Python objects should be in an inconsistent state.

crates/vm/src/builtins/list.rs (2)

27-33: LGTM - Correct pyclass configuration for manual traversal.

The traverse = "manual" and pop_edges attributes correctly signal that this type implements Traverse manually rather than via derive macro.


60-74: Silent failure on lock contention may leave edges uncollected.

The pop_edges implementation uses try_write() which silently returns if the lock cannot be acquired. While the comment explains that during GC the object should be unreachable, there's a potential edge case where edges remain uncollected if contention occurs.

For comparison, PyTuple::pop_edges (from relevant snippets) uses std::mem::take directly on &mut self.elements without lock concerns.

Consider either:

  1. Adding a debug assertion or log when try_write() fails during GC
  2. Documenting that this is expected behavior under certain conditions

Otherwise, the implementation is functionally correct for the typical GC scenario.

crates/vm/src/builtins/dict.rs (2)

33-39: LGTM - Correct pyclass configuration for manual traversal.

Consistent with the PyList changes - properly configured for manual Traverse implementation.


46-59: LGTM - Clean integration with Dict::pop_all_entries.

The Traverse implementation correctly:

  1. Delegates traverse to the underlying DictContentType
  2. Uses pop_all_entries() to drain all entries during GC
  3. Pushes both keys and values to the output vector

This properly integrates with the new pop_all_entries method added to dict_inner.rs.

crates/vm/src/builtins/tuple.rs (3)

6-6: LGTM!

Import of Traverse and TraverseFn is correctly added to support the new traversal implementation.


28-28: LGTM!

The pyclass attribute correctly declares manual traversal and pop_edges capability, consistent with the pattern used in PyDict and PyList.


40-52: LGTM!

The Traverse implementation is correct:

  • traverse properly delegates to elements.traverse() to visit contained objects
  • pop_edges correctly takes ownership via std::mem::take and extends the output vector
  • The SAFETY comment appropriately documents the safety invariants

This matches the established pattern for other container types (PyDict, PyList) in this PR.

crates/vm/src/vm/interpreter.rs (2)

113-120: LGTM!

The finalization step documentation accurately reflects the new shutdown sequence including the finalizing flag, threading._shutdown call, and two-pass GC collection around module cleanup.


152-161: LGTM!

The two-pass GC collection around module finalization follows CPython's shutdown semantics:

  1. First pass collects cycles before module cleanup
  2. Module finalization breaks references in module namespaces
  3. Second pass collects newly-unreachable cyclic garbage

Using collect_force(2) ensures full generation-2 collection regardless of GC enabled state, which is correct for shutdown.

crates/vm/src/builtins/function.rs (4)

12-12: LGTM!

Import and pyclass attribute changes are correct for enabling manual traversal and pop_edges capability.

Also applies to: 28-28


46-58: LGTM!

The traverse implementation correctly visits all fields that may contain object references:

  • globals, closure, defaults_and_kwdefaults (existing)
  • type_params, annotations, module, doc (newly added)

Locking before traversal ensures thread-safe access to the guarded values.


76-96: Well-documented edge clearing semantics.

The comments clearly explain the rationale for:

  • Not clearing annotations (shared dict concern)
  • Clearing name/qualname (str subclass cycle potential)
  • Not clearing globals/builtins/code (CPython func_clear requirement)

This follows CPython's func_clear semantics appropriately.


60-97: Verify Context::genesis() availability and try_lock() behavior during GC shutdown.

The code uses Context::genesis().empty_str (lines 86, 91) to replace name and qualname strings. Confirm that the genesis context is guaranteed to be available and valid during late-stage garbage collection and program shutdown. Additionally, verify whether silent failures from try_lock() (lines 67, 85, 89) are acceptable—while safe, unpopped edges from lock contention during GC may require fallback handling or documentation. The code comments justify clearing name/qualname due to str subclass cycles and explain why globals/builtins/code are not cleared; consider extending these comments to address the genesis context and try_lock() patterns.

crates/stdlib/src/gc.rs (3)

12-22: LGTM!

Debug flag constants correctly mirror the gc_state module definitions, providing Python-level access to GC debugging options.


49-86: LGTM!

The collect function correctly:

  • Validates generation bounds (0-2)
  • Invokes callbacks with "start" phase before collection
  • Uses collect_force which runs regardless of GC enabled state
  • Transfers garbage objects to the Python-visible gc.garbage list
  • Invokes callbacks with "stop" phase including collection stats

This matches CPython's gc.collect() behavior.


157-176: LGTM!

The get_objects implementation correctly:

  • Uses OptionalArg<Option<i32>> to handle both missing argument and explicit None
  • Validates generation bounds (0-2) with a clear error message
  • Delegates to gc_state().get_objects() for the actual work
crates/vm/src/object/core.rs (6)

127-131: LGTM!

The try_pop_edges_obj function correctly forwards to the payload's try_pop_edges method through proper pointer casting.


229-233: LGTM!

Good handling of the resurrection edge case where inner.obj was cleared by GC but the object was subsequently resurrected (e.g., during __del__). Restoring the pointer allows new weakrefs to be created correctly.


264-315: LGTM!

The two new GC-specific weakref clearing methods implement CPython's weakref semantics correctly:

  • clear_for_gc: Clears and calls callbacks but doesn't decrement ref_count (owner not being dropped)
  • clear_for_gc_collect_callbacks: Clears without callbacks, enabling the two-phase approach where ALL weakrefs are cleared BEFORE any callbacks run

This prevents callbacks from seeing inconsistent state where some objects have cleared weakrefs and others don't.


1396-1415: LGTM!

The tracking logic correctly handles both cases:

  • Objects with IS_TRACE = true (has trace function for traversal)
  • Objects with instance dicts (user-defined class instances)

Calling maybe_collect() after tracking ensures GC can run on allocation, matching CPython's behavior. The condition T::IS_TRACE || has_dict aligns with the untrack condition in drop_slow.


903-906: LGTM!

Making gc_finalized public enables the Python-facing gc.is_finalized() function to check if an object's __del__ has been called.


991-1029: Verify the SAFETY claim about untrack_object implementation.

The drop_slow implementation's overall structure is sound:

  1. Untrack from GC before deallocation (deferred if lock contention via try_defer_drop)
  2. Extract child edges via pop_edges
  3. Deallocate the object
  4. Drop child edges (triggers recursive destruction with cycles broken)

However, the SAFETY comment at lines 1009-1011 claims that untrack_object "only removes the pointer address from a HashSet" and "does NOT dereference the pointer." This is critical: if untrack_object actually performs any pointer dereference after the object is deallocated, this would be undefined behavior. Confirm that untrack_object truly only performs address-based operations (e.g., hashset removal) without dereferencing the pointer.

crates/vm/src/gc_state.rs (2)

33-83: GcGeneration API looks solid and thread-safe.
The atomic counters + mutexed stats give a clean, low-risk surface for callers.


158-219: Public GC state accessors are clear and consistent.
The enable/disable, debug, thresholds, counts, and stats APIs line up with expected GC control surfaces.

crates/codegen/src/compile.rs (44)

37-38: Import addition looks correct.


42-45: Bytecode import expansion aligns with the new emission path.


277-297: emit! macro coverage looks solid for struct/tuple/no-arg variants.


1188-1193: Confirm PopBlock pseudo-instruction lowering in unwind path.
This unwind branch now emits PseudoInstruction::PopBlock; please ensure the lowering phase handles it correctly for with/async-with cleanup.


1913-1936: Control-flow lowering to pseudo jump looks fine.


2520-2522: Pseudo jump to end block is appropriate here.


2551-2564: Please double-check stack shape before RERAISE.
This path restores prev_exc via Copy + PopExcept while leaving prev_exc on stack; confirm that the VM/exception table expects this layout.


2604-2605: Pseudo jump to else block looks correct.


2686-2690: Jump to handler_normal_exit via pseudo op is fine.


2734-2739: Pseudo jump to finally block looks correct.


2810-2812: Pseudo jump to end block looks fine.


2850-2863: Please re-verify PopExcept placement with lasti on stack.
Stack comments indicate [lasti, prev_exc, exc] remains after PopExcept; please confirm this matches exception table expectations.


2944-2945: Pseudo jump in try flow looks fine.*


3056-3058: Pseudo jump to next handler block looks fine.


3084-3086: Pseudo jump in handler-error path looks fine.


3108-3112: Pseudo jump to reraise_star block looks correct.


3162-3163: Pseudo jump to end block looks correct.


3202-3204: Pseudo jump to end block looks correct.


4043-4048: While-loop backedge via pseudo jump is fine.


4181-4185: Normal-exit jump in with-statement looks correct.


4253-4257: Suppress-path jump to after-block looks correct.


4341-4342: For-loop backedge via pseudo jump is fine.


4398-4403: Pattern fail-pop jump via pseudo op looks fine.


5414-5416: Cleanup jump in chained comparison looks correct.


5856-5858: Yield-from backedge via pseudo jump looks fine.


5931-5935: Confirm lowering for LoadZeroSuperAttr pseudo-op.
Please verify the lowering phase converts this to the correct real instruction with proper flags.


6121-6127: List-comp init via AnyInstruction looks good.


6148-6154: Set-comp init via AnyInstruction looks good.


6177-6182: Dict-comp init via AnyInstruction looks good.


6257-6262: Ternary expression jump via pseudo op looks fine.


6383-6388: Confirm lowering for LoadSuperMethod/LoadZeroSuperMethod pseudo-ops.
Ensure the lowered instruction preserves the method-call fast path semantics.


6393-6397: Confirm lowering for LoadAttrMethod pseudo-op.
Make sure the lowered instruction still produces the expected [method, self_or_null] stack layout.


6541-6543: compile_comprehension signature update looks consistent.


6687-6689: Comprehension loop backedge via pseudo jump looks fine.


6764-6766: compile_inlined_comprehension signature update looks consistent.


6885-6887: Inlined comprehension loop backedge via pseudo jump looks fine.


6899-6901: Inlined comprehension normal-path jump looks fine.


6973-6981: Generic _emit with AnyInstruction looks correct.


6989-6991: emit_no_arg generic helper looks good.


6993-6999: emit_arg generic helper looks good.


7021-7024: ReturnConst folding with AnyInstruction looks correct.


7472-7476: EmitArg trait signature update looks correct.


7478-7485: EmitArg impl for OpArgType looks correct.


7488-7494: EmitArg impl for BlockIdx looks correct.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Cargo.toml Outdated
Comment on lines 199 to 200
memoffset = "0.9"
atomic = "0.5"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

echo "=== Checking memoffset ==="
curl -s https://crates.io/api/v1/crates/memoffset | jq '{name: .crate.name, newest_version: .crate.newest_version, max_stable_version: .crate.max_stable_version}'

echo -e "\n=== Checking atomic ==="
curl -s https://crates.io/api/v1/crates/atomic | jq '{name: .crate.name, newest_version: .crate.newest_version, max_stable_version: .crate.max_stable_version}'

echo -e "\n=== Checking rustsec database for advisories ==="
# Check if rustsec cargo is available, if not use web API
if command -v cargo-tree &> /dev/null; then
  echo "Checking with cargo-audit..."
  # We'll use a simple approach to check published advisories
fi

# Check memoffset and atomic versions on crates.io for any yanked versions
echo -e "\n=== Checking memoffset versions ==="
curl -s https://crates.io/api/v1/crates/memoffset/versions | jq '.versions[] | select(.num == "0.9.0" or .num == "0.9") | {num: .num, yanked: .yanked, created_at: .created_at}' | head -5

echo -e "\n=== Checking atomic versions ==="
curl -s https://crates.io/api/v1/crates/atomic/versions | jq '.versions[] | select(.num == "0.5" or .num == "0.5.0") | {num: .num, yanked: .yanked, created_at: .created_at}' | head -5

Repository: youknowone/RustPython

Length of output: 584


Consider updating the atomic dependency to a newer version.

memoffset 0.9 is acceptable (latest is 0.9.1, a minor patch ahead). However, atomic 0.5 is significantly outdated with version 0.6.1 currently available. Update to atomic = "0.6" to benefit from bug fixes and improvements released over the past ~4 years.

🤖 Prompt for AI Agents
In `@Cargo.toml` around lines 199 - 200, The Cargo.toml currently pins the atomic
crate at "0.5", which is outdated; update the dependency entry for atomic (the
"atomic" key in Cargo.toml alongside memoffset) to "0.6" (e.g., change atomic =
"0.5" to atomic = "0.6"), then run cargo update -p atomic to refresh Cargo.lock
and verify the build and tests; leave memoffset unchanged unless you also want
to bump it separately.

Comment on lines 226 to 239
pub unsafe fn track_object(&self, obj: NonNull<PyObject>) {
let gc_ptr = GcObjectPtr(obj);
self.generations[0].count.fetch_add(1, Ordering::SeqCst);
self.alloc_count.fetch_add(1, Ordering::SeqCst);

// Add to generation 0 tracking (for correct gc_refs algorithm)
if let Ok(mut gen0) = self.generation_objects[0].write() {
gen0.insert(gc_ptr);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Gen0 count resets can lose concurrent allocations.
track_object increments before acquiring the gen0 write lock, while collection uses unconditional store(0). A concurrent allocation during collection can be dropped from the count, delaying future collections.

🛠️ Suggested fix: update counts under the same lock and reset from the set size
 impl GcState {
+    fn reset_gen0_count_from_set(&self) {
+        if let Ok(gen0) = self.generation_objects[0].read() {
+            self.generations[0].count.store(gen0.len(), Ordering::SeqCst);
+        }
+    }
+
     pub unsafe fn track_object(&self, obj: NonNull<PyObject>) {
         let gc_ptr = GcObjectPtr(obj);
-        self.generations[0].count.fetch_add(1, Ordering::SeqCst);
-        self.alloc_count.fetch_add(1, Ordering::SeqCst);
-
         // Add to generation 0 tracking (for correct gc_refs algorithm)
         if let Ok(mut gen0) = self.generation_objects[0].write() {
             gen0.insert(gc_ptr);
+            self.generations[0].count.fetch_add(1, Ordering::SeqCst);
+            self.alloc_count.fetch_add(1, Ordering::SeqCst);
         }
         ...
     }
 }

Apply self.reset_gen0_count_from_set() in place of each self.generations[0].count.store(0, ...) occurrence in collect_inner.

Also applies to: 422-426, 515-517, 545-547, 692-694

🤖 Prompt for AI Agents
In `@crates/vm/src/gc_state.rs` around lines 226 - 234, track_object currently
increments generations[0].count and alloc_count before acquiring the
generation_objects[0] write lock, which races with collect_inner calling
generations[0].count.store(0, ...) and can lose concurrent allocations; to fix,
ensure gen0 count is updated/cleared under the same lock by replacing every
unconditional generations[0].count.store(0, ...) in collect_inner with a call to
self.reset_gen0_count_from_set() so the count is derived from the protected
generation_objects[0] set, and keep track_object as-is but rely on
reset_gen0_count_from_set to reconcile counts atomically with the gen0 write
lock (apply the same replacement at the other occurrences flagged: the
collect_inner spots around lines noted and any other places that zero gen0
directly).

* Updated urllib + urllib tests

* Updated http + test

* Annotated failing tests

* Fixed issues in httpservers, robotparser and urllib2net

* Fixed windows only success

* Annotated flaky test in test_logging
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/vm/src/builtins/function.rs (1)

46-58: Missing traversal of code and builtins fields.

The traverse implementation doesn't visit self.code and self.builtins, both of which hold PyObjectRef/PyRef that may participate in reference cycles. While pop_edges intentionally skips these (matching CPython's func_clear), the traversal phase should still visit them so the GC can discover all reachable objects.

Proposed fix
 unsafe impl Traverse for PyFunction {
     fn traverse(&self, tracer_fn: &mut TraverseFn<'_>) {
+        self.code.lock().traverse(tracer_fn);
         self.globals.traverse(tracer_fn);
+        self.builtins.traverse(tracer_fn);
         if let Some(closure) = self.closure.as_ref() {
             closure.as_untyped().traverse(tracer_fn);
         }
         self.defaults_and_kwdefaults.traverse(tracer_fn);
         // Traverse additional fields that may contain references
         self.type_params.lock().traverse(tracer_fn);
         self.annotations.lock().traverse(tracer_fn);
         self.module.lock().traverse(tracer_fn);
         self.doc.lock().traverse(tracer_fn);
     }
crates/vm/src/vm/interpreter.rs (1)

113-152: Documentation and code order mismatch for finalization steps.

The doc comment (lines 116-117) states:

  1. Set finalizing flag (suppresses unraisable exceptions)
  2. Call threading._shutdown() to join non-daemon threads

But the actual code does the opposite:

  • Line 135: wait_for_thread_shutdown(vm) (calls threading._shutdown())
  • Line 139: vm.state.finalizing.store(true, Ordering::Release)

This means unraisable exceptions from threading._shutdown() are not suppressed. If this is intentional, update the doc comment to match. Otherwise, swap the order so exceptions are suppressed during shutdown.

Proposed fix (if suppression is intended)
-            // Wait for non-daemon threads (wait_for_thread_shutdown)
-            wait_for_thread_shutdown(vm);
-
             // Suppress unraisable exceptions from daemon threads and __del__
             // methods during shutdown.
             vm.state.finalizing.store(true, Ordering::Release);

+            // Wait for non-daemon threads (wait_for_thread_shutdown)
+            wait_for_thread_shutdown(vm);
+
             atexit::_run_exitfuncs(vm);
🤖 Fix all issues with AI agents
In `@crates/vm/src/gc_state.rs`:
- Around line 566-586: The loop calling weakref callbacks currently ignores the
Option returned by crate::vm::thread::with_vm, causing silent skips when no VM
is available; change the for loop that iterates over all_callbacks (tuples of
crate::PyRef<crate::object::PyWeak> and crate::PyObjectRef> produced by
gc_clear_weakrefs_collect_callbacks) to explicitly handle with_vm(&cb, |vm| {
... })’s return: if with_vm returns Some(vm) call cb.call((wr.clone(),), vm) and
handle its Result (log or propagate errors), and if with_vm returns None log a
warning or otherwise surface the missing-VM error instead of using let _ = and
silently ignoring it. Ensure you reference the with_vm invocation and the
cb.call invocation when implementing the handling.
♻️ Duplicate comments (4)
crates/vm/src/object/core.rs (1)

1135-1140: Potential aliasing issue remains in gc_pop_edges.

This method takes &self but immediately casts to *mut PyObject, which can violate Rust's aliasing rules. The past review flagged this issue - consider using only the raw pointer variant gc_pop_edges_raw and removing this &self wrapper, or document why this is safe in this specific context.

The safety comment mentions "exclusive access during GC", but the function signature doesn't enforce this guarantee.

crates/derive-impl/src/pyclass.rs (1)

632-645: Add missing lifetime parameter to TraverseFn in generated try_traverse signature.

The generated try_traverse signature at line 637 uses &mut ::rustpython_vm::object::TraverseFn without a lifetime parameter. However, TraverseFn is defined as pub type TraverseFn<'a> = dyn FnMut(&PyObject) + 'a; and the trait implementations consistently use TraverseFn<'_>. The generated code should match:

Suggested fix
-                fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) {
+                fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) {
crates/stdlib/src/gc.rs (1)

189-196: Incomplete implementation: get_referrers always returns an empty list.

This was flagged in a previous review and remains unaddressed. The function documents returning "the list of objects that directly refer to any of the arguments" but always returns empty.

crates/vm/src/gc_state.rs (1)

426-431: Gen0 count reset can still lose concurrent allocations.

The previous review flagged that generations[0].count.store(0, ...) can race with track_object. While track_object now increments under the write lock (lines 234-238), collect_inner still uses unconditional store(0) without holding the same lock.

Race scenario:

  1. collect_inner reads gen0 objects and determines they're all reachable
  2. track_object acquires write lock, inserts new object, increments count to 1
  3. collect_inner executes store(0), losing the new allocation's count

This delays future automatic collections since gen0 count won't accurately reflect tracked objects.

Also applies to: 517-525, 548-554, 698-701

🧹 Nitpick comments (4)
crates/vm/src/object/core.rs (1)

990-1022: Clarify comment about untrack_object safety.

The safety comment at lines 1002-1003 states:

// SAFETY: untrack_object only removes the pointer address from a HashSet.
// It does NOT dereference the pointer, so it's safe even after deallocation.

However, untrack_object is called before deallocation (line 1004), so the "after deallocation" part of the comment is misleading. The comment appears to be explaining why the deferred closure is safe (since it may run after deallocation), but this isn't clear.

Suggested clarification
-            // SAFETY: untrack_object only removes the pointer address from a HashSet.
-            // It does NOT dereference the pointer, so it's safe even after deallocation.
+            // SAFETY: The deferred closure may run after the object is deallocated.
+            // This is safe because untrack_object only removes the pointer address
+            // from a HashSet without dereferencing it.
crates/vm/src/vm/mod.rs (1)

1152-1219: Consider refining the stdlib detection heuristics.

The hardcoded stdlib module list (lines 1181-1201) and path-based detection (pylib, Lib at line 1213) are fragile:

  1. The module list may become stale as stdlib evolves
  2. Path substrings may match unintended directories (e.g., a user module in myLib/)

Consider using a more robust approach such as checking if the module is a builtin (module.__loader__ type) or using sys.stdlib_module_names (Python 3.10+).

That said, since this is defensive cleanup during shutdown where errors are acceptable, the current approach is functional.

💡 Alternative approach using loader type
-                    // Skip stdlib modules (starting with _ or known stdlib names)
-                    if name.starts_with('_')
-                        || matches!(
-                            name,
-                            "sys"
-                                | "builtins"
-                                ...
-                        )
-                    {
-                        continue;
-                    }
+                    // Skip stdlib modules by checking if they're builtin/frozen
+                    if name.starts_with('_') {
+                        continue;
+                    }
+                    if let Some(module) = value.downcast_ref::<PyModule>()
+                        && let Ok(loader) = module.dict().get_item("__loader__", self)
+                    {
+                        let loader_name = loader.class().name().to_string();
+                        if loader_name.contains("BuiltinImporter")
+                            || loader_name.contains("FrozenImporter")
+                        {
+                            continue;
+                        }
+                    }
crates/stdlib/src/gc.rs (1)

242-265: Consider logging errors from GC callbacks instead of silently ignoring them.

CPython uses PyErr_WriteUnraisable for callback errors, which logs them via sys.unraisablehook. Currently, all errors are silently swallowed with let _ =. This could make debugging difficult when callbacks fail.

💡 Suggestion to log callback errors
         for callback in callbacks {
-            let _ = callback.call((phase_str.clone(), info.clone()), vm);
+            if let Err(exc) = callback.call((phase_str.clone(), info.clone()), vm) {
+                // Log via unraisable hook like CPython does
+                vm.write_unraisable(exc, Some("gc callback".to_owned()), None);
+            }
         }
crates/vm/src/gc_state.rs (1)

806-842: Test coverage is limited to basic configuration.

The tests verify configuration methods but don't test the core GC functionality:

  • Object tracking/untracking
  • Collection algorithm (cycle detection, resurrection handling)
  • Generation promotion
  • freeze/unfreeze with actual objects

Consider adding integration tests with mock or real PyObject instances to verify the collection algorithm works correctly.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de0bb10 and d51978e.

⛔ Files ignored due to path filters (14)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/json/__init__.py is excluded by !Lib/**
  • Lib/json/decoder.py is excluded by !Lib/**
  • Lib/json/encoder.py is excluded by !Lib/**
  • Lib/test/support/__init__.py is excluded by !Lib/**
  • Lib/test/test_builtin.py is excluded by !Lib/**
  • Lib/test/test_dict.py is excluded by !Lib/**
  • Lib/test/test_gc.py is excluded by !Lib/**
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_json/test_encode_basestring_ascii.py is excluded by !Lib/**
  • Lib/test/test_json/test_scanstring.py is excluded by !Lib/**
  • Lib/test/test_json/test_unicode.py is excluded by !Lib/**
  • Lib/test/test_subprocess.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
📒 Files selected for processing (30)
  • .cspell.json
  • crates/common/Cargo.toml
  • crates/common/src/lib.rs
  • crates/common/src/refcount.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/derive-impl/src/util.rs
  • crates/stdlib/src/gc.rs
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/builtins/coroutine.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/generator.rs
  • crates/vm/src/builtins/list.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/gc_state.rs
  • crates/vm/src/lib.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/traverse.rs
  • crates/vm/src/object/traverse_object.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (6)
  • crates/derive-impl/src/util.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/dict_inner.rs
  • crates/common/Cargo.toml
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/list.rs
🧰 Additional context used
🧬 Code graph analysis (16)
crates/vm/src/vm/context.rs (3)
crates/vm/src/object/core.rs (2)
  • new_ref (1361-1378)
  • default (413-415)
crates/vm/src/builtins/list.rs (1)
  • new_ref (91-93)
crates/vm/src/gc_state.rs (1)
  • default (130-132)
crates/vm/src/builtins/coroutine.rs (3)
crates/vm/src/builtins/asyncgenerator.rs (5)
  • traverse (29-32)
  • traverse (215-217)
  • traverse (271-274)
  • traverse (373-376)
  • traverse (532-535)
crates/vm/src/builtins/generator.rs (1)
  • traverse (24-26)
crates/vm/src/object/traverse.rs (7)
  • traverse (35-35)
  • traverse (44-46)
  • traverse (50-52)
  • traverse (56-56)
  • traverse (61-65)
  • traverse (73-77)
  • traverse (85-89)
crates/vm/src/coroutine.rs (2)
crates/vm/src/frame.rs (2)
  • traverse (101-103)
  • traverse (107-118)
crates/vm/src/object/traverse.rs (6)
  • traverse (35-35)
  • traverse (44-46)
  • traverse (50-52)
  • traverse (56-56)
  • traverse (61-65)
  • traverse (73-77)
crates/vm/src/builtins/function.rs (4)
crates/vm/src/builtins/generator.rs (1)
  • traverse (24-26)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/object/traverse.rs (1)
  • pop_edges (40-40)
crates/vm/src/builtins/dict.rs (1)
  • pop_edges (52-58)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
  • gc_state (802-804)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
  • reactivate_guard (56-62)
crates/vm/src/builtins/tuple.rs (3)
crates/vm/src/builtins/dict.rs (2)
  • traverse (48-50)
  • pop_edges (52-58)
crates/vm/src/builtins/list.rs (2)
  • traverse (62-64)
  • pop_edges (66-73)
crates/vm/src/object/traverse.rs (1)
  • pop_edges (40-40)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
  • debug_obj (87-93)
  • drop_dealloc_obj (84-86)
  • try_pop_edges_obj (128-131)
  • try_trace_obj (96-100)
crates/vm/src/frame.rs (4)
crates/vm/src/builtins/asyncgenerator.rs (5)
  • traverse (29-32)
  • traverse (215-217)
  • traverse (271-274)
  • traverse (373-376)
  • traverse (532-535)
crates/vm/src/builtins/coroutine.rs (2)
  • traverse (22-24)
  • traverse (157-159)
crates/vm/src/object/traverse.rs (6)
  • traverse (35-35)
  • traverse (44-46)
  • traverse (50-52)
  • traverse (56-56)
  • traverse (61-65)
  • traverse (73-77)
crates/vm/src/exceptions.rs (2)
  • traverse (28-33)
  • traverse (1634-1652)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/object/traverse.rs (2)
  • try_traverse (20-20)
  • try_pop_edges (22-22)
crates/vm/src/builtins/str.rs (2)
  • try_traverse (1930-1932)
  • try_pop_edges (1934-1936)
crates/vm/src/vm/thread.rs (1)
crates/vm/src/vm/mod.rs (1)
  • new (145-260)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
  • ensure_pinned (42-48)
  • drop_guard (67-71)
crates/vm/src/object/traverse.rs (5)
crates/vm/src/builtins/str.rs (1)
  • try_pop_edges (1934-1936)
crates/vm/src/builtins/dict.rs (1)
  • pop_edges (52-58)
crates/vm/src/builtins/function.rs (1)
  • pop_edges (60-97)
crates/vm/src/builtins/list.rs (1)
  • pop_edges (66-73)
crates/vm/src/builtins/tuple.rs (1)
  • pop_edges (47-51)
crates/derive-impl/src/pyclass.rs (2)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/object/traverse_object.rs (3)
  • traverse (49-51)
  • traverse (56-70)
  • traverse (75-84)
crates/vm/src/builtins/generator.rs (3)
crates/vm/src/builtins/asyncgenerator.rs (5)
  • traverse (29-32)
  • traverse (215-217)
  • traverse (271-274)
  • traverse (373-376)
  • traverse (532-535)
crates/vm/src/builtins/coroutine.rs (2)
  • traverse (22-24)
  • traverse (157-159)
crates/vm/src/object/traverse.rs (6)
  • traverse (35-35)
  • traverse (44-46)
  • traverse (50-52)
  • traverse (56-56)
  • traverse (61-65)
  • traverse (73-77)
crates/vm/src/gc_state.rs (1)
crates/stdlib/src/gc.rs (6)
  • disable (32-34)
  • get_count (113-122)
  • get_stats (138-155)
  • is_finalized (207-210)
  • collect (50-86)
  • callbacks (238-240)
🪛 Biome (2.1.2)
.cspell.json

[error] 69-69: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 69-69: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 121-121: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 121-121: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 122-122: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 122-122: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 123-123: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 123-123: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 124-124: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 124-124: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 125-125: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 125-125: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 126-126: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 126-126: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 127-127: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 127-127: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
🔇 Additional comments (47)
.cspell.json (1)

69-127: LGTM!

The spell-check additions are appropriate for the GC terminology introduced in this PR. The static analysis hints from Biome are false positives - it's incorrectly treating JSONC (JSON with comments) as strict JSON.

crates/derive-impl/src/pyclass.rs (2)

590-608: Verify intentional error swallowing for traverse attribute parsing.

The code intentionally uses .ok().flatten() to treat Err as None, which the comment explains is for handling bare traverse attributes (without a value). However, this also silently ignores genuine parse errors like #[pyclass(traverse(123))].

Consider whether a more explicit match would be safer:

let value = match class_meta.inner()._optional_str("traverse") {
    Ok(v) => v,
    Err(_) if class_meta.inner()._has_key("traverse").unwrap_or(false) => None, // bare traverse
    Err(e) => return Err(e), // genuine parse error
};

This preserves the intended behavior for bare traverse while surfacing actual parse failures.


698-706: LGTM!

The MaybeTraverse implementation is properly generated and integrated into the final token stream. The structure correctly handles both traverse and pop_edges attributes with appropriate conditional logic.

crates/vm/src/vm/thread.rs (2)

25-31: LGTM!

The thread-local storage for async generator hooks and EBR guard is well-structured. Using RefCell is appropriate since these are only accessed from the owning thread.


36-71: LGTM!

The EBR guard management functions correctly implement the coarse-grained pinning strategy:

  • ensure_pinned() is idempotent and only pins if not already pinned
  • reactivate_guard() properly calls repin() to allow epoch advancement at safe points
  • drop_guard() cleanly unpins when the thread exits

The documentation clearly explains the purpose and usage context.

crates/vm/src/stdlib/thread.rs (4)

3-6: LGTM!

The self:: prefix makes the re-export path more explicit.


429-459: LGTM!

The EBR guard lifecycle is properly integrated:

  • ensure_pinned() at thread start protects object access during thread execution
  • drop_guard() is correctly placed last, after all cleanup that might access objects

525-530: LGTM!

Using is_some_and() is more idiomatic than map_or(false, ...) for this pattern.


676-682: LGTM!

The SystemExit handling now properly checks for subclasses using fast_issubclass(), which matches CPython's behavior of treating all SystemExit subclasses as exit signals. The is_some_and pattern correctly handles cases where exc_type might not be a PyType.

crates/vm/src/object/core.rs (7)

127-131: LGTM!

The try_pop_edges_obj function correctly handles payload-level edge extraction through raw pointers, matching the vtable-based dispatch pattern used elsewhere.


229-233: LGTM!

This correctly handles the resurrection edge case where a new weakref is created during __del__ after GC has cleared the object pointer. Restoring the pointer ensures the weakref functions correctly.


257-381: LGTM!

The weakref clearing refactoring correctly implements CPython-compatible GC semantics:

  • clear_for_gc_collect_callbacks() collects callbacks for batch invocation after all weakrefs are cleared
  • The already_cleared check at line 320 prevents double ref_count decrement when both GC and drop_slow_inner trigger clearing
  • The flags provide proper separation of concerns for different clearing scenarios

1359-1378: LGTM!

The GC tracking in new_ref correctly mirrors the untracking logic in drop_slow:

  • Same condition: IS_TRACE || has_dict
  • Immediate GC check via maybe_collect() after tracking

This ensures consistent object lifecycle management.


1042-1094: LGTM!

The new GC utility methods are well-implemented:

  • is_gc_tracked() correctly checks both trace capability and dict presence
  • try_call_finalizer() properly sets the finalized flag before calling __del__ to prevent double invocation
  • gc_get_referents() and gc_clear_weakrefs_collect_callbacks() are clean delegations

1096-1126: LGTM!

The raw pointer GC methods are properly marked unsafe with clear safety documentation. gc_pop_edges_raw is the preferred API as it makes the caller responsible for ensuring exclusive access.


910-972: Resurrection detection logic is sound and handles the edge cases correctly.

The logic works as intended:

  1. inc() from 0 performs a double increment (0→2) for thread safety
  2. After __del__ completes, after_del is compared to after_inc (which is 2)
  3. If after_del > after_inc, resurrection detected; otherwise, normal destruction

The concern about __del__ creating and dropping temporary references is already handled: the code measures after_del only after __del__ completes, so any temporary refs created and dropped during execution are no longer visible in the final count. Only external references held beyond __del__ would increase the count beyond the expected 2, correctly triggering resurrection detection.

crates/vm/src/lib.rs (1)

80-80: LGTM! New GC state module export.

The module is correctly placed alphabetically among other public modules. The unconditional export (no feature gate) is appropriate since GC state should be available regardless of configuration.

crates/common/src/lib.rs (1)

17-17: LGTM! Epoch-based reclamation re-export.

Re-exporting crossbeam_epoch as epoch provides a clean, consistent API for EBR primitives across the crate. This aligns with the broader GC/EBR integration in this PR.

crates/vm/src/vm/context.rs (2)

54-58: LGTM! GC-related fields for CPython compatibility.

The gc_callbacks and gc_garbage fields align with CPython's gc.callbacks and gc.garbage API. Using PyListRef provides thread-safe access through internal locking.


335-362: LGTM! GC fields initialization follows established patterns.

The initialization of gc_callbacks and gc_garbage is consistent with other Context fields (e.g., empty_tuple, int_cache_pool). Both are correctly initialized as empty lists.

crates/vm/src/vm/mod.rs (2)

552-562: LGTM! Suppress unraisable exceptions during finalization.

The early return when state.finalizing is true correctly mirrors CPython's behavior of silently ignoring __del__ errors and daemon thread exceptions during interpreter shutdown. The Acquire ordering is appropriate for reading a flag potentially set by another thread.


673-678: LGTM! EBR guard reactivation at frame boundary.

Frame boundaries are appropriate safe points for epoch advancement since borrowed references from the frame are no longer in use. The #[cfg(feature = "threading")] guard correctly scopes this to threaded builds where EBR is relevant.

crates/vm/src/coroutine.rs (1)

42-49: LGTM! Traverse implementation for coroutine state.

The implementation correctly traverses all fields containing object references (frame, name, qualname, exception) while skipping primitive fields (closed, running). All traversed fields have proper Traverse implementations: PyRef<T> delegates to the object, PyMutex<T> safely locks and traverses via try_lock() to avoid deadlocks, and Option<T> delegates to the inner type. This aligns with other Traverse implementations in the codebase.

crates/vm/src/builtins/generator.rs (1)

17-27: LGTM! Correct manual traversal implementation.

The traverse = "manual" annotation and corresponding unsafe impl Traverse correctly delegate to the inner Coro, consistent with the patterns in coroutine.rs and asyncgenerator.rs.

crates/derive-impl/src/pystructseq.rs (1)

608-619: LGTM! Proper MaybeTraverse scaffolding for struct sequences.

The constants IS_TRACE = true and HAS_POP_EDGES = false correctly indicate traversal is needed but edge popping is not required. The no-op try_pop_edges and delegation to the inner PyTuple via try_traverse aligns with other similar implementations like PyUtf8Str.

crates/vm/src/builtins/function.rs (1)

60-97: Good defensive use of try_lock() in pop_edges.

Using try_lock() instead of lock() prevents potential deadlocks during GC if another thread holds the mutex. The comments clearly document CPython's func_clear semantics and the rationale for not clearing annotations.

crates/vm/src/frame.rs (1)

100-119: LGTM! Comprehensive traversal of Frame and FrameState.

All fields containing PyObjectRef or related types are properly traversed. The implementation correctly covers code, func_obj, fastlocals, cells_frees, locals, globals, builtins, trace, state, and temporary_refs. Primitive fields like lasti and trace_lines are correctly excluded.

crates/vm/src/builtins/coroutine.rs (2)

14-25: LGTM! Consistent traversal implementation for PyCoroutine.

The manual traversal correctly delegates to self.inner.traverse(), matching the pattern established in generator.rs and asyncgenerator.rs.


148-160: LGTM! Proper traversal for PyCoroutineWrapper.

The traversal correctly visits the contained PyRef<PyCoroutine>. The AtomicCell<bool> field closed is a primitive and correctly excluded from traversal.

crates/vm/src/vm/interpreter.rs (1)

143-152: GC-driven cleanup sequence looks correct.

The two-phase GC approach (collect before module cleanup, then after) is a sound strategy for breaking cycles and collecting garbage that becomes eligible after module finalization.

crates/vm/src/object/traverse_object.rs (1)

14-45: VTable extension for edge popping looks correct.

The pop_edges field follows the same pattern as the existing trace field, using a const block to conditionally set the function pointer based on T::HAS_POP_EDGES. The approach is consistent with the established vtable design.

crates/vm/src/builtins/tuple.rs (1)

40-52: Traverse implementation for PyTuple is correct.

The implementation properly handles traversal and edge popping. into_vec() on Box<[T]> is stable and correctly converts the boxed slice to a Vec for extending the output.

Note: The implementation only applies to PyTuple<PyObjectRef> (the default generic parameter), which is correct since only the default tuple type needs GC traversal.

crates/vm/src/object/traverse.rs (2)

14-41: MaybeTraverse and Traverse trait extensions are well-designed.

The additions follow the established pattern:

  • HAS_POP_EDGES mirrors IS_TRACE for compile-time capability detection
  • try_pop_edges and pop_edges have safe default no-op implementations
  • Documentation clearly explains the purpose (edge extraction before deallocation)

104-114: BoxVec Traverse implementation is consistent.

The implementation follows the same pattern as Vec<T> and Box<[T]>, iterating through elements and calling traverse on each.

crates/vm/src/builtins/dict.rs (1)

46-59: Traverse implementation for PyDict is correct.

The implementation properly:

  1. Traverses entries during GC marking
  2. Pops all keys and values during edge extraction

The pop_all_entries() method exists on DictContentType and returns an iterator of (PyObjectRef, T) tuples, making the iteration and collection of keys and values correct.

crates/vm/src/builtins/asyncgenerator.rs (3)

17-33: LGTM! Traverse implementation for PyAsyncGen is correct.

The implementation properly traverses both the inner Coro (which contains the frame and other GC-relevant fields) and ag_finalizer (which holds a callback PyObjectRef). The running_async and ag_hooks_inited AtomicCell fields correctly don't need traversal as they contain primitive values.


493-511: LGTM! Proper state management in error paths.

Setting inner.closed.store(true) in both yield_close and check_error ensures the async generator is properly marked as closed when it terminates due to errors or ignored GeneratorExit. This is consistent with CPython's behavior.


210-218: Traverse implementations for wrapper types are correct.

All async generator helper types properly traverse their fields. The tuple type (PyObjectRef, PyObjectRef, PyObjectRef) for PyAsyncGenAThrow::value works correctly due to the blanket Traverse implementation for tuples (defined in crates/vm/src/object/traverse.rs line 153), which handles tuples of any number of Traverse types.

Applies to lines: 210-218, 262-275, 363-377, 523-536

crates/stdlib/src/gc.rs (1)

49-86: LGTM! The collect implementation correctly handles manual GC calls.

The implementation properly:

  1. Validates generation bounds (0-2)
  2. Invokes callbacks with "start" phase before collection
  3. Uses collect_force to run even when GC is disabled (matching CPython's gc.collect() behavior)
  4. Transfers garbage to vm.ctx.gc_garbage for DEBUG_SAVEALL support
  5. Invokes callbacks with "stop" phase and actual collection stats
crates/common/src/refcount.rs (4)

28-39: LGTM! Bit layout constants are correctly defined.

The previous bit overlap issue has been fixed. With TOTAL_COUNT_WIDTH = u64::BITS - EPOCH_WIDTH - 3, the layout is:

  • Bits 63: DESTRUCTED
  • Bits 62: WEAKED
  • Bits 61: LEAKED
  • Bits 31-60: weak_count (30 bits)
  • Bits 0-30: strong_count (31 bits)

No overlap between flags and count fields.


208-240: LGTM! Panic-safe deferred drop context implementation.

The DeferredDropGuard RAII pattern correctly addresses the previous review concern about panic safety. The guard restores the context flag and flushes deferred drops even on panic/unwind.


277-295: LGTM! The defer_destruction function is correctly documented and implemented.

The safety requirements are clearly documented, and the function correctly delegates to crossbeam-epoch's defer_unchecked. The Send + 'static bounds on F are appropriate since the closure may execute on a different thread after epoch advancement.


118-131: Clarify the comment for inc() double-increment behavior when strong count is zero.

The double-increment at lines 127-130 is intentional and correct—when strong count reaches 0 before inc(), it increments to 2. This provides a "permission" mechanism for thread safety during object resurrection, as documented in the calling code (crates/vm/src/object/core.rs:920-921).

However, the comment "created a permission to run decrement again" at line 128 is vague. Align it with the clearer explanation in the usage documentation: "double increment for thread safety" to help future maintainers understand why inc() from 0 produces strong=2 instead of strong=1.

crates/vm/src/gc_state.rs (4)

256-269: LGTM! untrack_object now correctly decrements the appropriate generation's count.

The fix addresses the previous review concern by iterating through all generations and decrementing the count only for the generation where the object was actually found.


711-746: LGTM! promote_survivors now correctly updates generation counts.

The fix addresses the previous review concern by decrementing the source generation's count (lines 726-731) and incrementing the target generation's count (lines 737-740) when moving objects between generations.


753-792: LGTM! freeze/unfreeze now properly move objects between sets.

The fix addresses the previous review concern. Both functions now:

  1. Actually move object pointers between the generation sets and permanent_objects
  2. Update the counts to reflect the actual set sizes

Frozen objects are now properly excluded from collection since they're removed from generation_objects.


394-402: LGTM! EBR integration for collection safety.

The implementation correctly:

  1. Enters an EBR critical section before analyzing the object graph (line 398)
  2. Uses a memory fence to ensure visibility of reference count updates from other threads (line 402)
  3. Flushes EBR deferred operations before exiting (line 706)

This ensures that objects being freed by other threads won't have their memory deallocated until the collection phase completes.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +566 to +586
// 6c: Clear existing weakrefs BEFORE calling __del__
// This invalidates existing weakrefs, but new weakrefs created during __del__
// will still work (WeakRefList::add restores inner.obj if cleared)
//
// CRITICAL: We use a two-phase approach to match CPython behavior:
// Phase 1: Clear ALL weakrefs (set inner.obj = None) and collect callbacks
// Phase 2: Invoke ALL callbacks
// This ensures that when a callback runs, ALL weakrefs to unreachable objects
// are already dead (return None when called).
let mut all_callbacks: Vec<(crate::PyRef<crate::object::PyWeak>, crate::PyObjectRef)> =
Vec::new();
for obj_ref in &unreachable_refs {
let callbacks = obj_ref.gc_clear_weakrefs_collect_callbacks();
all_callbacks.extend(callbacks);
}
// Phase 2: Now call all callbacks - at this point ALL weakrefs are cleared
for (wr, cb) in all_callbacks {
crate::vm::thread::with_vm(&cb, |vm| {
let _ = cb.call((wr.clone(),), vm);
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n "pub fn with_vm" --type rust -B 2 -A 15

Repository: youknowone/RustPython

Length of output: 1214


🏁 Script executed:

rg -n "pub fn with_vm" --type rust -A 25

Repository: youknowone/RustPython

Length of output: 1893


🏁 Script executed:

sed -n '576,590p' crates/vm/src/gc_state.rs | cat -n

Repository: youknowone/RustPython

Length of output: 820


Handle the fallible return from with_vm when invoking weakref callbacks.

At lines 583-586, the code ignores the Option returned by with_vm(&cb, |vm| ...). If no VM is available for the callback object, the callback silently fails to execute and the weakref cleanup is incomplete. Replace let _ = with explicit error handling:

Current code (lines 583-586)
        // Phase 2: Now call all callbacks - at this point ALL weakrefs are cleared
        for (wr, cb) in all_callbacks {
            crate::vm::thread::with_vm(&cb, |vm| {
                let _ = cb.call((wr.clone(),), vm);
            });
        }

Either propagate the error, log a warning, or panic if no VM is found—do not silently skip callback execution.

🤖 Prompt for AI Agents
In `@crates/vm/src/gc_state.rs` around lines 566 - 586, The loop calling weakref
callbacks currently ignores the Option returned by crate::vm::thread::with_vm,
causing silent skips when no VM is available; change the for loop that iterates
over all_callbacks (tuples of crate::PyRef<crate::object::PyWeak> and
crate::PyObjectRef> produced by gc_clear_weakrefs_collect_callbacks) to
explicitly handle with_vm(&cb, |vm| { ... })’s return: if with_vm returns
Some(vm) call cb.call((wr.clone(),), vm) and handle its Result (log or propagate
errors), and if with_vm returns None log a warning or otherwise surface the
missing-VM error instead of using let _ = and silently ignoring it. Ensure you
reference the with_vm invocation and the cb.call invocation when implementing
the handling.

ShaharNaveh and others added 10 commits January 16, 2026 23:08
* Add all other bytecodes

* Mark passing/failing tests
* correct buitins type

* Auto-format: ruff format

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
)

* fix_test to remove unexpected success

* rename fix_test

* Auto-format: ruff format

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@crates/common/src/refcount.rs`:
- Around line 118-131: Replace the cryptic comment inside inc() (around the
second self.state.fetch_add(COUNT, Ordering::SeqCst)) with an explicit
explanation of the 0→2 resurrection pattern: explain that when the previous
value's strong() == 0 the first fetch_add created a transient permission for a
concurrent decrement path, so inc() must perform a second fetch_add to raise the
strong count from 0 to 2 (not 1) to avoid races with concurrent drop/decrement;
reference the involved symbols (inc, State::from_raw, self.state.fetch_add,
COUNT, destructed, strong) and briefly state this is done for thread-safety so a
concurrent decrement cannot observe an intermediate 1 and prematurely destroy
the object.

In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 693-784: The specialized (BinaryOpAddFloat, CallBuiltinFast,
ForIterGen, UnpackSequenceList, InstrumentedCall, etc.) and instrumented
variants are being created via unsafe transmute in the TryFrom<u8> ranges
(specialized_start..=specialized_end and instrumented_start..=instrumented_end)
but their matches in the Instruction::stack_effect method all return 0, which
miscounts stack usage; fix by mapping those specialized/instrumented variants to
the equivalent generic variants (e.g., BinaryOpAddFloat -> BinaryOp,
CallBuiltinFast/CallPyExactArgs/etc. -> Call with correct nargs handling,
ForIterGen -> ForIter with same conditional logic, UnpackSequenceList ->
UnpackSequence, Instrumented* -> corresponding base instruction) before
computing stack effect or implement correct stack_effect arms that compute the
same values as their generic counterparts, ensuring code paths reference the
Instruction enum variants (BinaryOp, Call, ForIter, UnpackSequence,
Instrumented* mapping) and preserve existing semantics used by stack-size
verification.

In `@crates/vm/src/vm/interpreter.rs`:
- Around line 137-141: The finalizing flag is set before running atexit handlers
which causes run_unraisable to short-circuit and suppress errors from atexit;
update the shutdown sequence in interpreter.rs so that
atexit::_run_exitfuncs(vm) is called first and only after it completes set
vm.state.finalizing.store(true, Ordering::Release) (ensure any logic that relies
on finalizing remains correct after the move); reference vm.state.finalizing and
atexit::_run_exitfuncs when making the change.
♻️ Duplicate comments (2)
crates/derive-impl/src/pyclass.rs (1)

637-639: Add lifetime parameter to TraverseFn type.

The generated try_traverse method signature is missing the lifetime parameter on TraverseFn. According to the trait definition in traverse.rs, the signature should be fn try_traverse(&self, traverse_fn: &mut TraverseFn<'_>).

Proposed fix
-                fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) {
+                fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) {
                     `#try_traverse_body`
                 }
crates/stdlib/src/gc.rs (1)

189-196: Incomplete get_referrers implementation.

This returns an empty list but the issue was already flagged in a previous review.

🧹 Nitpick comments (10)
scripts/auto_mark_test.py (1)

67-74: Class-level mutable defaults are a Python footgun.

tests = [] and unexpected_successes = [] at class level means all instances share the same list objects. While parse_results explicitly reinitializes them (lines 80-81), this pattern is error-prone if TestResult is instantiated elsewhere without reassignment.

♻️ Suggested fix using __init__
 class TestResult:
     tests_result: str = ""
-    tests = []
-    unexpected_successes = []  # Tests that passed but were marked as expectedFailure
     stdout = ""
+
+    def __init__(self):
+        self.tests = []
+        self.unexpected_successes = []  # Tests that passed but were marked as expectedFailure

Then in parse_results, remove the redundant reassignments:

 def parse_results(result):
     lines = result.stdout.splitlines()
     test_results = TestResult()
-    test_results.tests = []
-    test_results.unexpected_successes = []
     test_results.stdout = result.stdout
crates/derive-impl/src/pyclass.rs (1)

622-630: Consider validating that pop_edges requires traverse.

If a user specifies #[pyclass(pop_edges)] without traverse, the generated code will call Traverse::pop_edges on a type that doesn't implement Traverse, resulting in a compile error. While the error is caught at compile time, adding an early validation would provide a clearer error message at macro expansion time.

Optional: Add validation
// Add after line 588
if has_pop_edges && !has_traverse {
    bail_span!(
        item,
        "#[pyclass(pop_edges)] requires #[pyclass(traverse)] to be specified",
    );
}
crates/common/Cargo.toml (1)

34-35: Consider using workspace dependency management for consistency.

Most dependencies in this file use { workspace = true } for centralized version management. While some inline versions exist (e.g., lock_api, siphasher), adding crossbeam-epoch to the workspace's [workspace.dependencies] would improve consistency and simplify future version updates across the crate.

crates/common/src/lib.rs (1)

17-17: Full crate re-export exposes broad API surface.

Re-exporting the entire crossbeam_epoch crate publicly couples downstream consumers to its full API. If only specific types are needed externally (e.g., Guard, pin), consider re-exporting only those to minimize the public API surface and insulate consumers from upstream breaking changes.

// Example: selective re-export
pub mod epoch {
    pub use crossbeam_epoch::{pin, Guard};
}
crates/common/src/refcount.rs (1)

263-275: Consider panic resilience in flush_deferred_drops.

If one deferred operation panics, subsequent operations in the queue won't execute. If this is intentional (e.g., panics are unrecoverable anyway), consider documenting it. Otherwise, wrapping each op() in std::panic::catch_unwind would ensure all operations attempt execution.

💡 Optional: panic-resilient flush
 pub fn flush_deferred_drops() {
     DEFERRED_QUEUE.with(|q| {
         let ops: Vec<_> = q.borrow_mut().drain(..).collect();
         for op in ops {
-            op();
+            let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(op));
         }
     });
 }
crates/vm/src/vm/mod.rs (2)

1177-1204: The hardcoded stdlib module list is brittle and may cause issues.

The allowlist approach has maintenance concerns:

  1. Missing common stdlib modules (e.g., re, json, copy, operator, itertools)
  2. New stdlib modules added in future Python versions won't be automatically covered
  3. The contains("pylib") and contains("Lib") path checks may miss stdlib modules in non-standard locations

Consider a more robust approach, such as checking if __file__ starts with the known stdlib paths from sys.prefix or sys.base_prefix, or marking modules as stdlib during import.

♻️ Alternative approach using sys paths
// Get stdlib path from sys module once
let stdlib_path = self.sys_module
    .get_attr("base_prefix", self)
    .ok()
    .and_then(|p| p.downcast_ref::<PyStr>().map(|s| s.as_str().to_owned()));

// Then in the loop, check if file_path starts with stdlib_path
if let Some(ref stdlib) = stdlib_path {
    if file_path.starts_with(stdlib) {
        continue; // Skip stdlib module
    }
}

1164-1168: Consider handling the case where __main__ has no dict.

While PyModule should always have a dict, the silent skip via let Some(module) could mask issues during shutdown debugging. A debug assertion or trace log might help diagnose shutdown problems.

crates/vm/src/vm/thread.rs (1)

42-48: Potential panic from double borrow in ensure_pinned.

The function first calls guard.borrow() to check if None, then immediately calls guard.borrow_mut(). While this works because borrow() returns a temporary that's dropped before borrow_mut(), a more idiomatic approach would avoid this pattern.

♻️ Suggested refactor to avoid double borrow
 pub fn ensure_pinned() {
     EBR_GUARD.with(|guard| {
-        if guard.borrow().is_none() {
-            *guard.borrow_mut() = Some(rustpython_common::epoch::pin());
-        }
+        let mut g = guard.borrow_mut();
+        if g.is_none() {
+            *g = Some(rustpython_common::epoch::pin());
+        }
     });
 }
crates/stdlib/src/gc.rs (1)

262-264: Consider logging unraisable exceptions from GC callbacks.

Callback errors are silently ignored. While CPython also doesn't propagate these errors, it does report them via sys.unraisablehook. Consider using vm.run_unraisable for consistency.

♻️ Suggested improvement
         for callback in callbacks {
-            let _ = callback.call((phase_str.clone(), info.clone()), vm);
+            if let Err(exc) = callback.call((phase_str.clone(), info.clone()), vm) {
+                vm.run_unraisable(exc, Some("Exception ignored in gc callback".to_owned()), callback.clone());
+            }
         }
crates/vm/src/object/core.rs (1)

1135-1140: Aliasing concern documented but not eliminated.

The past review flagged the &self to *mut cast as potentially violating Rust's aliasing rules. While the safety comment explains that exclusive access is guaranteed during GC, the technical UB concern remains. The current approach works in practice but isn't strictly sound under Rust's memory model.

The safer alternative would be to only expose gc_pop_edges_raw and require callers to obtain the raw pointer through safe means.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d51978e and 4118c1c.

⛔ Files ignored due to path filters (30)
  • Cargo.lock is excluded by !**/*.lock
  • Lib/_opcode_metadata.py is excluded by !Lib/**
  • Lib/http/__init__.py is excluded by !Lib/**
  • Lib/http/client.py is excluded by !Lib/**
  • Lib/http/cookiejar.py is excluded by !Lib/**
  • Lib/http/server.py is excluded by !Lib/**
  • Lib/test/support/__init__.py is excluded by !Lib/**
  • Lib/test/test__opcode.py is excluded by !Lib/**
  • Lib/test/test_builtin.py is excluded by !Lib/**
  • Lib/test/test_dict.py is excluded by !Lib/**
  • Lib/test/test_gc.py is excluded by !Lib/**
  • Lib/test/test_generators.py is excluded by !Lib/**
  • Lib/test/test_http_cookiejar.py is excluded by !Lib/**
  • Lib/test/test_httplib.py is excluded by !Lib/**
  • Lib/test/test_httpservers.py is excluded by !Lib/**
  • Lib/test/test_logging.py is excluded by !Lib/**
  • Lib/test/test_robotparser.py is excluded by !Lib/**
  • Lib/test/test_ssl.py is excluded by !Lib/**
  • Lib/test/test_subprocess.py is excluded by !Lib/**
  • Lib/test/test_urllib.py is excluded by !Lib/**
  • Lib/test/test_urllib2.py is excluded by !Lib/**
  • Lib/test/test_urllib2_localnet.py is excluded by !Lib/**
  • Lib/test/test_urllib2net.py is excluded by !Lib/**
  • Lib/test/test_urllib_response.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
  • Lib/test/test_wsgiref.py is excluded by !Lib/**
  • Lib/urllib/error.py is excluded by !Lib/**
  • Lib/urllib/parse.py is excluded by !Lib/**
  • Lib/urllib/request.py is excluded by !Lib/**
  • Lib/urllib/robotparser.py is excluded by !Lib/**
📒 Files selected for processing (33)
  • .cspell.json
  • crates/common/Cargo.toml
  • crates/common/src/lib.rs
  • crates/common/src/refcount.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/derive-impl/src/pyclass.rs
  • crates/derive-impl/src/pystructseq.rs
  • crates/derive-impl/src/util.rs
  • crates/stdlib/src/gc.rs
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/builtins/coroutine.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/generator.rs
  • crates/vm/src/builtins/list.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/gc_state.rs
  • crates/vm/src/lib.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/traverse.rs
  • crates/vm/src/object/traverse_object.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/vm/context.rs
  • crates/vm/src/vm/interpreter.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs
  • extra_tests/snippets/builtins_module.py
  • scripts/auto_mark_test.py
🚧 Files skipped from review as they are similar to previous changes (11)
  • crates/vm/src/builtins/coroutine.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/object/traverse.rs
  • crates/vm/src/builtins/generator.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/coroutine.rs
  • crates/vm/src/gc_state.rs
  • crates/derive-impl/src/pystructseq.rs
🧰 Additional context used
🧬 Code graph analysis (14)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
  • gc_state (802-804)
crates/vm/src/builtins/list.rs (6)
crates/vm/src/builtins/function.rs (2)
  • traverse (47-58)
  • pop_edges (60-97)
crates/vm/src/builtins/str.rs (1)
  • traverse (274-277)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/builtins/set.rs (1)
  • traverse (176-179)
crates/vm/src/builtins/mappingproxy.rs (1)
  • traverse (30-35)
crates/vm/src/object/traverse.rs (1)
  • pop_edges (40-40)
crates/derive-impl/src/pyclass.rs (3)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/vm/src/object/traverse.rs (3)
  • pop_edges (40-40)
  • try_traverse (20-20)
  • try_pop_edges (22-22)
crates/vm/src/builtins/function.rs (1)
  • pop_edges (60-97)
crates/vm/src/vm/interpreter.rs (2)
crates/vm/src/stdlib/atexit.rs (1)
  • _run_exitfuncs (36-47)
crates/vm/src/gc_state.rs (1)
  • gc_state (802-804)
crates/vm/src/stdlib/sys.rs (6)
crates/vm/src/object/core.rs (1)
  • dict (803-805)
crates/vm/src/builtins/dict.rs (1)
  • dict (796-796)
crates/vm/src/builtins/str.rs (2)
  • key (1412-1412)
  • key (1418-1418)
crates/vm/src/vm/mod.rs (1)
  • key (1175-1175)
crates/vm/src/coroutine.rs (1)
  • frame (215-217)
crates/vm/src/stdlib/io.rs (1)
  • vm (3791-3793)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
  • debug_obj (87-93)
  • drop_dealloc_obj (84-86)
  • try_pop_edges_obj (128-131)
  • try_trace_obj (96-100)
crates/vm/src/vm/context.rs (3)
crates/vm/src/object/core.rs (2)
  • new_ref (1361-1378)
  • default (413-415)
crates/vm/src/builtins/dict.rs (1)
  • new_ref (77-79)
crates/vm/src/builtins/list.rs (1)
  • new_ref (91-93)
scripts/auto_mark_test.py (1)
scripts/lib_updater.py (3)
  • PatchSpec (81-122)
  • UtMethod (59-78)
  • apply_patches (273-283)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
  • ensure_pinned (42-48)
  • drop_guard (67-71)
extra_tests/snippets/builtins_module.py (1)
crates/vm/src/stdlib/builtins.rs (1)
  • isinstance (527-529)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (18)
  • gc_state (802-804)
  • enable (167-169)
  • disable (172-174)
  • collect (374-376)
  • get_threshold (187-193)
  • set_threshold (64-66)
  • set_threshold (196-204)
  • get_count (207-213)
  • get_debug (177-179)
  • set_debug (182-184)
  • get_stats (216-222)
  • stats (68-75)
  • get_objects (303-344)
  • new (44-54)
  • new (136-159)
  • freeze (754-773)
  • unfreeze (776-792)
  • get_freeze_count (749-751)
crates/vm/src/object/core.rs (2)
crates/vm/src/object/traverse_object.rs (1)
  • try_pop_edges_obj (39-39)
crates/vm/src/gc_state.rs (1)
  • gc_state (802-804)
crates/vm/src/builtins/function.rs (4)
crates/vm/src/builtins/type.rs (2)
  • module (1780-1780)
  • name (592-597)
crates/vm/src/frame.rs (4)
  • name (1320-1320)
  • name (2695-2695)
  • traverse (101-103)
  • traverse (107-118)
crates/vm/src/builtins/generator.rs (1)
  • traverse (24-26)
crates/vm/src/builtins/tuple.rs (2)
  • traverse (43-45)
  • pop_edges (47-51)
crates/compiler-core/src/bytecode/instruction.rs (3)
crates/codegen/src/ir.rs (1)
  • from (55-57)
crates/compiler-core/src/bytecode/oparg.rs (2)
  • from (25-27)
  • from (68-70)
crates/compiler-core/src/bytecode.rs (2)
  • from (345-347)
  • from (351-353)
🪛 Biome (2.1.2)
.cspell.json

[error] 69-69: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 69-69: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 121-121: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 121-121: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 122-122: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 122-122: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 123-123: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 123-123: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 124-124: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 124-124: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 125-125: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 125-125: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 126-126: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 126-126: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 127-127: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 127-127: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (windows-2025)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (windows-2025)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (52)
crates/vm/src/stdlib/sys.rs (1)

829-834: LGTM!

The change from consuming the frame (.into()) to cloning its object reference (.as_object().to_owned()) is appropriate. This preserves the original FrameRef for GC traversal while still providing an owned value to the dictionary, aligning with the broader GC/ownership improvements in this PR.

extra_tests/snippets/builtins_module.py (1)

25-35: LGTM!

The test correctly validates that function.__builtins__ is a dict as per Python's datamodel. The assertion includes a helpful diagnostic message showing the actual type on failure.

scripts/auto_mark_test.py (5)

26-36: LGTM!

Imports are appropriate for the new AST-based functionality.


119-132: LGTM!

The parsing logic for unexpected successes follows the same pattern as the existing fail/error parsing and correctly extracts the test name and path.


142-160: LGTM!

The AST inspection correctly identifies methods that are pure super-delegation wrappers.


163-228: Solid implementation for removing expected failures.

The hybrid AST + line-based approach works well for this use case. One minor consideration: the string check "expectedFailure" not in line_content on line 208 could theoretically match false positives in comments, but since we're only examining decorator lines, this is acceptable.


304-338: LGTM!

The logic correctly handles both adding expectedFailure decorators for newly failing tests and removing them from tests that now pass. The order of operations (remove first, then apply patches) is correct.

crates/derive-impl/src/util.rs (1)

366-376: LGTM!

The addition of "pop_edges" to ClassItemMeta::ALLOWED_NAMES correctly enables the new pop_edges attribute for #[pyclass] macro, aligning with the traversal/edge-popping capabilities introduced in the VM.

crates/derive-impl/src/pyclass.rs (3)

590-608: LGTM with note on intentional error handling.

The .ok().flatten() pattern on line 593 intentionally converts errors to None to handle the case where traverse is specified without a value (e.g., #[pyclass(traverse)] vs #[pyclass(traverse = "manual")]). The inline comment documents this behavior well.

Note: The assert_eq!(s, "manual") on line 601 is redundant since ALLOWED_TRAVERSE_OPTS currently only contains "manual", but it serves as a safety check if more options are added later.


632-646: Well-structured MaybeTraverse implementation generation.

The generated impl correctly:

  • Sets IS_TRACE and HAS_POP_EDGES constants based on attribute presence
  • Conditionally generates method bodies that either delegate to Traverse or are no-ops
  • Uses fully qualified paths for type references

This aligns well with the trait definition in traverse.rs.


577-588: Clear documentation of the traverse/pop_edges behavior.

The comments clearly explain the three states for traverse (none, manual, auto-derive) and two states for pop_edges (none, present). This makes the macro behavior easy to understand for future maintainers.

crates/common/src/refcount.rs (8)

1-17: LGTM - Re-exports and module documentation are clear.

The EBR type re-exports and cs alias for pin() provide good API compatibility while integrating with crossbeam-epoch.


18-39: LGTM - Bit layout constants are correctly defined.

The 64-bit state layout is sound:

  • Bits 0-30: strong count (31 bits)
  • Bits 31-60: weak count (30 bits)
  • Bit 61: LEAKED
  • Bit 62: WEAKED
  • Bit 63: DESTRUCTED

The previous bit overlap issue has been properly addressed.


41-87: LGTM - State struct implementation is correct.

The bit manipulation methods correctly handle the state layout. The division by COUNT in strong() is a no-op (since COUNT = 1) but maintains consistency if COUNT changes in the future.


133-166: LGTM - safe_inc() and dec() implementations are correct.

The CAS loop in safe_inc() properly handles concurrent access, and dec() correctly returns true when the count drops to zero while respecting leaked objects.


168-190: LGTM - leak() and is_leaked() are correctly implemented.

The CAS loop properly handles concurrent flag setting. The Acquire ordering in is_leaked() is sufficient for reading the flag.


192-224: LGTM - Panic-safe deferred drop guard.

The DeferredDropGuard RAII pattern correctly restores state and flushes deferred drops on both normal exit and panic unwinding, addressing the previous panic-safety concern.


226-261: LGTM - Deferred drop context functions are well-designed.

The with_deferred_drops properly uses RAII for panic safety, and try_defer_drop correctly handles the thread-local context check. The lack of Send bound is intentional and documented.


277-295: LGTM - defer_destruction correctly wraps EBR deferral.

The safety requirements are well-documented, and the function properly delegates to guard.defer_unchecked() with appropriate bounds (Send + 'static).

.cspell.json (1)

69-69: LGTM - Spell-check additions align with GC terminology.

The added terms ("finalizers", "uncollectable", "weaked") correspond to GC concepts introduced in this PR. Note that "weaked" is non-standard terminology - consider documenting its meaning in the code where it's used if not already done.

The Biome static analysis errors are false positives - this file uses JSON5 format which supports trailing commas and comments.

Also applies to: 121-121, 127-127

crates/vm/src/vm/context.rs (2)

54-57: LGTM - GC module state fields properly declared.

The gc_callbacks and gc_garbage lists mirror CPython's gc module design, providing storage for callback functions and uncollectable garbage respectively.


335-338: LGTM - GC lists properly initialized during genesis.

The initialization using PyRef::new_ref with PyList::default() is consistent with other singleton objects created in init_genesis.

crates/vm/src/lib.rs (1)

80-80: LGTM - Public GC state module export.

This properly exposes the GC state API for use by the stdlib gc module and other VM components.

crates/vm/src/vm/mod.rs (2)

553-562: LGTM - Proper exception suppression during finalization.

The early return when finalizing is true correctly mirrors CPython's behavior of silently ignoring __del__ errors and daemon thread exceptions during shutdown. Using Acquire ordering ensures visibility of the flag set with Release in interpreter.rs.


673-677: LGTM - EBR guard reactivation at safe points.

Reactivating the guard at frame boundaries allows epoch-based reclamation to advance, preventing memory from being held indefinitely during long-running operations.

crates/vm/src/vm/interpreter.rs (1)

143-152: LGTM - GC passes around module cleanup follow a sound pattern.

The two-pass approach is appropriate:

  1. First pass collects cycles before module cleanup (generation 0-2, full collection)
  2. finalize_modules breaks references in module namespaces
  3. Second pass collects the newly-breakable cyclic garbage (generation 0-2, full collection)

This mirrors CPython's finalization strategy of clearing modules to break cycles. The collect_force(2) parameter correctly targets generation 2 (the oldest), collecting across all three generations (0-2) for a complete cycle collection each time.

crates/vm/src/vm/thread.rs (1)

25-32: Well-structured EBR pinning lifecycle.

The three-function API (ensure_pinned, reactivate_guard, drop_guard) provides a clean interface for managing EBR guards. The documentation clearly explains the coarse-grained pinning strategy and when each function should be called.

Also applies to: 36-48, 50-71

crates/vm/src/stdlib/thread.rs (3)

429-459: Correct EBR pinning lifecycle in thread execution.

The thread properly enters the EBR critical section at the start with ensure_pinned() and exits it with drop_guard() at the end. This ensures GC won't free objects while the thread might access them. The cleanup order (sentinels → thread-local data → frame tracking → thread count → EBR guard) is appropriate.


676-682: Correct SystemExit subclass handling.

The updated check properly handles SystemExit subclasses, matching CPython's behavior where threading's _excepthook silently ignores SystemExit and all its subclasses. The is_some_and with fast_issubclass is the right approach.


525-530: Good use of is_some_and for cleaner code.

The change from map_or(false, ...) to is_some_and(...) is more idiomatic and clearer in intent.

crates/stdlib/src/gc.rs (2)

50-86: Solid collect implementation with proper callback lifecycle.

The function correctly:

  1. Validates generation parameter
  2. Invokes callbacks with "start" phase before collection
  3. Uses collect_force to run even when GC is disabled (matching CPython's manual gc.collect behavior)
  4. Transfers garbage objects to vm.ctx.gc_garbage for DEBUG_SAVEALL
  5. Invokes callbacks with "stop" phase after collection

164-174: Clean generation validation in get_objects.

The validation correctly handles the optional generation parameter and provides a clear error message matching CPython's format.

crates/vm/src/object/core.rs (5)

990-1023: Correct cycle-breaking order in drop_slow.

The implementation correctly:

  1. Untracks the object from GC before deallocation
  2. Extracts child references via pop_edges before deallocation
  3. Deallocates the object
  4. Drops child references after deallocation

This order ensures circular references are broken because children are dropped after the parent is freed, preventing use-after-free in cycles.


264-308: Well-designed clear_for_gc_collect_callbacks for GC semantics.

This method correctly separates weakref clearing from callback invocation, allowing GC to:

  1. Clear ALL weakrefs in unreachable objects first
  2. Collect all callbacks
  3. Invoke callbacks later (after all weakrefs are cleared)

This matches CPython's GC behavior where weakref callbacks are invoked in a second pass.


1361-1378: Correct GC tracking at object creation.

Objects are tracked if they have IS_TRACE (can contain references) or have an instance dict (user-defined class instances). The tracking condition here matches the untracking condition in drop_slow, ensuring consistency. Calling maybe_collect() after tracking allows automatic GC to run when thresholds are exceeded.


1055-1072: Correct finalizer invocation for GC.

try_call_finalizer properly:

  1. Sets the FINALIZED flag before calling __del__ to prevent double-invocation
  2. Reports errors via run_unraisable rather than propagating them
  3. Does not handle resurrection directly (that's the caller's responsibility)

This design allows GC to call finalizers in a controlled manner during collection.


920-944: Resurrection detection logic is correct.

The inc() method does perform a double increment from 0→2 (verified: it calls fetch_add(COUNT) once, checks if the old value was 0, then calls fetch_add(COUNT) again). The resurrection detection properly handles all scenarios:

  1. No resurrection: after_del equals after_inc (=2) → both decrements execute → count reaches 0
  2. Persistent resurrection: after_del > after_inc → first decrement executes, check detects it, second decrement skipped → count stays elevated
  3. Temporary reference in __del__: __del__ creates then drops a reference (3→2) → after_del still equals 2 → correctly treated as non-resurrection

The comparison at line 937 is sound because it checks after_del > after_inc before the second decrement, correctly distinguishing between references that persist versus those temporarily created during __del__ execution.

crates/vm/src/object/traverse_object.rs (1)

14-45: Clean vtable extension for edge popping.

The pop_edges field addition follows the same pattern as trace:

  • Optional function pointer for types that support edge extraction
  • Conditional initialization via const block based on HAS_POP_EDGES
  • Clear documentation explaining its role in circular reference resolution

The implementation is consistent with the existing vtable design.

crates/vm/src/builtins/list.rs (3)

6-6: No comment.


27-33: Manual traversal + pop_edges wiring looks good.


60-72: Traverse + pop_edges implementation looks solid.

crates/vm/src/builtins/tuple.rs (3)

6-6: No comment.


28-28: Manual traversal + pop_edges annotation looks good.


40-51: Traverse + pop_edges implementation is clear and consistent.

crates/vm/src/builtins/function.rs (6)

4-7: No comment.


28-28: Manual traversal + pop_edges annotation looks good.


46-97: GC traversal + edge popping coverage looks comprehensive.


108-122: Builtins normalization to dict is a nice robustness improvement.


724-737: New args shape (argdefs/closure/kwdefaults) is clear and consistent.


744-781: Constructor updates for closure/argdefs/kwdefaults look correct.

crates/compiler-core/src/bytecode/instruction.rs (2)

248-376: Opcode expansion reads cleanly and keeps IDs stable.

Explicit discriminants and the end-of-custom marker make the extended table straightforward to reason about.


401-406: Validation ranges are updated cleanly.

Deriving specialized/instrumented ranges from enum values keeps TryFrom<u8> aligned with the opcode table.

Also applies to: 447-448

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +118 to 131
/// Increment strong count
#[inline]
pub fn inc(&self) {
let old_size = self.strong.fetch_add(1, Relaxed);

if old_size & Self::MASK == Self::MASK {
let val = State::from_raw(self.state.fetch_add(COUNT, Ordering::SeqCst));
if val.destructed() {
// Already marked for destruction, but we're incrementing
// This shouldn't happen in normal usage
std::process::abort();
}
if val.strong() == 0 {
// The previous fetch_add created a permission to run decrement again
self.state.fetch_add(COUNT, Ordering::SeqCst);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

wc -l crates/common/src/refcount.rs

Repository: youknowone/RustPython

Length of output: 98


🏁 Script executed:

cat -n crates/common/src/refcount.rs

Repository: youknowone/RustPython

Length of output: 11057


🏁 Script executed:

rg -n "\.inc\(\)" --type rust -B3 -A3

Repository: youknowone/RustPython

Length of output: 2298


🏁 Script executed:

rg -n "fn dec\|fn decrement" --type rust -B2 -A10

Repository: youknowone/RustPython

Length of output: 47


🏁 Script executed:

rg -n "COUNT\s*=" crates/common/src/refcount.rs

Repository: youknowone/RustPython

Length of output: 47


Improve the inline comment in inc() to clarify the 0→2 resurrection behavior.

The logic on lines 127-130 is intentional but underdocumented: when strong count is 0 before the first fetch_add, a second fetch_add increments the count to 2. While the calling code in crates/vm/src/object/core.rs (lines 920-921) explains this as "inc() from 0 does a double increment (0→2) for thread safety," the inline comment on line 128 ("created a permission to run decrement again") is too cryptic.

Replace the inline comment with an explicit explanation of the 0→2 resurrection pattern and its thread-safety rationale to match the clarity in the calling code.

🤖 Prompt for AI Agents
In `@crates/common/src/refcount.rs` around lines 118 - 131, Replace the cryptic
comment inside inc() (around the second self.state.fetch_add(COUNT,
Ordering::SeqCst)) with an explicit explanation of the 0→2 resurrection pattern:
explain that when the previous value's strong() == 0 the first fetch_add created
a transient permission for a concurrent decrement path, so inc() must perform a
second fetch_add to raise the strong count from 0 to 2 (not 1) to avoid races
with concurrent drop/decrement; reference the involved symbols (inc,
State::from_raw, self.state.fetch_add, COUNT, destructed, strong) and briefly
state this is done for thread-safety so a concurrent decrement cannot observe an
intermediate 1 and prematurely destroy the object.

Comment on lines +693 to +784
Self::BinaryOpAddFloat => 0,
Self::BinaryOpAddInt => 0,
Self::BinaryOpAddUnicode => 0,
Self::BinaryOpMultiplyFloat => 0,
Self::BinaryOpMultiplyInt => 0,
Self::BinaryOpSubtractFloat => 0,
Self::BinaryOpSubtractInt => 0,
Self::BinarySubscrDict => 0,
Self::BinarySubscrGetitem => 0,
Self::BinarySubscrListInt => 0,
Self::BinarySubscrStrInt => 0,
Self::BinarySubscrTupleInt => 0,
Self::CallAllocAndEnterInit => 0,
Self::CallBoundMethodExactArgs => 0,
Self::CallBoundMethodGeneral => 0,
Self::CallBuiltinClass => 0,
Self::CallBuiltinFast => 0,
Self::CallBuiltinFastWithKeywords => 0,
Self::CallBuiltinO => 0,
Self::CallIsinstance => 0,
Self::CallLen => 0,
Self::CallListAppend => 0,
Self::CallMethodDescriptorFast => 0,
Self::CallMethodDescriptorFastWithKeywords => 0,
Self::CallMethodDescriptorNoargs => 0,
Self::CallMethodDescriptorO => 0,
Self::CallNonPyGeneral => 0,
Self::CallPyExactArgs => 0,
Self::CallPyGeneral => 0,
Self::CallStr1 => 0,
Self::CallTuple1 => 0,
Self::CallType1 => 0,
Self::CompareOpFloat => 0,
Self::CompareOpInt => 0,
Self::CompareOpStr => 0,
Self::ContainsOpDict => 0,
Self::ContainsOpSet => 0,
Self::ForIterGen => 0,
Self::ForIterList => 0,
Self::ForIterRange => 0,
Self::ForIterTuple => 0,
Self::LoadAttrClass => 0,
Self::LoadAttrGetattributeOverridden => 0,
Self::LoadAttrInstanceValue => 0,
Self::LoadAttrMethodLazyDict => 0,
Self::LoadAttrMethodNoDict => 0,
Self::LoadAttrMethodWithValues => 0,
Self::LoadAttrModule => 0,
Self::LoadAttrNondescriptorNoDict => 0,
Self::LoadAttrNondescriptorWithValues => 0,
Self::LoadAttrProperty => 0,
Self::LoadAttrSlot => 0,
Self::LoadAttrWithHint => 0,
Self::LoadGlobalBuiltin => 0,
Self::LoadGlobalModule => 0,
Self::LoadSuperAttrAttr => 0,
Self::LoadSuperAttrMethod => 0,
Self::ResumeCheck => 0,
Self::SendGen => 0,
Self::StoreAttrInstanceValue => 0,
Self::StoreAttrSlot => 0,
Self::StoreAttrWithHint => 0,
Self::StoreSubscrDict => 0,
Self::StoreSubscrListInt => 0,
Self::ToBoolAlwaysTrue => 0,
Self::ToBoolBool => 0,
Self::ToBoolInt => 0,
Self::ToBoolList => 0,
Self::ToBoolNone => 0,
Self::ToBoolStr => 0,
Self::UnpackSequenceList => 0,
Self::UnpackSequenceTuple => 0,
Self::UnpackSequenceTwoTuple => 0,
Self::InstrumentedResume => 0,
Self::InstrumentedEndFor => 0,
Self::InstrumentedEndSend => 0,
Self::InstrumentedReturnValue => 0,
Self::InstrumentedReturnConst => 0,
Self::InstrumentedYieldValue => 0,
Self::InstrumentedLoadSuperAttr => 0,
Self::InstrumentedForIter => 0,
Self::InstrumentedCall => 0,
Self::InstrumentedCallKw => 0,
Self::InstrumentedCallFunctionEx => 0,
Self::InstrumentedInstruction => 0,
Self::InstrumentedJumpForward => 0,
Self::InstrumentedJumpBackward => 0,
Self::InstrumentedPopJumpIfTrue => 0,
Self::InstrumentedPopJumpIfFalse => 0,
Self::InstrumentedPopJumpIfNone => 0,
Self::InstrumentedPopJumpIfNotNone => 0,
Self::InstrumentedLine => 0,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the file and understand its structure
head -750 crates/compiler-core/src/bytecode/instruction.rs | tail -100

Repository: youknowone/RustPython

Length of output: 4483


🏁 Script executed:

# Get more context around the lines mentioned (693-784)
sed -n '680,800p' crates/compiler-core/src/bytecode/instruction.rs

Repository: youknowone/RustPython

Length of output: 4975


🏁 Script executed:

# Find where TryFrom<u8> is implemented for this type
rg "impl.*TryFrom.*u8" crates/compiler-core/src/bytecode/instruction.rs -A 10

Repository: youknowone/RustPython

Length of output: 762


🏁 Script executed:

# Find the stack_effect function mentioned in the concern
rg "fn stack_effect|stack_effect\(" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 5

Repository: youknowone/RustPython

Length of output: 1250


🏁 Script executed:

# Look for the generic BinaryOp (not specialized versions)
rg "Self::BinaryOp[^A-Z]" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 2

Repository: youknowone/RustPython

Length of output: 570


🏁 Script executed:

# Check the enum definition to see if these are marked as specialized/pseudo
rg "enum Instruction" crates/compiler-core/src/bytecode/instruction.rs -A 200 | head -300

Repository: youknowone/RustPython

Length of output: 4909


🏁 Script executed:

# Look for documentation or comments about pseudo instructions
rg "Pseudo instruction|pseudo opcode|specialized opcode" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 2

Repository: youknowone/RustPython

Length of output: 268


🏁 Script executed:

# Search for where TryFrom<u8> is actually used - does it convert these variants?
rg "Instruction::try_from|TryFrom.*Instruction" crates/compiler-core/src/ --type rust -B 3 -A 3

Repository: youknowone/RustPython

Length of output: 2456


🏁 Script executed:

# Check if there are any assertions about which variants can appear in converted instructions
rg "assert|panic|unreachable" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 2 | head -80

Repository: youknowone/RustPython

Length of output: 1043


🏁 Script executed:

# Look for the TryFrom implementation for Instruction to see which opcodes it actually creates
sed -n '1,300p' crates/compiler-core/src/bytecode/instruction.rs | grep -A 100 "impl TryFrom<u8> for Instruction"

Repository: youknowone/RustPython

Length of output: 47


🏁 Script executed:

# Find the full TryFrom<u8> implementation for Instruction
rg "impl TryFrom<u8> for Instruction" crates/compiler-core/src/bytecode/instruction.rs -A 150

Repository: youknowone/RustPython

Length of output: 5610


🏁 Script executed:

# Check if these placeholder variants are ever constructed in the codebase
rg "BinaryOpAddFloat|CallAllocAndEnterInit|ForIterGen|UnpackSequenceList" crates/compiler-core/src/ --type rust

Repository: youknowone/RustPython

Length of output: 988


🏁 Script executed:

# Search for where these are created - are they only from bytecode parsing?
rg "InstrumentedResume|InstrumentedCall|InstrumentedForIter" crates/compiler-core/src/ --type rust -B 2 -A 2

Repository: youknowone/RustPython

Length of output: 3213


🏁 Script executed:

# Look for PseudoInstruction enum to see if these variants are actually there
rg "enum PseudoInstruction" crates/compiler-core/src/bytecode/instruction.rs -A 100

Repository: youknowone/RustPython

Length of output: 3042


🏁 Script executed:

# Check the generic BinaryOp, Call, ForIter stack effects
rg "Self::BinaryOp \{|Self::Call \{|Self::ForIter \{" crates/compiler-core/src/bytecode/instruction.rs | grep "=>" | head -5

Repository: youknowone/RustPython

Length of output: 355


🏁 Script executed:

# Get more details on Call and ForIter effects
sed -n '500,650p' crates/compiler-core/src/bytecode/instruction.rs | grep -A 1 "Self::Call\|Self::ForIter"

Repository: youknowone/RustPython

Length of output: 830


Specialized and instrumented opcodes have incorrect zero stack effects, breaking stack-size verification.

These variants (BinaryOp*, Call*, ForIter*, UnpackSequence*, Instrumented*) are now explicitly accepted by TryFrom<u8> via unsafe transmute in the ranges (specialized_start..=specialized_end) and (instrumented_start..=instrumented_end). When present in bytecode, they will be successfully converted to Instruction enum variants, but all return 0 for stack_effect, which mismatches their generic equivalents:

  • BinaryOp { .. } => -1 vs. BinaryOpAddFloat etc. => 0
  • Call { nargs } => -(nargs.get(arg) as i32) - 2 + 1 vs. all Call* variants => 0
  • ForIter { .. } (conditional logic) vs. ForIterGen etc. => 0
  • UnpackSequence* variants all => 0

This will cause stack-size calculations or verification to undercount pops/pushes. Either normalize these opcodes to their generic equivalents before stack_effect is queried, provide correct stack effect values, or document/assert that these opcodes are guaranteed never to appear in bytecode.

🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 693 - 784, The
specialized (BinaryOpAddFloat, CallBuiltinFast, ForIterGen, UnpackSequenceList,
InstrumentedCall, etc.) and instrumented variants are being created via unsafe
transmute in the TryFrom<u8> ranges (specialized_start..=specialized_end and
instrumented_start..=instrumented_end) but their matches in the
Instruction::stack_effect method all return 0, which miscounts stack usage; fix
by mapping those specialized/instrumented variants to the equivalent generic
variants (e.g., BinaryOpAddFloat -> BinaryOp,
CallBuiltinFast/CallPyExactArgs/etc. -> Call with correct nargs handling,
ForIterGen -> ForIter with same conditional logic, UnpackSequenceList ->
UnpackSequence, Instrumented* -> corresponding base instruction) before
computing stack effect or implement correct stack_effect arms that compute the
same values as their generic counterparts, ensuring code paths reference the
Instruction enum variants (BinaryOp, Call, ForIter, UnpackSequence,
Instrumented* mapping) and preserve existing semantics used by stack-size
verification.

Comment on lines +137 to 141
// Suppress unraisable exceptions from daemon threads and __del__
// methods during shutdown.
vm.state.finalizing.store(true, Ordering::Release);

atexit::_run_exitfuncs(vm);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Setting finalizing before atexit may suppress important errors.

The finalizing flag is set before _run_exitfuncs, which means errors in atexit handlers will be silently suppressed (since run_unraisable returns early when finalizing is true). In CPython, atexit functions run before Py_Finalizing() becomes true.

Consider moving vm.state.finalizing.store(true, ...) to after _run_exitfuncs:

🔧 Suggested fix
             // Wait for non-daemon threads (wait_for_thread_shutdown)
             wait_for_thread_shutdown(vm);

+            atexit::_run_exitfuncs(vm);
+
             // Suppress unraisable exceptions from daemon threads and __del__
             // methods during shutdown.
             vm.state.finalizing.store(true, Ordering::Release);
-
-            atexit::_run_exitfuncs(vm);
🤖 Prompt for AI Agents
In `@crates/vm/src/vm/interpreter.rs` around lines 137 - 141, The finalizing flag
is set before running atexit handlers which causes run_unraisable to
short-circuit and suppress errors from atexit; update the shutdown sequence in
interpreter.rs so that atexit::_run_exitfuncs(vm) is called first and only after
it completes set vm.state.finalizing.store(true, Ordering::Release) (ensure any
logic that relies on finalizing remains correct after the move); reference
vm.state.finalizing and atexit::_run_exitfuncs when making the change.

@youknowone youknowone marked this pull request as draft January 16, 2026 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants