-
Notifications
You must be signed in to change notification settings - Fork 1
gc #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gc #7
Conversation
📝 Walkthrough🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🧹 Nitpick comments (16)
crates/vm/src/dict_inner.rs (1)
712-721: pop_all_entries keeps Dict invariants; consider reuse of clear semanticsThe method correctly:
- Obtains a mutable handle via
get_mut()(no lock, requires&mut self).- Resets
used/filledand marks all indicesFREEbefore draining entries.- Returns an iterator of
(PyObjectRef, T)for GC to consume, leaving the dict logically empty.One behavioral difference vs
clear()is thatindicescapacity is preserved instead of being reset to length 8. That’s probably fine for GC-focused use, but if this gets used on very large dicts in hot paths you might want to either reuseclear()’s resizing or document this capacity retention explicitly.crates/common/src/ebr/sync/once_lock.rs (1)
10-17: Consider usingOnce::is_completed()instead of a separateAtomicBool.The comment on line 12 mentions
Once::is_completedrequires Rust 1.43, but the workspace MSRV is 1.89.0 (per root Cargo.toml line 134). UsingOnce::is_completed()would eliminate theis_initializedfield and the associated synchronization overhead.🔎 Proposed simplification
pub(crate) struct OnceLock<T> { once: Once, - // Once::is_completed requires Rust 1.43, so use this to track of whether they have been initialized. - is_initialized: AtomicBool, value: UnsafeCell<MaybeUninit<T>>, - // Unlike std::sync::OnceLock, we don't need PhantomData here because - // we don't use #[may_dangle]. }Then replace
self.is_initialized()calls withself.once.is_completed().crates/vm/src/object/core.rs (1)
298-305: Subtle double-decrement prevention logic - add clarifying comment.The interaction between
already_clearedanddecrement_ref_countprevents double decrements when both GC anddrop_slow_innerclear weakrefs. However, the logic is subtle and could benefit from a more explicit comment explaining the scenario.// Only decrement ref_count if requested AND not already cleared + // This prevents double-decrement when: + // 1. GC calls clear_for_gc (decrement_ref_count=false) + // 2. Later, drop_slow_inner calls clear (decrement_ref_count=true) + // but obj is already None, so we skip the decrement if decrement_ref_count && !already_cleared {crates/stdlib/src/gc.rs (1)
261-266: Silent error ignoring in callback invocation.Errors from
set_itemandcallback.callare silently discarded. While CPython also handles callback errors gracefully, completely ignoring them could make debugging difficult.Consider logging errors or using
run_unraisablefor callback failures, similar to how__del__errors are handled in core.rs.- let _ = info.set_item("generation", vm.ctx.new_int(generation).into(), vm); - let _ = info.set_item("collected", vm.ctx.new_int(collected).into(), vm); - let _ = info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm); + // These set_item calls should not fail for simple int values + info.set_item("generation", vm.ctx.new_int(generation).into(), vm).ok(); + info.set_item("collected", vm.ctx.new_int(collected).into(), vm).ok(); + info.set_item("uncollectable", vm.ctx.new_int(uncollectable).into(), vm).ok(); for callback in callbacks { - let _ = callback.call((phase_str.clone(), info.clone()), vm); + if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) { + // Handle callback errors gracefully, similar to __del__ errors + vm.run_unraisable(e, Some("gc callback".to_owned()), callback); + } }Cargo.toml (1)
208-209: Both dependencies are necessary for the EBR implementation.The
memoffsetcrate is actively used incrates/common/src/ebr/internal.rs(lines 45, 575, 582). Whilestd::mem::offset_of!is available (MSRV 1.89.0 exceeds its 1.77 stabilization), consolidating on the standard library version could be a future refactoring. Theatomiccrate is essential—it providesAtomic<T>for arbitrary types, whichstd::sync::atomicdoesn't offer (only concrete types likeAtomicUsize).crates/common/src/ebr/mod.rs (1)
23-23: Verify ifinternalshould be publicly exposed.The
internalmodule is declared aspub mod internal, which exposes implementation details. This might be intentional for testing or advanced use cases within the rustpython workspace, but it's worth confirming that this is the desired visibility rather thanpub(crate).🔎 Consider restricting visibility if not needed externally
If the
internalmodule is only used within this crate:-pub mod internal; +pub(crate) mod internal;If it needs to be accessible to other rustpython crates but not external users, consider using
pub(crate)or visibility modifiers likepub(in crate::ebr).crates/vm/src/vm/mod.rs (1)
981-1052: Consider making the stdlib module list more maintainable.The hardcoded list of stdlib module names (lines 1006-1029) and the path-based detection (
pylib,Lib) are fragile:
- The stdlib list may become stale as modules are added/removed
- Path detection may fail on non-standard installations or alternative stdlib locations
Consider one of these alternatives:
- Use a centralized constant or configuration for stdlib module names
- Mark stdlib modules with a metadata flag during import
- Check if the module originated from frozen modules or builtin inits
That said, this is a reasonable first implementation for breaking cycles during shutdown.
crates/vm/src/builtins/function.rs (2)
53-57: Consider traversingnameandqualnamefields.The
traversemethod visitstype_params,annotations,module, anddoc, but notnameandqualname. Since these can bePyStrsubclasses (as noted inpop_edges), they may also participate in reference cycles and should be traversed for consistency.🔎 Proposed addition to traverse
self.type_params.lock().traverse(tracer_fn); self.annotations.lock().traverse(tracer_fn); self.module.lock().traverse(tracer_fn); self.doc.lock().traverse(tracer_fn); + self.name.lock().traverse(tracer_fn); + self.qualname.lock().traverse(tracer_fn); }
67-74:try_lockfailure silently skips clearing defaults.If
try_lock()fails (returnsNone), the defaults and kwdefaults are not cleared. While this is likely intentional for GC safety (avoiding deadlock), it could lead to reference cycles not being broken if the lock is contended during collection. Consider adding a debug assertion or log to track this case.crates/common/src/ebr/epoch.rs (1)
33-38: Verifywrapping_subcorrectness for edge cases.The optimization in
wrapping_subrelies on the LSB difference being discarded by the right shift. The comment explains this, but consider adding a unit test to verify behavior at epoch boundaries (e.g., nearusize::MAX).crates/vm/src/stdlib/thread.rs (2)
445-469: Constructor uses indexing after length validation - safe but consider bounds-checked access.The constructor validates
seq.len() == 4before indexing, so this is safe. However, usingget()with pattern matching would be more idiomatic Rust and defensive against future refactoring.🔎 Optional improvement using pattern matching
- if seq.len() != 4 { - return Err(vm.new_type_error(format!( - "_ExceptHookArgs expected 4 arguments, got {}", - seq.len() - ))); - } - Ok(Self { - exc_type: seq[0].clone(), - exc_value: seq[1].clone(), - exc_traceback: seq[2].clone(), - thread: seq[3].clone(), - }) + match <[_; 4]>::try_from(seq) { + Ok([exc_type, exc_value, exc_traceback, thread]) => Ok(Self { + exc_type, + exc_value, + exc_traceback, + thread, + }), + Err(seq) => Err(vm.new_type_error(format!( + "_ExceptHookArgs expected 4 arguments, got {}", + seq.len() + ))), + }
499-522: Deep nesting in stderr fallback logic reduces readability.The nested
if let/if/elsechain for getting stderr is hard to follow. Consider extracting to a helper function or using early returns.🔎 Suggested refactor using a helper closure
+ // Helper to get stderr, with fallback to thread._stderr + let get_stderr = || -> Option<crate::PyObjectRef> { + if let Ok(stderr) = vm.sys_module.get_attr("stderr", vm) { + if !vm.is_none(&stderr) { + return Some(stderr); + } + } + let thread = thread.as_ref()?; + if vm.is_none(thread) { + return None; + } + let thread_stderr = thread.get_attr("_stderr", vm).ok()?; + if vm.is_none(&thread_stderr) { + return None; + } + Some(thread_stderr) + }; + - // Get stderr - fall back to thread._stderr if sys.stderr is None - let stderr = match vm.sys_module.get_attr("stderr", vm) { - Ok(stderr) if !vm.is_none(&stderr) => stderr, - _ => { - // ... deeply nested logic ... - } - }; + let stderr = match get_stderr() { + Some(s) => s, + None => return Ok(()), + };crates/common/src/ebr/internal.rs (1)
55-58: Static mutable variables are unsafe and difficult to use correctly.
MAX_OBJECTSandMANUAL_EVENTS_BETWEEN_COLLECTarestatic mut, requiringunsafeblocks for every access. Consider usingAtomicUsizeoronce_cell::sync::Lazyfor safe interior mutability, especially if these might be configurable at runtime.🔎 Suggested safer alternative
-/// Maximum number of objects a bag can contain. -static mut MAX_OBJECTS: usize = 64; - -#[allow(dead_code)] -static mut MANUAL_EVENTS_BETWEEN_COLLECT: usize = 64; +use std::sync::atomic::AtomicUsize; + +/// Maximum number of objects a bag can contain. +static MAX_OBJECTS: AtomicUsize = AtomicUsize::new(64); + +#[allow(dead_code)] +static MANUAL_EVENTS_BETWEEN_COLLECT: AtomicUsize = AtomicUsize::new(64);Then update usages:
-Bag(Vec::with_capacity(unsafe { MAX_OBJECTS })) +Bag(Vec::with_capacity(MAX_OBJECTS.load(Ordering::Relaxed)))crates/vm/src/gc_state.rs (2)
406-418: Holding read locks during entire collection may cause contention.Lines 406-408 acquire read locks on all generation sets and hold them until line 513/523. During this time,
track_objectanduntrack_objectfrom other threads will block on write locks.For a GC that may take significant time, this could cause allocation pauses. Consider copying the object set and releasing locks earlier.
🔎 Alternative: copy and release locks early
- let gen_locks: Vec<_> = (0..=generation) - .filter_map(|i| self.generation_objects[i].read().ok()) - .collect(); - let mut collecting: HashSet<GcObjectPtr> = HashSet::new(); - for gen_set in &gen_locks { - for &ptr in gen_set.iter() { + for i in 0..=generation { + if let Ok(gen_set) = self.generation_objects[i].read() { + for &ptr in gen_set.iter() { + let obj = unsafe { ptr.0.as_ref() }; + if obj.strong_count() > 0 { + collecting.insert(ptr); + } + } + } // Lock released here + }This trades off snapshot consistency for reduced lock contention.
690-712:promote_survivorshas O(n*m) complexity - can be optimized.For each survivor, the code iterates through all generations 0..=from_gen to find which one contains it (lines 700-710). Since objects should be in exactly one generation, this is inefficient.
A more efficient approach would track which generation each object is in, or iterate the generation sets instead of survivors.
🔎 Sketch of more efficient approach
fn promote_survivors(&self, from_gen: usize, survivors: &HashSet<GcObjectPtr>) { if from_gen >= 2 { return; } let next_gen = from_gen + 1; // Process each generation once instead of per-survivor for gen_idx in 0..=from_gen { if let Ok(mut gen_set) = self.generation_objects[gen_idx].write() { let to_promote: Vec<_> = gen_set.intersection(survivors).copied().collect(); for ptr in to_promote { gen_set.remove(&ptr); if let Ok(mut next_set) = self.generation_objects[next_gen].write() { next_set.insert(ptr); } } } } }crates/common/src/ebr/collector.rs (1)
70-96: LocalHandle safely stores a raw pointer viahandle_counttracking—add safety documentation to explain the invariants.The unsafe code is sound:
Localis heap-allocated viaRawShared::from_owned()and lives longer than anyLocalHandledue to thehandle_countmechanism (initialized to 1, incremented/decremented on handle creation/drop, and only finalized when bothhandle_countandguard_countreach 0). The design correctly follows the epoch-based reclamation pattern.Add a safety comment to
LocalHandledocumenting these invariants so maintainers understand why the unsafe code is safe.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (11)
Cargo.lockis excluded by!**/*.lockLib/test/support/__init__.pyis excluded by!Lib/**Lib/test/test_builtin.pyis excluded by!Lib/**Lib/test/test_dict.pyis excluded by!Lib/**Lib/test/test_gc.pyis excluded by!Lib/**Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_super.pyis excluded by!Lib/**Lib/test/test_threading.pyis excluded by!Lib/**Lib/test/test_weakref.pyis excluded by!Lib/**Lib/test/test_with.pyis excluded by!Lib/**
📒 Files selected for processing (40)
Cargo.tomlcrates/common/Cargo.tomlcrates/common/src/ebr/collector.rscrates/common/src/ebr/default.rscrates/common/src/ebr/deferred.rscrates/common/src/ebr/epoch.rscrates/common/src/ebr/guard.rscrates/common/src/ebr/internal.rscrates/common/src/ebr/mod.rscrates/common/src/ebr/pointers.rscrates/common/src/ebr/sync/list.rscrates/common/src/ebr/sync/mod.rscrates/common/src/ebr/sync/once_lock.rscrates/common/src/ebr/sync/queue.rscrates/common/src/lib.rscrates/common/src/refcount.rscrates/derive-impl/src/pyclass.rscrates/derive-impl/src/pystructseq.rscrates/derive-impl/src/util.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/gc.rscrates/stdlib/src/ssl.rscrates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/str.rscrates/vm/src/builtins/super.rscrates/vm/src/builtins/tuple.rscrates/vm/src/dict_inner.rscrates/vm/src/gc_state.rscrates/vm/src/lib.rscrates/vm/src/object/core.rscrates/vm/src/object/traverse.rscrates/vm/src/object/traverse_object.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/context.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/thread.rscrates/vm/src/warn.rs
🧰 Additional context used
🧬 Code graph analysis (24)
crates/vm/src/vm/context.rs (2)
crates/vm/src/object/core.rs (2)
new_ref(1290-1307)default(342-344)crates/vm/src/builtins/list.rs (1)
new_ref(91-93)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
gc_state(742-744)
crates/vm/src/builtins/dict.rs (2)
crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/object/traverse.rs (9)
traverse(34-34)traverse(43-45)traverse(49-51)traverse(55-55)traverse(60-64)traverse(72-76)traverse(84-88)traverse(96-100)pop_edges(39-39)
crates/common/src/ebr/default.rs (4)
crates/common/src/ebr/collector.rs (6)
new(30-32)global_epoch(41-43)drop(91-95)drop(248-250)drop(313-315)drop(393-395)crates/common/src/ebr/epoch.rs (1)
new(90-93)crates/common/src/ebr/sync/once_lock.rs (2)
new(25-31)drop(98-103)crates/common/src/ebr/guard.rs (1)
drop(150-154)
crates/vm/src/warn.rs (6)
crates/vm/src/vm/mod.rs (1)
modules(990-990)crates/vm/src/builtins/type.rs (1)
__dict__(963-965)crates/vm/src/stdlib/builtins.rs (1)
globals(421-423)crates/vm/src/object/core.rs (1)
dict(732-734)crates/vm/src/builtins/dict.rs (1)
dict(782-782)crates/vm/src/frame.rs (1)
dict(801-801)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/str.rs (2)
try_traverse(1932-1934)try_pop_edges(1936-1938)crates/vm/src/object/traverse.rs (2)
try_traverse(19-19)try_pop_edges(21-21)
crates/stdlib/src/fcntl.rs (1)
extra_tests/snippets/stdlib_ctypes.py (1)
c_ulong(94-95)
crates/common/src/ebr/guard.rs (2)
crates/common/src/ebr/internal.rs (4)
defer(349-359)new(68-70)new(160-166)pin(392-454)crates/common/src/ebr/deferred.rs (5)
mem(33-33)mem(34-34)mem(37-37)mem(37-37)new(32-70)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
debug_obj(87-93)drop_dealloc_obj(84-86)try_pop_edges_obj(103-106)try_trace_obj(96-100)
crates/vm/src/builtins/function.rs (2)
crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)
crates/common/src/ebr/sync/queue.rs (1)
crates/common/src/ebr/guard.rs (1)
unprotected(173-177)
crates/common/src/ebr/collector.rs (2)
crates/common/src/ebr/internal.rs (9)
fmt(117-119)default(101-103)new(68-70)new(160-166)register(300-323)pin(392-454)drop(107-112)release_handle(517-526)collector(333-335)crates/common/src/ebr/guard.rs (2)
fmt(158-160)drop(150-154)
crates/vm/src/vm/thread.rs (1)
crates/common/src/ebr/default.rs (1)
cs(24-26)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
ensure_pinned(30-36)drop_guard(55-59)
crates/stdlib/src/ssl.rs (2)
crates/vm/src/builtins/function.rs (1)
traverse(47-58)crates/vm/src/builtins/dict.rs (1)
traverse(48-50)
crates/common/src/ebr/mod.rs (5)
crates/common/src/ebr/internal.rs (2)
collector(333-335)default(101-103)crates/common/src/ebr/default.rs (1)
collector(11-15)crates/common/src/ebr/sync/list.rs (1)
default(133-137)crates/common/src/ebr/collector.rs (1)
default(21-25)crates/common/src/ebr/pointers.rs (1)
default(29-31)
crates/vm/src/builtins/tuple.rs (3)
crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/builtins/dict.rs (2)
traverse(48-50)pop_edges(52-58)crates/vm/src/object/traverse.rs (10)
traverse(34-34)traverse(43-45)traverse(49-51)traverse(55-55)traverse(60-64)traverse(72-76)traverse(84-88)traverse(96-100)traverse(105-112)pop_edges(39-39)
crates/vm/src/object/traverse.rs (4)
crates/vm/src/builtins/str.rs (2)
try_traverse(1932-1934)try_pop_edges(1936-1938)crates/vm/src/builtins/list.rs (1)
pop_edges(66-73)crates/vm/src/builtins/dict.rs (1)
pop_edges(52-58)crates/vm/src/builtins/tuple.rs (1)
pop_edges(47-51)
crates/vm/src/builtins/str.rs (1)
crates/vm/src/object/traverse.rs (2)
try_traverse(19-19)try_pop_edges(21-21)
crates/common/src/ebr/epoch.rs (1)
crates/common/src/ebr/internal.rs (2)
new(68-70)new(160-166)
crates/common/src/refcount.rs (2)
crates/common/src/ebr/default.rs (2)
cs(24-26)global_epoch(44-46)crates/common/src/ebr/collector.rs (1)
global_epoch(41-43)
crates/vm/src/gc_state.rs (3)
crates/common/src/refcount.rs (2)
default(244-246)with_deferred_drops(396-413)crates/common/src/ebr/default.rs (1)
cs(24-26)crates/common/src/ebr/guard.rs (1)
drop(150-154)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
reactivate_guard(44-50)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (18)
gc_state(742-744)enable(164-166)disable(169-171)collect(368-370)get_threshold(184-190)set_threshold(64-66)set_threshold(193-201)get_count(204-210)get_debug(174-176)set_debug(179-181)get_stats(213-219)stats(68-75)get_objects(297-338)new(44-54)new(134-156)freeze(720-726)unfreeze(729-732)get_freeze_count(715-717)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Collect benchmark data
- GitHub Check: Collect what is left data
- GitHub Check: Collect code coverage data
- GitHub Check: Collect regression test data
🔇 Additional comments (73)
crates/derive-impl/src/util.rs (1)
375-375: LGTM! Thepop_edgesattribute is properly integrated throughout the codebase.The addition of "pop_edges" to ALLOWED_NAMES follows the pattern established by other attributes like "traverse" and enables
#[pyclass(pop_edges)]to pass validation. The macro implementation already checks for this attribute using_has_key("pop_edges")and generates the appropriateHAS_POP_EDGESconstant andtry_pop_edgesmethod based on its presence. Multiple builtin types (dict, list, tuple, function) already use this attribute with correspondingpop_edgesmethod implementations, and the GC system successfully leverages it to extract child references for circular reference resolution.crates/stdlib/src/fcntl.rs (1)
95-95: Conversion logic is correct; i64 choice is reasonable for the Python API.The truncation-then-zero-extend pattern correctly handles both positive values and negative i32 values passed as i64 (preserving their bit patterns). The i64 parameter type is appropriate for a Python API—it accepts a wider range of Python integer inputs while still safely truncating to u32 before passing to the syscall, which aligns with CPython's approach of accepting platform-sized integers for ioctl request codes. No breaking changes to internal Rust callers were found.
crates/common/Cargo.toml (1)
34-38: EBR dependency block looks consistent with existing workspace patternsThe added
crossbeam-utils,memoffset,scopeguard, andatomicworkspace dependencies are grouped clearly under an EBR comment and match the style of existing entries; no issues from the manifest side.crates/derive-impl/src/pyclass.rs (1)
608-643: MaybeTraverse generation fortraverse/pop_edgeslooks consistent with GC designThe new
MaybeTraverseimpl:
- Sets
IS_TRACEbased onhas_traverseand only callsTraverse::traversewhen requested.- Exposes a
HAS_POP_EDGESflag and routestry_pop_edgestoTraverse::pop_edgesonly whenpop_edgesis present.- Leaves both hooks as no‑ops when the attributes are absent, so all
#[pyclass]types implementMaybeTraverseuniformly.This matches the documented cases (manual vs derived traverse, optional pop_edges) and keeps the generated code straightforward; aside from the
_optional_strerror handling above, the structure looks solid.Also applies to: 699-706
crates/stdlib/src/ssl.rs (1)
2296-2345: GC traversal configuration for_SSLSocketlooks appropriateMarking
_SSLSocketas#[pyclass(..., traverse)]and selectively skipping fields with#[pytraverse(skip)]aligns well with what actually holds Python references:
- Traced:
sock,context,owner,session,incoming_bio,outgoing_bio,pending_context— all containPyObjectRef/PyRefand should participate in GC.- Skipped:
server_side,server_hostname,connection,handshake_done,session_was_reused,verified_chain,sni_state,client_hello_buffer,shutdown_state,deferred_cert_error— these are pure Rust state or TLS/certificate structures with no embedded Python objects.This should give the collector visibility into real Python edges (including the
_SSLSocket↔ owner/session relationships) without forcingTraverseimpls on the rustls plumbing types.crates/common/src/ebr/sync/once_lock.rs (1)
97-103: LGTM: Drop implementation is correct.The Drop implementation properly checks initialization before dropping the inner value, and uses
drop_in_placecorrectly through a raw pointer cast.crates/vm/src/object/core.rs (3)
870-888: Good: Proper finalization guard prevents double__del__calls.The
is_finalized/mark_finalizedpattern correctly prevents__del__from being called twice on resurrected objects. This aligns with CPython's behavior.
1290-1307: LGTM: GC tracking during object creation.The tracking logic correctly tracks objects with
IS_TRACEor instance dict, matching the untracking condition indrop_slow. Themaybe_collectcall enables automatic GC triggering.
831-856: Resurrection detection logic is sound.The
RefCount::inc()behavior is correctly implemented as documented: when called on strong count 0, it performs a double increment (0→2) by executing two separate atomic operations when the first increment observes strong count = 0. This is intentional for thread safety, as confirmed in the RefCount implementation (crates/common/src/refcount.rs:273-276).The resurrection detection correctly leverages this: after
inc(), the expected strong count is 2. Ifafter_delexceeds this, the__del__method created new references and the object is resurrected; the seconddec()is skipped to keep the object alive. Otherwise, both decrements run to return the count to 0, matching the double increment. The logic is correct.crates/stdlib/src/gc.rs (1)
50-86: LGTM:collectfunction correctly implements generation-based collection.The implementation properly validates generation bounds, invokes start/stop callbacks, forces collection even when GC is disabled (matching CPython behavior), and transfers garbage to the VM context.
crates/common/src/lib.rs (1)
17-17: LGTM: EBR module export.The new
ebrmodule is correctly added to the public API surface, enabling epoch-based reclamation functionality across the crate.crates/vm/src/lib.rs (1)
80-80: LGTM: GC state module export.The
gc_statemodule is correctly exposed as a public module, enabling GC functionality to be accessed from the stdlib and other components.crates/vm/src/builtins/str.rs (1)
1928-1938: LGTM: Correct traversal implementation forPyUtf8Str.
PyUtf8Strcorrectly implementsMaybeTraverse:
IS_TRACE = trueenables GC participationHAS_POP_EDGES = falseis correct since strings don't hold references to other Python objectstry_pop_edgesis appropriately a no-opcrates/vm/src/object/traverse.rs (3)
16-17: LGTM!The
HAS_POP_EDGESflag follows the same pattern asIS_TRACEand provides a clean compile-time capability check for the pop_edges functionality.
21-21: LGTM!The
try_pop_edgesmethod signature and default implementation are appropriate. The&mut selfparameter correctly allows implementations to take ownership of child references during GC cleanup.
36-39: LGTM!The
pop_edgesmethod is well-documented with clear semantics for circular reference resolution. The default no-op implementation is safe, and the relevant code snippets show that concrete types likePyDict,PyList, andPyTupleproperly implement this method to extract their owned child references.crates/vm/src/vm/context.rs (3)
55-57: LGTM!The new GC state fields are appropriately typed as
PyListRefand clearly documented. The public visibility is correct for VM-level GC integration.
336-338: LGTM!The initialization of
gc_callbacksandgc_garbagefollows the correct pattern used for other context objects likeempty_bytes. Both lists are properly tracked by the GC system viaPyRef::new_ref.
360-361: LGTM!The struct initialization correctly includes the new GC fields.
crates/common/src/ebr/sync/mod.rs (1)
1-5: LGTM!The synchronization primitives module structure is clean and appropriately scoped with
pub(crate)visibility for internal EBR use.crates/common/src/ebr/mod.rs (2)
1-17: LGTM!The module documentation provides a clear and comprehensive explanation of Epoch-Based Reclamation, including key concepts like pinning and critical sections.
27-30: LGTM!The re-exports provide a clean public API surface, making core EBR types and functions easily accessible through the module root.
crates/vm/src/builtins/super.rs (2)
64-64: LGTM!The type signature change from
PyTypeReftoPyObjectRefprovides more flexibility and aligns with Python's dynamic type system, wheresuper()can accept any object that is then checked at runtime.
78-81: LGTM!The downcast logic correctly validates that the first argument to
super()is a type object. The error message is clear and matches Python conventions, and the error handling viamap_erris appropriate.crates/vm/src/builtins/list.rs (2)
60-74: LGTM on the Traverse implementation.The
unsafe impl Traversecorrectly visits all ownedPyObjectRefs. Thepop_edgesimplementation usingtry_write()is appropriate for GC scenarios where the object should be unreachable.One minor observation: if
try_write()returnsNone(lock contention), elements won't be popped. The comment explains this is safe because objects should be unreachable during GC, but you may want to consider adding a debug assertion or logging for the failure case to catch unexpected contention during development.
27-33: Pyclass attributes look correct.The
traverse = "manual"andpop_edgesattributes properly indicate that this type provides a manualTraverseimplementation with edge-popping capability for GC, consistent with the pattern used forPyDictandPyTuple.crates/vm/src/vm/thread.rs (4)
15-20: Thread-local EBR guard storage looks correct.The
RefCell<Option<Guard>>pattern is appropriate for thread-local state that needs interior mutability. The doc comment clearly explains this is for coarse-grained pinning.
24-36: LGTM onensure_pinned().The function correctly uses
ebr::cs()to enter a critical section and stores the guard thread-locally. The idempotent check-then-set pattern is appropriate.
38-50: LGTM onreactivate_guard().Correctly calls
reactivate()on the existing guard to allow epoch advancement at safe points. The pattern of borrowing mutably only when a guard exists is efficient.
52-59: LGTM ondrop_guard().Setting the guard to
Nonewill drop theGuard, which unpins the thread as shown in the relevant snippet fromguard.rs(theDropimpl callslocal.unpin()).crates/vm/src/vm/mod.rs (2)
475-480: Correct CPython-aligned behavior for unraisable exceptions during shutdown.Suppressing unraisable exceptions when
finalizingistruematches CPython's behavior where daemon thread exceptions and__del__errors are silently ignored during interpreter shutdown.
533-538: Good placement for EBR guard reactivation.Frame boundaries are appropriate safe points for reactivating the EBR guard. This allows the GC to advance epochs and free deferred objects while ensuring no object references are held across the reactivation.
The
#[cfg(feature = "threading")]guard is correct since EBR is only relevant in threaded contexts.crates/vm/src/vm/interpreter.rs (2)
134-145: Good shutdown sequence for thread cleanup.Setting the
finalizingflag before callingthreading._shutdown()ensures that any exceptions from daemon threads during shutdown are suppressed. The error handling withlet _ = shutdown.call((), vm)silently ignores failures, which is appropriate during finalization.
149-158: Code pattern verified — bothgc_state()andcollect_force()are properly exposed and functional.The two-phase GC collection around module finalization is correct:
- First pass collects existing cycles before breaking module references
- Module cleanup breaks references to enable collection
- Second pass collects the newly unreachable cyclic garbage
Both
crate::gc_state::gc_state()(public function returning&'static GcState) andcollect_force()(public method accepting generation parameter) are properly defined and accessible.crates/derive-impl/src/pystructseq.rs (1)
610-622: LGTM on MaybeTraverse implementation for struct sequences.The implementation correctly:
- Sets
IS_TRACE = trueto enable traversal- Sets
HAS_POP_EDGES = falsesince edge popping is delegated to the innerPyTuple- Provides a no-op
try_pop_edgeswith appropriate comment- Delegates
try_traverseto the inner tuple (line 616)This follows the same pattern as
PyUtf8Strand other wrapper types.crates/vm/src/builtins/tuple.rs (2)
40-52: Correct Traverse implementation for PyTuple.The implementation is sound:
traverseproperly visits all elements via delegationpop_edgesusesstd::mem::takewhich replacesself.elementswith an empty boxed slice and returns the original- Converting to
Vecviainto_vec()and extending is correctUnlike
PyList, no locking is needed here because tuples are conceptually immutable (the elements are fixed at construction), and during GC the object is unreachable anyway.
28-28: Pyclass attributes correctly configured.The
traverse = "manual"andpop_edgesattributes match the manualTraverseimplementation below.crates/common/src/ebr/default.rs (3)
33-41: Good fallback handling inwith_handle.The
try_withwithunwrap_or_elsefallback correctly handles the case where the thread-localHANDLEhas been destroyed (e.g., during thread exit). By registering a new temporary handle with the collector, the function remains functional even during thread teardown.This is validated by the
pin_while_exitingtest which demonstrates pinning afterHANDLEis dropped doesn't panic.
11-26: Clean EBR default module design.The module correctly implements lazy initialization:
OnceLockensures the globalCollectoris initialized exactly once- Thread-local
HANDLEis registered lazily on first use per threadcs()provides a simple API for entering critical sections
48-76: Good test coverage for edge case.The
pin_while_exitingtest validates that callingcs()during thread exit (afterHANDLEis dropped but before other thread-locals likeFOO) doesn't panic. This is an important safety property for the EBR system.crates/vm/src/object/traverse_object.rs (2)
17-19: LGTM! Documentation and signature are clear.The
pop_edgeshook is well-documented and correctly typed as an optional function pointer for extracting child references before deallocation.
34-40: LGTM! Conditional initialization follows the existingtracepattern.The const block pattern mirrors the
tracefield initialization, maintaining consistency in how optional vtable hooks are configured based on type traits.crates/common/src/ebr/deferred.rs (1)
74-77:callconsumesselfbut doesn't prevent double-drop.The
callmethod consumesself, which is correct. However, since there's noDropimpl, ensure thatDeferredinstances are always consumed viacall()and never dropped without invocation. The current test coverage suggests this is the intended usage pattern.crates/common/src/ebr/guard.rs (2)
116-136: Good use ofscopeguard::defer!for panic safety inreactivate_after.The implementation correctly uses
scopeguard::defer!to ensure the guard is re-pinned even if the provided function panics. Themem::forget(local.pin())pattern properly transfers ownership of the new guard's lifetime to the existingGuardstruct.
148-155: LGTM!Dropimplementation correctly handles null local.The
as_ref()pattern safely handles both regular guards and unprotected guards (with null local pointer).crates/common/src/ebr/epoch.rs (2)
16-20: LGTM! Clear documentation of the bit layout.The LSB-as-pinned-flag design is well-documented and the
data: usizerepresentation is appropriate for atomic operations.
62-70: LGTM!successorcorrectly preserves pin state.Adding 2 increments the epoch value (stored in bits 1+) while preserving the LSB (pin state). This is correct given the bit layout.
crates/common/src/ebr/sync/list.rs (4)
219-233:Dropusesunprotected()guard - verify thread safety assumptions.The
Dropimplementation usesunprotected()which bypasses EBR protection. This is only safe if:
- The list is not being concurrently accessed by other threads
- All elements have already been marked as deleted (checked by assertion)
The assertion on line 227 (
assert_eq!(succ.tag(), 1)) will panic if any element wasn't properly deleted before the list is dropped. Consider usingdebug_assert!or returning an error if this is a recoverable condition.
162-194: LGTM! Insert uses correct lock-free CAS pattern.The insert operation correctly:
- Reads current head
- Sets new entry's next to current head
- Uses
compare_exchange_weakin a retry loop- Uses appropriate memory orderings (Release for success, Relaxed for retry)
239-296: Iterator correctly handles concurrent modifications.The iterator properly:
- Detects deleted nodes via tag bit
- Attempts to unlink deleted nodes via CAS
- Restarts from head when predecessor is deleted
- Returns
IterError::Stalledto signal restartThe logic is correct for a lock-free list traversal with concurrent modifications.
306-321: TestIsElementimplementation uses raw pointer cast.The
finalizeimplementation correctly defers destruction via the guard. Theunsafeblock is properly scoped with a safety comment.crates/vm/src/builtins/dict.rs (1)
46-59: No action required. Thepop_all_entriesmethod exists and is correctly implemented indict_inner.rsat line 715. It returns an iterator of(PyObjectRef, T)pairs, which matches the expected usage inpop_edges. TheTraverseimplementation properly delegates toself.entries.traverseand correctly collects both keys and values inpop_edges.crates/vm/src/builtins/function.rs (1)
85-92: No safety issue withContext::genesis()during GC.The code is safe.
Context::genesis()uses Rust'sstatic_cellwithget_or_init()pattern, which is thread-safe and idempotent—it initializes exactly once and returns a&'staticreference. The method cannot panic, even on first access. Additionally, garbage collection only runs after theVirtualMachineis fully initialized, and VM initialization explicitly callsContext::genesis()before any object creation. Thepop_edgesmethod will always have a valid, initialized genesis context available.crates/vm/src/stdlib/thread.rs (1)
351-376: EBR pinning correctly brackets thread execution.The
ensure_pinned()at thread start anddrop_guard()at thread exit properly implements coarse-grained EBR pinning. This ensures the GC won't reclaim objects while the thread is active.One consideration: if the thread panics between
ensure_pinned()anddrop_guard(), the guard may not be dropped. However, sinceEBR_GUARDis thread-local, it will be cleaned up when the thread exits anyway.crates/common/src/ebr/collector.rs (2)
104-423: Comprehensive test coverage for EBR pinning and reclamation.The tests cover critical scenarios: reentrant pinning, local bag flushing, garbage buffering, epoch advancement under pinning, stress tests with multiple threads, and array destruction. Good use of
#[cfg(miri)]for reduced iteration counts during Miri testing.
15-16: ManualSendandSyncimplementations are necessary and appropriately justified.The
CollectorwrapsArc<Global>, which is not automaticallySend + Sync. This is evidenced by the#[allow(clippy::arc_with_non_send_sync)]lint suppression inCollector::default(), which the Rust clippy linter uses to warn whenArcwraps types that don't implementSendorSync. SinceGlobalcontainsList<Local>with intrusive pointers toLocalentries—which themselves containUnsafeCell—the manual implementations are correctly required to affirm thread-safety. These impls should remain in place.crates/common/src/ebr/sync/queue.rs (4)
40-42: ManualSendandSyncimpls are appropriate for the queue.The safety comment on line 40 is accurate: individual
Tvalues are not accessed concurrently since they're only read/written during push/pop which are atomic operations. TheSendbound onTis correctly required.
141-169:pop_if_internalreads data before CAS - potential for reading invalid data if condition races.The condition reads
n.data.as_ptr()at line 151 before the CAS at line 153. If another thread pops this node between the read and the CAS:
- The CAS will fail (returning
Err)- But the condition was evaluated on potentially stale/freed data
However, since the guard is held during this operation, EBR guarantees the memory won't be reclaimed until the guard is dropped. The
T: Syncbound ensures the read is safe.
201-213: Drop implementation correctly drains queue before destroying sentinel.The
unprotected()guard usage inDropis safe becausedrophas exclusive access (&mut self). The sentinel node is properly destroyed after draining all data nodes.
249-257: Test wrapper'spop()spins indefinitely - acceptable for tests but would deadlock on empty queue.The
popmethod in the test wrapper loops forever until an element is available. This is fine for test scenarios where producers are guaranteed to push, but the comment or implementation could note this is intentional for testing.crates/common/src/ebr/internal.rs (4)
134-141: SealedBag expiration uses 3-epoch distance - matches EBR guarantees.The
is_expiredcheck at line 139 ensures bags are only reclaimed whenglobal_epoch.wrapping_sub(self.epoch) >= 3. This is correct for EBR: a pinned thread can observe at most one epoch advancement, so 3 epochs guarantees all pinned threads have unpinned since the bag was sealed.
390-454: Local::pin implements correct EBR pinning with x86 optimization.The pinning logic:
- Increments
guard_count(lines 395-396)- On first pin (
guard_count == 0), stores pinned epoch with proper synchronization- Uses x86-specific
compare_exchangeoptimization (lines 406-434) for performance- Retries if global epoch advanced during pinning (lines 440-444)
- Resets advance counter on epoch change (lines 447-450)
The x86 hack using
compare_exchangeinstead ofmfenceis a known optimization from crossbeam-epoch.
456-480: Local::unpin triggers collection when last guard is dropped.The unpin logic at lines 460-470 triggers collection while still pinned (
guard_count == 1). This is intentional - collection requires being pinned to safely traverse the object graph. Thecollectingflag prevents reentrancy.Note: Line 472 decrements
guard_countafter the collection loop, which is correct - the ManuallyDrop guard at line 465 prevents double-decrement.
573-593: IsElement trait implementation uses offset_of! for intrusive list.The
entry_ofandelement_ofimplementations correctly convert betweenLocalandEntryusingoffset_of!. Thefinalizemethod defers destruction of theLocalnode.Line 590: The
RawShared::fromconversion from*const Localis safe because the caller guarantees the entry belongs to a validLocal.crates/common/src/ebr/pointers.rs (3)
62-72: High tag constants and bit manipulation are correct.
HIGH_TAG_WIDTH = 4uses the top 4 bits of the pointer. Thehigh_bits()andhigh_bits_pos()calculations are correct:
high_bits_pos() = 64 - 4 = 60(on 64-bit)high_bits() = 0xF << 60 = 0xF000_0000_0000_0000This reserves the top 4 bits for epoch tagging while preserving pointer validity in the middle bits.
148-156:low_bits<T>()correctly computes alignment-based tag mask.The function
(1 << align_of::<T>().trailing_zeros()) - 1computes the maximum value that fits in the alignment bits. For example, ifalign_of::<T>() == 8, this returns0b111 = 7.
258-306: RawShared provides safe wrappers around unsafe pointer operations.The
from_owned,drop,deref, andas_refmethods correctly document their safety requirements. The#[cfg(test)]onis_nullkeeps it out of production builds while available for testing.crates/common/src/refcount.rs (4)
301-312:dec()correctly handles LEAKED objects.The decrement returns
falsefor leaked objects (line 307-309), preventing deallocation of interned objects. The checkold.strong() == 1(line 311) correctly returns true only when this was the last reference.
382-448: Deferred drop infrastructure is well-designed for deadlock prevention.The thread-local
IN_DEFERRED_CONTEXTandDEFERRED_QUEUEpattern allows deferring untrack calls during GC collection. Key points:
with_deferred_dropssets context and flushes on exit (lines 400-412)try_defer_dropqueues if in context, else executes immediately (lines 425-433)flush_deferred_dropsdrains queue outside borrow (lines 440-447)The comment at lines 378-380 explains the deadlock scenario this prevents.
461-468:defer_destructionis marked unsafe but doesn't validate guard.The safety comment requires the guard to be valid, but there's no runtime check. This is acceptable since it's
unsafe, but consider adding a debug assertion.
264-277: Clarify the strong==0 case and document abort rationale.Line 273-276: The comment "The previous fetch_add created a permission to run decrement again" is cryptic and doesn't explain why a second
fetch_addis needed whenstrong == 0. This appears to handle a race condition with concurrent decrements, but the logic should be documented explicitly.Lines 268-272: The
abort()call on a destructed object is a severe failure (kills the process). Document why this is preferred overpanic!()orunreachable!(). Note thatsafe_inc()(lines 282-299) handles the destructed case more gracefully with compare-exchange; clarify which path is recommended and whyinc()takes the fatal approach.crates/vm/src/gc_state.rs (2)
85-92:GcObjectPtrmanual Send/Sync requires careful justification.The wrapper makes
NonNull<PyObject>Send+Sync. The safety comment "proper synchronization is used" is vague. The actual safety relies on:
- Objects being accessed through PyObjectRef which manages synchronization
- GcObjectPtr only being used as a key (for hashing/equality), not for dereferencing without synchronization
Consider strengthening the safety comment to document these invariants.
735-744: Global singleton pattern usingOnceLockis appropriate.The
GC_STATEsingleton withOnceLockprovides thread-safe lazy initialization. Thegc_state()accessor is clean and idiomatic.
crates/common/src/ebr/collector.rs
Outdated
| /// Reads the global epoch, without issueing a fence. | ||
| #[inline] | ||
| pub fn global_epoch(&self) -> Epoch { | ||
| self.global.epoch.load(Ordering::Relaxed) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in doc comment: "issueing" should be "issuing".
🔎 Proposed fix
- /// Reads the global epoch, without issueing a fence.
+ /// Reads the global epoch, without issuing a fence.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Reads the global epoch, without issueing a fence. | |
| #[inline] | |
| pub fn global_epoch(&self) -> Epoch { | |
| self.global.epoch.load(Ordering::Relaxed) | |
| } | |
| /// Reads the global epoch, without issuing a fence. | |
| #[inline] | |
| pub fn global_epoch(&self) -> Epoch { | |
| self.global.epoch.load(Ordering::Relaxed) | |
| } |
🤖 Prompt for AI Agents
In crates/common/src/ebr/collector.rs around lines 39 to 43, the doc comment for
global_epoch contains a typo: "issueing" should be corrected to "issuing";
update the doc comment to read "Reads the global epoch, without issuing a
fence." so the documentation is spelled correctly.
crates/common/src/ebr/deferred.rs
Outdated
| pub(crate) struct Deferred { | ||
| call: unsafe fn(*mut u8), | ||
| data: MaybeUninit<Data>, | ||
| _marker: PhantomData<*mut ()>, // !Send + !Sync | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing Drop implementation may cause memory leaks.
Deferred stores a FnOnce() closure inline or boxed, but has no Drop implementation. If a Deferred is dropped without call() being invoked, the stored closure (and any captured data) will be leaked. This is particularly concerning for the heap-allocated path where a Box<F> is stored.
Consider adding a Drop impl that invokes the closure or at least drops the stored data, unless leaking is intentional for the EBR design (in which case, document this behavior).
🔎 Proposed Drop implementation
+impl Drop for Deferred {
+ fn drop(&mut self) {
+ // Call the deferred function to ensure it's not leaked
+ // This is safe because we own self
+ let call = self.call;
+ unsafe { call(self.data.as_mut_ptr().cast::<u8>()) };
+ }
+}Alternatively, if leaking is intentional, add documentation explaining why.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/common/src/ebr/deferred.rs around lines 18-22, Deferred currently
holds a closure (inline or boxed) but has no Drop implementation, so if a
Deferred is dropped without call() being invoked the stored closure/data will be
leaked; add a Drop impl that safely releases stored resources: on drop, check
the active variant/state and either invoke the stored destructor function (the
unsafe fn(*mut u8)) against the data buffer to run/drop the closure or, if
invoking is unsafe for EBR semantics, explicitly run the correct destructor for
the boxed path and drop the inline data to avoid leaks; ensure the Drop uses the
same unsafe logic as call(), avoids double-running if call() was already
executed (e.g. by clearing a flag or setting data to a neutral state), and
documents the chosen behavior if intentional leaking is preserved.
crates/common/src/ebr/guard.rs
Outdated
| /// # Safety | ||
| /// | ||
| /// The given function must not hold reference onto the stack. It is highly recommended that | ||
| /// the passed function is **always** marked with `move` in order to prevent accidental | ||
| /// borrows. | ||
| /// | ||
| /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by | ||
| /// the closure must be `Send`. | ||
| /// # Safety | ||
| /// | ||
| /// The given function must not hold reference onto the stack. It is highly recommended that | ||
| /// the passed function is **always** marked with `move` in order to prevent accidental | ||
| /// borrows. The closure must be `Send` as another thread may execute it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate safety documentation.
The safety documentation for defer_unchecked is duplicated (lines 32-39 and 40-44). Remove the duplicate section.
🔎 Proposed fix
/// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by
/// the closure must be `Send`.
- /// # Safety
- ///
- /// The given function must not hold reference onto the stack. It is highly recommended that
- /// the passed function is **always** marked with `move` in order to prevent accidental
- /// borrows. The closure must be `Send` as another thread may execute it.
pub unsafe fn defer_unchecked<F, R>(&self, f: F)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// # Safety | |
| /// | |
| /// The given function must not hold reference onto the stack. It is highly recommended that | |
| /// the passed function is **always** marked with `move` in order to prevent accidental | |
| /// borrows. | |
| /// | |
| /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by | |
| /// the closure must be `Send`. | |
| /// # Safety | |
| /// | |
| /// The given function must not hold reference onto the stack. It is highly recommended that | |
| /// the passed function is **always** marked with `move` in order to prevent accidental | |
| /// borrows. The closure must be `Send` as another thread may execute it. | |
| /// # Safety | |
| /// | |
| /// The given function must not hold reference onto the stack. It is highly recommended that | |
| /// the passed function is **always** marked with `move` in order to prevent accidental | |
| /// borrows. | |
| /// | |
| /// Apart from that, keep in mind that another thread may execute `f`, so anything accessed by | |
| /// the closure must be `Send`. | |
| pub unsafe fn defer_unchecked<F, R>(&self, f: F) |
🤖 Prompt for AI Agents
crates/common/src/ebr/guard.rs lines 32-44: the safety documentation for
defer_unchecked is duplicated across two blocks; remove the redundant copy so
only a single Safety section remains, keeping the more concise/accurate wording
(ensure it states that the closure must not hold stack references, should be
marked move to avoid borrows, and must be Send), and adjust surrounding comment
spacing to preserve formatting.
crates/common/src/ebr/pointers.rs
Outdated
| pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> { | ||
| // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`. | ||
| // The equality of the sizes is checked by `const_assert!`. | ||
| let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) }; | ||
| let prev = inner.fetch_or(low_bits::<T>() & tag, order); | ||
| RawShared::from(prev as *const _) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fetch_or uses unsafe transmute to AtomicUsize - missing size assertion.
The comment mentions "The equality of the sizes is checked by const_assert!" but no const_assert! is present in this file. This is a latent bug if Atomic<Tagged<T>> ever differs in size from AtomicUsize.
🔎 Add the missing size assertion
+use static_assertions::const_assert_eq;
+const_assert_eq!(core::mem::size_of::<Atomic<Tagged<()>>>(), core::mem::size_of::<AtomicUsize>());
+
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
// HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`.
// The equality of the sizes is checked by `const_assert!`.Alternatively, add a debug_assert at runtime:
pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> {
+ debug_assert_eq!(
+ core::mem::size_of::<Atomic<Tagged<T>>>(),
+ core::mem::size_of::<AtomicUsize>()
+ );
let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> { | |
| // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`. | |
| // The equality of the sizes is checked by `const_assert!`. | |
| let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) }; | |
| let prev = inner.fetch_or(low_bits::<T>() & tag, order); | |
| RawShared::from(prev as *const _) | |
| } | |
| } | |
| pub fn fetch_or<'g>(&self, tag: usize, order: Ordering, _: &'g Guard) -> RawShared<'g, T> { | |
| debug_assert_eq!( | |
| core::mem::size_of::<Atomic<Tagged<T>>>(), | |
| core::mem::size_of::<AtomicUsize>() | |
| ); | |
| // HACK: The size and alignment of `Atomic<TaggedCnt<T>>` will be same with `AtomicUsize`. | |
| // The equality of the sizes is checked by `const_assert!`. | |
| let inner = unsafe { &*(&self.inner as *const _ as *const AtomicUsize) }; | |
| let prev = inner.fetch_or(low_bits::<T>() & tag, order); | |
| RawShared::from(prev as *const _) | |
| } | |
| } |
| /// State wraps reference count + flags in a single 64-bit word | ||
| #[derive(Clone, Copy)] | ||
| pub struct State { | ||
| inner: u64, | ||
| } | ||
|
|
||
| impl State { | ||
| #[inline] | ||
| pub fn from_raw(inner: u64) -> Self { | ||
| Self { inner } | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn as_raw(self) -> u64 { | ||
| self.inner | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn strong(self) -> u32 { | ||
| ((self.inner & STRONG) / COUNT) as u32 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn weak(self) -> u32 { | ||
| ((self.inner & WEAK) / WEAK_COUNT) as u32 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn destructed(self) -> bool { | ||
| (self.inner & DESTRUCTED) != 0 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn weaked(self) -> bool { | ||
| (self.inner & WEAKED) != 0 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn leaked(self) -> bool { | ||
| (self.inner & LEAKED) != 0 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn epoch(self) -> u32 { | ||
| ((self.inner & EPOCH) >> EPOCH_MASK_HEIGHT) as u32 | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_epoch(self, epoch: usize) -> Self { | ||
| Self::from_raw((self.inner & !EPOCH) | (((epoch as u64) << EPOCH_MASK_HEIGHT) & EPOCH)) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn add_strong(self, val: u32) -> Self { | ||
| Self::from_raw(self.inner + (val as u64) * COUNT) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn sub_strong(self, val: u32) -> Self { | ||
| debug_assert!(self.strong() >= val); | ||
| Self::from_raw(self.inner - (val as u64) * COUNT) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn add_weak(self, val: u32) -> Self { | ||
| Self::from_raw(self.inner + (val as u64) * WEAK_COUNT) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_destructed(self, dest: bool) -> Self { | ||
| Self::from_raw((self.inner & !DESTRUCTED) | if dest { DESTRUCTED } else { 0 }) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_weaked(self, weaked: bool) -> Self { | ||
| Self::from_raw((self.inner & !WEAKED) | if weaked { WEAKED } else { 0 }) | ||
| } | ||
|
|
||
| #[inline] | ||
| pub fn with_leaked(self, leaked: bool) -> Self { | ||
| Self::from_raw((self.inner & !LEAKED) | if leaked { LEAKED } else { 0 }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
State and PyState are identical - consider removing duplication.
PyState (lines 151-233) has the exact same fields and methods as State (lines 33-115). This violates DRY and creates maintenance burden. If they need to diverge in the future, document why. Otherwise, use a single type or a type alias.
🔎 Proposed fix - use type alias or single type
-/// PyState extends State with LEAKED support for RustPython
-#[derive(Clone, Copy)]
-pub struct PyState {
- inner: u64,
-}
-
-impl PyState {
- // ... 80+ lines of duplicated methods ...
-}
+/// PyState is an alias for State, used in RustPython integration
+pub type PyState = State;If distinct types are truly needed for type safety, consider a generic or trait-based approach.
Also applies to: 151-233
🤖 Prompt for AI Agents
crates/common/src/refcount.rs lines 33-115 (and PyState at 151-233): the State
and PyState implementations are identical, so remove duplication by
consolidating into a single definition; either replace PyState with a type alias
(pub type PyState = State) and remove the duplicate struct/impl, or keep one
struct (State) and extract shared methods into a trait/generic used by both
names, updating all call sites to use the unified type; ensure visibility (pub)
and any module exports are preserved and add a short comment if you
intentionally keep a separate alias for future divergence.
| const ALLOWED_TRAVERSE_OPTS: &[&str] = &["manual"]; | ||
| // try to know if it have a `#[pyclass(trace)]` exist on this struct | ||
| // TODO(discord9): rethink on auto detect `#[Derive(PyTrace)]` | ||
|
|
||
| // 1. no `traverse` at all: generate a dummy try_traverse | ||
| // 2. `traverse = "manual"`: generate a try_traverse, but not #[derive(Traverse)] | ||
| // 3. `traverse`: generate a try_traverse, and #[derive(Traverse)] | ||
| let (maybe_trace_code, derive_trace) = { | ||
| if class_meta.inner()._has_key("traverse")? { | ||
| let maybe_trace_code = quote! { | ||
| impl ::rustpython_vm::object::MaybeTraverse for #ident { | ||
| const IS_TRACE: bool = true; | ||
| fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) { | ||
| ::rustpython_vm::object::Traverse::traverse(self, tracer_fn); | ||
| } | ||
| // Generate MaybeTraverse impl with both traverse and pop_edges support | ||
| // | ||
| // For traverse: | ||
| // 1. no `traverse` at all: IS_TRACE = false, try_traverse does nothing | ||
| // 2. `traverse = "manual"`: IS_TRACE = true, but no #[derive(Traverse)] | ||
| // 3. `traverse`: IS_TRACE = true, and #[derive(Traverse)] | ||
| // | ||
| // For pop_edges: | ||
| // 1. no `pop_edges`: HAS_POP_EDGES = false, try_pop_edges does nothing | ||
| // 2. `pop_edges`: HAS_POP_EDGES = true, try_pop_edges calls Traverse::pop_edges | ||
| let has_traverse = class_meta.inner()._has_key("traverse")?; | ||
| let has_pop_edges = class_meta.inner()._has_key("pop_edges")?; | ||
|
|
||
| let derive_trace = if has_traverse { | ||
| let value = class_meta.inner()._optional_str("traverse"); | ||
| if let Ok(Some(s)) = value { | ||
| if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) { | ||
| bail_span!( | ||
| item, | ||
| "traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all", | ||
| ); | ||
| } | ||
| assert_eq!(s, "manual"); | ||
| quote! {} | ||
| } else { | ||
| quote! {#[derive(Traverse)]} | ||
| } | ||
| } else { | ||
| quote! {} | ||
| }; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don’t ignore errors from _optional_str("traverse") when deriving Traverse
The new derive_trace logic is almost there, but this pattern:
let value = class_meta.inner()._optional_str("traverse");
if let Ok(Some(s)) = value {
// ...
} else {
quote! {#[derive(Traverse)]}
}silently treats any Err(_) from _optional_str as “no value, derive Traverse”, rather than surfacing a compile‑time error. That means malformed uses like #[pyclass(traverse(123))] could compile and unexpectedly auto‑derive Traverse instead of failing in the macro.
You can preserve the intended behavior while propagating parse errors by binding the Result with ? first:
Proposed fix for error propagation in traverse handling
- let derive_trace = if has_traverse {
- let value = class_meta.inner()._optional_str("traverse");
- if let Ok(Some(s)) = value {
+ let derive_trace = if has_traverse {
+ let value = class_meta.inner()._optional_str("traverse")?;
+ if let Some(s) = value {
if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) {
bail_span!(
item,
"traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all",
);
}
assert_eq!(s, "manual");
quote! {}
} else {
quote! {#[derive(Traverse)]}
}
} else {
quote! {}
};🤖 Prompt for AI Agents
crates/derive-impl/src/pyclass.rs lines 576-607: the code currently ignores Err
from _optional_str("traverse") which can hide parse errors; change to propagate
the Result by using let value = class_meta.inner()._optional_str("traverse")?;
then match on value (Some(s) => validate s is "manual" and return quote!{} for
manual, None => return quote!{#[derive(Traverse)]}) so any Err is returned to
the caller instead of silently treating it as None.
| /// Return the list of objects that directly refer to any of the arguments. | ||
| #[pyfunction] | ||
| fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult { | ||
| Err(vm.new_not_implemented_error("")) | ||
| fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef { | ||
| // This is expensive: we need to scan all tracked objects | ||
| // For now, return an empty list (would need full object tracking to implement) | ||
| let _ = args; | ||
| vm.ctx.new_list(vec![]) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete implementation: get_referrers always returns an empty list.
This function is documented to "Return the list of objects that directly refer to any of the arguments" but always returns an empty list. This could cause subtle bugs in code relying on this CPython feature.
Consider either:
- Implementing it properly by scanning all tracked objects
- Raising
NotImplementedErrorto make the limitation explicit - Adding a clear warning in the docstring
+ /// Return the list of objects that directly refer to any of the arguments.
+ ///
+ /// NOTE: This is currently not implemented and always returns an empty list.
+ /// Implementing this requires scanning all tracked objects which is expensive.
#[pyfunction]
fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef {
- // This is expensive: we need to scan all tracked objects
- // For now, return an empty list (would need full object tracking to implement)
let _ = args;
+ // TODO: Implement by iterating gc_state().get_objects(None) and checking referents
vm.ctx.new_list(vec![])
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Return the list of objects that directly refer to any of the arguments. | |
| #[pyfunction] | |
| fn get_referrers(_args: FuncArgs, vm: &VirtualMachine) -> PyResult { | |
| Err(vm.new_not_implemented_error("")) | |
| fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef { | |
| // This is expensive: we need to scan all tracked objects | |
| // For now, return an empty list (would need full object tracking to implement) | |
| let _ = args; | |
| vm.ctx.new_list(vec![]) | |
| } | |
| /// Return the list of objects that directly refer to any of the arguments. | |
| /// | |
| /// NOTE: This is currently not implemented and always returns an empty list. | |
| /// Implementing this requires scanning all tracked objects which is expensive. | |
| #[pyfunction] | |
| fn get_referrers(args: FuncArgs, vm: &VirtualMachine) -> PyListRef { | |
| let _ = args; | |
| // TODO: Implement by iterating gc_state().get_objects(None) and checking referents | |
| vm.ctx.new_list(vec![]) | |
| } |
🤖 Prompt for AI Agents
In crates/stdlib/src/gc.rs around lines 191 to 198, get_referrers currently
returns an empty list which misrepresents its documented behavior; change it to
explicitly signal the limitation by returning/raising a NotImplementedError (or
the project-equivalent VM exception) instead of an empty list and update the
docstring to mention that get_referrers is not implemented on this platform;
ensure the function uses the VM to construct and return the proper exception so
callers receive an explicit error rather than an empty result.
crates/vm/src/warn.rs
Outdated
| let (globals, filename, lineno) = if let Some(f) = f { | ||
| (f.globals.clone(), f.code.source_path, f.f_lineno()) | ||
| } else if let Some(frame) = vm.current_frame() { | ||
| // We have a frame but it wasn't found during stack walking | ||
| (frame.globals.clone(), vm.ctx.intern_str("sys"), 1) | ||
| } else { | ||
| (vm.current_globals().clone(), vm.ctx.intern_str("sys"), 1) | ||
| // No frames on the stack - we're in interpreter shutdown or similar state | ||
| // Use __main__ globals if available, otherwise skip the warning | ||
| match vm | ||
| .sys_module | ||
| .get_attr(identifier!(vm, modules), vm) | ||
| .and_then(|modules| modules.get_item(vm.ctx.intern_str("__main__"), vm)) | ||
| .and_then(|main_mod| main_mod.get_attr(identifier!(vm, __dict__), vm)) | ||
| { | ||
| Ok(globals) => { | ||
| if let Ok(dict) = globals.downcast::<crate::builtins::PyDict>() { | ||
| (dict, vm.ctx.intern_str("sys"), 1) | ||
| } else { | ||
| // Cannot get globals, skip warning | ||
| return Ok(( | ||
| vm.ctx.intern_str("sys").to_owned(), | ||
| 1, | ||
| None, | ||
| vm.ctx.new_dict().into(), | ||
| )); | ||
| } | ||
| } | ||
| Err(_) => { | ||
| // Cannot get __main__ globals, skip warning | ||
| return Ok(( | ||
| vm.ctx.intern_str("sys").to_owned(), | ||
| 1, | ||
| None, | ||
| vm.ctx.new_dict().into(), | ||
| )); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shutdown fallback currently still emits warnings despite “skip warning” comment
In the final else branch of setup_context, both “Cannot get globals, skip warning” comments suggest the intent is to suppress the warning when there are no frames / __main__ globals. However, the function returns a normal (filename, lineno, module = None, registry = new_dict()) tuple; warn then still calls warn_explicit, which normalizes module from filename and proceeds to emit the warning.
If the goal is to truly skip warnings in these shutdown-like states, you’ll need a distinct signal (e.g., have setup_context return an Option and early‑return in warn, or return a sentinel registry/module that warn checks before calling warn_explicit). Otherwise, it’d be good to adjust or remove the “skip warning” comments to match the actual behavior and avoid confusion.
🤖 Prompt for AI Agents
In crates/vm/src/warn.rs around lines 405-441, the shutdown fallback branches
claim “skip warning” but still return a normal tuple so warn continues and emits
warnings; change the API so setup_context can signal “skip” (e.g., return
Option<(filename, lineno, module, registry)> or a Result with a distinct
variant) and update warn to early-return when setup_context returns None (or the
skip variant) so no warn_explicit is called during
interpreter-shutdown/no-globals cases; alternatively, if you prefer not to
change the signature, return a clear sentinel (like module
Some(vm.ctx.intern_str("__shutdown__")) or an empty registry flagged with a
special key) and add a check in warn to detect that sentinel and return
early—implement one of these approaches and remove or adjust the misleading
“skip warning” comments to match the new behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/stdlib/src/json/machinery.rs (1)
199-246: Fix error position for malformed second\uXXXXescape.
The surrogate-pair branch usesnext_char_ifrom the first escape, soDecodeError::posis off when the second escape is invalid. Capture the actual char index fromchars2.Proposed fix
- if let Some(((_, (byte_pos2, _)), (_, _))) = chars2 + if let Some(((char_pos2, (byte_pos2, _)), (_, _))) = chars2 .next_tuple() .filter(|((_, (_, c1)), (_, (_, c2)))| *c1 == '\\' && *c2 == 'u') { - let uni2 = - decode_unicode(&mut chars2, char_offset + next_char_i + 1)?; + let uni2 = decode_unicode(&mut chars2, char_offset + char_pos2)?;
🤖 Fix all issues with AI agents
In @.claude/commands/upgrade-pylib.md:
- Around line 23-27: The fenced code block containing the three commands
(/upgrade-pylib inspect, /upgrade-pylib json, /upgrade-pylib asyncio) is missing
a language specifier; update that fenced block to include a language tag (e.g.,
"text") after the opening backticks so the block reads as a language-specified
code fence to enable proper syntax highlighting and rendering.
In `@crates/codegen/src/ir.rs`:
- Around line 259-266: Replace the unimplemented!() call for the unreachable
placeholder variants PseudoInstruction::JumpNoInterrupt, Reserved258,
SetupCleanup, SetupFinally, SetupWith, and StoreFastMaybeNull with a panic (or
other hard failure) that clearly states the instruction is unreachable and
includes the instruction debug info (e.g., panic!("Unreachable: {instr:?} should
never be emitted")). Update the match arm that currently contains
unimplemented!("Got a placeholder pseudo instruction ({instr:?})") so it uses
the new panic message to improve diagnostics when bytecode is corrupted or
manually constructed.
In `@crates/common/src/refcount.rs`:
- Around line 262-268: The header comment above the RefCount struct incorrectly
documents a 4-bit epoch while EPOCH_WIDTH is 0; update the comment (or make it
conditional on EPOCH_WIDTH) to reflect the real bit layout used by RefCount:
when EPOCH_WIDTH == 0 the layout should be "[1 bit: destructed] [1 bit: weaked]
[1 bit: leaked] [30 bits: weak_count] [31 bits: strong_count]" (or document both
variants and reference the EPOCH_WIDTH constant), so change the comment text
near the RefCount declaration and mention EPOCH_WIDTH so readers see the actual
mapping.
- Around line 30-42: The LEAKED flag overlaps the WEAK count because
TOTAL_COUNT_WIDTH subtracts only 2 reserved bits but there are 3 flag bits
(DESTRUCTED, WEAKED, LEAKED); update TOTAL_COUNT_WIDTH to subtract 3 (i.e.
TOTAL_COUNT_WIDTH = u64::BITS - EPOCH_WIDTH - 3) so
WEAK_WIDTH/STRONG_WIDTH/STRONG/WEAK/WEAK_COUNT/COUNT calculations shift
accordingly and the LEAKED bit (1 << (EPOCH_MASK_HEIGHT - 3)) no longer overlaps
the weak count; verify constants DESTRUCTED, WEAKED, LEAKED, WEAK, STRONG,
WEAK_COUNT remain consistent after this change.
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 853-858: The match in function label_arg does not account for the
JumpNoInterrupt variant, so its target Arg<Label> is never returned; update
label_arg (in crate's instruction.rs) to match both Self::Jump { target: l } and
Self::JumpNoInterrupt { target: l } and return Some(*l) for both cases (keeping
the existing _ => None fallback) so label resolution handles JumpNoInterrupt
correctly.
In `@crates/derive-impl/src/pyclass.rs`:
- Around line 631-639: The impl of ::rustpython_vm::object::MaybeTraverse for
`#ident` uses the wrong type for the tracer: update the try_traverse signature to
take a lifetime on TraverseFn (match the trait) by changing &mut
::rustpython_vm::object::TraverseFn to &mut
::rustpython_vm::object::TraverseFn<'_>; modify the function header for fn
try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) {
... } in the impl for `#ident` so it matches the trait definition and compiles.
In `@crates/stdlib/src/json.rs`:
- Around line 229-236: The JSON parser currently calls Wtf8Buf::to_string() and
uses a HashMap<String, PyStrRef> for memoization which loses unpaired
surrogates; change parse_object (and the other similar sites) to preserve WTF-8
by constructing keys/values from the Wtf8Buf directly (or use PyStrRef that
preserves WTF-8) instead of calling Wtf8Buf::to_string(), and change the memo
map type to use the WTF-8 preserving key (e.g., HashMap<Wtf8Buf, PyStrRef> or
HashMap<PyStrRefKey, PyStrRef>) so keys like "\ud800" remain distinct and are
not corrupted.
In `@crates/stdlib/src/opcode.rs`:
- Around line 83-99: The has_name() matcher in opcode.rs currently treats some
PseudoInstruction variants as name-bearing but misses the super pseudo-opcodes;
update the pattern in has_name (the matches! over
Self::try_from(opcode).map(|op| op.inner())) to also consider
AnyInstruction::Pseudo(PseudoInstruction::LoadSuperMethod { .. }),
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperMethod { .. }), and
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperAttr { .. }) so those
pseudo-opcodes return true for has_name().
In `@crates/vm/src/gc_state.rs`:
- Around line 690-711: promote_survivors currently moves pointers between
generation_objects but never updates the corresponding GcGeneration.count,
causing get_count()/threshold logic to be wrong; modify promote_survivors so
that when you successfully remove ptr from the source generation (inside the
block that has mut gen_set) you also decrement
self.generation_objects[from_gen]’s GcGeneration.count and when you insert into
next_set (inside the block that has mut next_set) you increment
self.generation_objects[next_gen]’s GcGeneration.count (use the same write-locks
already acquired or the atomic/field update mechanism used by GcGeneration.count
to keep updates thread-safe), ensuring counts stay in sync with the sets
referenced by get_count() and threshold logic.
- Around line 719-732: freeze and unfreeze currently only adjust the numeric
counters (permanent.count and generations[*].count) leaving the actual object
pointers in generation_objects so frozen objects can still be collected and
get_objects(g) remains inconsistent; modify freeze and unfreeze to move object
handles between storage sets: maintain a permanent object container (e.g.
permanent.objects or permanent_set), on freeze iterate generations (or use
generation_objects) to remove each object's pointer from its generation_objects
entry and insert it into permanent.objects while adjusting counts atomically,
and on unfreeze move pointers from permanent.objects back into
generations[2].generation_objects and update counts; also update get_objects to
consult permanent.objects when g == Permanent (or to exclude permanent objects
from normal collection paths) and ensure any collector loops skip
permanent.objects.
♻️ Duplicate comments (5)
crates/common/src/refcount.rs (1)
170-260:StateandPyStateare identical - duplication concern.This was flagged in a previous review. These two types have identical implementations. Consider consolidating with
pub type PyState = State;or documenting why separate types are necessary.crates/derive-impl/src/pyclass.rs (1)
590-603: Propagate errors fromtraverseattribute parsing.
The currentResultis silently ignored, so malformed#[pyclass(traverse(...))]can compile unexpectedly. This is the same issue noted previously.🛠️ Suggested fix
- let value = class_meta.inner()._optional_str("traverse"); - if let Ok(Some(s)) = value { + let value = class_meta.inner()._optional_str("traverse")?; + if let Some(s) = value { if !ALLOWED_TRAVERSE_OPTS.contains(&s.as_str()) { bail_span!( item, "traverse attribute only accept {ALLOWED_TRAVERSE_OPTS:?} as value or no value at all", ); } assert_eq!(s, "manual"); quote! {} } else { quote! {#[derive(Traverse)]} }crates/vm/src/object/core.rs (1)
1094-1106: Aliasing concern with&selfto*mutcast remains.This method casts
&selfto*mut PyObjectwhich can violate Rust's aliasing rules even with the safety comment. The previous review flagged this issue.crates/stdlib/src/gc.rs (1)
191-198:get_referrersstub not implemented - previously flagged.This function always returns an empty list despite being documented to "return the list of objects that directly refer to any of the arguments."
crates/vm/src/gc_state.rs (1)
247-263: Track the correct generation when untracking.
untrack_objectalways decrements gen0, which will corrupt counts for objects in gen1/gen2. Please decrement the actual generation where the object was found.🔧 Proposed fix (per-generation decrement)
pub unsafe fn untrack_object(&self, obj: NonNull<PyObject>) { let gc_ptr = GcObjectPtr(obj); + let mut found_gen: Option<usize> = None; // Remove from all generation tracking lists - for generation in &self.generation_objects { + for (gen_idx, generation) in self.generation_objects.iter().enumerate() { if let Ok(mut gen_set) = generation.write() { if gen_set.remove(&gc_ptr) { + found_gen = Some(gen_idx); break; // Object can only be in one generation } } } - // Update counts (simplified - just decrement gen0 for now) - let count = self.generations[0].count.load(Ordering::SeqCst); - if count > 0 { - self.generations[0].count.fetch_sub(1, Ordering::SeqCst); + // Decrement the correct generation's count + if let Some(gen_idx) = found_gen { + self.generations[gen_idx].count.fetch_sub(1, Ordering::SeqCst); }
🧹 Nitpick comments (8)
.claude/commands/upgrade-pylib.md (2)
8-20: Consider adding notes about validation and error handling.The workflow is clear, but consider documenting:
- How to validate that
$ARGUMENTSis a valid library name before deletion- What to do if the library doesn't exist in
cpython/Lib/- Recommended practice of working in a separate branch before making destructive changes
- Brief explanation of what
--quick-upgradedoesThese additions would make the documentation more robust and helpful for users unfamiliar with the process.
29-32: Helpful contextual notes.The notes provide useful context about the upgrade process. The suggested test command is appropriate.
Optionally, you could mention that during iterative development, omitting
--releasemay provide faster compile times at the cost of slower test execution.crates/common/src/refcount.rs (2)
419-440: Consider panic safety for deferred context.If
f()panics,in_ctx.set(was_in_context)andflush_deferred_drops()are never called. While thread unwinding often terminates the thread, if the panic is caught upstream, the deferred context flag remains set and queued operations are lost.♻️ Suggested RAII guard approach
struct DeferredContextGuard<'a> { cell: &'a Cell<bool>, was_in_context: bool, } impl Drop for DeferredContextGuard<'_> { fn drop(&mut self) { self.cell.set(self.was_in_context); if !self.was_in_context { flush_deferred_drops(); } } } pub fn with_deferred_drops<F, R>(f: F) -> R where F: FnOnce() -> R, { IN_DEFERRED_CONTEXT.with(|in_ctx| { let _guard = DeferredContextGuard { cell: in_ctx, was_in_context: in_ctx.replace(true), }; f() }) }
463-475: Potential issue: panic in deferred op leaves remaining ops unexecuted.If any
op()panics, the remaining operations in the queue will not execute. Consider whether this is acceptable for your use case, or ifstd::panic::catch_unwindshould be used to ensure all operations attempt execution.crates/stdlib/src/gc.rs (1)
259-267: Silent error handling ininvoke_callbacksmay hide issues.The
set_itemcalls (lines 261-263) and callback invocation (line 266) uselet _ =to ignore errors. While this prevents GC callbacks from crashing the collection process, errors in user-provided callbacks will be silently swallowed.Consider logging or reporting these errors via
vm.run_unraisable()similar to how__del__errors are handled, so users are aware their callbacks are failing.♻️ Suggested improvement
for callback in callbacks { - let _ = callback.call((phase_str.clone(), info.clone()), vm); + if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) { + vm.run_unraisable(e, None, callback.clone()); + } }crates/vm/src/object/core.rs (1)
873-898: Resurrection detection logic is correct - consider adding tests.The logic correctly handles both normal destruction and resurrection:
- Increment ref_count to protect del from concurrent drops (1→2)
- Run del and capture ref_count before and after
- Decrement once to undo the increment
- If ref_count increased during del (resurrection detected), skip the second decrement and leave the object alive
- If no resurrection, decrement again to reach 0
No dedicated unit tests for resurrection scenarios exist in the Rust codebase. Consider adding tests covering:
- Normal finalization without del side effects
- Resurrection (where del creates new references)
- Error handling during del
Minor: The comment on lines 15–16 stating "inc() from 0 does a double increment (0→2)" is misleading in this context, since inc() is called when ref_count=1. Clarify the comment to explain the thread-safety rationale for the increment/decrement pattern.
crates/compiler-core/src/bytecode/instruction.rs (2)
282-284: Track the pending migration ofLoadClosuretoPseudoInstruction.The TODO comment indicates this instruction should be moved to the pseudo-instruction enum. Consider opening an issue to track this migration to avoid it being forgotten.
Would you like me to open a GitHub issue to track this migration task?
881-892:fmt_dispanics at runtime for pseudo-instructions.Using
unimplemented!()will cause a panic if pseudo-instructions are ever formatted for display, which could happen during debugging or error reporting. Consider providing a basic implementation.Proposed implementation
fn fmt_dis( &self, - _arg: OpArg, - _f: &mut fmt::Formatter<'_>, - _ctx: &impl InstrDisplayContext, + arg: OpArg, + f: &mut fmt::Formatter<'_>, + ctx: &impl InstrDisplayContext, _expand_code_objects: bool, - _pad: usize, + pad: usize, _level: usize, ) -> fmt::Result { - unimplemented!() + match self { + Self::Jump { target } => write!(f, "{:pad$}({})", "JUMP", target.get(arg)), + Self::JumpNoInterrupt { target } => write!(f, "{:pad$}({})", "JUMP_NO_INTERRUPT", target.get(arg)), + Self::LoadAttrMethod { idx } => { + let name = ctx.get_name(idx.get(arg) as usize); + write!(f, "{:pad$}({}, {})", "LOAD_ATTR_METHOD", idx.get(arg), name) + } + Self::LoadSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_SUPER_METHOD", idx.get(arg)), + Self::LoadZeroSuperAttr { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_ATTR", idx.get(arg)), + Self::LoadZeroSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_METHOD", idx.get(arg)), + Self::PopBlock => write!(f, "POP_BLOCK"), + Self::SetupCleanup => write!(f, "SETUP_CLEANUP"), + Self::SetupFinally => write!(f, "SETUP_FINALLY"), + Self::SetupWith => write!(f, "SETUP_WITH"), + Self::StoreFastMaybeNull => write!(f, "STORE_FAST_MAYBE_NULL"), + Self::Reserved258 => write!(f, "RESERVED_258"), + } }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (61)
Cargo.lockis excluded by!**/*.lockLib/_opcode_metadata.pyis excluded by!Lib/**Lib/asyncio/__main__.pyis excluded by!Lib/**Lib/asyncio/base_events.pyis excluded by!Lib/**Lib/asyncio/base_subprocess.pyis excluded by!Lib/**Lib/asyncio/events.pyis excluded by!Lib/**Lib/asyncio/format_helpers.pyis excluded by!Lib/**Lib/asyncio/futures.pyis excluded by!Lib/**Lib/asyncio/locks.pyis excluded by!Lib/**Lib/asyncio/proactor_events.pyis excluded by!Lib/**Lib/asyncio/queues.pyis excluded by!Lib/**Lib/asyncio/runners.pyis excluded by!Lib/**Lib/asyncio/selector_events.pyis excluded by!Lib/**Lib/asyncio/sslproto.pyis excluded by!Lib/**Lib/asyncio/staggered.pyis excluded by!Lib/**Lib/asyncio/streams.pyis excluded by!Lib/**Lib/asyncio/taskgroups.pyis excluded by!Lib/**Lib/asyncio/tasks.pyis excluded by!Lib/**Lib/asyncio/timeouts.pyis excluded by!Lib/**Lib/asyncio/transports.pyis excluded by!Lib/**Lib/asyncio/unix_events.pyis excluded by!Lib/**Lib/asyncio/windows_events.pyis excluded by!Lib/**Lib/test/support/__init__.pyis excluded by!Lib/**Lib/test/test__opcode.pyis excluded by!Lib/**Lib/test/test_ast/test_ast.pyis excluded by!Lib/**Lib/test/test_bdb.pyis excluded by!Lib/**Lib/test/test_builtin.pyis excluded by!Lib/**Lib/test/test_bytes.pyis excluded by!Lib/**Lib/test/test_class.pyis excluded by!Lib/**Lib/test/test_compile.pyis excluded by!Lib/**Lib/test/test_configparser.pyis excluded by!Lib/**Lib/test/test_context.pyis excluded by!Lib/**Lib/test/test_descr.pyis excluded by!Lib/**Lib/test/test_dict.pyis excluded by!Lib/**Lib/test/test_docxmlrpc.pyis excluded by!Lib/**Lib/test/test_enum.pyis excluded by!Lib/**Lib/test/test_future_stmt/test_future.pyis excluded by!Lib/**Lib/test/test_gc.pyis excluded by!Lib/**Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_http_cookiejar.pyis excluded by!Lib/**Lib/test/test_importlib/test_threaded_import.pyis excluded by!Lib/**Lib/test/test_io.pyis excluded by!Lib/**Lib/test/test_itertools.pyis excluded by!Lib/**Lib/test/test_json/test_decode.pyis excluded by!Lib/**Lib/test/test_logging.pyis excluded by!Lib/**Lib/test/test_marshal.pyis excluded by!Lib/**Lib/test/test_multiprocessing_fork/test_threads.pyis excluded by!Lib/**Lib/test/test_multiprocessing_spawn/test_threads.pyis excluded by!Lib/**Lib/test/test_signal.pyis excluded by!Lib/**Lib/test/test_socket.pyis excluded by!Lib/**Lib/test/test_sort.pyis excluded by!Lib/**Lib/test/test_sqlite3/test_dbapi.pyis excluded by!Lib/**Lib/test/test_str.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_tarfile.pyis excluded by!Lib/**Lib/test/test_thread.pyis excluded by!Lib/**Lib/test/test_threading.pyis excluded by!Lib/**Lib/test/test_time.pyis excluded by!Lib/**Lib/test/test_timeout.pyis excluded by!Lib/**Lib/test/test_unittest/test_async_case.pyis excluded by!Lib/**Lib/test/test_weakref.pyis excluded by!Lib/**
📒 Files selected for processing (30)
.claude/commands/upgrade-pylib.md.github/copilot-instructions.md.github/workflows/ci.yamlCargo.tomlcrates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/common/Cargo.tomlcrates/common/src/lib.rscrates/common/src/refcount.rscrates/compiler-core/src/bytecode.rscrates/compiler-core/src/bytecode/instruction.rscrates/compiler-core/src/bytecode/oparg.rscrates/derive-impl/src/pyclass.rscrates/derive-impl/src/pystructseq.rscrates/derive-impl/src/util.rscrates/jit/src/instructions.rscrates/stdlib/src/gc.rscrates/stdlib/src/json.rscrates/stdlib/src/json/machinery.rscrates/stdlib/src/opcode.rscrates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/str.rscrates/vm/src/builtins/tuple.rscrates/vm/src/dict_inner.rscrates/vm/src/frame.rscrates/vm/src/gc_state.rscrates/vm/src/lib.rscrates/vm/src/object/core.rs
💤 Files with no reviewable changes (1)
- .github/workflows/ci.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
- crates/common/src/lib.rs
- Cargo.toml
- crates/common/Cargo.toml
- crates/vm/src/builtins/str.rs
- crates/vm/src/builtins/tuple.rs
🧰 Additional context used
🧬 Code graph analysis (12)
crates/derive-impl/src/pyclass.rs (3)
crates/vm/src/object/core.rs (2)
traverse(161-163)traverse(169-171)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/object/traverse.rs (2)
pop_edges(39-39)try_traverse(19-19)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
gc_state(742-744)
crates/stdlib/src/json/machinery.rs (1)
crates/stdlib/src/json.rs (1)
scanstring(702-732)
crates/vm/src/builtins/list.rs (5)
crates/vm/src/builtins/dict.rs (2)
traverse(48-50)pop_edges(52-58)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/builtins/set.rs (1)
traverse(176-179)crates/vm/src/builtins/mappingproxy.rs (1)
traverse(30-35)crates/vm/src/object/traverse.rs (1)
pop_edges(39-39)
crates/codegen/src/ir.rs (2)
crates/compiler-core/src/bytecode.rs (1)
encode_exception_table(63-75)crates/compiler-core/src/bytecode/instruction.rs (3)
encode_load_attr_arg(1108-1110)encode_load_super_attr_arg(1122-1124)marker(1056-1058)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/builtins/str.rs (2)
try_traverse(1930-1932)try_pop_edges(1934-1936)crates/vm/src/object/traverse.rs (2)
try_traverse(19-19)try_pop_edges(21-21)
crates/vm/src/gc_state.rs (2)
crates/common/src/refcount.rs (4)
new(143-145)new(278-283)with_deferred_drops(423-440)state(366-368)crates/stdlib/src/gc.rs (9)
enable(26-28)disable(32-34)get_debug(126-128)set_debug(132-134)get_threshold(90-99)get_count(113-122)get_stats(138-155)get_objects(165-176)collect(50-86)
crates/vm/src/frame.rs (2)
crates/compiler-core/src/bytecode/instruction.rs (1)
get(1074-1076)crates/compiler-core/src/bytecode/oparg.rs (1)
get(81-87)
crates/stdlib/src/json.rs (1)
crates/stdlib/src/json/machinery.rs (3)
scanstring(140-252)new(112-115)len(123-128)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (17)
gc_state(742-744)enable(164-166)disable(169-171)collect(368-370)get_threshold(184-190)set_threshold(64-66)set_threshold(193-201)get_count(204-210)get_debug(174-176)set_debug(179-181)get_stats(213-219)stats(68-75)get_objects(297-338)new(44-54)new(134-156)freeze(720-726)get_freeze_count(715-717)
crates/vm/src/object/core.rs (3)
crates/vm/src/object/traverse_object.rs (1)
try_pop_edges_obj(39-39)crates/common/src/refcount.rs (3)
try_defer_drop(448-461)new(143-145)new(278-283)crates/vm/src/gc_state.rs (3)
gc_state(742-744)new(44-54)new(134-156)
crates/compiler-core/src/bytecode.rs (1)
crates/compiler-core/src/bytecode/instruction.rs (4)
decode_load_attr_arg(1114-1118)decode_load_super_attr_arg(1128-1133)encode_load_attr_arg(1108-1110)encode_load_super_attr_arg(1122-1124)
🪛 markdownlint-cli2 (0.18.1)
.claude/commands/upgrade-pylib.md
23-23: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check Rust code with clippy
- GitHub Check: Collect benchmark data
- GitHub Check: Collect what is left data
- GitHub Check: Collect regression test data
- GitHub Check: Collect code coverage data
🔇 Additional comments (55)
.claude/commands/upgrade-pylib.md (1)
1-7: LGTM! Clear documentation structure.The header and arguments section clearly defines the command's purpose and expected input format.
.github/copilot-instructions.md (2)
90-106: Clear testing guidance improvements.The distinction between using debug mode for snippet tests (faster compilation) and release mode for unittest suites (better runtime performance) is helpful and aligns with typical Rust development workflows. The consolidated note at lines 105-106 reinforces this guidance effectively.
192-197: Unable to complete verification due to repository access limitations. Manual verification is required to:
- Confirm DEVELOPMENT.md contains the referenced "CPython Version Upgrade Checklist" section with venvlauncher build instructions
- Identify the 4 venvlauncher binary names and their current build/copy approach in Cargo.toml
- Assess whether more specific build instructions and examples should be added to
.github/copilot-instructions.mdcrates/jit/src/instructions.rs (1)
563-569: LGTM - The JIT correctly handles only resolved directional jump instructions.The JIT only handles
JumpBackward,JumpBackwardNoInterrupt, andJumpForwardbecause these are the real instructions that reach JIT compilation. The pseudo-instructionPseudoInstruction::Jump(which is compiler-only, distinct from theInstructionenum) is resolved to eitherJumpForwardorJumpBackwardduring IR lowering based on whether the target offset is forward or backward relative to the current position. Any pseudo-instructions would never reach the JIT—they are lowered to real instructions before bytecode generation.crates/common/src/refcount.rs (8)
1-16: LGTM!Module documentation and re-exports are clear. The aliasing of
pinascsmaintains API compatibility while leveraging crossbeam-epoch.
44-134: State implementation is correct, pending constant fix.The accessor and mutator methods correctly implement bit manipulation for the packed state. Once the
TOTAL_COUNT_WIDTHconstant is fixed, this will work correctly.
136-168: LGTM!The modular arithmetic helper correctly handles epoch wraparound comparisons. The transformation to/from modular space ensures proper ordering even when values wrap around.
291-304: Confusing increment logic when strong count is zero.The logic at lines 300-303 is unclear: if
old.strong() == 0, an additionalCOUNTis added, resulting in a final count of 2 instead of 1. The comment mentions "created a permission to run decrement again" but doesn't explain the rationale.Additionally, the
fetch_addmodifies state before checkingdestructed(), thoughabort()prevents further damage.Consider using a CAS loop (like
safe_inc) for consistency, or adding documentation explaining this specific recovery pattern.
306-326: LGTM!
safe_inccorrectly uses a CAS loop to atomically check the destructed flag before incrementing. This is the preferred pattern for thread-safe reference counting.
328-362: LGTM!The
dec,leak, andis_leakedmethods are correctly implemented:
decproperly returns true when strong count reaches zero, with LEAKED objects always returning false.leakuses a CAS loop for atomic flag setting.is_leakeduses appropriateAcquireordering for reads.
364-399: LGTM!The accessors and
mark_destructedare well-implemented:
mark_destructedcorrectly uses a CAS loop and validates that strong count is 0 and object is not leaked before marking.- Exposing the raw
AtomicU64enables advanced operations when needed.
477-495: LGTM!The
defer_destructionwrapper correctly delegates to crossbeam-epoch'sdefer_uncheckedwith appropriate safety documentation. The 3-epoch guarantee explanation is helpful for understanding the EBR semantics.crates/derive-impl/src/util.rs (1)
366-376: Allowingpop_edgesin#[pyclass]meta looks good.
This keeps the macro validation aligned with the new traversal/edge‑popping hooks.crates/derive-impl/src/pystructseq.rs (1)
608-618: ConfirmHAS_POP_EDGES = falseis intentional for struct sequences.
Since these wrapPyTuple(which now supports edge‑popping), please verify that skippingpop_edgeswon’t leave cycles unbroken in GC paths. If you intended delegation, consider:🔧 Optional delegation to inner PyTuple
impl ::rustpython_vm::object::MaybeTraverse for `#pytype_ident` { const IS_TRACE: bool = true; - const HAS_POP_EDGES: bool = false; + const HAS_POP_EDGES: bool = true; fn try_traverse(&self, traverse_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) { self.0.try_traverse(traverse_fn) } - fn try_pop_edges(&mut self, _out: &mut ::std::vec::Vec<::rustpython_vm::PyObjectRef>) { - // Struct sequences don't need pop_edges + fn try_pop_edges(&mut self, out: &mut ::std::vec::Vec<::rustpython_vm::PyObjectRef>) { + ::rustpython_vm::object::Traverse::pop_edges(&mut self.0, out); } }crates/derive-impl/src/pyclass.rs (1)
696-701: Emitting#maybe_traverse_codealongside the class item looks good.
Ensures the new traversal behavior is consistently generated with the class definition.crates/vm/src/lib.rs (1)
77-81: Publicly exportinggc_statelooks good.
Clear API surface for the new GC state module.crates/vm/src/dict_inner.rs (1)
728-737:pop_all_entriesefficiently clears state and drains entries.
Good fit for GC edge‑popping, and the&mut selfrequirement keeps locking simple.crates/vm/src/builtins/list.rs (1)
60-74: LGTM - Traverse implementation is consistent with the GC framework.The implementation correctly delegates
traverseto the inner elements and usestry_write()forpop_edgesto handle thePyRwLock. Thetry_write()approach is appropriate here since during GC the object should be unreachable, and if the lock is somehow held, silently skipping is safer than blocking.One minor observation: Unlike
PyTuplewhich usesstd::mem::take(&mut self.elements),PyListmust usetry_write()due to interior mutability. This is the correct approach given the struct design.crates/vm/src/builtins/function.rs (2)
46-57: LGTM - Extended traversal coverage for PyFunction.The traverse implementation now properly visits additional fields (
type_params,annotations,module,doc) that may contain references. This ensures the GC can properly track all reachable objects from a function.
82-92: No changes needed—Context::genesis() is already optimized.The implementation correctly uses a static cached context via the
get_or_initpattern, soContext::genesis()returns a reference to a static value after initialization. This is a cheap operation and the same pattern is used throughout the codebase (interpreter initialization, exception_group, etc.). Additionally,empty_stris already a&'static PyStrInterned, so there's no allocation at runtime. The.to_owned()call simply increments the refcount, which is appropriate during garbage collection. The code matches CPython'sfunc_clearbehavior and performs well.crates/vm/src/builtins/dict.rs (2)
46-59: LGTM - Dict traversal follows established patterns.The
Traverseimplementation correctly delegates to the inner entries and collects both keys and values during edge popping. This ensures all references held by the dictionary are properly handled during GC.
52-58: Thepop_all_entriesmethod is correctly implemented and accessible.The method exists on
dict_inner::Dict(aliasDictContentType) at line 731 ofcrates/vm/src/dict_inner.rswith the signaturepub fn pop_all_entries(&mut self) -> impl Iterator<Item = (PyObjectRef, T)> + '_. Thepop_edgesimplementation correctly calls this method and consumes the iterator, pushing both keys and values to the output vector.crates/vm/src/object/core.rs (3)
271-330: Good fix:already_clearedflag prevents double ref_count decrement.The separation of
clear()andclear_for_gc()with thealready_clearedcheck at line 281 prevents a subtle bug where both GC clearing and normal drop could decrementref_count. This is a solid defensive fix.
944-976: Correct ordering: untrack → pop_edges → dealloc → drop children.The destruction sequence properly:
- Untracking via
try_defer_drop(handles GC lock contention)- Extracts child references before deallocation
- Deallocates the object
- Drops children (may trigger recursive destruction)
This breaks cycles correctly and avoids use-after-free.
1325-1344: Object tracking triggers automatic GC collection.Objects with
IS_TRACEor instance dict are tracked, andmaybe_collect()is called after tracking. This enables automatic generational collection. The implementation is lightweight when thresholds aren't met—it performs early returns for disabled GC and only executes atomic loads and comparisons in the common path where collection isn't triggered.crates/stdlib/src/gc.rs (2)
49-86: LGTM - Well-structured collect implementation with proper validation and callbacks.The
collectfunction correctly:
- Validates generation bounds (0-2)
- Invokes callbacks with "start" and "stop" phases
- Uses
collect_forceto run collection even when GC is disabled- Migrates garbage objects to the Python-visible list
136-155: LGTM - Stats exposure follows CPython convention.The
get_statsfunction returns a list of dicts withcollections,collected, anduncollectablekeys per generation, matching CPython's API.crates/stdlib/src/json/machinery.rs (4)
112-115: PublicDecodeError::newlooks fine.
No concerns.
130-196:scanstringAPI/doc and fast/slow path split look solid.
Thechar_offsetcontract andbytes_consumedreturn are clear, and the early-return path keeps counts consistent.
259-266:decode_unicodeaccumulator change is OK.
Au16accumulator fits\uXXXXand keeps the bit assembly straightforward.
33-33:memchris properly declared in the stdlib crate.The dependency is present in
crates/stdlib/Cargo.toml(line 51) asmemchr = { workspace = true }, pinned to version 2.7.4 in the root workspace manifest, and resolved to 2.7.6 in the lockfile. The import at line 33 is valid.crates/compiler-core/src/bytecode/oparg.rs (1)
81-85: LGTM — clearer ExtendedArg guard.The
matches!form preserves semantics while making the intent explicit.crates/stdlib/src/json.rs (7)
20-40: Helpers are clear and JSON‑spec compliant.ASCII whitespace and prefix checks are straightforward and efficient.
92-188: Byte/char index tracking looks consistent.The dual-index approach and early StopIteration handling are coherent.
388-503: Array/object assembly and hook finalization look solid.Whitespace handling, delimiter checks, and hook priority match expected semantics.
505-526: Token dispatch incall_scan_onceis comprehensive.The explicit branches for literals/numbers/objects/arrays are clear.
Also applies to: 538-618
619-629: No issues in centralized decode error construction.
632-663: Entry-point byte index derivation is clean and contained.The O(n) char→byte conversion is limited to the public entry.
701-732:scanstringbyte-index conversion is correct and localized.The conversion keeps error positioning consistent with the new byte/char split.
crates/codegen/src/ir.rs (3)
88-96: Good expansion of InstructionInfo metadata.
end_locationandexcept_handleralign with the later linetable/exception-table generation.
199-255: Pseudo‑to‑real lowering for attribute loads is clean.Encoding once here keeps the emission path uniform.
305-322: Jump direction selection reads correctly.Using target offset vs. current offset is the right split for forward/backward encoding.
crates/compiler-core/src/bytecode.rs (1)
17-22: Re‑exports align with the new instruction model.Public surface now matches the refactor in instruction.rs.
crates/vm/src/frame.rs (1)
10-11: Instruction alias adoption looks consistent throughout execution paths.The updated dispatch and reraise detection are coherent with the new Instruction enum usage.
Also applies to: 390-401, 442-450, 552-1756
crates/codegen/src/compile.rs (3)
277-297:emit!macro expansion looks good.The struct/tuple/no‑arg variants are now covered cleanly and route through the generic emission helpers as intended.
6955-6983: Generic_emit/emit_argplumbing is solid.The
Into<AnyInstruction>flow is consistent and keeps instruction/arg/target wiring centralized.
7455-7477: EmitArg trait update is clean.The
AnyInstructionreturn type aligns well with the new emission path.crates/compiler-core/src/bytecode/instruction.rs (8)
286-354: LGTM!The conversion implementations correctly handle the non-contiguous opcode space with explicit validation. The use of
transmuteis safe given therepr(u8)guarantee and the size assertion.
356-389: LGTM!The
label_argandunconditional_branchimplementations correctly identify the branching instructions. The pattern matching is comprehensive for all jump-related variants.
860-862: Consider includingJumpNoInterruptinunconditional_branch.If
JumpNoInterruptrepresents an unconditional jump (without interrupt checks), it should likely returntruefromunconditional_branch. Otherwise, code flow analysis may be incorrect.Please verify the intended semantics of
JumpNoInterrupt- if it's an unconditional jump, update accordingly:fn unconditional_branch(&self) -> bool { - matches!(self, Self::Jump { .. },) + matches!(self, Self::Jump { .. } | Self::JumpNoInterrupt { .. }) }
834-850: LGTM!The
TryFrom<u16>implementation correctly validates the contiguous discriminant range and usestransmutesafely given therepr(u16)guarantee.
920-929: LGTM!The
TryFrom<u16>logic correctly routes values 0-255 toInstructionand values 256+ toPseudoInstruction, matching the discriminant design.
931-958: LGTM!The
inst_either!macro cleanly delegates trait methods to the inner variant, reducing boilerplate while maintaining type safety.
998-1049: LGTM!The
InstructionMetadatatrait is well-designed with clear documentation and useful examples. Thedisplayhelper using a closure wrapper is idiomatic.
1106-1133: LGTM!The encode/decode helpers are symmetric, well-documented, and correctly implement the bit-packing scheme for opargs.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| ``` | ||
| /upgrade-pylib inspect | ||
| /upgrade-pylib json | ||
| /upgrade-pylib asyncio | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add language specifier to the code block.
The code block is missing a language specifier, which is a markdown best practice for proper syntax highlighting and rendering.
📝 Proposed fix
-```
+```text
/upgrade-pylib inspect
/upgrade-pylib json
/upgrade-pylib asyncioBased on static analysis hints.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ``` | |
| /upgrade-pylib inspect | |
| /upgrade-pylib json | |
| /upgrade-pylib asyncio | |
| ``` |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
23-23: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In @.claude/commands/upgrade-pylib.md around lines 23 - 27, The fenced code
block containing the three commands (/upgrade-pylib inspect, /upgrade-pylib
json, /upgrade-pylib asyncio) is missing a language specifier; update that
fenced block to include a language tag (e.g., "text") after the opening
backticks so the block reads as a language-specified code fence to enable proper
syntax highlighting and rendering.
| PseudoInstruction::JumpNoInterrupt { .. } | ||
| | PseudoInstruction::Reserved258 | ||
| | PseudoInstruction::SetupCleanup | ||
| | PseudoInstruction::SetupFinally | ||
| | PseudoInstruction::SetupWith | ||
| | PseudoInstruction::StoreFastMaybeNull => { | ||
| unimplemented!("Got a placeholder pseudo instruction ({instr:?})") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's look at the file structure and context
cd crates/codegen/src && wc -l ir.rsRepository: youknowone/RustPython
Length of output: 74
🏁 Script executed:
#!/bin/bash
# Get the context around the mentioned lines
cd crates/codegen/src && sed -n '250,280p' ir.rsRepository: youknowone/RustPython
Length of output: 1810
🏁 Script executed:
#!/bin/bash
# Search for all references to these pseudo-instructions
rg -n "(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)" --type rsRepository: youknowone/RustPython
Length of output: 91
🏁 Script executed:
#!/bin/bash
# Look for lowering logic or where these pseudo-instructions are created/emitted
rg -n "PseudoInstruction::(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)" --type rs -A 5 -B 5Repository: youknowone/RustPython
Length of output: 91
🏁 Script executed:
#!/bin/bash
# Search for all references to these pseudo-instructions (corrected command)
rg -n "(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)" --type-list | head -5Repository: youknowone/RustPython
Length of output: 454
🏁 Script executed:
#!/bin/bash
# Try simpler rg command without type filter
rg -n "JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull" crates/Repository: youknowone/RustPython
Length of output: 1984
🏁 Script executed:
#!/bin/bash
# Look for PseudoInstruction enum definition to understand where these are defined
rg -n "enum PseudoInstruction" crates/ -A 50Repository: youknowone/RustPython
Length of output: 4253
🏁 Script executed:
#!/bin/bash
# Search for where these instructions might be emitted/created
rg -n "PseudoInstruction::" crates/codegen/ | head -30Repository: youknowone/RustPython
Length of output: 2856
🏁 Script executed:
#!/bin/bash
# Check if there's any lowering or validation that prevents these from being reached
rg -n "unimplemented" crates/codegen/src/ir.rsRepository: youknowone/RustPython
Length of output: 159
🏁 Script executed:
#!/bin/bash
# Check if any code tries to emit or create these placeholder instructions
rg -n "SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull|JumpNoInterrupt|Reserved258" crates/codegen/src/compile.rsRepository: youknowone/RustPython
Length of output: 208
🏁 Script executed:
#!/bin/bash
# Look at the full context of the match statement to understand the flow
cd crates/codegen/src && sed -n '200,280p' ir.rsRepository: youknowone/RustPython
Length of output: 4532
🏁 Script executed:
#!/bin/bash
# Search for where PseudoInstructions are created to see all emission sites
rg -n "PseudoInstruction::" crates/codegen/src/compile.rs | grep -E "(JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull)"Repository: youknowone/RustPython
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Check if there's validation or error handling elsewhere that prevents these from being emitted
rg -n "placeholder" crates/codegen/ -iRepository: youknowone/RustPython
Length of output: 184
🏁 Script executed:
#!/bin/bash
# Search across the entire codebase for any attempt to create these placeholder instructions
rg -n "JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull" --type-list 2>/dev/null || echo "Getting all file types"
rg "JumpNoInterrupt|Reserved258|SetupCleanup|SetupFinally|SetupWith|StoreFastMaybeNull" crates/ | grep -v "^crates/codegen/src/ir.rs:" | grep -v "^crates/codegen/src/compile.rs:" | grep -v "^crates/compiler-core/src/bytecode/instruction.rs:"Repository: youknowone/RustPython
Length of output: 87
🏁 Script executed:
#!/bin/bash
# Verify the comments about not emitting SetupFinally
rg -n "SetupFinally" crates/codegen/src/compile.rs -B 5 -A 5Repository: youknowone/RustPython
Length of output: 1286
🏁 Script executed:
#!/bin/bash
# Search for the emit! macro to see all pseudo instruction emissions in compile.rs
rg -n 'emit!\(.*PseudoInstruction::' crates/codegen/src/compile.rsRepository: youknowone/RustPython
Length of output: 1665
🏁 Script executed:
#!/bin/bash
# Check if there's any documentation or comments about these being unreachable
rg -n "unreachable|placeholder|Placeholder" crates/compiler-core/src/bytecode/instruction.rs -iRepository: youknowone/RustPython
Length of output: 1407
Improve error handling for unreachable placeholder pseudo-instructions.
These placeholder instructions (JumpNoInterrupt, Reserved258, SetupCleanup, SetupFinally, SetupWith, StoreFastMaybeNull) are never emitted by the compiler pipeline and are dead code. Rather than unimplemented!(), replace with a panic or error that clearly identifies what went wrong—e.g., panic!("Unreachable: {instr:?} should never be emitted"). This provides better diagnostics if bytecode is corrupted or manually constructed.
🤖 Prompt for AI Agents
In `@crates/codegen/src/ir.rs` around lines 259 - 266, Replace the
unimplemented!() call for the unreachable placeholder variants
PseudoInstruction::JumpNoInterrupt, Reserved258, SetupCleanup, SetupFinally,
SetupWith, and StoreFastMaybeNull with a panic (or other hard failure) that
clearly states the instruction is unreachable and includes the instruction debug
info (e.g., panic!("Unreachable: {instr:?} should never be emitted")). Update
the match arm that currently contains unimplemented!("Got a placeholder pseudo
instruction ({instr:?})") so it uses the new panic message to improve
diagnostics when bytecode is corrupted or manually constructed.
| fn label_arg(&self) -> Option<Arg<Label>> { | ||
| match self { | ||
| Self::Jump { target: l } => Some(*l), | ||
| _ => None, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JumpNoInterrupt label is not returned by label_arg.
JumpNoInterrupt has a target: Arg<Label> field but is not included in label_arg. This will cause label resolution issues if this instruction is used.
Proposed fix
fn label_arg(&self) -> Option<Arg<Label>> {
match self {
- Self::Jump { target: l } => Some(*l),
+ Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l),
_ => None,
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn label_arg(&self) -> Option<Arg<Label>> { | |
| match self { | |
| Self::Jump { target: l } => Some(*l), | |
| _ => None, | |
| } | |
| } | |
| fn label_arg(&self) -> Option<Arg<Label>> { | |
| match self { | |
| Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l), | |
| _ => None, | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 853 - 858, The
match in function label_arg does not account for the JumpNoInterrupt variant, so
its target Arg<Label> is never returned; update label_arg (in crate's
instruction.rs) to match both Self::Jump { target: l } and Self::JumpNoInterrupt
{ target: l } and return Some(*l) for both cases (keeping the existing _ => None
fallback) so label resolution handles JumpNoInterrupt correctly.
| fn parse_object( | ||
| &self, | ||
| pystr: PyStrRef, | ||
| start_char_idx: usize, | ||
| start_byte_idx: usize, | ||
| scan_once: &PyObjectRef, | ||
| memo: &mut HashMap<String, PyStrRef>, | ||
| vm: &VirtualMachine, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve WTF‑8 for JSON strings (lossy conversion breaks CPython compatibility).
Wtf8Buf::to_string() drops unpaired surrogates, so keys/values like "\ud800" are corrupted and key memoization can collapse distinct strings. Use a WTF‑8‑preserving constructor and memoize by Wtf8Buf (or PyStrRef) instead of String.
🔧 Suggested fix (preserve WTF‑8)
- memo: &mut HashMap<String, PyStrRef>,
+ memo: &mut HashMap<Wtf8Buf, PyStrRef>,
- let key_str = key_wtf8.to_string();
- let key: PyObjectRef = match memo.get(&key_str) {
+ let key: PyObjectRef = match memo.get(&key_wtf8) {
Some(cached) => cached.clone().into(),
None => {
- let py_key = vm.ctx.new_str(key_str.clone());
- memo.insert(key_str, py_key.clone());
+ let py_key = vm.ctx.new_str(key_wtf8.clone());
+ memo.insert(key_wtf8.clone(), py_key.clone());
py_key.into()
}
};
- let py_str = vm.ctx.new_str(wtf8_result.to_string());
+ let py_str = vm.ctx.new_str(wtf8_result);Also applies to: 283-291, 527-535
🤖 Prompt for AI Agents
In `@crates/stdlib/src/json.rs` around lines 229 - 236, The JSON parser currently
calls Wtf8Buf::to_string() and uses a HashMap<String, PyStrRef> for memoization
which loses unpaired surrogates; change parse_object (and the other similar
sites) to preserve WTF-8 by constructing keys/values from the Wtf8Buf directly
(or use PyStrRef that preserves WTF-8) instead of calling Wtf8Buf::to_string(),
and change the memo map type to use the WTF-8 preserving key (e.g.,
HashMap<Wtf8Buf, PyStrRef> or HashMap<PyStrRefKey, PyStrRef>) so keys like
"\ud800" remain distinct and are not corrupted.
| matches!( | ||
| Self::try_from(opcode).map(|op| op.inner()), | ||
| Ok(AnyInstruction::Real( | ||
| Instruction::DeleteAttr { .. } | ||
| | Instruction::DeleteGlobal(_) | ||
| | Instruction::DeleteName(_) | ||
| | Instruction::ImportFrom { .. } | ||
| | Instruction::ImportName { .. } | ||
| | Instruction::LoadAttr { .. } | ||
| | Instruction::LoadGlobal(_) | ||
| | Instruction::LoadAttrMethod { .. } | ||
| | Instruction::LoadName(_) | ||
| | Instruction::StoreAttr { .. } | ||
| | Instruction::StoreGlobal(_) | ||
| | Instruction::StoreName(_)) | ||
| ) | ||
| | Instruction::StoreName(_) | ||
| ) | AnyInstruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. })) | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include super pseudo-opcodes in has_name.
compile.rs emits PseudoInstruction::LoadSuperMethod, PseudoInstruction::LoadZeroSuperMethod, and PseudoInstruction::LoadZeroSuperAttr, all of which carry name indices. opcode.has_name should return true for those to keep co_names handling/disassembly consistent.
🛠️ Suggested fix
- ) | AnyInstruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. }))
+ )
+ | AnyInstruction::Pseudo(
+ PseudoInstruction::LoadAttrMethod { .. }
+ | PseudoInstruction::LoadSuperMethod { .. }
+ | PseudoInstruction::LoadZeroSuperMethod { .. }
+ | PseudoInstruction::LoadZeroSuperAttr { .. }
+ ))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| matches!( | |
| Self::try_from(opcode).map(|op| op.inner()), | |
| Ok(AnyInstruction::Real( | |
| Instruction::DeleteAttr { .. } | |
| | Instruction::DeleteGlobal(_) | |
| | Instruction::DeleteName(_) | |
| | Instruction::ImportFrom { .. } | |
| | Instruction::ImportName { .. } | |
| | Instruction::LoadAttr { .. } | |
| | Instruction::LoadGlobal(_) | |
| | Instruction::LoadAttrMethod { .. } | |
| | Instruction::LoadName(_) | |
| | Instruction::StoreAttr { .. } | |
| | Instruction::StoreGlobal(_) | |
| | Instruction::StoreName(_)) | |
| ) | |
| | Instruction::StoreName(_) | |
| ) | AnyInstruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. })) | |
| ) | |
| } | |
| matches!( | |
| Self::try_from(opcode).map(|op| op.inner()), | |
| Ok(AnyInstruction::Real( | |
| Instruction::DeleteAttr { .. } | |
| | Instruction::DeleteGlobal(_) | |
| | Instruction::DeleteName(_) | |
| | Instruction::ImportFrom { .. } | |
| | Instruction::ImportName { .. } | |
| | Instruction::LoadAttr { .. } | |
| | Instruction::LoadGlobal(_) | |
| | Instruction::LoadName(_) | |
| | Instruction::StoreAttr { .. } | |
| | Instruction::StoreGlobal(_) | |
| | Instruction::StoreName(_) | |
| ) | |
| | AnyInstruction::Pseudo( | |
| PseudoInstruction::LoadAttrMethod { .. } | |
| | PseudoInstruction::LoadSuperMethod { .. } | |
| | PseudoInstruction::LoadZeroSuperMethod { .. } | |
| | PseudoInstruction::LoadZeroSuperAttr { .. } | |
| )) | |
| ) | |
| } |
🤖 Prompt for AI Agents
In `@crates/stdlib/src/opcode.rs` around lines 83 - 99, The has_name() matcher in
opcode.rs currently treats some PseudoInstruction variants as name-bearing but
misses the super pseudo-opcodes; update the pattern in has_name (the matches!
over Self::try_from(opcode).map(|op| op.inner())) to also consider
AnyInstruction::Pseudo(PseudoInstruction::LoadSuperMethod { .. }),
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperMethod { .. }), and
AnyInstruction::Pseudo(PseudoInstruction::LoadZeroSuperAttr { .. }) so those
pseudo-opcodes return true for has_name().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@Cargo.toml`:
- Around line 199-200: The Cargo.toml currently pins the atomic crate at "0.5",
which is outdated; update the dependency entry for atomic (the "atomic" key in
Cargo.toml alongside memoffset) to "0.6" (e.g., change atomic = "0.5" to atomic
= "0.6"), then run cargo update -p atomic to refresh Cargo.lock and verify the
build and tests; leave memoffset unchanged unless you also want to bump it
separately.
In `@crates/common/src/refcount.rs`:
- Around line 297-313: with_deferred_drops currently sets the thread-local
IN_DEFERRED_CONTEXT, calls f, and resets the flag only after f returns, so a
panic in f leaves the flag set and prevents flush_deferred_drops; fix by
introducing an RAII guard type (e.g., DeferredDropsGuard) inside
with_deferred_drops that captures the previous IN_DEFERRED_CONTEXT value on
construction and, in its Drop impl, restores IN_DEFERRED_CONTEXT and calls
flush_deferred_drops if this guard is the outermost context; create the guard
before invoking f so the Drop runs on unwind and ensures proper reset/flush
regardless of panic.
In `@crates/vm/src/gc_state.rs`:
- Around line 226-234: track_object currently increments generations[0].count
and alloc_count before acquiring the generation_objects[0] write lock, which
races with collect_inner calling generations[0].count.store(0, ...) and can lose
concurrent allocations; to fix, ensure gen0 count is updated/cleared under the
same lock by replacing every unconditional generations[0].count.store(0, ...) in
collect_inner with a call to self.reset_gen0_count_from_set() so the count is
derived from the protected generation_objects[0] set, and keep track_object
as-is but rely on reset_gen0_count_from_set to reconcile counts atomically with
the gen0 write lock (apply the same replacement at the other occurrences
flagged: the collect_inner spots around lines noted and any other places that
zero gen0 directly).
♻️ Duplicate comments (9)
crates/derive-impl/src/pyclass.rs (2)
590-606: Don’t swallow malformedtraverseattributes.
.ok().flatten()turns any parse error into “no value,” so invalid usages (e.g., non-string values) silently deriveTraverse. Please handle the “flag-only” case explicitly while still propagating real parse errors.
632-639:TraverseFnlifetime missing in generated impl.
MaybeTraverse::try_traverserequires&mut TraverseFn<'_>. The current signature won’t satisfy the trait and should fail to compile.🐛 Proposed fix
- fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) { + fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) { `#try_traverse_body` }crates/common/src/refcount.rs (2)
30-37: LEAKED bit overlaps weak count.
TOTAL_COUNT_WIDTHsubtracts only 2 bits, but there are 3 flag bits (DESTRUCTED, WEAKED, LEAKED). WithEPOCH_WIDTH = 0,LEAKED(bit 61) overlaps the weak-count range (bits 31–61). This corrupts counts.🐛 Proposed fix
-pub const TOTAL_COUNT_WIDTH: u32 = u64::BITS - EPOCH_WIDTH - 2; +pub const TOTAL_COUNT_WIDTH: u32 = u64::BITS - EPOCH_WIDTH - 3; // DESTRUCTED, WEAKED, LEAKED
136-139: State layout comment is stale.The comment assumes 4 epoch bits and 28/29 counts, but
EPOCH_WIDTHis 0 and the actual bit allocation differs. Please update the doc (or make it conditional onEPOCH_WIDTH).crates/stdlib/src/gc.rs (1)
191-198: Incomplete implementation:get_referrersalways returns an empty list.This function is documented to "Return the list of objects that directly refer to any of the arguments" but always returns an empty list. This was flagged in a past review.
Consider either implementing it by scanning tracked objects, or raising
NotImplementedErrorto make the limitation explicit rather than silently returning incorrect results.crates/vm/src/object/core.rs (1)
1172-1177: Potential undefined behavior: casting&selfto*mutviolates aliasing rules.
gc_pop_edgestakes&selfbut casts to*mut PyObject. Even with the safety comment about exclusive GC access, creating a mutable pointer from a shared reference can violate Rust's aliasing rules.This was flagged in a previous review. Consider using only the
gc_pop_edges_rawvariant with pointers obtained through safe means, or usingUnsafeCellfor fields that need mutation during GC.crates/vm/src/gc_state.rs (3)
247-265: Gen count decrement is incorrect for non‑gen0 objects.
untrack_objectalways decrements gen0, which desynchronizes counts for objects tracked in gen1/gen2.
704-723: Promotion doesn’t update generation counts.
Objects move between generation sets without adjustingGcGeneration.count, so thresholds and stats drift.
735-747: Freeze/unfreeze only changes counters; objects remain collectible.
Without moving pointers out ofgeneration_objects, frozen objects can still be collected andget_objectsstays inconsistent.
🧹 Nitpick comments (2)
crates/vm/src/vm/mod.rs (1)
1152-1219: Heuristic-based module classification may be fragile.The module finalization logic relies on:
- Hardcoded stdlib module names (may become stale as stdlib evolves)
- Path substring checks (
"pylib","Lib") that may not work across all deployment scenarios (e.g., custom installations, virtual environments, frozen executables)Consider:
- Using a more robust marker (e.g., a flag set during import from frozen/builtin sources)
- Documenting that the list needs maintenance when stdlib changes
- Adding
"itertools","operator","re"and other commonly-used stdlib modules to the skip list if they're needed during finalizationThat said, this matches CPython's approach of best-effort module cleanup during shutdown.
crates/stdlib/src/gc.rs (1)
244-267: Consider reporting callback errors as unraisable exceptions.Callback errors are silently ignored (
let _ = callback.call(...)). In CPython, errors in GC callbacks are typically reported as unraisable exceptions rather than being completely silenced.Consider using
vm.run_unraisable()to report callback errors:♻️ Suggested improvement
for callback in callbacks { - let _ = callback.call((phase_str.clone(), info.clone()), vm); + if let Err(e) = callback.call((phase_str.clone(), info.clone()), vm) { + vm.run_unraisable(e, Some("gc callback".to_owned()), callback); + } }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (10)
Cargo.lockis excluded by!**/*.lockLib/test/support/__init__.pyis excluded by!Lib/**Lib/test/test_builtin.pyis excluded by!Lib/**Lib/test/test_concurrent_futures/test_process_pool.pyis excluded by!Lib/**Lib/test/test_dict.pyis excluded by!Lib/**Lib/test/test_gc.pyis excluded by!Lib/**Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_threading.pyis excluded by!Lib/**Lib/test/test_weakref.pyis excluded by!Lib/**
📒 Files selected for processing (27)
Cargo.tomlcrates/codegen/src/compile.rscrates/common/Cargo.tomlcrates/common/src/lib.rscrates/common/src/refcount.rscrates/derive-impl/src/pyclass.rscrates/derive-impl/src/pystructseq.rscrates/derive-impl/src/util.rscrates/stdlib/src/gc.rscrates/stdlib/src/multiprocessing.rscrates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/str.rscrates/vm/src/builtins/tuple.rscrates/vm/src/dict_inner.rscrates/vm/src/gc_state.rscrates/vm/src/lib.rscrates/vm/src/object/core.rscrates/vm/src/object/traverse.rscrates/vm/src/object/traverse_object.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/context.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (8)
- crates/vm/src/lib.rs
- crates/derive-impl/src/util.rs
- crates/derive-impl/src/pystructseq.rs
- crates/vm/src/object/traverse.rs
- crates/vm/src/builtins/str.rs
- crates/common/src/lib.rs
- crates/common/Cargo.toml
- crates/vm/src/vm/thread.rs
🧰 Additional context used
🧬 Code graph analysis (12)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
reactivate_guard(56-62)
crates/stdlib/src/multiprocessing.rs (1)
crates/vm/src/stdlib/os.rs (2)
__reduce__(715-717)__reduce__(775-777)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
debug_obj(87-93)drop_dealloc_obj(84-86)try_pop_edges_obj(128-131)try_trace_obj(96-100)
crates/vm/src/builtins/tuple.rs (2)
crates/vm/src/builtins/dict.rs (2)
traverse(48-50)pop_edges(52-58)crates/vm/src/object/traverse.rs (2)
traverse(34-34)pop_edges(39-39)
crates/vm/src/vm/context.rs (3)
crates/vm/src/builtins/list.rs (1)
new_ref(91-93)crates/vm/src/object/core.rs (2)
new_ref(1398-1415)default(420-422)crates/vm/src/gc_state.rs (1)
default(128-130)
crates/vm/src/vm/interpreter.rs (2)
crates/vm/src/stdlib/atexit.rs (1)
_run_exitfuncs(36-47)crates/vm/src/gc_state.rs (1)
gc_state(758-760)
crates/vm/src/builtins/dict.rs (2)
crates/vm/src/dict_inner.rs (2)
traverse(40-42)traverse(84-94)crates/vm/src/object/traverse.rs (2)
traverse(34-34)pop_edges(39-39)
crates/vm/src/builtins/function.rs (4)
crates/vm/src/builtins/dict.rs (2)
traverse(48-50)pop_edges(52-58)crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/object/traverse.rs (5)
traverse(34-34)traverse(43-45)traverse(49-51)traverse(55-55)pop_edges(39-39)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
ensure_pinned(42-48)drop_guard(67-71)
crates/vm/src/builtins/list.rs (5)
crates/vm/src/builtins/function.rs (2)
traverse(47-58)pop_edges(60-97)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/dict_inner.rs (1)
traverse(40-42)crates/vm/src/builtins/mappingproxy.rs (1)
traverse(30-35)crates/vm/src/object/traverse.rs (1)
pop_edges(39-39)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (16)
gc_state(758-760)enable(164-166)disable(169-171)collect(370-372)get_threshold(184-190)set_threshold(64-66)set_threshold(193-201)get_count(204-210)get_debug(174-176)set_debug(179-181)get_stats(213-219)stats(68-75)get_objects(299-340)freeze(736-742)unfreeze(745-748)get_freeze_count(731-733)
crates/codegen/src/compile.rs (2)
crates/compiler-core/src/bytecode/instruction.rs (2)
marker(1056-1058)new(1061-1063)crates/compiler-core/src/bytecode/oparg.rs (2)
null(19-21)null(42-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Check the WASM package and demo
🔇 Additional comments (78)
crates/vm/src/stdlib/thread.rs (3)
3-6: Re-export list looks fine.
661-760: Exception hook flow looks solid and robust.Finalization suppression and stderr routing are well handled, and the traceback output path is clear.
429-458: Make EBR guard drop panic-safe via scopeguard.The
drop_guard()call runs only on normal exit. Iffunc.invoke()or cleanup code panics, the guard won't be released until TLS teardown, which delays epoch advancement. Usingscopeguard::defer!ensures cleanup happens regardless of panic.♻️ Suggested change
fn run_thread(func: ArgCallable, args: FuncArgs, vm: &VirtualMachine) { // Enter EBR critical section for this thread (Coarse-grained pinning) // This ensures GC won't free objects while this thread might access them crate::vm::thread::ensure_pinned(); + scopeguard::defer! { + // Drop EBR guard even if we panic during thread execution/cleanup + crate::vm::thread::drop_guard(); + } match func.invoke(args, vm) { Ok(_obj) => {} Err(e) if e.fast_isinstance(vm.ctx.exceptions.system_exit) => {} Err(exc) => { @@ crate::vm::thread::cleanup_current_thread_frames(vm); vm.state.thread_count.fetch_sub(1); - // Drop EBR guard when thread exits, allowing epoch advancement - crate::vm::thread::drop_guard(); + // drop_guard handled by scopeguard }crates/vm/src/stdlib/sys.rs (1)
829-832: Frame cloning here looks safe and intentional.Using
frame.as_object().to_owned()preserves theFrameRefwhile storing a proper ownedPyObjectRefin the dict; semantics remain correct.crates/vm/src/object/traverse_object.rs (1)
14-43: pop_edges vtable wiring looks consistent.The conditional hook (
HAS_POP_EDGES→try_pop_edges_obj::<T>) mirrors the trace path and keeps the vtable cohesive.crates/vm/src/vm/context.rs (1)
55-58: GC state fields wired into Context cleanly.The new
gc_callbacks/gc_garbagelists are initialized alongside other core objects, which matches the GC module’s needs.crates/stdlib/src/multiprocessing.rs (1)
550-555: LGTM - Consistent with existing unpicklable type patterns.The
__reduce__implementation correctly prevents direct pickling ofSemLockobjects. The error message and pattern match similar implementations inos.rsforDirEntryandScandirIterator.crates/vm/src/dict_inner.rs (1)
727-737: LGTM - Clean implementation for GC edge collection.The method correctly:
- Requires
&mut selfto ensure exclusive access during GC- Resets internal state (
used,filled, indices)- Uses
drain(..).flatten()to handleOption<DictEntry<T>>entries- Returns an iterator yielding owned
(key, value)pairsThis aligns well with the broader GC traversal framework introduced in this PR.
crates/vm/src/vm/mod.rs (2)
553-562: LGTM - Correct suppression of unraisable exceptions during finalization.The early return when
finalizingis set matches CPython's behavior of silently ignoring__del__errors and daemon thread exceptions during interpreter shutdown. TheAcquirememory ordering is correct for synchronizing with theReleasestore that sets the flag.
673-678: LGTM - Appropriate EBR guard reactivation at frame boundary.Reactivating the EBR guard after a frame completes allows the garbage collector to advance epochs and free deferred objects. The frame boundary is a safe point where no Python objects should be in an inconsistent state.
crates/vm/src/builtins/list.rs (2)
27-33: LGTM - Correct pyclass configuration for manual traversal.The
traverse = "manual"andpop_edgesattributes correctly signal that this type implementsTraversemanually rather than via derive macro.
60-74: Silent failure on lock contention may leave edges uncollected.The
pop_edgesimplementation usestry_write()which silently returns if the lock cannot be acquired. While the comment explains that during GC the object should be unreachable, there's a potential edge case where edges remain uncollected if contention occurs.For comparison,
PyTuple::pop_edges(from relevant snippets) usesstd::mem::takedirectly on&mut self.elementswithout lock concerns.Consider either:
- Adding a debug assertion or log when
try_write()fails during GC- Documenting that this is expected behavior under certain conditions
Otherwise, the implementation is functionally correct for the typical GC scenario.
crates/vm/src/builtins/dict.rs (2)
33-39: LGTM - Correct pyclass configuration for manual traversal.Consistent with the
PyListchanges - properly configured for manualTraverseimplementation.
46-59: LGTM - Clean integration withDict::pop_all_entries.The
Traverseimplementation correctly:
- Delegates
traverseto the underlyingDictContentType- Uses
pop_all_entries()to drain all entries during GC- Pushes both keys and values to the output vector
This properly integrates with the new
pop_all_entriesmethod added todict_inner.rs.crates/vm/src/builtins/tuple.rs (3)
6-6: LGTM!Import of
TraverseandTraverseFnis correctly added to support the new traversal implementation.
28-28: LGTM!The pyclass attribute correctly declares manual traversal and pop_edges capability, consistent with the pattern used in PyDict and PyList.
40-52: LGTM!The
Traverseimplementation is correct:
traverseproperly delegates toelements.traverse()to visit contained objectspop_edgescorrectly takes ownership viastd::mem::takeand extends the output vector- The SAFETY comment appropriately documents the safety invariants
This matches the established pattern for other container types (PyDict, PyList) in this PR.
crates/vm/src/vm/interpreter.rs (2)
113-120: LGTM!The finalization step documentation accurately reflects the new shutdown sequence including the finalizing flag, threading._shutdown call, and two-pass GC collection around module cleanup.
152-161: LGTM!The two-pass GC collection around module finalization follows CPython's shutdown semantics:
- First pass collects cycles before module cleanup
- Module finalization breaks references in module namespaces
- Second pass collects newly-unreachable cyclic garbage
Using
collect_force(2)ensures full generation-2 collection regardless of GC enabled state, which is correct for shutdown.crates/vm/src/builtins/function.rs (4)
12-12: LGTM!Import and pyclass attribute changes are correct for enabling manual traversal and pop_edges capability.
Also applies to: 28-28
46-58: LGTM!The
traverseimplementation correctly visits all fields that may contain object references:
globals,closure,defaults_and_kwdefaults(existing)type_params,annotations,module,doc(newly added)Locking before traversal ensures thread-safe access to the guarded values.
76-96: Well-documented edge clearing semantics.The comments clearly explain the rationale for:
- Not clearing annotations (shared dict concern)
- Clearing name/qualname (str subclass cycle potential)
- Not clearing globals/builtins/code (CPython func_clear requirement)
This follows CPython's
func_clearsemantics appropriately.
60-97: Verify Context::genesis() availability and try_lock() behavior during GC shutdown.The code uses
Context::genesis().empty_str(lines 86, 91) to replace name and qualname strings. Confirm that the genesis context is guaranteed to be available and valid during late-stage garbage collection and program shutdown. Additionally, verify whether silent failures fromtry_lock()(lines 67, 85, 89) are acceptable—while safe, unpopped edges from lock contention during GC may require fallback handling or documentation. The code comments justify clearing name/qualname due to str subclass cycles and explain why globals/builtins/code are not cleared; consider extending these comments to address the genesis context and try_lock() patterns.crates/stdlib/src/gc.rs (3)
12-22: LGTM!Debug flag constants correctly mirror the gc_state module definitions, providing Python-level access to GC debugging options.
49-86: LGTM!The
collectfunction correctly:
- Validates generation bounds (0-2)
- Invokes callbacks with "start" phase before collection
- Uses
collect_forcewhich runs regardless of GC enabled state- Transfers garbage objects to the Python-visible
gc.garbagelist- Invokes callbacks with "stop" phase including collection stats
This matches CPython's
gc.collect()behavior.
157-176: LGTM!The
get_objectsimplementation correctly:
- Uses
OptionalArg<Option<i32>>to handle both missing argument and explicitNone- Validates generation bounds (0-2) with a clear error message
- Delegates to
gc_state().get_objects()for the actual workcrates/vm/src/object/core.rs (6)
127-131: LGTM!The
try_pop_edges_objfunction correctly forwards to the payload'stry_pop_edgesmethod through proper pointer casting.
229-233: LGTM!Good handling of the resurrection edge case where
inner.objwas cleared by GC but the object was subsequently resurrected (e.g., during__del__). Restoring the pointer allows new weakrefs to be created correctly.
264-315: LGTM!The two new GC-specific weakref clearing methods implement CPython's weakref semantics correctly:
clear_for_gc: Clears and calls callbacks but doesn't decrement ref_count (owner not being dropped)clear_for_gc_collect_callbacks: Clears without callbacks, enabling the two-phase approach where ALL weakrefs are cleared BEFORE any callbacks runThis prevents callbacks from seeing inconsistent state where some objects have cleared weakrefs and others don't.
1396-1415: LGTM!The tracking logic correctly handles both cases:
- Objects with
IS_TRACE = true(has trace function for traversal)- Objects with instance dicts (user-defined class instances)
Calling
maybe_collect()after tracking ensures GC can run on allocation, matching CPython's behavior. The conditionT::IS_TRACE || has_dictaligns with the untrack condition indrop_slow.
903-906: LGTM!Making
gc_finalizedpublic enables the Python-facinggc.is_finalized()function to check if an object's__del__has been called.
991-1029: Verify the SAFETY claim aboutuntrack_objectimplementation.The drop_slow implementation's overall structure is sound:
- Untrack from GC before deallocation (deferred if lock contention via
try_defer_drop)- Extract child edges via
pop_edges- Deallocate the object
- Drop child edges (triggers recursive destruction with cycles broken)
However, the SAFETY comment at lines 1009-1011 claims that
untrack_object"only removes the pointer address from a HashSet" and "does NOT dereference the pointer." This is critical: ifuntrack_objectactually performs any pointer dereference after the object is deallocated, this would be undefined behavior. Confirm thatuntrack_objecttruly only performs address-based operations (e.g., hashset removal) without dereferencing the pointer.crates/vm/src/gc_state.rs (2)
33-83: GcGeneration API looks solid and thread-safe.
The atomic counters + mutexed stats give a clean, low-risk surface for callers.
158-219: Public GC state accessors are clear and consistent.
The enable/disable, debug, thresholds, counts, and stats APIs line up with expected GC control surfaces.crates/codegen/src/compile.rs (44)
37-38: Import addition looks correct.
42-45: Bytecode import expansion aligns with the new emission path.
277-297: emit! macro coverage looks solid for struct/tuple/no-arg variants.
1188-1193: Confirm PopBlock pseudo-instruction lowering in unwind path.
This unwind branch now emitsPseudoInstruction::PopBlock; please ensure the lowering phase handles it correctly for with/async-with cleanup.
1913-1936: Control-flow lowering to pseudo jump looks fine.
2520-2522: Pseudo jump to end block is appropriate here.
2551-2564: Please double-check stack shape before RERAISE.
This path restoresprev_excviaCopy+PopExceptwhile leavingprev_excon stack; confirm that the VM/exception table expects this layout.
2604-2605: Pseudo jump to else block looks correct.
2686-2690: Jump to handler_normal_exit via pseudo op is fine.
2734-2739: Pseudo jump to finally block looks correct.
2810-2812: Pseudo jump to end block looks fine.
2850-2863: Please re-verify PopExcept placement with lasti on stack.
Stack comments indicate[lasti, prev_exc, exc]remains afterPopExcept; please confirm this matches exception table expectations.
2944-2945: Pseudo jump in try flow looks fine.*
3056-3058: Pseudo jump to next handler block looks fine.
3084-3086: Pseudo jump in handler-error path looks fine.
3108-3112: Pseudo jump to reraise_star block looks correct.
3162-3163: Pseudo jump to end block looks correct.
3202-3204: Pseudo jump to end block looks correct.
4043-4048: While-loop backedge via pseudo jump is fine.
4181-4185: Normal-exit jump in with-statement looks correct.
4253-4257: Suppress-path jump to after-block looks correct.
4341-4342: For-loop backedge via pseudo jump is fine.
4398-4403: Pattern fail-pop jump via pseudo op looks fine.
5414-5416: Cleanup jump in chained comparison looks correct.
5856-5858: Yield-from backedge via pseudo jump looks fine.
5931-5935: Confirm lowering for LoadZeroSuperAttr pseudo-op.
Please verify the lowering phase converts this to the correct real instruction with proper flags.
6121-6127: List-comp init via AnyInstruction looks good.
6148-6154: Set-comp init via AnyInstruction looks good.
6177-6182: Dict-comp init via AnyInstruction looks good.
6257-6262: Ternary expression jump via pseudo op looks fine.
6383-6388: Confirm lowering for LoadSuperMethod/LoadZeroSuperMethod pseudo-ops.
Ensure the lowered instruction preserves the method-call fast path semantics.
6393-6397: Confirm lowering for LoadAttrMethod pseudo-op.
Make sure the lowered instruction still produces the expected[method, self_or_null]stack layout.
6541-6543: compile_comprehension signature update looks consistent.
6687-6689: Comprehension loop backedge via pseudo jump looks fine.
6764-6766: compile_inlined_comprehension signature update looks consistent.
6885-6887: Inlined comprehension loop backedge via pseudo jump looks fine.
6899-6901: Inlined comprehension normal-path jump looks fine.
6973-6981: Generic _emit with AnyInstruction looks correct.
6989-6991: emit_no_arg generic helper looks good.
6993-6999: emit_arg generic helper looks good.
7021-7024: ReturnConst folding with AnyInstruction looks correct.
7472-7476: EmitArg trait signature update looks correct.
7478-7485: EmitArg impl for OpArgType looks correct.
7488-7494: EmitArg impl for BlockIdx looks correct.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Cargo.toml
Outdated
| memoffset = "0.9" | ||
| atomic = "0.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Checking memoffset ==="
curl -s https://crates.io/api/v1/crates/memoffset | jq '{name: .crate.name, newest_version: .crate.newest_version, max_stable_version: .crate.max_stable_version}'
echo -e "\n=== Checking atomic ==="
curl -s https://crates.io/api/v1/crates/atomic | jq '{name: .crate.name, newest_version: .crate.newest_version, max_stable_version: .crate.max_stable_version}'
echo -e "\n=== Checking rustsec database for advisories ==="
# Check if rustsec cargo is available, if not use web API
if command -v cargo-tree &> /dev/null; then
echo "Checking with cargo-audit..."
# We'll use a simple approach to check published advisories
fi
# Check memoffset and atomic versions on crates.io for any yanked versions
echo -e "\n=== Checking memoffset versions ==="
curl -s https://crates.io/api/v1/crates/memoffset/versions | jq '.versions[] | select(.num == "0.9.0" or .num == "0.9") | {num: .num, yanked: .yanked, created_at: .created_at}' | head -5
echo -e "\n=== Checking atomic versions ==="
curl -s https://crates.io/api/v1/crates/atomic/versions | jq '.versions[] | select(.num == "0.5" or .num == "0.5.0") | {num: .num, yanked: .yanked, created_at: .created_at}' | head -5Repository: youknowone/RustPython
Length of output: 584
Consider updating the atomic dependency to a newer version.
memoffset 0.9 is acceptable (latest is 0.9.1, a minor patch ahead). However, atomic 0.5 is significantly outdated with version 0.6.1 currently available. Update to atomic = "0.6" to benefit from bug fixes and improvements released over the past ~4 years.
🤖 Prompt for AI Agents
In `@Cargo.toml` around lines 199 - 200, The Cargo.toml currently pins the atomic
crate at "0.5", which is outdated; update the dependency entry for atomic (the
"atomic" key in Cargo.toml alongside memoffset) to "0.6" (e.g., change atomic =
"0.5" to atomic = "0.6"), then run cargo update -p atomic to refresh Cargo.lock
and verify the build and tests; leave memoffset unchanged unless you also want
to bump it separately.
| pub unsafe fn track_object(&self, obj: NonNull<PyObject>) { | ||
| let gc_ptr = GcObjectPtr(obj); | ||
| self.generations[0].count.fetch_add(1, Ordering::SeqCst); | ||
| self.alloc_count.fetch_add(1, Ordering::SeqCst); | ||
|
|
||
| // Add to generation 0 tracking (for correct gc_refs algorithm) | ||
| if let Ok(mut gen0) = self.generation_objects[0].write() { | ||
| gen0.insert(gc_ptr); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gen0 count resets can lose concurrent allocations.
track_object increments before acquiring the gen0 write lock, while collection uses unconditional store(0). A concurrent allocation during collection can be dropped from the count, delaying future collections.
🛠️ Suggested fix: update counts under the same lock and reset from the set size
impl GcState {
+ fn reset_gen0_count_from_set(&self) {
+ if let Ok(gen0) = self.generation_objects[0].read() {
+ self.generations[0].count.store(gen0.len(), Ordering::SeqCst);
+ }
+ }
+
pub unsafe fn track_object(&self, obj: NonNull<PyObject>) {
let gc_ptr = GcObjectPtr(obj);
- self.generations[0].count.fetch_add(1, Ordering::SeqCst);
- self.alloc_count.fetch_add(1, Ordering::SeqCst);
-
// Add to generation 0 tracking (for correct gc_refs algorithm)
if let Ok(mut gen0) = self.generation_objects[0].write() {
gen0.insert(gc_ptr);
+ self.generations[0].count.fetch_add(1, Ordering::SeqCst);
+ self.alloc_count.fetch_add(1, Ordering::SeqCst);
}
...
}
}Apply self.reset_gen0_count_from_set() in place of each self.generations[0].count.store(0, ...) occurrence in collect_inner.
Also applies to: 422-426, 515-517, 545-547, 692-694
🤖 Prompt for AI Agents
In `@crates/vm/src/gc_state.rs` around lines 226 - 234, track_object currently
increments generations[0].count and alloc_count before acquiring the
generation_objects[0] write lock, which races with collect_inner calling
generations[0].count.store(0, ...) and can lose concurrent allocations; to fix,
ensure gen0 count is updated/cleared under the same lock by replacing every
unconditional generations[0].count.store(0, ...) in collect_inner with a call to
self.reset_gen0_count_from_set() so the count is derived from the protected
generation_objects[0] set, and keep track_object as-is but rely on
reset_gen0_count_from_set to reconcile counts atomically with the gen0 write
lock (apply the same replacement at the other occurrences flagged: the
collect_inner spots around lines noted and any other places that zero gen0
directly).
* Updated urllib + urllib tests * Updated http + test * Annotated failing tests * Fixed issues in httpservers, robotparser and urllib2net * Fixed windows only success * Annotated flaky test in test_logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/vm/src/builtins/function.rs (1)
46-58: Missing traversal ofcodeandbuiltinsfields.The
traverseimplementation doesn't visitself.codeandself.builtins, both of which holdPyObjectRef/PyRefthat may participate in reference cycles. Whilepop_edgesintentionally skips these (matching CPython'sfunc_clear), the traversal phase should still visit them so the GC can discover all reachable objects.Proposed fix
unsafe impl Traverse for PyFunction { fn traverse(&self, tracer_fn: &mut TraverseFn<'_>) { + self.code.lock().traverse(tracer_fn); self.globals.traverse(tracer_fn); + self.builtins.traverse(tracer_fn); if let Some(closure) = self.closure.as_ref() { closure.as_untyped().traverse(tracer_fn); } self.defaults_and_kwdefaults.traverse(tracer_fn); // Traverse additional fields that may contain references self.type_params.lock().traverse(tracer_fn); self.annotations.lock().traverse(tracer_fn); self.module.lock().traverse(tracer_fn); self.doc.lock().traverse(tracer_fn); }crates/vm/src/vm/interpreter.rs (1)
113-152: Documentation and code order mismatch for finalization steps.The doc comment (lines 116-117) states:
- Set finalizing flag (suppresses unraisable exceptions)
- Call
threading._shutdown()to join non-daemon threadsBut the actual code does the opposite:
- Line 135:
wait_for_thread_shutdown(vm)(callsthreading._shutdown())- Line 139:
vm.state.finalizing.store(true, Ordering::Release)This means unraisable exceptions from
threading._shutdown()are not suppressed. If this is intentional, update the doc comment to match. Otherwise, swap the order so exceptions are suppressed during shutdown.Proposed fix (if suppression is intended)
- // Wait for non-daemon threads (wait_for_thread_shutdown) - wait_for_thread_shutdown(vm); - // Suppress unraisable exceptions from daemon threads and __del__ // methods during shutdown. vm.state.finalizing.store(true, Ordering::Release); + // Wait for non-daemon threads (wait_for_thread_shutdown) + wait_for_thread_shutdown(vm); + atexit::_run_exitfuncs(vm);
🤖 Fix all issues with AI agents
In `@crates/vm/src/gc_state.rs`:
- Around line 566-586: The loop calling weakref callbacks currently ignores the
Option returned by crate::vm::thread::with_vm, causing silent skips when no VM
is available; change the for loop that iterates over all_callbacks (tuples of
crate::PyRef<crate::object::PyWeak> and crate::PyObjectRef> produced by
gc_clear_weakrefs_collect_callbacks) to explicitly handle with_vm(&cb, |vm| {
... })’s return: if with_vm returns Some(vm) call cb.call((wr.clone(),), vm) and
handle its Result (log or propagate errors), and if with_vm returns None log a
warning or otherwise surface the missing-VM error instead of using let _ = and
silently ignoring it. Ensure you reference the with_vm invocation and the
cb.call invocation when implementing the handling.
♻️ Duplicate comments (4)
crates/vm/src/object/core.rs (1)
1135-1140: Potential aliasing issue remains ingc_pop_edges.This method takes
&selfbut immediately casts to*mut PyObject, which can violate Rust's aliasing rules. The past review flagged this issue - consider using only the raw pointer variantgc_pop_edges_rawand removing this&selfwrapper, or document why this is safe in this specific context.The safety comment mentions "exclusive access during GC", but the function signature doesn't enforce this guarantee.
crates/derive-impl/src/pyclass.rs (1)
632-645: Add missing lifetime parameter toTraverseFnin generatedtry_traversesignature.The generated
try_traversesignature at line 637 uses&mut ::rustpython_vm::object::TraverseFnwithout a lifetime parameter. However,TraverseFnis defined aspub type TraverseFn<'a> = dyn FnMut(&PyObject) + 'a;and the trait implementations consistently useTraverseFn<'_>. The generated code should match:Suggested fix
- fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) { + fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) {crates/stdlib/src/gc.rs (1)
189-196: Incomplete implementation:get_referrersalways returns an empty list.This was flagged in a previous review and remains unaddressed. The function documents returning "the list of objects that directly refer to any of the arguments" but always returns empty.
crates/vm/src/gc_state.rs (1)
426-431: Gen0 count reset can still lose concurrent allocations.The previous review flagged that
generations[0].count.store(0, ...)can race withtrack_object. Whiletrack_objectnow increments under the write lock (lines 234-238),collect_innerstill uses unconditionalstore(0)without holding the same lock.Race scenario:
collect_innerreads gen0 objects and determines they're all reachabletrack_objectacquires write lock, inserts new object, increments count to 1collect_innerexecutesstore(0), losing the new allocation's countThis delays future automatic collections since gen0 count won't accurately reflect tracked objects.
Also applies to: 517-525, 548-554, 698-701
🧹 Nitpick comments (4)
crates/vm/src/object/core.rs (1)
990-1022: Clarify comment about untrack_object safety.The safety comment at lines 1002-1003 states:
// SAFETY: untrack_object only removes the pointer address from a HashSet. // It does NOT dereference the pointer, so it's safe even after deallocation.However,
untrack_objectis called before deallocation (line 1004), so the "after deallocation" part of the comment is misleading. The comment appears to be explaining why the deferred closure is safe (since it may run after deallocation), but this isn't clear.Suggested clarification
- // SAFETY: untrack_object only removes the pointer address from a HashSet. - // It does NOT dereference the pointer, so it's safe even after deallocation. + // SAFETY: The deferred closure may run after the object is deallocated. + // This is safe because untrack_object only removes the pointer address + // from a HashSet without dereferencing it.crates/vm/src/vm/mod.rs (1)
1152-1219: Consider refining the stdlib detection heuristics.The hardcoded stdlib module list (lines 1181-1201) and path-based detection (
pylib,Libat line 1213) are fragile:
- The module list may become stale as stdlib evolves
- Path substrings may match unintended directories (e.g., a user module in
myLib/)Consider using a more robust approach such as checking if the module is a builtin (
module.__loader__type) or usingsys.stdlib_module_names(Python 3.10+).That said, since this is defensive cleanup during shutdown where errors are acceptable, the current approach is functional.
💡 Alternative approach using loader type
- // Skip stdlib modules (starting with _ or known stdlib names) - if name.starts_with('_') - || matches!( - name, - "sys" - | "builtins" - ... - ) - { - continue; - } + // Skip stdlib modules by checking if they're builtin/frozen + if name.starts_with('_') { + continue; + } + if let Some(module) = value.downcast_ref::<PyModule>() + && let Ok(loader) = module.dict().get_item("__loader__", self) + { + let loader_name = loader.class().name().to_string(); + if loader_name.contains("BuiltinImporter") + || loader_name.contains("FrozenImporter") + { + continue; + } + }crates/stdlib/src/gc.rs (1)
242-265: Consider logging errors from GC callbacks instead of silently ignoring them.CPython uses
PyErr_WriteUnraisablefor callback errors, which logs them viasys.unraisablehook. Currently, all errors are silently swallowed withlet _ =. This could make debugging difficult when callbacks fail.💡 Suggestion to log callback errors
for callback in callbacks { - let _ = callback.call((phase_str.clone(), info.clone()), vm); + if let Err(exc) = callback.call((phase_str.clone(), info.clone()), vm) { + // Log via unraisable hook like CPython does + vm.write_unraisable(exc, Some("gc callback".to_owned()), None); + } }crates/vm/src/gc_state.rs (1)
806-842: Test coverage is limited to basic configuration.The tests verify configuration methods but don't test the core GC functionality:
- Object tracking/untracking
- Collection algorithm (cycle detection, resurrection handling)
- Generation promotion
- freeze/unfreeze with actual objects
Consider adding integration tests with mock or real PyObject instances to verify the collection algorithm works correctly.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (14)
Cargo.lockis excluded by!**/*.lockLib/json/__init__.pyis excluded by!Lib/**Lib/json/decoder.pyis excluded by!Lib/**Lib/json/encoder.pyis excluded by!Lib/**Lib/test/support/__init__.pyis excluded by!Lib/**Lib/test/test_builtin.pyis excluded by!Lib/**Lib/test/test_dict.pyis excluded by!Lib/**Lib/test/test_gc.pyis excluded by!Lib/**Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_json/test_encode_basestring_ascii.pyis excluded by!Lib/**Lib/test/test_json/test_scanstring.pyis excluded by!Lib/**Lib/test/test_json/test_unicode.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_weakref.pyis excluded by!Lib/**
📒 Files selected for processing (30)
.cspell.jsoncrates/common/Cargo.tomlcrates/common/src/lib.rscrates/common/src/refcount.rscrates/derive-impl/src/pyclass.rscrates/derive-impl/src/pystructseq.rscrates/derive-impl/src/util.rscrates/stdlib/src/gc.rscrates/vm/src/builtins/asyncgenerator.rscrates/vm/src/builtins/coroutine.rscrates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/generator.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/str.rscrates/vm/src/builtins/tuple.rscrates/vm/src/coroutine.rscrates/vm/src/dict_inner.rscrates/vm/src/frame.rscrates/vm/src/gc_state.rscrates/vm/src/lib.rscrates/vm/src/object/core.rscrates/vm/src/object/traverse.rscrates/vm/src/object/traverse_object.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/context.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/thread.rs
🚧 Files skipped from review as they are similar to previous changes (6)
- crates/derive-impl/src/util.rs
- crates/vm/src/stdlib/sys.rs
- crates/vm/src/dict_inner.rs
- crates/common/Cargo.toml
- crates/vm/src/builtins/str.rs
- crates/vm/src/builtins/list.rs
🧰 Additional context used
🧬 Code graph analysis (16)
crates/vm/src/vm/context.rs (3)
crates/vm/src/object/core.rs (2)
new_ref(1361-1378)default(413-415)crates/vm/src/builtins/list.rs (1)
new_ref(91-93)crates/vm/src/gc_state.rs (1)
default(130-132)
crates/vm/src/builtins/coroutine.rs (3)
crates/vm/src/builtins/asyncgenerator.rs (5)
traverse(29-32)traverse(215-217)traverse(271-274)traverse(373-376)traverse(532-535)crates/vm/src/builtins/generator.rs (1)
traverse(24-26)crates/vm/src/object/traverse.rs (7)
traverse(35-35)traverse(44-46)traverse(50-52)traverse(56-56)traverse(61-65)traverse(73-77)traverse(85-89)
crates/vm/src/coroutine.rs (2)
crates/vm/src/frame.rs (2)
traverse(101-103)traverse(107-118)crates/vm/src/object/traverse.rs (6)
traverse(35-35)traverse(44-46)traverse(50-52)traverse(56-56)traverse(61-65)traverse(73-77)
crates/vm/src/builtins/function.rs (4)
crates/vm/src/builtins/generator.rs (1)
traverse(24-26)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/object/traverse.rs (1)
pop_edges(40-40)crates/vm/src/builtins/dict.rs (1)
pop_edges(52-58)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
gc_state(802-804)
crates/vm/src/vm/mod.rs (1)
crates/vm/src/vm/thread.rs (1)
reactivate_guard(56-62)
crates/vm/src/builtins/tuple.rs (3)
crates/vm/src/builtins/dict.rs (2)
traverse(48-50)pop_edges(52-58)crates/vm/src/builtins/list.rs (2)
traverse(62-64)pop_edges(66-73)crates/vm/src/object/traverse.rs (1)
pop_edges(40-40)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
debug_obj(87-93)drop_dealloc_obj(84-86)try_pop_edges_obj(128-131)try_trace_obj(96-100)
crates/vm/src/frame.rs (4)
crates/vm/src/builtins/asyncgenerator.rs (5)
traverse(29-32)traverse(215-217)traverse(271-274)traverse(373-376)traverse(532-535)crates/vm/src/builtins/coroutine.rs (2)
traverse(22-24)traverse(157-159)crates/vm/src/object/traverse.rs (6)
traverse(35-35)traverse(44-46)traverse(50-52)traverse(56-56)traverse(61-65)traverse(73-77)crates/vm/src/exceptions.rs (2)
traverse(28-33)traverse(1634-1652)
crates/derive-impl/src/pystructseq.rs (2)
crates/vm/src/object/traverse.rs (2)
try_traverse(20-20)try_pop_edges(22-22)crates/vm/src/builtins/str.rs (2)
try_traverse(1930-1932)try_pop_edges(1934-1936)
crates/vm/src/vm/thread.rs (1)
crates/vm/src/vm/mod.rs (1)
new(145-260)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
ensure_pinned(42-48)drop_guard(67-71)
crates/vm/src/object/traverse.rs (5)
crates/vm/src/builtins/str.rs (1)
try_pop_edges(1934-1936)crates/vm/src/builtins/dict.rs (1)
pop_edges(52-58)crates/vm/src/builtins/function.rs (1)
pop_edges(60-97)crates/vm/src/builtins/list.rs (1)
pop_edges(66-73)crates/vm/src/builtins/tuple.rs (1)
pop_edges(47-51)
crates/derive-impl/src/pyclass.rs (2)
crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/object/traverse_object.rs (3)
traverse(49-51)traverse(56-70)traverse(75-84)
crates/vm/src/builtins/generator.rs (3)
crates/vm/src/builtins/asyncgenerator.rs (5)
traverse(29-32)traverse(215-217)traverse(271-274)traverse(373-376)traverse(532-535)crates/vm/src/builtins/coroutine.rs (2)
traverse(22-24)traverse(157-159)crates/vm/src/object/traverse.rs (6)
traverse(35-35)traverse(44-46)traverse(50-52)traverse(56-56)traverse(61-65)traverse(73-77)
crates/vm/src/gc_state.rs (1)
crates/stdlib/src/gc.rs (6)
disable(32-34)get_count(113-122)get_stats(138-155)is_finalized(207-210)collect(50-86)callbacks(238-240)
🪛 Biome (2.1.2)
.cspell.json
[error] 69-69: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 69-69: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 121-121: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 121-121: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 122-122: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 122-122: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 123-123: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 123-123: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 124-124: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 124-124: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 125-125: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 125-125: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 126-126: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 126-126: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 127-127: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 127-127: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check Rust code with clippy
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests on wasm-wasi
🔇 Additional comments (47)
.cspell.json (1)
69-127: LGTM!The spell-check additions are appropriate for the GC terminology introduced in this PR. The static analysis hints from Biome are false positives - it's incorrectly treating JSONC (JSON with comments) as strict JSON.
crates/derive-impl/src/pyclass.rs (2)
590-608: Verify intentional error swallowing for traverse attribute parsing.The code intentionally uses
.ok().flatten()to treatErrasNone, which the comment explains is for handling baretraverseattributes (without a value). However, this also silently ignores genuine parse errors like#[pyclass(traverse(123))].Consider whether a more explicit match would be safer:
let value = match class_meta.inner()._optional_str("traverse") { Ok(v) => v, Err(_) if class_meta.inner()._has_key("traverse").unwrap_or(false) => None, // bare traverse Err(e) => return Err(e), // genuine parse error };This preserves the intended behavior for bare
traversewhile surfacing actual parse failures.
698-706: LGTM!The
MaybeTraverseimplementation is properly generated and integrated into the final token stream. The structure correctly handles bothtraverseandpop_edgesattributes with appropriate conditional logic.crates/vm/src/vm/thread.rs (2)
25-31: LGTM!The thread-local storage for async generator hooks and EBR guard is well-structured. Using
RefCellis appropriate since these are only accessed from the owning thread.
36-71: LGTM!The EBR guard management functions correctly implement the coarse-grained pinning strategy:
ensure_pinned()is idempotent and only pins if not already pinnedreactivate_guard()properly callsrepin()to allow epoch advancement at safe pointsdrop_guard()cleanly unpins when the thread exitsThe documentation clearly explains the purpose and usage context.
crates/vm/src/stdlib/thread.rs (4)
3-6: LGTM!The
self::prefix makes the re-export path more explicit.
429-459: LGTM!The EBR guard lifecycle is properly integrated:
ensure_pinned()at thread start protects object access during thread executiondrop_guard()is correctly placed last, after all cleanup that might access objects
525-530: LGTM!Using
is_some_and()is more idiomatic thanmap_or(false, ...)for this pattern.
676-682: LGTM!The SystemExit handling now properly checks for subclasses using
fast_issubclass(), which matches CPython's behavior of treating allSystemExitsubclasses as exit signals. Theis_some_andpattern correctly handles cases whereexc_typemight not be aPyType.crates/vm/src/object/core.rs (7)
127-131: LGTM!The
try_pop_edges_objfunction correctly handles payload-level edge extraction through raw pointers, matching the vtable-based dispatch pattern used elsewhere.
229-233: LGTM!This correctly handles the resurrection edge case where a new weakref is created during
__del__after GC has cleared the object pointer. Restoring the pointer ensures the weakref functions correctly.
257-381: LGTM!The weakref clearing refactoring correctly implements CPython-compatible GC semantics:
clear_for_gc_collect_callbacks()collects callbacks for batch invocation after all weakrefs are cleared- The
already_clearedcheck at line 320 prevents double ref_count decrement when both GC anddrop_slow_innertrigger clearing- The flags provide proper separation of concerns for different clearing scenarios
1359-1378: LGTM!The GC tracking in
new_refcorrectly mirrors the untracking logic indrop_slow:
- Same condition:
IS_TRACE || has_dict- Immediate GC check via
maybe_collect()after trackingThis ensures consistent object lifecycle management.
1042-1094: LGTM!The new GC utility methods are well-implemented:
is_gc_tracked()correctly checks both trace capability and dict presencetry_call_finalizer()properly sets the finalized flag before calling__del__to prevent double invocationgc_get_referents()andgc_clear_weakrefs_collect_callbacks()are clean delegations
1096-1126: LGTM!The raw pointer GC methods are properly marked
unsafewith clear safety documentation.gc_pop_edges_rawis the preferred API as it makes the caller responsible for ensuring exclusive access.
910-972: Resurrection detection logic is sound and handles the edge cases correctly.The logic works as intended:
inc()from 0 performs a double increment (0→2) for thread safety- After
__del__completes,after_delis compared toafter_inc(which is 2)- If
after_del > after_inc, resurrection detected; otherwise, normal destructionThe concern about
__del__creating and dropping temporary references is already handled: the code measuresafter_delonly after__del__completes, so any temporary refs created and dropped during execution are no longer visible in the final count. Only external references held beyond__del__would increase the count beyond the expected 2, correctly triggering resurrection detection.crates/vm/src/lib.rs (1)
80-80: LGTM! New GC state module export.The module is correctly placed alphabetically among other public modules. The unconditional export (no feature gate) is appropriate since GC state should be available regardless of configuration.
crates/common/src/lib.rs (1)
17-17: LGTM! Epoch-based reclamation re-export.Re-exporting
crossbeam_epochasepochprovides a clean, consistent API for EBR primitives across the crate. This aligns with the broader GC/EBR integration in this PR.crates/vm/src/vm/context.rs (2)
54-58: LGTM! GC-related fields for CPython compatibility.The
gc_callbacksandgc_garbagefields align with CPython'sgc.callbacksandgc.garbageAPI. UsingPyListRefprovides thread-safe access through internal locking.
335-362: LGTM! GC fields initialization follows established patterns.The initialization of
gc_callbacksandgc_garbageis consistent with other Context fields (e.g.,empty_tuple,int_cache_pool). Both are correctly initialized as empty lists.crates/vm/src/vm/mod.rs (2)
552-562: LGTM! Suppress unraisable exceptions during finalization.The early return when
state.finalizingis true correctly mirrors CPython's behavior of silently ignoring__del__errors and daemon thread exceptions during interpreter shutdown. TheAcquireordering is appropriate for reading a flag potentially set by another thread.
673-678: LGTM! EBR guard reactivation at frame boundary.Frame boundaries are appropriate safe points for epoch advancement since borrowed references from the frame are no longer in use. The
#[cfg(feature = "threading")]guard correctly scopes this to threaded builds where EBR is relevant.crates/vm/src/coroutine.rs (1)
42-49: LGTM! Traverse implementation for coroutine state.The implementation correctly traverses all fields containing object references (
frame,name,qualname,exception) while skipping primitive fields (closed,running). All traversed fields have proper Traverse implementations:PyRef<T>delegates to the object,PyMutex<T>safely locks and traverses viatry_lock()to avoid deadlocks, andOption<T>delegates to the inner type. This aligns with other Traverse implementations in the codebase.crates/vm/src/builtins/generator.rs (1)
17-27: LGTM! Correct manual traversal implementation.The
traverse = "manual"annotation and correspondingunsafe impl Traversecorrectly delegate to the innerCoro, consistent with the patterns incoroutine.rsandasyncgenerator.rs.crates/derive-impl/src/pystructseq.rs (1)
608-619: LGTM! Proper MaybeTraverse scaffolding for struct sequences.The constants
IS_TRACE = trueandHAS_POP_EDGES = falsecorrectly indicate traversal is needed but edge popping is not required. The no-optry_pop_edgesand delegation to the innerPyTupleviatry_traversealigns with other similar implementations likePyUtf8Str.crates/vm/src/builtins/function.rs (1)
60-97: Good defensive use oftry_lock()inpop_edges.Using
try_lock()instead oflock()prevents potential deadlocks during GC if another thread holds the mutex. The comments clearly document CPython'sfunc_clearsemantics and the rationale for not clearing annotations.crates/vm/src/frame.rs (1)
100-119: LGTM! Comprehensive traversal of Frame and FrameState.All fields containing
PyObjectRefor related types are properly traversed. The implementation correctly coverscode,func_obj,fastlocals,cells_frees,locals,globals,builtins,trace,state, andtemporary_refs. Primitive fields likelastiandtrace_linesare correctly excluded.crates/vm/src/builtins/coroutine.rs (2)
14-25: LGTM! Consistent traversal implementation for PyCoroutine.The manual traversal correctly delegates to
self.inner.traverse(), matching the pattern established ingenerator.rsandasyncgenerator.rs.
148-160: LGTM! Proper traversal for PyCoroutineWrapper.The traversal correctly visits the contained
PyRef<PyCoroutine>. TheAtomicCell<bool>fieldclosedis a primitive and correctly excluded from traversal.crates/vm/src/vm/interpreter.rs (1)
143-152: GC-driven cleanup sequence looks correct.The two-phase GC approach (collect before module cleanup, then after) is a sound strategy for breaking cycles and collecting garbage that becomes eligible after module finalization.
crates/vm/src/object/traverse_object.rs (1)
14-45: VTable extension for edge popping looks correct.The
pop_edgesfield follows the same pattern as the existingtracefield, using a const block to conditionally set the function pointer based onT::HAS_POP_EDGES. The approach is consistent with the established vtable design.crates/vm/src/builtins/tuple.rs (1)
40-52: Traverse implementation for PyTuple is correct.The implementation properly handles traversal and edge popping.
into_vec()onBox<[T]>is stable and correctly converts the boxed slice to a Vec for extending the output.Note: The implementation only applies to
PyTuple<PyObjectRef>(the default generic parameter), which is correct since only the default tuple type needs GC traversal.crates/vm/src/object/traverse.rs (2)
14-41: MaybeTraverse and Traverse trait extensions are well-designed.The additions follow the established pattern:
HAS_POP_EDGESmirrorsIS_TRACEfor compile-time capability detectiontry_pop_edgesandpop_edgeshave safe default no-op implementations- Documentation clearly explains the purpose (edge extraction before deallocation)
104-114: BoxVec Traverse implementation is consistent.The implementation follows the same pattern as
Vec<T>andBox<[T]>, iterating through elements and calling traverse on each.crates/vm/src/builtins/dict.rs (1)
46-59: Traverse implementation for PyDict is correct.The implementation properly:
- Traverses entries during GC marking
- Pops all keys and values during edge extraction
The
pop_all_entries()method exists onDictContentTypeand returns an iterator of(PyObjectRef, T)tuples, making the iteration and collection of keys and values correct.crates/vm/src/builtins/asyncgenerator.rs (3)
17-33: LGTM! Traverse implementation for PyAsyncGen is correct.The implementation properly traverses both the
innerCoro (which contains the frame and other GC-relevant fields) andag_finalizer(which holds a callback PyObjectRef). Therunning_asyncandag_hooks_initedAtomicCell fields correctly don't need traversal as they contain primitive values.
493-511: LGTM! Proper state management in error paths.Setting
inner.closed.store(true)in bothyield_closeandcheck_errorensures the async generator is properly marked as closed when it terminates due to errors or ignored GeneratorExit. This is consistent with CPython's behavior.
210-218: Traverse implementations for wrapper types are correct.All async generator helper types properly traverse their fields. The tuple type
(PyObjectRef, PyObjectRef, PyObjectRef)forPyAsyncGenAThrow::valueworks correctly due to the blanketTraverseimplementation for tuples (defined incrates/vm/src/object/traverse.rsline 153), which handles tuples of any number ofTraversetypes.Applies to lines: 210-218, 262-275, 363-377, 523-536
crates/stdlib/src/gc.rs (1)
49-86: LGTM! The collect implementation correctly handles manual GC calls.The implementation properly:
- Validates generation bounds (0-2)
- Invokes callbacks with "start" phase before collection
- Uses
collect_forceto run even when GC is disabled (matching CPython'sgc.collect()behavior)- Transfers garbage to
vm.ctx.gc_garbageforDEBUG_SAVEALLsupport- Invokes callbacks with "stop" phase and actual collection stats
crates/common/src/refcount.rs (4)
28-39: LGTM! Bit layout constants are correctly defined.The previous bit overlap issue has been fixed. With
TOTAL_COUNT_WIDTH = u64::BITS - EPOCH_WIDTH - 3, the layout is:
- Bits 63: DESTRUCTED
- Bits 62: WEAKED
- Bits 61: LEAKED
- Bits 31-60: weak_count (30 bits)
- Bits 0-30: strong_count (31 bits)
No overlap between flags and count fields.
208-240: LGTM! Panic-safe deferred drop context implementation.The
DeferredDropGuardRAII pattern correctly addresses the previous review concern about panic safety. The guard restores the context flag and flushes deferred drops even on panic/unwind.
277-295: LGTM! The defer_destruction function is correctly documented and implemented.The safety requirements are clearly documented, and the function correctly delegates to crossbeam-epoch's
defer_unchecked. TheSend + 'staticbounds onFare appropriate since the closure may execute on a different thread after epoch advancement.
118-131: Clarify the comment for inc() double-increment behavior when strong count is zero.The double-increment at lines 127-130 is intentional and correct—when strong count reaches 0 before
inc(), it increments to 2. This provides a "permission" mechanism for thread safety during object resurrection, as documented in the calling code (crates/vm/src/object/core.rs:920-921).However, the comment "created a permission to run decrement again" at line 128 is vague. Align it with the clearer explanation in the usage documentation: "double increment for thread safety" to help future maintainers understand why
inc()from 0 produces strong=2 instead of strong=1.crates/vm/src/gc_state.rs (4)
256-269: LGTM! untrack_object now correctly decrements the appropriate generation's count.The fix addresses the previous review concern by iterating through all generations and decrementing the count only for the generation where the object was actually found.
711-746: LGTM! promote_survivors now correctly updates generation counts.The fix addresses the previous review concern by decrementing the source generation's count (lines 726-731) and incrementing the target generation's count (lines 737-740) when moving objects between generations.
753-792: LGTM! freeze/unfreeze now properly move objects between sets.The fix addresses the previous review concern. Both functions now:
- Actually move object pointers between the generation sets and
permanent_objects- Update the counts to reflect the actual set sizes
Frozen objects are now properly excluded from collection since they're removed from
generation_objects.
394-402: LGTM! EBR integration for collection safety.The implementation correctly:
- Enters an EBR critical section before analyzing the object graph (line 398)
- Uses a memory fence to ensure visibility of reference count updates from other threads (line 402)
- Flushes EBR deferred operations before exiting (line 706)
This ensures that objects being freed by other threads won't have their memory deallocated until the collection phase completes.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| // 6c: Clear existing weakrefs BEFORE calling __del__ | ||
| // This invalidates existing weakrefs, but new weakrefs created during __del__ | ||
| // will still work (WeakRefList::add restores inner.obj if cleared) | ||
| // | ||
| // CRITICAL: We use a two-phase approach to match CPython behavior: | ||
| // Phase 1: Clear ALL weakrefs (set inner.obj = None) and collect callbacks | ||
| // Phase 2: Invoke ALL callbacks | ||
| // This ensures that when a callback runs, ALL weakrefs to unreachable objects | ||
| // are already dead (return None when called). | ||
| let mut all_callbacks: Vec<(crate::PyRef<crate::object::PyWeak>, crate::PyObjectRef)> = | ||
| Vec::new(); | ||
| for obj_ref in &unreachable_refs { | ||
| let callbacks = obj_ref.gc_clear_weakrefs_collect_callbacks(); | ||
| all_callbacks.extend(callbacks); | ||
| } | ||
| // Phase 2: Now call all callbacks - at this point ALL weakrefs are cleared | ||
| for (wr, cb) in all_callbacks { | ||
| crate::vm::thread::with_vm(&cb, |vm| { | ||
| let _ = cb.call((wr.clone(),), vm); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "pub fn with_vm" --type rust -B 2 -A 15Repository: youknowone/RustPython
Length of output: 1214
🏁 Script executed:
rg -n "pub fn with_vm" --type rust -A 25Repository: youknowone/RustPython
Length of output: 1893
🏁 Script executed:
sed -n '576,590p' crates/vm/src/gc_state.rs | cat -nRepository: youknowone/RustPython
Length of output: 820
Handle the fallible return from with_vm when invoking weakref callbacks.
At lines 583-586, the code ignores the Option returned by with_vm(&cb, |vm| ...). If no VM is available for the callback object, the callback silently fails to execute and the weakref cleanup is incomplete. Replace let _ = with explicit error handling:
Current code (lines 583-586)
// Phase 2: Now call all callbacks - at this point ALL weakrefs are cleared
for (wr, cb) in all_callbacks {
crate::vm::thread::with_vm(&cb, |vm| {
let _ = cb.call((wr.clone(),), vm);
});
}
Either propagate the error, log a warning, or panic if no VM is found—do not silently skip callback execution.
🤖 Prompt for AI Agents
In `@crates/vm/src/gc_state.rs` around lines 566 - 586, The loop calling weakref
callbacks currently ignores the Option returned by crate::vm::thread::with_vm,
causing silent skips when no VM is available; change the for loop that iterates
over all_callbacks (tuples of crate::PyRef<crate::object::PyWeak> and
crate::PyObjectRef> produced by gc_clear_weakrefs_collect_callbacks) to
explicitly handle with_vm(&cb, |vm| { ... })’s return: if with_vm returns
Some(vm) call cb.call((wr.clone(),), vm) and handle its Result (log or propagate
errors), and if with_vm returns None log a warning or otherwise surface the
missing-VM error instead of using let _ = and silently ignoring it. Ensure you
reference the with_vm invocation and the cb.call invocation when implementing
the handling.
* Add all other bytecodes * Mark passing/failing tests
* correct buitins type * Auto-format: ruff format --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@crates/common/src/refcount.rs`:
- Around line 118-131: Replace the cryptic comment inside inc() (around the
second self.state.fetch_add(COUNT, Ordering::SeqCst)) with an explicit
explanation of the 0→2 resurrection pattern: explain that when the previous
value's strong() == 0 the first fetch_add created a transient permission for a
concurrent decrement path, so inc() must perform a second fetch_add to raise the
strong count from 0 to 2 (not 1) to avoid races with concurrent drop/decrement;
reference the involved symbols (inc, State::from_raw, self.state.fetch_add,
COUNT, destructed, strong) and briefly state this is done for thread-safety so a
concurrent decrement cannot observe an intermediate 1 and prematurely destroy
the object.
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 693-784: The specialized (BinaryOpAddFloat, CallBuiltinFast,
ForIterGen, UnpackSequenceList, InstrumentedCall, etc.) and instrumented
variants are being created via unsafe transmute in the TryFrom<u8> ranges
(specialized_start..=specialized_end and instrumented_start..=instrumented_end)
but their matches in the Instruction::stack_effect method all return 0, which
miscounts stack usage; fix by mapping those specialized/instrumented variants to
the equivalent generic variants (e.g., BinaryOpAddFloat -> BinaryOp,
CallBuiltinFast/CallPyExactArgs/etc. -> Call with correct nargs handling,
ForIterGen -> ForIter with same conditional logic, UnpackSequenceList ->
UnpackSequence, Instrumented* -> corresponding base instruction) before
computing stack effect or implement correct stack_effect arms that compute the
same values as their generic counterparts, ensuring code paths reference the
Instruction enum variants (BinaryOp, Call, ForIter, UnpackSequence,
Instrumented* mapping) and preserve existing semantics used by stack-size
verification.
In `@crates/vm/src/vm/interpreter.rs`:
- Around line 137-141: The finalizing flag is set before running atexit handlers
which causes run_unraisable to short-circuit and suppress errors from atexit;
update the shutdown sequence in interpreter.rs so that
atexit::_run_exitfuncs(vm) is called first and only after it completes set
vm.state.finalizing.store(true, Ordering::Release) (ensure any logic that relies
on finalizing remains correct after the move); reference vm.state.finalizing and
atexit::_run_exitfuncs when making the change.
♻️ Duplicate comments (2)
crates/derive-impl/src/pyclass.rs (1)
637-639: Add lifetime parameter toTraverseFntype.The generated
try_traversemethod signature is missing the lifetime parameter onTraverseFn. According to the trait definition intraverse.rs, the signature should befn try_traverse(&self, traverse_fn: &mut TraverseFn<'_>).Proposed fix
- fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn) { + fn try_traverse(&self, tracer_fn: &mut ::rustpython_vm::object::TraverseFn<'_>) { `#try_traverse_body` }crates/stdlib/src/gc.rs (1)
189-196: Incompleteget_referrersimplementation.This returns an empty list but the issue was already flagged in a previous review.
🧹 Nitpick comments (10)
scripts/auto_mark_test.py (1)
67-74: Class-level mutable defaults are a Python footgun.
tests = []andunexpected_successes = []at class level means all instances share the same list objects. Whileparse_resultsexplicitly reinitializes them (lines 80-81), this pattern is error-prone ifTestResultis instantiated elsewhere without reassignment.♻️ Suggested fix using __init__
class TestResult: tests_result: str = "" - tests = [] - unexpected_successes = [] # Tests that passed but were marked as expectedFailure stdout = "" + + def __init__(self): + self.tests = [] + self.unexpected_successes = [] # Tests that passed but were marked as expectedFailureThen in
parse_results, remove the redundant reassignments:def parse_results(result): lines = result.stdout.splitlines() test_results = TestResult() - test_results.tests = [] - test_results.unexpected_successes = [] test_results.stdout = result.stdoutcrates/derive-impl/src/pyclass.rs (1)
622-630: Consider validating thatpop_edgesrequirestraverse.If a user specifies
#[pyclass(pop_edges)]withouttraverse, the generated code will callTraverse::pop_edgeson a type that doesn't implementTraverse, resulting in a compile error. While the error is caught at compile time, adding an early validation would provide a clearer error message at macro expansion time.Optional: Add validation
// Add after line 588 if has_pop_edges && !has_traverse { bail_span!( item, "#[pyclass(pop_edges)] requires #[pyclass(traverse)] to be specified", ); }crates/common/Cargo.toml (1)
34-35: Consider using workspace dependency management for consistency.Most dependencies in this file use
{ workspace = true }for centralized version management. While some inline versions exist (e.g.,lock_api,siphasher), addingcrossbeam-epochto the workspace's[workspace.dependencies]would improve consistency and simplify future version updates across the crate.crates/common/src/lib.rs (1)
17-17: Full crate re-export exposes broad API surface.Re-exporting the entire
crossbeam_epochcrate publicly couples downstream consumers to its full API. If only specific types are needed externally (e.g.,Guard,pin), consider re-exporting only those to minimize the public API surface and insulate consumers from upstream breaking changes.// Example: selective re-export pub mod epoch { pub use crossbeam_epoch::{pin, Guard}; }crates/common/src/refcount.rs (1)
263-275: Consider panic resilience inflush_deferred_drops.If one deferred operation panics, subsequent operations in the queue won't execute. If this is intentional (e.g., panics are unrecoverable anyway), consider documenting it. Otherwise, wrapping each
op()instd::panic::catch_unwindwould ensure all operations attempt execution.💡 Optional: panic-resilient flush
pub fn flush_deferred_drops() { DEFERRED_QUEUE.with(|q| { let ops: Vec<_> = q.borrow_mut().drain(..).collect(); for op in ops { - op(); + let _ = std::panic::catch_unwind(std::panic::AssertUnwindSafe(op)); } }); }crates/vm/src/vm/mod.rs (2)
1177-1204: The hardcoded stdlib module list is brittle and may cause issues.The allowlist approach has maintenance concerns:
- Missing common stdlib modules (e.g.,
re,json,copy,operator,itertools)- New stdlib modules added in future Python versions won't be automatically covered
- The
contains("pylib")andcontains("Lib")path checks may miss stdlib modules in non-standard locationsConsider a more robust approach, such as checking if
__file__starts with the known stdlib paths fromsys.prefixorsys.base_prefix, or marking modules as stdlib during import.♻️ Alternative approach using sys paths
// Get stdlib path from sys module once let stdlib_path = self.sys_module .get_attr("base_prefix", self) .ok() .and_then(|p| p.downcast_ref::<PyStr>().map(|s| s.as_str().to_owned())); // Then in the loop, check if file_path starts with stdlib_path if let Some(ref stdlib) = stdlib_path { if file_path.starts_with(stdlib) { continue; // Skip stdlib module } }
1164-1168: Consider handling the case where__main__has no dict.While
PyModuleshould always have a dict, the silent skip vialet Some(module)could mask issues during shutdown debugging. A debug assertion or trace log might help diagnose shutdown problems.crates/vm/src/vm/thread.rs (1)
42-48: Potential panic from double borrow inensure_pinned.The function first calls
guard.borrow()to check ifNone, then immediately callsguard.borrow_mut(). While this works becauseborrow()returns a temporary that's dropped beforeborrow_mut(), a more idiomatic approach would avoid this pattern.♻️ Suggested refactor to avoid double borrow
pub fn ensure_pinned() { EBR_GUARD.with(|guard| { - if guard.borrow().is_none() { - *guard.borrow_mut() = Some(rustpython_common::epoch::pin()); - } + let mut g = guard.borrow_mut(); + if g.is_none() { + *g = Some(rustpython_common::epoch::pin()); + } }); }crates/stdlib/src/gc.rs (1)
262-264: Consider logging unraisable exceptions from GC callbacks.Callback errors are silently ignored. While CPython also doesn't propagate these errors, it does report them via
sys.unraisablehook. Consider usingvm.run_unraisablefor consistency.♻️ Suggested improvement
for callback in callbacks { - let _ = callback.call((phase_str.clone(), info.clone()), vm); + if let Err(exc) = callback.call((phase_str.clone(), info.clone()), vm) { + vm.run_unraisable(exc, Some("Exception ignored in gc callback".to_owned()), callback.clone()); + } }crates/vm/src/object/core.rs (1)
1135-1140: Aliasing concern documented but not eliminated.The past review flagged the
&selfto*mutcast as potentially violating Rust's aliasing rules. While the safety comment explains that exclusive access is guaranteed during GC, the technical UB concern remains. The current approach works in practice but isn't strictly sound under Rust's memory model.The safer alternative would be to only expose
gc_pop_edges_rawand require callers to obtain the raw pointer through safe means.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (30)
Cargo.lockis excluded by!**/*.lockLib/_opcode_metadata.pyis excluded by!Lib/**Lib/http/__init__.pyis excluded by!Lib/**Lib/http/client.pyis excluded by!Lib/**Lib/http/cookiejar.pyis excluded by!Lib/**Lib/http/server.pyis excluded by!Lib/**Lib/test/support/__init__.pyis excluded by!Lib/**Lib/test/test__opcode.pyis excluded by!Lib/**Lib/test/test_builtin.pyis excluded by!Lib/**Lib/test/test_dict.pyis excluded by!Lib/**Lib/test/test_gc.pyis excluded by!Lib/**Lib/test/test_generators.pyis excluded by!Lib/**Lib/test/test_http_cookiejar.pyis excluded by!Lib/**Lib/test/test_httplib.pyis excluded by!Lib/**Lib/test/test_httpservers.pyis excluded by!Lib/**Lib/test/test_logging.pyis excluded by!Lib/**Lib/test/test_robotparser.pyis excluded by!Lib/**Lib/test/test_ssl.pyis excluded by!Lib/**Lib/test/test_subprocess.pyis excluded by!Lib/**Lib/test/test_urllib.pyis excluded by!Lib/**Lib/test/test_urllib2.pyis excluded by!Lib/**Lib/test/test_urllib2_localnet.pyis excluded by!Lib/**Lib/test/test_urllib2net.pyis excluded by!Lib/**Lib/test/test_urllib_response.pyis excluded by!Lib/**Lib/test/test_weakref.pyis excluded by!Lib/**Lib/test/test_wsgiref.pyis excluded by!Lib/**Lib/urllib/error.pyis excluded by!Lib/**Lib/urllib/parse.pyis excluded by!Lib/**Lib/urllib/request.pyis excluded by!Lib/**Lib/urllib/robotparser.pyis excluded by!Lib/**
📒 Files selected for processing (33)
.cspell.jsoncrates/common/Cargo.tomlcrates/common/src/lib.rscrates/common/src/refcount.rscrates/compiler-core/src/bytecode/instruction.rscrates/derive-impl/src/pyclass.rscrates/derive-impl/src/pystructseq.rscrates/derive-impl/src/util.rscrates/stdlib/src/gc.rscrates/vm/src/builtins/asyncgenerator.rscrates/vm/src/builtins/coroutine.rscrates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/generator.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/str.rscrates/vm/src/builtins/tuple.rscrates/vm/src/coroutine.rscrates/vm/src/dict_inner.rscrates/vm/src/frame.rscrates/vm/src/gc_state.rscrates/vm/src/lib.rscrates/vm/src/object/core.rscrates/vm/src/object/traverse.rscrates/vm/src/object/traverse_object.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/context.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/thread.rsextra_tests/snippets/builtins_module.pyscripts/auto_mark_test.py
🚧 Files skipped from review as they are similar to previous changes (11)
- crates/vm/src/builtins/coroutine.rs
- crates/vm/src/builtins/str.rs
- crates/vm/src/object/traverse.rs
- crates/vm/src/builtins/generator.rs
- crates/vm/src/frame.rs
- crates/vm/src/dict_inner.rs
- crates/vm/src/builtins/dict.rs
- crates/vm/src/builtins/asyncgenerator.rs
- crates/vm/src/coroutine.rs
- crates/vm/src/gc_state.rs
- crates/derive-impl/src/pystructseq.rs
🧰 Additional context used
🧬 Code graph analysis (14)
crates/vm/src/lib.rs (1)
crates/vm/src/gc_state.rs (1)
gc_state(802-804)
crates/vm/src/builtins/list.rs (6)
crates/vm/src/builtins/function.rs (2)
traverse(47-58)pop_edges(60-97)crates/vm/src/builtins/str.rs (1)
traverse(274-277)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/builtins/set.rs (1)
traverse(176-179)crates/vm/src/builtins/mappingproxy.rs (1)
traverse(30-35)crates/vm/src/object/traverse.rs (1)
pop_edges(40-40)
crates/derive-impl/src/pyclass.rs (3)
crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)crates/vm/src/object/traverse.rs (3)
pop_edges(40-40)try_traverse(20-20)try_pop_edges(22-22)crates/vm/src/builtins/function.rs (1)
pop_edges(60-97)
crates/vm/src/vm/interpreter.rs (2)
crates/vm/src/stdlib/atexit.rs (1)
_run_exitfuncs(36-47)crates/vm/src/gc_state.rs (1)
gc_state(802-804)
crates/vm/src/stdlib/sys.rs (6)
crates/vm/src/object/core.rs (1)
dict(803-805)crates/vm/src/builtins/dict.rs (1)
dict(796-796)crates/vm/src/builtins/str.rs (2)
key(1412-1412)key(1418-1418)crates/vm/src/vm/mod.rs (1)
key(1175-1175)crates/vm/src/coroutine.rs (1)
frame(215-217)crates/vm/src/stdlib/io.rs (1)
vm(3791-3793)
crates/vm/src/object/traverse_object.rs (1)
crates/vm/src/object/core.rs (4)
debug_obj(87-93)drop_dealloc_obj(84-86)try_pop_edges_obj(128-131)try_trace_obj(96-100)
crates/vm/src/vm/context.rs (3)
crates/vm/src/object/core.rs (2)
new_ref(1361-1378)default(413-415)crates/vm/src/builtins/dict.rs (1)
new_ref(77-79)crates/vm/src/builtins/list.rs (1)
new_ref(91-93)
scripts/auto_mark_test.py (1)
scripts/lib_updater.py (3)
PatchSpec(81-122)UtMethod(59-78)apply_patches(273-283)
crates/vm/src/stdlib/thread.rs (1)
crates/vm/src/vm/thread.rs (2)
ensure_pinned(42-48)drop_guard(67-71)
extra_tests/snippets/builtins_module.py (1)
crates/vm/src/stdlib/builtins.rs (1)
isinstance(527-529)
crates/stdlib/src/gc.rs (1)
crates/vm/src/gc_state.rs (18)
gc_state(802-804)enable(167-169)disable(172-174)collect(374-376)get_threshold(187-193)set_threshold(64-66)set_threshold(196-204)get_count(207-213)get_debug(177-179)set_debug(182-184)get_stats(216-222)stats(68-75)get_objects(303-344)new(44-54)new(136-159)freeze(754-773)unfreeze(776-792)get_freeze_count(749-751)
crates/vm/src/object/core.rs (2)
crates/vm/src/object/traverse_object.rs (1)
try_pop_edges_obj(39-39)crates/vm/src/gc_state.rs (1)
gc_state(802-804)
crates/vm/src/builtins/function.rs (4)
crates/vm/src/builtins/type.rs (2)
module(1780-1780)name(592-597)crates/vm/src/frame.rs (4)
name(1320-1320)name(2695-2695)traverse(101-103)traverse(107-118)crates/vm/src/builtins/generator.rs (1)
traverse(24-26)crates/vm/src/builtins/tuple.rs (2)
traverse(43-45)pop_edges(47-51)
crates/compiler-core/src/bytecode/instruction.rs (3)
crates/codegen/src/ir.rs (1)
from(55-57)crates/compiler-core/src/bytecode/oparg.rs (2)
from(25-27)from(68-70)crates/compiler-core/src/bytecode.rs (2)
from(345-347)from(351-353)
🪛 Biome (2.1.2)
.cspell.json
[error] 69-69: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 69-69: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 121-121: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 121-121: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 122-122: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 122-122: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 123-123: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 123-123: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 124-124: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 124-124: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 125-125: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 125-125: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 126-126: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 126-126: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 127-127: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 127-127: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Ensure compilation on various targets
🔇 Additional comments (52)
crates/vm/src/stdlib/sys.rs (1)
829-834: LGTM!The change from consuming the frame (
.into()) to cloning its object reference (.as_object().to_owned()) is appropriate. This preserves the originalFrameReffor GC traversal while still providing an owned value to the dictionary, aligning with the broader GC/ownership improvements in this PR.extra_tests/snippets/builtins_module.py (1)
25-35: LGTM!The test correctly validates that
function.__builtins__is a dict as per Python's datamodel. The assertion includes a helpful diagnostic message showing the actual type on failure.scripts/auto_mark_test.py (5)
26-36: LGTM!Imports are appropriate for the new AST-based functionality.
119-132: LGTM!The parsing logic for unexpected successes follows the same pattern as the existing fail/error parsing and correctly extracts the test name and path.
142-160: LGTM!The AST inspection correctly identifies methods that are pure super-delegation wrappers.
163-228: Solid implementation for removing expected failures.The hybrid AST + line-based approach works well for this use case. One minor consideration: the string check
"expectedFailure" not in line_contenton line 208 could theoretically match false positives in comments, but since we're only examining decorator lines, this is acceptable.
304-338: LGTM!The logic correctly handles both adding
expectedFailuredecorators for newly failing tests and removing them from tests that now pass. The order of operations (remove first, then apply patches) is correct.crates/derive-impl/src/util.rs (1)
366-376: LGTM!The addition of
"pop_edges"toClassItemMeta::ALLOWED_NAMEScorrectly enables the newpop_edgesattribute for#[pyclass]macro, aligning with the traversal/edge-popping capabilities introduced in the VM.crates/derive-impl/src/pyclass.rs (3)
590-608: LGTM with note on intentional error handling.The
.ok().flatten()pattern on line 593 intentionally converts errors toNoneto handle the case wheretraverseis specified without a value (e.g.,#[pyclass(traverse)]vs#[pyclass(traverse = "manual")]). The inline comment documents this behavior well.Note: The
assert_eq!(s, "manual")on line 601 is redundant sinceALLOWED_TRAVERSE_OPTScurrently only contains"manual", but it serves as a safety check if more options are added later.
632-646: Well-structuredMaybeTraverseimplementation generation.The generated impl correctly:
- Sets
IS_TRACEandHAS_POP_EDGESconstants based on attribute presence- Conditionally generates method bodies that either delegate to
Traverseor are no-ops- Uses fully qualified paths for type references
This aligns well with the trait definition in
traverse.rs.
577-588: Clear documentation of the traverse/pop_edges behavior.The comments clearly explain the three states for
traverse(none, manual, auto-derive) and two states forpop_edges(none, present). This makes the macro behavior easy to understand for future maintainers.crates/common/src/refcount.rs (8)
1-17: LGTM - Re-exports and module documentation are clear.The EBR type re-exports and
csalias forpin()provide good API compatibility while integrating with crossbeam-epoch.
18-39: LGTM - Bit layout constants are correctly defined.The 64-bit state layout is sound:
- Bits 0-30: strong count (31 bits)
- Bits 31-60: weak count (30 bits)
- Bit 61: LEAKED
- Bit 62: WEAKED
- Bit 63: DESTRUCTED
The previous bit overlap issue has been properly addressed.
41-87: LGTM - State struct implementation is correct.The bit manipulation methods correctly handle the state layout. The division by
COUNTinstrong()is a no-op (sinceCOUNT = 1) but maintains consistency ifCOUNTchanges in the future.
133-166: LGTM -safe_inc()anddec()implementations are correct.The CAS loop in
safe_inc()properly handles concurrent access, anddec()correctly returnstruewhen the count drops to zero while respecting leaked objects.
168-190: LGTM -leak()andis_leaked()are correctly implemented.The CAS loop properly handles concurrent flag setting. The
Acquireordering inis_leaked()is sufficient for reading the flag.
192-224: LGTM - Panic-safe deferred drop guard.The
DeferredDropGuardRAII pattern correctly restores state and flushes deferred drops on both normal exit and panic unwinding, addressing the previous panic-safety concern.
226-261: LGTM - Deferred drop context functions are well-designed.The
with_deferred_dropsproperly uses RAII for panic safety, andtry_defer_dropcorrectly handles the thread-local context check. The lack ofSendbound is intentional and documented.
277-295: LGTM -defer_destructioncorrectly wraps EBR deferral.The safety requirements are well-documented, and the function properly delegates to
guard.defer_unchecked()with appropriate bounds (Send + 'static)..cspell.json (1)
69-69: LGTM - Spell-check additions align with GC terminology.The added terms ("finalizers", "uncollectable", "weaked") correspond to GC concepts introduced in this PR. Note that "weaked" is non-standard terminology - consider documenting its meaning in the code where it's used if not already done.
The Biome static analysis errors are false positives - this file uses JSON5 format which supports trailing commas and comments.
Also applies to: 121-121, 127-127
crates/vm/src/vm/context.rs (2)
54-57: LGTM - GC module state fields properly declared.The
gc_callbacksandgc_garbagelists mirror CPython'sgcmodule design, providing storage for callback functions and uncollectable garbage respectively.
335-338: LGTM - GC lists properly initialized during genesis.The initialization using
PyRef::new_refwithPyList::default()is consistent with other singleton objects created ininit_genesis.crates/vm/src/lib.rs (1)
80-80: LGTM - Public GC state module export.This properly exposes the GC state API for use by the stdlib
gcmodule and other VM components.crates/vm/src/vm/mod.rs (2)
553-562: LGTM - Proper exception suppression during finalization.The early return when
finalizingis true correctly mirrors CPython's behavior of silently ignoring__del__errors and daemon thread exceptions during shutdown. UsingAcquireordering ensures visibility of the flag set withReleaseininterpreter.rs.
673-677: LGTM - EBR guard reactivation at safe points.Reactivating the guard at frame boundaries allows epoch-based reclamation to advance, preventing memory from being held indefinitely during long-running operations.
crates/vm/src/vm/interpreter.rs (1)
143-152: LGTM - GC passes around module cleanup follow a sound pattern.The two-pass approach is appropriate:
- First pass collects cycles before module cleanup (generation 0-2, full collection)
finalize_modulesbreaks references in module namespaces- Second pass collects the newly-breakable cyclic garbage (generation 0-2, full collection)
This mirrors CPython's finalization strategy of clearing modules to break cycles. The
collect_force(2)parameter correctly targets generation 2 (the oldest), collecting across all three generations (0-2) for a complete cycle collection each time.crates/vm/src/vm/thread.rs (1)
25-32: Well-structured EBR pinning lifecycle.The three-function API (
ensure_pinned,reactivate_guard,drop_guard) provides a clean interface for managing EBR guards. The documentation clearly explains the coarse-grained pinning strategy and when each function should be called.Also applies to: 36-48, 50-71
crates/vm/src/stdlib/thread.rs (3)
429-459: Correct EBR pinning lifecycle in thread execution.The thread properly enters the EBR critical section at the start with
ensure_pinned()and exits it withdrop_guard()at the end. This ensures GC won't free objects while the thread might access them. The cleanup order (sentinels → thread-local data → frame tracking → thread count → EBR guard) is appropriate.
676-682: Correct SystemExit subclass handling.The updated check properly handles SystemExit subclasses, matching CPython's behavior where threading's
_excepthooksilently ignoresSystemExitand all its subclasses. Theis_some_andwithfast_issubclassis the right approach.
525-530: Good use ofis_some_andfor cleaner code.The change from
map_or(false, ...)tois_some_and(...)is more idiomatic and clearer in intent.crates/stdlib/src/gc.rs (2)
50-86: Solidcollectimplementation with proper callback lifecycle.The function correctly:
- Validates generation parameter
- Invokes callbacks with "start" phase before collection
- Uses
collect_forceto run even when GC is disabled (matching CPython's manual gc.collect behavior)- Transfers garbage objects to vm.ctx.gc_garbage for DEBUG_SAVEALL
- Invokes callbacks with "stop" phase after collection
164-174: Clean generation validation inget_objects.The validation correctly handles the optional generation parameter and provides a clear error message matching CPython's format.
crates/vm/src/object/core.rs (5)
990-1023: Correct cycle-breaking order indrop_slow.The implementation correctly:
- Untracks the object from GC before deallocation
- Extracts child references via
pop_edgesbefore deallocation- Deallocates the object
- Drops child references after deallocation
This order ensures circular references are broken because children are dropped after the parent is freed, preventing use-after-free in cycles.
264-308: Well-designedclear_for_gc_collect_callbacksfor GC semantics.This method correctly separates weakref clearing from callback invocation, allowing GC to:
- Clear ALL weakrefs in unreachable objects first
- Collect all callbacks
- Invoke callbacks later (after all weakrefs are cleared)
This matches CPython's GC behavior where weakref callbacks are invoked in a second pass.
1361-1378: Correct GC tracking at object creation.Objects are tracked if they have
IS_TRACE(can contain references) or have an instance dict (user-defined class instances). The tracking condition here matches the untracking condition indrop_slow, ensuring consistency. Callingmaybe_collect()after tracking allows automatic GC to run when thresholds are exceeded.
1055-1072: Correct finalizer invocation for GC.
try_call_finalizerproperly:
- Sets the FINALIZED flag before calling
__del__to prevent double-invocation- Reports errors via
run_unraisablerather than propagating them- Does not handle resurrection directly (that's the caller's responsibility)
This design allows GC to call finalizers in a controlled manner during collection.
920-944: Resurrection detection logic is correct.The
inc()method does perform a double increment from 0→2 (verified: it callsfetch_add(COUNT)once, checks if the old value was 0, then callsfetch_add(COUNT)again). The resurrection detection properly handles all scenarios:
- No resurrection:
after_delequalsafter_inc(=2) → both decrements execute → count reaches 0- Persistent resurrection:
after_del>after_inc→ first decrement executes, check detects it, second decrement skipped → count stays elevated- Temporary reference in
__del__:__del__creates then drops a reference (3→2) →after_delstill equals 2 → correctly treated as non-resurrectionThe comparison at line 937 is sound because it checks
after_del > after_incbefore the second decrement, correctly distinguishing between references that persist versus those temporarily created during__del__execution.crates/vm/src/object/traverse_object.rs (1)
14-45: Clean vtable extension for edge popping.The
pop_edgesfield addition follows the same pattern astrace:
- Optional function pointer for types that support edge extraction
- Conditional initialization via const block based on
HAS_POP_EDGES- Clear documentation explaining its role in circular reference resolution
The implementation is consistent with the existing vtable design.
crates/vm/src/builtins/list.rs (3)
6-6: No comment.
27-33: Manual traversal + pop_edges wiring looks good.
60-72: Traverse + pop_edges implementation looks solid.crates/vm/src/builtins/tuple.rs (3)
6-6: No comment.
28-28: Manual traversal + pop_edges annotation looks good.
40-51: Traverse + pop_edges implementation is clear and consistent.crates/vm/src/builtins/function.rs (6)
4-7: No comment.
28-28: Manual traversal + pop_edges annotation looks good.
46-97: GC traversal + edge popping coverage looks comprehensive.
108-122: Builtins normalization to dict is a nice robustness improvement.
724-737: New args shape (argdefs/closure/kwdefaults) is clear and consistent.
744-781: Constructor updates for closure/argdefs/kwdefaults look correct.crates/compiler-core/src/bytecode/instruction.rs (2)
248-376: Opcode expansion reads cleanly and keeps IDs stable.Explicit discriminants and the end-of-custom marker make the extended table straightforward to reason about.
401-406: Validation ranges are updated cleanly.Deriving specialized/instrumented ranges from enum values keeps
TryFrom<u8>aligned with the opcode table.Also applies to: 447-448
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| /// Increment strong count | ||
| #[inline] | ||
| pub fn inc(&self) { | ||
| let old_size = self.strong.fetch_add(1, Relaxed); | ||
|
|
||
| if old_size & Self::MASK == Self::MASK { | ||
| let val = State::from_raw(self.state.fetch_add(COUNT, Ordering::SeqCst)); | ||
| if val.destructed() { | ||
| // Already marked for destruction, but we're incrementing | ||
| // This shouldn't happen in normal usage | ||
| std::process::abort(); | ||
| } | ||
| if val.strong() == 0 { | ||
| // The previous fetch_add created a permission to run decrement again | ||
| self.state.fetch_add(COUNT, Ordering::SeqCst); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
wc -l crates/common/src/refcount.rsRepository: youknowone/RustPython
Length of output: 98
🏁 Script executed:
cat -n crates/common/src/refcount.rsRepository: youknowone/RustPython
Length of output: 11057
🏁 Script executed:
rg -n "\.inc\(\)" --type rust -B3 -A3Repository: youknowone/RustPython
Length of output: 2298
🏁 Script executed:
rg -n "fn dec\|fn decrement" --type rust -B2 -A10Repository: youknowone/RustPython
Length of output: 47
🏁 Script executed:
rg -n "COUNT\s*=" crates/common/src/refcount.rsRepository: youknowone/RustPython
Length of output: 47
Improve the inline comment in inc() to clarify the 0→2 resurrection behavior.
The logic on lines 127-130 is intentional but underdocumented: when strong count is 0 before the first fetch_add, a second fetch_add increments the count to 2. While the calling code in crates/vm/src/object/core.rs (lines 920-921) explains this as "inc() from 0 does a double increment (0→2) for thread safety," the inline comment on line 128 ("created a permission to run decrement again") is too cryptic.
Replace the inline comment with an explicit explanation of the 0→2 resurrection pattern and its thread-safety rationale to match the clarity in the calling code.
🤖 Prompt for AI Agents
In `@crates/common/src/refcount.rs` around lines 118 - 131, Replace the cryptic
comment inside inc() (around the second self.state.fetch_add(COUNT,
Ordering::SeqCst)) with an explicit explanation of the 0→2 resurrection pattern:
explain that when the previous value's strong() == 0 the first fetch_add created
a transient permission for a concurrent decrement path, so inc() must perform a
second fetch_add to raise the strong count from 0 to 2 (not 1) to avoid races
with concurrent drop/decrement; reference the involved symbols (inc,
State::from_raw, self.state.fetch_add, COUNT, destructed, strong) and briefly
state this is done for thread-safety so a concurrent decrement cannot observe an
intermediate 1 and prematurely destroy the object.
| Self::BinaryOpAddFloat => 0, | ||
| Self::BinaryOpAddInt => 0, | ||
| Self::BinaryOpAddUnicode => 0, | ||
| Self::BinaryOpMultiplyFloat => 0, | ||
| Self::BinaryOpMultiplyInt => 0, | ||
| Self::BinaryOpSubtractFloat => 0, | ||
| Self::BinaryOpSubtractInt => 0, | ||
| Self::BinarySubscrDict => 0, | ||
| Self::BinarySubscrGetitem => 0, | ||
| Self::BinarySubscrListInt => 0, | ||
| Self::BinarySubscrStrInt => 0, | ||
| Self::BinarySubscrTupleInt => 0, | ||
| Self::CallAllocAndEnterInit => 0, | ||
| Self::CallBoundMethodExactArgs => 0, | ||
| Self::CallBoundMethodGeneral => 0, | ||
| Self::CallBuiltinClass => 0, | ||
| Self::CallBuiltinFast => 0, | ||
| Self::CallBuiltinFastWithKeywords => 0, | ||
| Self::CallBuiltinO => 0, | ||
| Self::CallIsinstance => 0, | ||
| Self::CallLen => 0, | ||
| Self::CallListAppend => 0, | ||
| Self::CallMethodDescriptorFast => 0, | ||
| Self::CallMethodDescriptorFastWithKeywords => 0, | ||
| Self::CallMethodDescriptorNoargs => 0, | ||
| Self::CallMethodDescriptorO => 0, | ||
| Self::CallNonPyGeneral => 0, | ||
| Self::CallPyExactArgs => 0, | ||
| Self::CallPyGeneral => 0, | ||
| Self::CallStr1 => 0, | ||
| Self::CallTuple1 => 0, | ||
| Self::CallType1 => 0, | ||
| Self::CompareOpFloat => 0, | ||
| Self::CompareOpInt => 0, | ||
| Self::CompareOpStr => 0, | ||
| Self::ContainsOpDict => 0, | ||
| Self::ContainsOpSet => 0, | ||
| Self::ForIterGen => 0, | ||
| Self::ForIterList => 0, | ||
| Self::ForIterRange => 0, | ||
| Self::ForIterTuple => 0, | ||
| Self::LoadAttrClass => 0, | ||
| Self::LoadAttrGetattributeOverridden => 0, | ||
| Self::LoadAttrInstanceValue => 0, | ||
| Self::LoadAttrMethodLazyDict => 0, | ||
| Self::LoadAttrMethodNoDict => 0, | ||
| Self::LoadAttrMethodWithValues => 0, | ||
| Self::LoadAttrModule => 0, | ||
| Self::LoadAttrNondescriptorNoDict => 0, | ||
| Self::LoadAttrNondescriptorWithValues => 0, | ||
| Self::LoadAttrProperty => 0, | ||
| Self::LoadAttrSlot => 0, | ||
| Self::LoadAttrWithHint => 0, | ||
| Self::LoadGlobalBuiltin => 0, | ||
| Self::LoadGlobalModule => 0, | ||
| Self::LoadSuperAttrAttr => 0, | ||
| Self::LoadSuperAttrMethod => 0, | ||
| Self::ResumeCheck => 0, | ||
| Self::SendGen => 0, | ||
| Self::StoreAttrInstanceValue => 0, | ||
| Self::StoreAttrSlot => 0, | ||
| Self::StoreAttrWithHint => 0, | ||
| Self::StoreSubscrDict => 0, | ||
| Self::StoreSubscrListInt => 0, | ||
| Self::ToBoolAlwaysTrue => 0, | ||
| Self::ToBoolBool => 0, | ||
| Self::ToBoolInt => 0, | ||
| Self::ToBoolList => 0, | ||
| Self::ToBoolNone => 0, | ||
| Self::ToBoolStr => 0, | ||
| Self::UnpackSequenceList => 0, | ||
| Self::UnpackSequenceTuple => 0, | ||
| Self::UnpackSequenceTwoTuple => 0, | ||
| Self::InstrumentedResume => 0, | ||
| Self::InstrumentedEndFor => 0, | ||
| Self::InstrumentedEndSend => 0, | ||
| Self::InstrumentedReturnValue => 0, | ||
| Self::InstrumentedReturnConst => 0, | ||
| Self::InstrumentedYieldValue => 0, | ||
| Self::InstrumentedLoadSuperAttr => 0, | ||
| Self::InstrumentedForIter => 0, | ||
| Self::InstrumentedCall => 0, | ||
| Self::InstrumentedCallKw => 0, | ||
| Self::InstrumentedCallFunctionEx => 0, | ||
| Self::InstrumentedInstruction => 0, | ||
| Self::InstrumentedJumpForward => 0, | ||
| Self::InstrumentedJumpBackward => 0, | ||
| Self::InstrumentedPopJumpIfTrue => 0, | ||
| Self::InstrumentedPopJumpIfFalse => 0, | ||
| Self::InstrumentedPopJumpIfNone => 0, | ||
| Self::InstrumentedPopJumpIfNotNone => 0, | ||
| Self::InstrumentedLine => 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the file and understand its structure
head -750 crates/compiler-core/src/bytecode/instruction.rs | tail -100Repository: youknowone/RustPython
Length of output: 4483
🏁 Script executed:
# Get more context around the lines mentioned (693-784)
sed -n '680,800p' crates/compiler-core/src/bytecode/instruction.rsRepository: youknowone/RustPython
Length of output: 4975
🏁 Script executed:
# Find where TryFrom<u8> is implemented for this type
rg "impl.*TryFrom.*u8" crates/compiler-core/src/bytecode/instruction.rs -A 10Repository: youknowone/RustPython
Length of output: 762
🏁 Script executed:
# Find the stack_effect function mentioned in the concern
rg "fn stack_effect|stack_effect\(" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 5Repository: youknowone/RustPython
Length of output: 1250
🏁 Script executed:
# Look for the generic BinaryOp (not specialized versions)
rg "Self::BinaryOp[^A-Z]" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 2Repository: youknowone/RustPython
Length of output: 570
🏁 Script executed:
# Check the enum definition to see if these are marked as specialized/pseudo
rg "enum Instruction" crates/compiler-core/src/bytecode/instruction.rs -A 200 | head -300Repository: youknowone/RustPython
Length of output: 4909
🏁 Script executed:
# Look for documentation or comments about pseudo instructions
rg "Pseudo instruction|pseudo opcode|specialized opcode" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 2Repository: youknowone/RustPython
Length of output: 268
🏁 Script executed:
# Search for where TryFrom<u8> is actually used - does it convert these variants?
rg "Instruction::try_from|TryFrom.*Instruction" crates/compiler-core/src/ --type rust -B 3 -A 3Repository: youknowone/RustPython
Length of output: 2456
🏁 Script executed:
# Check if there are any assertions about which variants can appear in converted instructions
rg "assert|panic|unreachable" crates/compiler-core/src/bytecode/instruction.rs -B 2 -A 2 | head -80Repository: youknowone/RustPython
Length of output: 1043
🏁 Script executed:
# Look for the TryFrom implementation for Instruction to see which opcodes it actually creates
sed -n '1,300p' crates/compiler-core/src/bytecode/instruction.rs | grep -A 100 "impl TryFrom<u8> for Instruction"Repository: youknowone/RustPython
Length of output: 47
🏁 Script executed:
# Find the full TryFrom<u8> implementation for Instruction
rg "impl TryFrom<u8> for Instruction" crates/compiler-core/src/bytecode/instruction.rs -A 150Repository: youknowone/RustPython
Length of output: 5610
🏁 Script executed:
# Check if these placeholder variants are ever constructed in the codebase
rg "BinaryOpAddFloat|CallAllocAndEnterInit|ForIterGen|UnpackSequenceList" crates/compiler-core/src/ --type rustRepository: youknowone/RustPython
Length of output: 988
🏁 Script executed:
# Search for where these are created - are they only from bytecode parsing?
rg "InstrumentedResume|InstrumentedCall|InstrumentedForIter" crates/compiler-core/src/ --type rust -B 2 -A 2Repository: youknowone/RustPython
Length of output: 3213
🏁 Script executed:
# Look for PseudoInstruction enum to see if these variants are actually there
rg "enum PseudoInstruction" crates/compiler-core/src/bytecode/instruction.rs -A 100Repository: youknowone/RustPython
Length of output: 3042
🏁 Script executed:
# Check the generic BinaryOp, Call, ForIter stack effects
rg "Self::BinaryOp \{|Self::Call \{|Self::ForIter \{" crates/compiler-core/src/bytecode/instruction.rs | grep "=>" | head -5Repository: youknowone/RustPython
Length of output: 355
🏁 Script executed:
# Get more details on Call and ForIter effects
sed -n '500,650p' crates/compiler-core/src/bytecode/instruction.rs | grep -A 1 "Self::Call\|Self::ForIter"Repository: youknowone/RustPython
Length of output: 830
Specialized and instrumented opcodes have incorrect zero stack effects, breaking stack-size verification.
These variants (BinaryOp*, Call*, ForIter*, UnpackSequence*, Instrumented*) are now explicitly accepted by TryFrom<u8> via unsafe transmute in the ranges (specialized_start..=specialized_end) and (instrumented_start..=instrumented_end). When present in bytecode, they will be successfully converted to Instruction enum variants, but all return 0 for stack_effect, which mismatches their generic equivalents:
BinaryOp { .. } => -1vs.BinaryOpAddFloatetc.=> 0Call { nargs } => -(nargs.get(arg) as i32) - 2 + 1vs. allCall*variants=> 0ForIter { .. }(conditional logic) vs.ForIterGenetc.=> 0UnpackSequence*variants all=> 0
This will cause stack-size calculations or verification to undercount pops/pushes. Either normalize these opcodes to their generic equivalents before stack_effect is queried, provide correct stack effect values, or document/assert that these opcodes are guaranteed never to appear in bytecode.
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 693 - 784, The
specialized (BinaryOpAddFloat, CallBuiltinFast, ForIterGen, UnpackSequenceList,
InstrumentedCall, etc.) and instrumented variants are being created via unsafe
transmute in the TryFrom<u8> ranges (specialized_start..=specialized_end and
instrumented_start..=instrumented_end) but their matches in the
Instruction::stack_effect method all return 0, which miscounts stack usage; fix
by mapping those specialized/instrumented variants to the equivalent generic
variants (e.g., BinaryOpAddFloat -> BinaryOp,
CallBuiltinFast/CallPyExactArgs/etc. -> Call with correct nargs handling,
ForIterGen -> ForIter with same conditional logic, UnpackSequenceList ->
UnpackSequence, Instrumented* -> corresponding base instruction) before
computing stack effect or implement correct stack_effect arms that compute the
same values as their generic counterparts, ensuring code paths reference the
Instruction enum variants (BinaryOp, Call, ForIter, UnpackSequence,
Instrumented* mapping) and preserve existing semantics used by stack-size
verification.
| // Suppress unraisable exceptions from daemon threads and __del__ | ||
| // methods during shutdown. | ||
| vm.state.finalizing.store(true, Ordering::Release); | ||
|
|
||
| atexit::_run_exitfuncs(vm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting finalizing before atexit may suppress important errors.
The finalizing flag is set before _run_exitfuncs, which means errors in atexit handlers will be silently suppressed (since run_unraisable returns early when finalizing is true). In CPython, atexit functions run before Py_Finalizing() becomes true.
Consider moving vm.state.finalizing.store(true, ...) to after _run_exitfuncs:
🔧 Suggested fix
// Wait for non-daemon threads (wait_for_thread_shutdown)
wait_for_thread_shutdown(vm);
+ atexit::_run_exitfuncs(vm);
+
// Suppress unraisable exceptions from daemon threads and __del__
// methods during shutdown.
vm.state.finalizing.store(true, Ordering::Release);
-
- atexit::_run_exitfuncs(vm);🤖 Prompt for AI Agents
In `@crates/vm/src/vm/interpreter.rs` around lines 137 - 141, The finalizing flag
is set before running atexit handlers which causes run_unraisable to
short-circuit and suppress errors from atexit; update the shutdown sequence in
interpreter.rs so that atexit::_run_exitfuncs(vm) is called first and only after
it completes set vm.state.finalizing.store(true, Ordering::Release) (ensure any
logic that relies on finalizing remains correct after the move); reference
vm.state.finalizing and atexit::_run_exitfuncs when making the change.
Summary by CodeRabbit
New Features
Bug Fixes
Performance
✏️ Tip: You can customize this high-level summary in your review settings.