feat(rustdoc-json): Add optional support for rkyv (de)serialization#153283
feat(rustdoc-json): Add optional support for rkyv (de)serialization#153283LukeMathWalker wants to merge 6 commits intorust-lang:mainfrom
Conversation
|
rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing |
|
|
This comment has been minimized.
This comment has been minimized.
6cc4c8b to
e357584
Compare
|
|
||
| #[cfg(feature = "rkyv_0_8")] | ||
| mod rkyv { | ||
| use std::fmt::Debug; |
There was a problem hiding this comment.
These tests don't run. When I applied
diff --git a/src/rustdoc-json-types/tests.rs b/src/rustdoc-json-types/tests.rs
index e878350e43b..258c22304c3 100644
--- a/src/rustdoc-json-types/tests.rs
+++ b/src/rustdoc-json-types/tests.rs
@@ -41,6 +41,11 @@ fn test_union_info_roundtrip() {
#[cfg(feature = "rkyv_0_8")]
mod rkyv {
+ #[test]
+ fn definenly_fails() {
+ panic!("at least the rkyv tests were ran");
+ }
+
use std::fmt::Debug;
use rkyv::Archive;Running ./x test ./src/rustdoc-json-types/ still passed.
The fix (I think) is to enable this feature in bootsrap:
diff --git a/src/bootstrap/src/core/build_steps/test.rs b/src/bootstrap/src/core/build_steps/test.rs
index 88f10775333..ab1d2b8a24b 100644
--- a/src/bootstrap/src/core/build_steps/test.rs
+++ b/src/bootstrap/src/core/build_steps/test.rs
@@ -3302,7 +3302,7 @@ fn run(self, builder: &Builder<'_>) {
builder.kind,
"src/rustdoc-json-types",
SourceType::InTree,
- &[],
+ &["rkyv_0_8".to_owned()],
);
// FIXME: this looks very wrong, libtest doesn't accept `-C` arguments and the quotes are fishy.(CC @jieyouxu, is this ok to do?)
There was a problem hiding this comment.
Apologies, I had only tested the crate directly via local cargo test, under the implicit assumption that the testing infrastructure would automatically pick up feature flags for matrix testing.
I've added the feature flag to the bootstrap script, let me know if other changes are needed.
| #[cfg_attr(feature = "rkyv_0_8", derive(rkyv::Archive, rkyv::Serialize, rkyv::Deserialize))] | ||
| #[cfg_attr(feature = "rkyv_0_8", rkyv(derive(Debug)))] | ||
| #[cfg_attr(feature = "rkyv_0_8", rkyv(serialize_bounds( | ||
| __S: rkyv::ser::Writer + rkyv::ser::Allocator, |
There was a problem hiding this comment.
When are these needed? I'm a bit hesitant about the maintenence requirement, given that it seems a bit more onerous that just adding the same derives everywhere. Could you maybe at a comment (at the top of the file) explaining which rkyv bounds are needed where (or link to their docs on this).
There was a problem hiding this comment.
I've added a CONTRIBUTING.md file which explains step-by-step why we need annotations and why those specific bounds are needed (c322fa5).
In the process of spelling it out, I realized we could simplify the deserialization bounds a bit (no Pooling, see 528307a).
Does it work in a separate file or do you prefer it as a comment in lib.rs?
There was a problem hiding this comment.
Comment in lib.rs makes more sense. In my mind CONTRIBUTING.md is about how to get started and social norms, not details about how things work.
Also: could you link to the rkyv docs/json example?
| /// to parse them, or otherwise depend on any implementation details. | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash, Serialize, Deserialize)] | ||
| #[cfg_attr(feature = "rkyv_0_8", derive(rkyv::Archive, rkyv::Serialize, rkyv::Deserialize))] | ||
| #[cfg_attr(feature = "rkyv_0_8", rkyv(derive(Debug, PartialEq, Eq, PartialOrd, Ord, Hash)))] |
There was a problem hiding this comment.
Why does this have more rkyv(derives than the other types?
There was a problem hiding this comment.
It allows the generated ArchivedId type to be used as a key in a HashMap, which is rather useful when working with the archived version of the JSON document.
Generally speaking, I've been rather minimal in the set of traits derived for the generated Archived* types, but if you prefer I can match the existing set of traits for the un-archived type.
| //! We expose a `rustc-hash` feature that is disabled by default. This feature switches the | ||
| //! [`std::collections::HashMap`] for [`rustc_hash::FxHashMap`] to improve the performance of said | ||
| //! `HashMap` in specific situations. | ||
| //! |
There was a problem hiding this comment.
NIT: Could you add a doc-comment about the rkyv_0_8 feature?
|
@rustbot author |
|
Reminder, once the PR becomes ready for a review, use |
|
@rustbot ready |
Motivation
The JSON documents produced by
rustdoc-jsonare big. More often than not, tools need to access a small fraction of that output—e.g. a couple of types from a transitive dependency, or a subset of the fields on a givenrustdoc-json-typestype.Using a binary (de)serialization format and a cache helps to drive down the performance cost of deserialization: you invoke
rustdoc-jsonto get the JSON output you need, re-serialize it using a more perfomant format as target (e.g.bincodeorpostcard) and thus amortize the cost of future queries that hit the persistent cache rather thanrustdoc-json.This is better, but still not great: the deserialization cost for crates like
stdstill shows up prominently in flamegraphs.An Alternative Approach: rkyv
rkyvprovides a different opportunity: you avoid paying the deserialization cost upfront thanks to zero-copy deserialization.You're often able to determine if you need a certain entry from the JSON document using the archived version of that type, thus incurring the full deserialization cost only for the subset of items you actually need (example).
The Change
This PR adds support for
rkyvbehind a feature flag (rkyv_0_8).For most types, it's a straight-forward
derive(rkyv::Archive, rkyv::Serialize, rkyv::Deserialize)annotation. For co-recursive types, we need to adjust the generated bounds, using the techniques fromrkyv's JSON example.I have added new round-trip tests to ensure
rkyvworks as expected.r? @aDotInTheVoid