Some more tweaks for rehandshake queue by Mark-Simulacrum · Pull Request #2576 · aws/s2n-quic

Mark-Simulacrum · 2025-03-28T13:49:41Z

Release Summary:

refactor(s2n-quic-dc): shrink rehandshake queue memory footprint
fix(s2n-quic-dc): avoid new handshake queue order on every refill

Resolved issues:

n/a

Description of changes:

This cuts ~6.67MB from our reserved capacity. It's still larger than we could have if we stored IPv4 addresses packed together, but I think the new overhead is sufficient to remove the FIXME, we're ~3x worse than optimal for IPv4-only workloads now. On Linux most of the backing memory doesn't get allocated anyway.
Handshake queue ordering is now based on a hash, not full randomness. This ensures the same peer hashes into the same slot over time, while nicely absorbing changes in the peer set. The hasher is randomly seeded.

Call-outs:

n/a

Testing:

Just CI for the SocketAddr changes, the handshake queue reordering change is tested in our internal test bed and produces stable results (see commit message for details).

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This cuts ~6.67MB from our reserved capacity. It's still larger than we could have if we stored IPv4 addresses packed together, but I think the new overhead is sufficient to remove the FIXME, we're ~3x worse than optimal for IPv4-only workloads now. On Linux most of the backing memory doesn't get allocated anyway.

This shrinks the p100 handshake time in practice (mostly fixing the comment) while still preserving random peer ordering between different hosts. In test, this seems to not *quite* accomplish our expectation of p100 time being exactly the handshake period (I see 47 minutes @ p100 with a 30 minute period) but that's still better than ~62 minutes we see otherwise and the average fluctuates much less (+/- roughly 1 minute).

camshaft · 2025-03-28T15:18:45Z

dc/s2n-quic-dc/src/path/secret/map/rehandshake.rs

+        // the comparison key to ensure that we find duplicate entries correctly (just by hash
+        // might have different addresses interleave).
+        self.queue
+            .sort_unstable_by_key(|peer| (self.hasher.hash_one(peer), *peer));


won't this hash the address on every comparison? I wonder if it would be better to use sort_unstable_by and check to see if the peers match before hashing? I guess that'll only reduce the hashing operations for duplicates which i'm not sure how often that'll happen.

It will, but I think matching addresses are very unlikely in practice -- we should only have that happen transiently after a rehandshake (for ~10 minutes) -- so with 500k peers and a 24 hour period, (500_000/(60*24)*10)/500_000 * 100 = 0.69 = ~0.7% of peers (3472) should have duplicate entries at any time.

I think in practice this also runs ~once a handshake period (so once every 24 hours) so a bit of extra CPU is not a big deal for the hashing. We could cache them (sort_unstable_by_key_cached) but I think the extra memory spend isn't worth it.

camshaft · 2025-03-28T15:22:21Z

dc/s2n-quic-dc/src/path/secret/map/rehandshake.rs

+
+            // Initializes the hasher with random keys, ensuring we handshake with peers in a
+            // different, random order from different hosts.
+            hasher: RandomState::new(),


do we need to rotate this state over time? the current behavior is the hosts that get the lowest hash values will always be prioritized over higher ones.

I'm not sure what you mean by prioritized. We run through the full list before resuming, so while low (or high -- I never remember which order sorting goes in?) go first/last, I'm not sure that means much? The hash of a given peer address will be different on ~every endpoint (should be randomly mapped given a reasonably good hash, which SipHash seems to be).

I think rotating the state gets us to the random shuffling we were doing before, which is bad for p100 ages, and doesn't actually add a lot of value -- I can't think of why we'd expect rotating it to improve on something vs. not rotating, since they are randomly picked to start. Do you have a model for why shuffling each time would be better?

) * Store SocketAddressV6 in rehandshake queue This cuts ~6.67MB from our reserved capacity. It's still larger than we could have if we stored IPv4 addresses packed together, but I think the new overhead is sufficient to remove the FIXME, we're ~3x worse than optimal for IPv4-only workloads now. On Linux most of the backing memory doesn't get allocated anyway. * Avoid shuffling handshake queue on each run This shrinks the p100 handshake time in practice (mostly fixing the comment) while still preserving random peer ordering between different hosts. In test, this seems to not *quite* accomplish our expectation of p100 time being exactly the handshake period (I see 47 minutes @ p100 with a 30 minute period) but that's still better than ~62 minutes we see otherwise and the average fluctuates much less (+/- roughly 1 minute).

Mark-Simulacrum force-pushed the fix-rehandshake-nits branch from adb1c6f to f3f7183 Compare March 28, 2025 13:51

Mark-Simulacrum marked this pull request as ready for review March 28, 2025 14:44

Mark-Simulacrum changed the title ~~Store SocketAddressV6 in rehandshake queue~~ Some more tweaks for rehandshake queue Mar 28, 2025

camshaft reviewed Mar 28, 2025

View reviewed changes

camshaft approved these changes Mar 28, 2025

View reviewed changes

camshaft merged commit 6a81117 into aws:main Mar 28, 2025
116 checks passed

Mark-Simulacrum deleted the fix-rehandshake-nits branch March 28, 2025 19:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some more tweaks for rehandshake queue#2576

Some more tweaks for rehandshake queue#2576
camshaft merged 2 commits intoaws:mainfrom
Mark-Simulacrum:fix-rehandshake-nits

Mark-Simulacrum commented Mar 28, 2025 •

edited

Loading

Uh oh!

camshaft Mar 28, 2025

Uh oh!

Mark-Simulacrum Mar 28, 2025

Uh oh!

camshaft Mar 28, 2025

Uh oh!

Mark-Simulacrum Mar 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mark-Simulacrum commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Summary:

Resolved issues:

Description of changes:

Call-outs:

Testing:

Uh oh!

camshaft Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

Mark-Simulacrum Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

camshaft Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

Mark-Simulacrum Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Mark-Simulacrum commented Mar 28, 2025 •

edited

Loading