Tags: avar/git
Tags
hooks: fix a TOCTOU in "did we run a hook?" heuristic Fix a Time-of-check to time-of-use (TOCTOU) race in code added in 680ee55 (commit: skip discarding the index if there is no pre-commit hook, 2017-08-14). We can fix the race passing around information about whether or not we ran the hook in question, instead of running hook_exists() after the fact to check if the hook in question exists. This problem has been noted on-list when 680ee55 was discussed[1], but had not been fixed. In addition to fixing this for the pre-commit hook as suggested there I'm also fixing this for the pre-merge-commit hook. See 6098817 (git-merge: honor pre-merge-commit hook, 2019-08-07) for the introduction of its previous behavior. Let's also change this for the push-to-checkout hook. Now instead of checking if the hook exists and either doing a push to checkout or a push to deploy we'll always attempt a push to checkout. If the hook doesn't exist we'll fall back on push to deploy. The same behavior as before, without the TOCTOU race. See 0855331 (receive-pack: support push-to-checkout hook, 2014-12-01) for the introduction of the previous behavior. This leaves uses of hook_exists() in two places that matter. The "reference-transaction" check in refs.c, see 6754159 (refs: implement reference transaction hook, 2020-06-19), and the prepare-commit-msg hook, see 66618a5 (sequencer: run 'prepare-commit-msg' hook, 2018-01-24). In both of those cases we're saving ourselves CPU time by not preparing data for the hook that we'll then do nothing with if we don't have the hook. So using this "invoked_hook" pattern doesn't make sense in those cases. More importantly, in those cases the worst we'll do is miss that we "should" run the hook because a new hook appeared, whereas in the pre-commit and pre-merge-commit cases we'll skip an important discard_cache() on the bases of our faulty guess. I do think none of these races really matter in practice. It would be some one-off issue as a hook was added or removed. I did think it was stupid that we didn't pass a "did this run?" flag instead of doing this guessing at a distance though, so now we're not guessing anymore. 1. https://lore.kernel.org/git/[email protected]/ Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
bundle-uri docs: add design notes Add a design doc for the bundle-uri protocol extension to go along with the packfile-uri extension added in cd8402e (Documentation: add Packfile URIs design doc, 2020-06-10). Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
hooks: fix a TOCTOU in "did we run a hook?" heuristic Fix a Time-of-check to time-of-use (TOCTOU) race in code added in 680ee55 (commit: skip discarding the index if there is no pre-commit hook, 2017-08-14). We can fix the race passing around information about whether or not we ran the hook in question, instead of running hook_exists() after the fact to check if the hook in question exists. This problem has been noted on-list when 680ee55 was discussed[1], but had not been fixed. In addition to fixing this for the pre-commit hook as suggested there I'm also fixing this for the pre-merge-commit hook. See 6098817 (git-merge: honor pre-merge-commit hook, 2019-08-07) for the introduction of its previous behavior. Let's also change this for the push-to-checkout hook. Now instead of checking if the hook exists and either doing a push to checkout or a push to deploy we'll always attempt a push to checkout. If the hook doesn't exist we'll fall back on push to deploy. The same behavior as before, without the TOCTOU race. See 0855331 (receive-pack: support push-to-checkout hook, 2014-12-01) for the introduction of the previous behavior. This leaves uses of hook_exists() in two places that matter. The "reference-transaction" check in refs.c, see 6754159 (refs: implement reference transaction hook, 2020-06-19), and the prepare-commit-msg hook, see 66618a5 (sequencer: run 'prepare-commit-msg' hook, 2018-01-24). In both of those cases we're saving ourselves CPU time by not preparing data for the hook that we'll then do nothing with if we don't have the hook. So using this "invoked_hook" pattern doesn't make sense in those cases. More importantly, in those cases the worst we'll do is miss that we "should" run the hook because a new hook appeared, whereas in the pre-commit and pre-merge-commit cases we'll skip an important discard_cache() on the bases of our faulty guess. I do think none of these races really matter in practice. It would be some one-off issue as a hook was added or removed. I did think it was stupid that we didn't pass a "did this run?" flag instead of doing this guessing at a distance though, so now we're not guessing anymore. 1. https://lore.kernel.org/git/[email protected]/ Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Revert "marks" This reverts commit eb48b1f.
serve: add command to advertise bundle URIs
When the uploadpack.bundleURI config is set to a URI (or URIs, if set
>1 times), advertise a "bundle-uri" command, then when the client
requests "bundle-uri" emit those URIs back at them.
The client CAN then request those URIs out of bounds, and after
they've done (after either disconnecting & coming back, or leaving us
hanging), proceed with the rest of request flow. I.e. issuing a
"ls-refs" followed by a "fetch". The client MAY then send us "have"
lines with the tips they've unpacked from their newly acquired
bundle(s).
This commit doesn't implement any of that required client behavior,
only the trivial server behavior of spewing a list of URLs at the
client on request.
There is already a uploadpack.blobPackfileUri setting for the server,
so why is this needed? The Documentation/technical/bundle-uri.txt
added in a preceding commit discusses that in more detail, but in
summary:
1. There is no "real" support for in git.git. The
uploadpack.blobPackfileUri setting allows carving out a list of
blobs (actually any OIDs), but as alluded to in bfc2a36 (Doc:
clarify contents of packfile sent as URI, 2021-01-20) the only
"real" implementation is JGit based.
2. The uploadpack.blobPackfileUri is a MUST where this is a
"CAN". I.e. once a client says they support packfile-uri of given
list of protocols the server will send them a PACK response
assuming they've downloaded the URI they client was sent, if the
client doesn't do that they don't have a valid repository.
Pointing at a bundle and having the client send us "have"
lines (or not, maybe they couldn't fetch it, or decided they
didn't want to) is more flexible, and can gracefully recover
e.g. if the CDN isn't reachable (maybe you do support "https", but
the CDN provider is down, or blocked your whole country).
3. Because of the disconnect in #2 "dumb" servers can seed
pre-clients, e.g. we might point to a repo.bundle whose exact
state we aren't sure of (a cronjob updates it, sometimes). The
client will discover its contents, and give us the "have" lines,
the "packfile-uri" method effectively requires the server to have
those exact "have" lines (or rather, it will produce a similar
PACK using give-or-take the same exclusions).
4. This provides an easy way to the long sought after "resumable
clones". I.e. since we can assume that it's in the server's
interest to keep their bundle(s) as up-to-date as possible, most
or all of the history we need to fetch will be in the bundle. If
we fail midway through the "clone" we can offload the problem of
resuming to wget/curl/rsync/whatever, instead of (as has been
suggested, but not implemented for the "normal" dialog)
"repairing" a partial PACK response or something.
There was a suggestion of implementing a similar feature long ago[1]
by Jeff King. The main difference between it and this approach is that
we've since gained protocol v2, so we can add this as an optional path
in the dialog between client and server. The 2011 implementation
hooked into the transport mechanism to try to clone from a bundle
directly. See also [2] and [3] for some later mentions of that
approach.
See also [4] for the series that implemented
uploadpack.blobPackfileUri, and [5] for a series on top that did the
.gitmodules check in that context. See [6] for the "ls-refs unborn"
feature which modified code in similar areas of the request flow.
1. https://lore.kernel.org/git/[email protected]/
2. https://lore.kernel.org/git/[email protected]/
3. https://lore.kernel.org/git/[email protected]/
4. https://lore.kernel.org/git/[email protected]/
Merged as 34e849b (Merge branch 'jt/cdn-offload', 2020-06-25)
5. https://lore.kernel.org/git/[email protected]/
Merged as 6ee353d (Merge branch 'jt/transfer-fsck-across-packs',
2021-03-01)
6. 69571df (Merge branch 'jt/clone-unborn-head', 2021-02-17)
Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
bundle-uri client: add boolean transfer.bundleURI setting The yet-to-be introduced client support for bundle-uri will always fall back on a full clone, but we'd still like to be able to ignore a server's bundle-uri advertisement entirely. This is useful for testing, and if a server is pointing to bad bundles, they take a while to time out etc. Since we might see the config in any order we need to clear out any accumulated bundle_uri list when we see transfer.bundleURI=false setting, and not add any more things to the list. Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
bundle-uri docs: add design notes Add a design doc for the bundle-uri protocol extension to go along with the packfile-uri extension added in cd8402e (Documentation: add Packfile URIs design doc, 2020-06-10). Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
this offset optimization is never going to work consider content from/to 123xxxx 12xxxxx and now if the user's needle is "12x", I'll have skipped up to: 3xxxx xxxxx So even in the fixed-string case it won't work.
don't have a test for this, but harmless?
Revert "assorted changes" This reverts commit 5142fcc.
PreviousNext