Processes that access packdata while the .idx file gets removed
(e.g. while repacking) did not fail or fall back gracefully as they
could.
* tb/idx-midx-race-fix:
midx.c: protect against disappearing packs
packfile.c: protect against disappearing indexes
"git update-ref --stdin" learns to take multiple transactions in a
single session.
* ps/update-ref-multi-transaction:
update-ref: disallow "start" for ongoing transactions
p1400: use `git-update-ref --stdin` to test multiple transactions
update-ref: allow creation of multiple transactions
t1400: avoid touching refs on filesystem
"git add -i" failed to honor custom colors configured to show
patches, which has been corrected.
* js/add-i-color-fix:
add -i: verify in the tests that colors can be overridden
add -p: prefer color.diff.context over color.diff.plain
add -i (Perl version): color header to match the C version
add -i (built-in): use the same indentation as the Perl version
add -p (built-in): do not color the progress indicator separately
add -i (built-in): use correct names to load color.diff.* config
add -i (built-in): prevent the `reset` "color" from being configured
add -i: use `reset_color` consistently
add -p (built-in): imitate `xdl_format_hunk_hdr()` generating hunk headers
add -i (built-in): send error messages to stderr
add -i (built-in): do show an error message for incorrect inputs
An earlier attempt to fix "git fetch --recurse-submodules" broke
another use case; revert it until a better fix is found.
* pk/subsub-fetch-fix:
Revert "submodules: fix of regression on fetching of non-init subsub-repo"
"git fetch" that is killed may leave a pack-objects process behind,
still computing to find a good compression, wasting cycles. This
has been corrected.
* jk/stop-pack-objects-when-fetch-is-killed:
upload-pack: kill pack-objects helper on signal or exit
"git push" that is killed may leave a pack-objects process behind,
still computing to find a good compression, wasting cycles. This
has been corrected.
* jk/stop-pack-objects-when-push-is-killed:
send-pack: kill pack-objects helper on signal or exit
Simplify the logic to deal with a repack operation that ended up
creating the same packfile.
* tb/repack-simplify:
builtin/repack.c: don't move existing packs out of the way
builtin/repack.c: keep track of what pack-objects wrote
repack: make "exts" array available outside cmd_repack()
"git pull --rebase --recurse-submodules" checked for local changes
in a wrong range and failed to run correctly when it should.
* pb/pull-rebase-recurse-submodules:
pull: check for local submodule modifications with the right range
t5572: describe '--rebase' tests a little more
t5572: add notes on a peculiar test
pull --rebase: compute rebase arguments in separate function
"git-parse-remote" shell script library outlived its usefulness.
* ab/retire-parse-remote:
submodule: fix fetch_in_submodule logic
parse-remote: remove this now-unused library
submodule: remove sh function in favor of helper
submodule: use "fetch" logic instead of custom remote discovery
This reverts commit 1b7ac4e6d4d490b224f5206af7418ed74e490608; in
<CAN0XMOLiS_8JZKF_wW70BvRRxkDHyUoa=Z3ODtB_Bd6f5Y=7JQ@mail.gmail.com>,
Ralf Thielow reports that "git fetch" with submodule.recurse set can
result in a bogus and infinitely recursive fetching of the same
submodule.
We spawn an external pack-objects process to actually send objects to
the remote side. If we are killed by a signal during this process, then
pack-objects may continue to run. As soon as it starts producing output
for the pack, it will see a failure writing to upload-pack and exit
itself. But before then, it may do significant work traversing the
object graph, compressing deltas, etc, which will all be pointless. So
let's make sure to kill as soon as we know that the caller will not read
the result.
There's no test here, since it's inherently racy, but here's an easy
reproduction is on a large-ish repo like linux.git:
- make sure you don't have pack bitmaps (since they make the enumerating
phase go quickly). For linux.git it takes ~30s or so to walk the
whole graph on my machine.
- run "git clone --no-local -q . dst"; the "-q" is important because
if pack-objects is writing progress to upload-pack (to get
multiplexed over the sideband to the client), then it will notice
pretty quickly the failure to write to stderr
- kill the client-side clone process in another terminal (don't use
^C, as that will send SIGINT to all of the processes)
- run "ps au | grep git" or similar to observe upload-pack dying
within 5 seconds (it will send a keepalive that will notice the
client has gone away)
- but you'll still see pack-objects consuming 100% CPU (and 1GB+ of
RAM) during the traversal and delta compression phases. It will exit
as soon as it starts to write the pack (when it will notice that
upload-pack went away).
With this patch, pack-objects exits as soon as upload-pack does.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Multiple "credential-store" backends can race to lock the same
file, causing everybody else but one to fail---reattempt locking
with some timeout to reduce the rate of the failure.
* sa/credential-store-timeout:
crendential-store: use timeout when locking file
A test script got cleaned up and then made not to depend on the
value of init.defaultBranch.
* js/t3404-master-to-primary:
t3404: do not depend on any specific default branch name
Config parser fix for "git notes".
* na/notes-displayref-is-not-boolean:
t3301: test proper exit response to no-value notes.displayRef.
notes.c: fix a segfault in notes_display_config()
Expectation for the original contributor after responding to a
review comment to use the explanation in a patch update has been
described.
* jc/do-not-just-explain-but-update-your-patch:
MyFirstContribition: answering questions is not the end of the story
Fix formulation of an error message with two placeholders in "git
worktree add" subcommand.
* mt/worktree-error-message-fix:
worktree: fix order of arguments in error message
Fix an option name in "gc" documentation.
* ab/gc-keep-base-option:
gc: rename keep_base_pack variable for --keep-largest-pack
gc docs: change --keep-base-pack to --keep-largest-pack
A test script got cleaned up not to depend on the value of
init.defaultBranch.
* js/t4015-wo-master:
t4015: let the test pass with any default branch name
A test script got cleaned up and then made not to depend on the
value of init.defaultBranch.
* js/t2106-cleanup:
t2106: ensure that the checkout fails for the expected reason
t2106: make test independent of the current main branch name
t2106: adjust style to the current conventions
A lazily defined test prerequisite can now be defined in terms of
another lazily defined test prerequisite.
* sg/tests-prereq:
tests: fix description of 'test_set_prereq'
tests: make sure nested lazy prereqs work reliably
Since jgit does not yet work with SHA-256 repositories, mark the
tests that uses it not to run unless we are testing with ShA-1
repositories.
* sg/t5310-jgit-wants-sha1:
t5310-pack-bitmaps: skip JGit tests with SHA256
"git fetch" did not work correctly with nested submodules where the
innermost submodule that is not of interest got updated in the
upstream, which has been corrected.
* pk/subsub-fetch-fix:
submodules: fix of regression on fetching of non-init subsub-repo
The code was not prepared to deal with pack .idx file that is
larger than 4GB.
* jk/4gb-idx:
packfile: detect overflow in .idx file size checks
block-sha1: take a size_t length parameter
fsck: correctly compute checksums on idx files larger than 4GB
use size_t to store pack .idx byte offsets
compute pack .idx byte offsets using size_t
The exchange between receive-pack and proc-receive hook did not
carefully check for errors.
* jx/t5411-flake-fix:
receive-pack: use default version 0 for proc-receive
receive-pack: gently write messages to proc-receive
t5411: new helper filter_out_user_friendly_and_stable_output
"git bisect start/next" in a large span of history spends a lot of
time trying to come up with exactly the half-way point; this can be
optimized by stopping when we see a commit that is close enough to
the half-way point.
* sg/bisect-approximately-halfway:
bisect: loosen halfway() check for a large number of commits
The command line completion script (in contrib/) learned to expand
commands that are alias of alias.
* fc/bash-completion-alias-of-alias:
completion: bash: improve alias loop detection
completion: bash: check for alias loop
completion: bash: support recursive aliases
When a packed object is stored in a multi-pack index, but that pack has
racily gone away, the MIDX code simply calls die(), when it could be
returning an error to the caller, which would in turn lead to
re-scanning the pack directory.
A pack can racily disappear, for example, due to a simultaneous 'git
repack -ad',
You can also reproduce this with two terminals, where one is running:
git init
while true; do
git commit -q --allow-empty -m foo
git repack -ad
git multi-pack-index write
done
(in effect, constantly writing new MIDXs), and the other is running:
obj=$(git rev-parse HEAD)
while true; do
echo $obj | git cat-file --batch-check='%(objectsize:disk)' || break
done
That will sometimes hit the error preparing packfile from
multi-pack-index message, which this patch fixes.
Right now, that path to discovering a missing pack looks something like
'find_pack_entry()' calling 'fill_midx_entry()' and eventually making
its way to call 'nth_midxed_pack_entry()'.
'nth_midxed_pack_entry()' already checks 'is_pack_valid()' and
propagates an error if the pack is invalid. So, this works if the pack
has gone away between calling 'prepare_midx_pack()' and before calling
'is_pack_valid()', but not if it disappears before then.
Catch the case where the pack has already disappeared before
'prepare_midx_pack()' by returning an error in that case, too.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 17c35c8969 (packfile: skip loading index if in multi-pack-index,
2018-07-12) we stopped loading the .idx file for packs that are
contained within a multi-pack index.
This saves us the effort of loading an .idx and doing some lightweight
validity checks by way of 'packfile.c:load_idx()', but introduces a race
between processes that need to load the index (e.g., to generate a
reverse index) and processes that can delete the index.
For example, running the following in your shell:
$ git init repo && cd repo
$ git commit --allow-empty -m 'base'
$ git repack -ad && git multi-pack-index write
followed by:
$ rm -f .git/objects/pack/pack-*.idx
$ git rev-parse HEAD | git cat-file --batch-check='%(objectsize:disk)'
will result in a segfault prior to this patch. What's happening here is
that we notice that the pack is in the multi-pack index, and so don't
check that it still has a .idx. When we then try and load that index to
generate a reverse index, we don't have it, so the call to
'find_pack_revindex()' in 'packfile.c:packed_object_info()' returns
NULL, and then dereferencing it causes a segfault.
Of course, we don't ever expect someone to remove the index file by
hand, or to be in a state where we never wrote it to begin with (yet
find that pack in the multi-pack-index). But, this can happen in a
timing race with 'git repack -ad', which removes all existing packs
after writing a new pack containing all of their objects.
Avoid this by reverting the hunk of 17c35c8969 which stops loading the
index when the pack is contained in a MIDX. This makes the latter half
of 17c35c8969 useless, since we'll always have a non-NULL
'p->index_data', in which case that if statement isn't guarding
anything.
These two together effectively revert 17c35c8969, and avoid the race
explained above.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When holding the lock for rewriting the credential file, use a timeout
to avoid race conditions when the credentials file needs to be updated
in parallel.
An example would be doing `fetch --all` on a repository with several
remotes that need credentials, using parallel fetching.
The timeout can be configured using "credentialStore.lockTimeoutMS",
defaulting to 1 second.
Signed-off-by: Simão Afonso <simao.afonso@powertools-tech.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sleep function is defined in wrapper.c, so it makes more sense to be a in
system compatibility header.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Emit a trace2 error event whenever warning() is called, just like when
die(), error(), or usage() is called.
This helps debugging issues that would trigger warnings but not errors.
In particular, this might have helped debugging an issue I encountered
with commit graphs at $DAYJOB [1].
There is a tradeoff between including potentially relevant messages and
cluttering up the trace output produced. I think that warning() messages
should be included in traces, because by its nature, Git is used over
multiple invocations of the Git tool, and a failure (currently traced)
in a Git invocation might be caused by an unexpected interaction in a
previous Git invocation that only has a warning (currently untraced) as
a symptom - as is the case in [1].
[1] https://lore.kernel.org/git/20200629220744.1054093-1-jonathantanmy@google.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A review exchange may begin with a reviewer asking "what did you
mean by this phrase in your log message (or here in the doc)?", the
author answering what was meant, and then the reviewer saying "ah,
that is what you meant---then the flow of the logic makes sense".
But that is not the happy end of the story. New contributors often
forget that the material that has been reviewed in the above exchange
is still unclear in the same way to the next person who reads it,
until it gets updated.
While we are in the vicinity, rephrase the verb "request" used to
refer to comments by reviewers to "suggest"---this matches the
contrast between "original" and "suggested" that appears later in
the same paragraph, and more importantly makes it clearer that it is
not like authors are to please reviewers' wishes but rather
reviewers are merely helping authors to polish their commits.
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we can override the default branch name in the tests via
`GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME`, we should avoid expecting a
particular hard-coded name.
So let's rename the initial branch immediately to `primary` and work
with that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 1c1518071c (submodule: use "fetch" logic instead of custom remote
discovery, 2020-11-14) rewrote the logic in fetch_in_submodule to do:
elif test "$2" -ne ""
But this is nonsense in shell: -ne is for numeric comparisons. This
should be "=" or more idiomatically:
elif test -n "$2"
But once we fix that, many tests start failing. Because that commit
introduced another problem. The caller that passes 3 arguments looks
like this:
fetch_in_submodule "$sm_path" $depth "$sha1"
Note the unquoted $depth parameter. When it isn't set, the function will
see only 2 arguments, and the function has no idea if what it sees in $2
is an option to go on the command line, or a refspec to pass on stdin.
In the old code before that commit:
fetch_in_submodule () (
sanitize_submodule_env &&
cd "$1" &&
- case "$2" in
- '')
- git fetch ;;
- *)
- shift
- git fetch $(get_default_remote) "$@" ;;
- esac
we treated those the same, so it didn't matter. But in the new logic
(with my fix above):
+ if test $# -eq 3
+ then
+ echo "$3" | git fetch --stdin "$2"
+ elif test -n "$n"
+ then
+ git fetch "$2"
+ else
+ git fetch
+ fi
we use the number of parameters to distinguish the two. Let's insist
that the caller pass an empty string for positional parameter two if
they want to have a third parameter after it.
But that still leaves one problem. In the --stdin block, we
unconditionally pass "$2" to git-fetch, even if it's the empty string.
Rather than add another conditional, we can use :+ parameter expansion
to include it only if it's non-empty. In fact, we can do the same for
the elif, too, simplifying it further. Technically this is overkill,
since we know the --depth parameter will not have whitespace (and
indeed, most callers do not bother quoting it), but it doesn't hurt for
the function to be careful.
It's somewhat amazing that no tests were failing. I think what happened
is that:
- the 3-arg form rarely triggered; any call with a non-empty $depth
and a $sha1 would work, but one with an empty $depth would only have
2 arguments
- because of the wrong arguments to "test", the shell would complain
and exit non-zero. So we never ran the middle conditional at all
- that left every call running "git fetch" with no arguments. A
well-written test could have detected the distinction here, but in
practice omitting --depth just means fetching more commits, and
fetching everything (rather than a single sha1) works as long as the
commit in question is reachable
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Restore a space that was lost in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25).
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>