Before, can_fast_forward was written with an if-else statement. However,
in the future, we may be adding more termination cases which would lead
to deeply nested if statements.
Refactor to use a goto tower so that future cases can be easily
inserted.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Append the strbuf buffer only after detaching it. There is no practical
difference here, as the strbuf is not empty and no strbuf_ function is
called between storing the pointer to the still attached buffer and
calling strbuf_detach(), so that pointer is valid, but make sure to
follow the standard sequence anyway for consistency.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to write commit-graph over given commit object names has
been made a bit more robust.
* sg/commit-graph-validate:
commit-graph: error out on invalid commit oids in 'write --stdin-commits'
commit-graph: turn a group of write-related macro flags into an enum
t5318-commit-graph: use 'test_expect_code'
"git checkout" and "git restore" to re-populate the index from a
tree-ish (typically HEAD) did not work correctly for a path that
was removed and then added again with the intent-to-add bit, when
the corresponding working tree file was empty. This has been
corrected.
* vn/restore-empty-ita-corner-case-fix:
restore: add test for deleted ita files
checkout.c: unstage empty deleted ita files
Codepaths to walk tree objects have been audited for integer
overflows and hardened.
* jk/tree-walk-overflow:
tree-walk: harden make_traverse_path() length computations
tree-walk: add a strbuf wrapper for make_traverse_path()
tree-walk: accept a raw length for traverse_path_len()
tree-walk: use size_t consistently
tree-walk: drop oid from traverse_info
setup_traverse_info(): stop copying oid
"git grep --recurse-submodules" that looks at the working tree
files looked at the contents in the index in submodules, instead of
files in the working tree.
* mt/grep-submodules-working-tree:
grep: fix worktree case in submodules
In this code path, we use sha1_to_hex to display the contents of a v1
pack index. While we plan to switch to v3 indices for SHA-256, the v1
pack indices still function, so to support both algorithms, switch
sha1_to_hex to hash_to_hex.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since sha1_to_hex is limited to SHA-1, replace it with hash_to_hex.
Rename several variables to indicate that they can contain any hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since sha1_to_hex is limited to SHA-1, replace it with hash_to_hex so
this code works with other algorithms.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change struct wt_status to use struct object_id instead of an array of
unsigned char.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the existing uses of null_sha1 can be converted into uses of
null_oid, so do so. Remove null_sha1 and is_null_sha1, and define
is_null_oid in terms of null_oid. This also has the additional benefit
of removing several uses of sha1_to_hex.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the_hash_algo when calling xwrite with a hex object ID so that the
proper amount of data is written.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
show-index uses a variety of hard-coded constants to enumerate the
values in pack indices. Switch these hard-coded constants to use
the_hash_algo or GIT_MAX_RAWSZ, as appropriate, and convert one instance
of an unsigned char array to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch several hard-coded uses of the constant 40 to references to
the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The push cert code uses HMAC-SHA-1 to create a nonce. This is a secure
use of SHA-1 which is not affected by its collision resistance (or lack
thereof). However, it makes sense for us to use a better algorithm if
one is available, one which may even be more performant. Futhermore,
until we have specialized functions for computing the hex value of an
arbitrary function, it simplifies the code greatly to use the same hash
algorithm everywhere.
Switch this code to use GIT_MAX_BLKSZ and the_hash_algo for computing
the push cert nonce, and rename the hmac_sha1 function to simply "hmac".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the two separate patch-id implementations to use the_hash_algo
in their implementation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using GIT_SHA1_HEXSZ and hard-coded constants, switch to
using the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the --set-upstream option to git pull/fetch
which lets the user set the upstream configuration
(branch.<current-branch-name>.merge and
branch.<current-branch-name>.remote) for the current branch.
A typical use-case is:
git clone http://example.com/my-public-fork
git remote add main http://example.com/project-main-repo
git pull --set-upstream main master
or, instead of the last line:
git fetch --set-upstream main master
git merge # or git rebase
This is mostly equivalent to cloning project-main-repo (which sets
upsteam) and then "git remote add" my-public-fork, but may feel more
natural for people using a hosting system which allows forking from
the web UI.
This functionality is analog to "git push --set-upstream".
Signed-off-by: Corentin BOMPARD <corentin.bompard@etu.univ-lyon1.fr>
Signed-off-by: Nathan BERBEZIER <nathan.berbezier@etu.univ-lyon1.fr>
Signed-off-by: Pablo CHABANNE <pablo.chabanne@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
Patch-edited-by: Matthieu Moy <git@matthieu-moy.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
write_tree_from_memory() appeared to be a merge-recursive special that
basically duplicated write_index_as_tree(). The two have a different
signature, but the bigger difference was just that write_index_as_tree()
would always unconditionally read the index off of disk instead of
working on the current in-memory index. So:
* split out common code into write_index_as_tree_internal()
* rename write_tree_from_memory() to write_inmemory_index_as_tree(),
make it call write_index_as_tree_internal(), and move it to
cache-tree.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_trees() took a results parameter that would only be written when
opt->call_depth was positive, which is never the case now that
merge_trees_internal() has been split from merge_trees(). Remove the
misleading and unused parameter from merge_trees().
While at it, add some comments explaining how the output of
merge_trees() and merge_recursive() differ.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the bug that just won't die; there always seems to be another
form of it somewhere. See the commit message of 55f39cf755 ("merge:
fix misleading pre-merge check documentation", 2018-06-30) for a more
detailed explanation), but in short:
<quick summary>
builtin/merge.c contains this important requirement for merge
strategies:
...the index must be in sync with the head commit. The strategies are
responsible to ensure this.
This condition is important to enforce because there are two likely
failure cases when the index isn't in sync with the head commit:
* we silently throw away changes the user had staged before the merge
* we accidentally (and silently) include changes in the merge that
were not part of either of the branches/trees being merged
Discarding users' work and mis-merging are both bad outcomes, especially
when done silently, so naturally this rule was stated sternly -- but,
unfortunately totally ignored in practice unless and until actual bugs
were found. But, fear not: the bugs from this were fixed in commit
ee6566e8d7 ("[PATCH] Rewrite read-tree", 2005-09-05)
through a rewrite of read-tree (again, commit 55f39cf755 has a more
detailed explanation of how this affected merge). And it was fixed
again in commit
160252f816 ("git-merge-ours: make sure our index matches HEAD", 2005-11-03)
...and it was fixed again in commit
3ec62ad9ff ("merge-octopus: abort if index does not match HEAD", 2016-04-09)
...and again in commit
65170c07d4 ("merge-recursive: avoid incorporating uncommitted changes in a merge", 2017-12-21)
...and again in commit
eddd1a411d ("merge-recursive: enforce rule that index matches head before merging", 2018-06-30)
...with multiple testcases added to the testsuite that could be
enumerated in even more commits.
Then, finally, in a patch in the same series as the last fix above, the
documentation about this requirement was fixed in commit 55f39cf755
("merge: fix misleading pre-merge check documentation", 2018-06-30), and
we all lived happily ever after...
</quick summary>
Unfortunately, "ever after" apparently denotes a limited time and it
expired today. The merge-recursive rule to enforce that index matches
head was at the beginning of merge_trees() and would only trigger when
opt->call_depth was 0. Since merge_recursive() doesn't call
merge_trees() until after returning from recursing, this meant that the
check wasn't triggered by merge_recursive() until it had first finished
all the intermediate merges to create virtual merge bases. That is a
potentially HUGE amount of computation (and writing of intermediate
merge results into the .git/objects directory) before it errors out and
says, in effect, "Sorry, I can't do any merging because you have some
local changes that would be overwritten."
Trying to enforce that all of merge_trees(), merge_recursive(), and
merge_recursive_generic() checked the index == head condition earlier
resulted in a bunch of broken tests. It turns out that
merge_recursive() has code to drop and reload the cache while recursing
to create intermediate virtual merge bases, but unfortunately that code
runs even when no recursion is necessary. This unconditional dropping
and reloading of the cache masked a few bugs:
* builtin/merge-recursive.c: didn't even bother loading the index.
* builtin/stash.c: feels like a fake 'builtin' because it repeatedly
invokes git subprocesses all over the place, mixed with other
operations. In particular, invoking "git reset" will reset the
index on disk, but the parent process that invoked it won't
automatically have its in-memory index updated.
* t3030-merge-recursive.h: this test has always been broken in that it
didn't make sure to make index match head before running. But, it
didn't care about the index or even the merge result, just the
verbose output while running. While commit eddd1a411d
("merge-recursive: enforce rule that index matches head before
merging", 2018-06-30) should have uncovered this broken test, it
used a test_must_fail wrapper around the merge-recursive call
because it was known that the merge resulted in a rename/rename
conflict. Thus, that fix only made this test fail for a different
reason, and since the index == head check didn't happen until after
coming all the way back out of the recursion, the testcase had
enough information to pass the one check that it did perform.
So, load the index in builtin/merge-recursive.c, reload the in-memory
index in builtin/stash.c, and modify the t3030 testcase to correctly
setup the index and make sure that the test fails in the expected way
(meaning it reports a rename/rename conflict). This makes sure that
all callers actually make the index match head. The next commit will
then enforce the condition that index matches head earlier so this
problem doesn't return in the future.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve code readability by introducing an enum to replace the
not-quite-boolean values taken on by detect_directory_renames.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git checkout -m' and using diff3 style conflict markers,
we want all the conflict hunks (left-side, "common" or "merge base", and
right-side) to have label markers letting the user know where each came
from. The "common" hunk label (o.ancestor) came from
old_branch_info->name, but that is NULL when HEAD is detached, which
resulted in a blank label. Check for that case and provide an
abbreviated commit hash instead.
(Incidentally, this was the only case in the git codebase where
merge_trees() was called with opt->ancestor being NULL. A subsequent
commit will prevent similar problems by enforcing that merge_trees()
always be called with opt->ancestor != NULL.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both commit a7256debd4 ("checkout.txt: note about losing staged
changes with --merge", 2019-03-19) from nd/checkout-m-doc-update and
commit 6eff409e8a ("checkout: prevent losing staged changes with
--merge", 2019-03-22) from nd/checkout-m were included in git.git
despite the fact that the latter was meant to be v2 of the former.
The merge of these two topics resulted in a redundant chunk of code;
remove it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.untrackedCache config setting is slightly complicated,
so clarify its use and centralize its parsing into the repo
settings.
The default value is "keep" (returned as -1), which persists the
untracked cache if it exists.
If the value is set as "false" (returned as 0), then remove the
untracked cache if it exists.
If the value is set as "true" (returned as 1), then write the
untracked cache and persist it.
Instead of relying on magic values of -1, 0, and 1, split these
options into an enum. This allows the use of "-1" as a
default value. After parsing the config options, if the value is
unset we can initialize it to UNTRACKED_CACHE_KEEP.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few important config settings that are not loaded
during git_default_config. These are instead loaded on-demand.
Centralize these config options to a single scan, and store
all of the values in a repo_settings struct. The values for
each setting are initialized as negative to indicate "unset".
This centralization will be particularly important in a later
change to introduce "meta" config settings that change the
defaults for these config settings.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To avoid data loss, 'git worktree remove' refuses to delete a worktree
if it's dirty or contains untracked files. However, the error message
only mentions that the worktree "is dirty", even if the worktree in
question is in fact clean, but contains untracked files:
$ git worktree add test-worktree
Preparing worktree (new branch 'test-worktree')
HEAD is now at aa53e60 Initial
$ >test-worktree/untracked-file
$ git worktree remove test-worktree/
fatal: 'test-worktree/' is dirty, use --force to delete it
$ git -C test-worktree/ diff
$ git -C test-worktree/ diff --cached
$ # Huh? Where are those dirty files?!
Clarify this error message to say that the worktree "contains modified
or untracked files".
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Long ago, in 97bfeb34df (Release pack windows before reporting out of
memory., 2006-12-24), we taught xmalloc() and friends to try unmapping
pack windows when malloc() failed. It's unlikely that his helps a lot in
practice, and it has some downsides. First, the downsides:
1. It makes xmalloc() not thread-safe. We've worked around this in
pack-objects.c, which installs its own locking version of the
try_to_free_routine(). But other threaded code doesn't.
2. It makes the system as a whole harder to reason about. Functions
which allocate heap memory under the hood may have farther-reaching
effects than expected.
That might be worth the tradeoff if there's a benefit. But in practice,
it seems unlikely. We're generally dealing with mmap'd files, so the OS
is going to do a much better job at responding to memory pressure by
dropping individual pages (the exception is systems with NO_MMAP, but
even there the OS can probably respond just as well with swapping).
So the only thing we're really freeing is address space. On 64-bit
systems, we have plenty of that to go around. On 32-bit systems, it
could possibly help. But around the same time we made two other changes:
77ccc5bbd1 (Introduce new config option for mmap limit., 2006-12-23) and
60bb8b1453 (Fully activate the sliding window pack access., 2006-12-23).
Together that means that a 32-bit system should have no more than 256MB
total of packed-git mmaps at one time, split between a few 32MB windows.
It's unlikely we have any address space problems since then, but we
don't have any data since the features were all added at the same time.
Likewise, xmmap() will try to free memory. At first glance, it seems
like we'd need this (when we try to mmap a new window, we might need to
close an old one to save address space on a 32-bit system). But we're
saved again by core.packedGitLimit: if we're going to exceed our 256MB
limit, we'll close an existing window before we even call mmap().
So it seems unlikely that this feature is actually doing anything
useful. And while we don't have reports of it harming anything (probably
because it rarely if ever kicks in), it would be nice to simplify the
system overall. This patch drops the whole try_to_free system from
xmalloc(), as well as the manual pack memory release in xmmap().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Analogous to commit, introduce a '--no-verify' option which bypasses the
pre-merge-commit hook. The shorthand '-n' is taken by '--no-stat'
already.
[js: * reworded commit message to reflect current state of --no-stat flag
and new hook name
* fixed flag documentation to reflect new hook name
* cleaned up trailing whitespace
* squashed test changes from the original series' patch 4/4
* modified tests to follow pattern from this series' patch 1/4
* added a test case for --no-verify with non-executable hook
* when testing that the merge hook did not run, make sure we
actually have a merge to perform (by resetting the "side" branch
to its original state).
]
Improved-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-merge does not honor the pre-commit hook when doing automatic merge
commits, and for compatibility reasons this is going to stay.
Introduce a pre-merge-commit hook which is called for an automatic merge
commit just like pre-commit is called for a non-automatic merge commit
(or any other commit).
[js: * renamed hook from "pre-merge" to "pre-merge-commit"
* only discard the index if the hook is actually present
* expanded githooks documentation entry
* clarified that hook should write messages to stderr
* squashed test changes from the original series' patch 4/4
* modified tests to follow new pattern from this series' patch 1/4
* added a test case for non-executable merge hooks
* added a test case for failed merges
* when testing that the merge hook did not run, make sure we
actually have a merge to perform (by resetting the "side" branch
to its original state).
* reworded commit message
]
Improved-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f8b863598c ("builtin/merge: honor commit-msg hook for merges", 2017-09-07)
introduced the no-verify flag to merge for bypassing the commit-msg
hook, though in a different way from the implementation in commit.c.
Change the implementation in merge.c to be the same as in commit.c so
that both do the same in the same way. This also changes the output of
"git merge --help" to be more clear that the hook return code is
respected by default.
[js: * reworded commit message
* squashed documentation changes from original series' patch 3/4
]
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While 'git commit-graph write --stdin-commits' expects commit object
ids as input, it accepts and silently skips over any invalid commit
object ids, and still exits with success:
# nonsense
$ echo not-a-commit-oid | git commit-graph write --stdin-commits
$ echo $?
0
# sometimes I forgot that refs are not good...
$ echo HEAD | git commit-graph write --stdin-commits
$ echo $?
0
# valid tree OID, but not a commit OID
$ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
$ echo $?
0
$ ls -l .git/objects/info/commit-graph
ls: cannot access '.git/objects/info/commit-graph': No such file or directory
Check that all input records are indeed valid commit object ids and
return with error otherwise, the same way '--stdin-packs' handles
invalid input; see e103f7276f (commit-graph: return with errors during
write, 2019-06-12).
Note that it should only return with error when encountering an
invalid commit object id coming from standard input. However,
'--reachable' uses the same code path to process object ids pointed to
by all refs, and that includes tag object ids as well, which should
still be skipped over. Therefore add a new flag to 'enum
commit_graph_write_flags' and a corresponding field to 'struct
write_commit_graph_context', so we can differentiate between those two
cases.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hotfix for making "git log" use the mailmap by default.
* jc/log-mailmap-flip-defaults:
log: really flip the --mailmap default
log: flip the --mailmap default unconditionally
It is possible to delete a committed file from the index and then add it
as intent-to-add. After `git checkout HEAD <pathspec>`, the file should
be identical in the index and HEAD. The command already works correctly
if the file has contents in HEAD. This patch provides the desired
behavior even when the file is empty in HEAD.
`git checkout HEAD <pathspec>` calls tree.c:read_tree_1(), with fn
pointing to checkout.c:update_some(). update_some() creates a new cache
entry but discards it when its mode and oid match those of the old
entry. A cache entry for an ita file and a cache entry for an empty file
have the same oid. Therefore, an empty deleted ita file previously
passed both of these checks, and the new entry was discarded, so the
file remained unchanged in the index. After this fix, if the file is
marked as ita in the cache, then we avoid discarding the new entry and
add the new entry to the cache instead.
This change should not affect newly added ita files. For those, inside
tree.c:read_tree_1(), tree_entry_interesting() returns
entry_not_interesting, so fn is never called.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Varun Naik <vcnaik94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the docs, test the interaction between the new default,
configuration and command line option, in addition to actually
flipping the default.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All but one of the callers of make_traverse_path() allocate a new heap
buffer to store the path. Let's give them an easy way to write to a
strbuf, which saves them from computing the length themselves (which is
especially tricky when they want to add to the path). It will also make
it easier for us to change the make_traverse_path() interface in a
future patch to improve its bounds-checking.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We take a "struct name_entry", but only care about the length of the
path name. Let's just take that length directly, making it easier to use
the function from callers that sometimes do not have a name_entry at
all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Squelch unneeded and misleading warnings from "repack" when the
command attempts to generate pack bitmaps without explicitly asked
for by the user.
* jk/repack-silence-auto-bitmap-warning:
repack: simplify handling of auto-bitmaps and .keep files
repack: silence warnings when auto-enabled bitmaps cannot be built
t7700: clean up .keep file in bitmap-writing test
It turns out that being cautious to warn against upcoming default
change was an unpopular behaviour, and such a care can easily be
defeated by distro packagers to render it ineffective anyway.
Just flip the default, with only a mention in the release notes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the previous commit shows, the presence of an oid in each level of
the traverse_info is confusing and ultimately not necessary. Let's drop
it to make it clear that it will not always be set (as well as convince
us that it's unused, and let the compiler catch any merges with other
branches that do add new uses).
Since the oid is part of name_entry, we'll actually stop embedding a
name_entry entirely, and instead just separately hold the pathname, its
length, and the mode.
This makes the resulting code slightly more verbose as we have to pass
those elements around individually. But it also makes it more clear what
each code path is going to use (and in most of the paths, we really only
care about the pathname itself).
A few of these conversions are noisier than they need to be, as they
also take the opportunity to rename "len" to "namelen" for clarity
(especially where we also have "pathlen" or "ce_len" alongside).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We assume that if setup_traverse_info() is passed a non-empty "base"
string, that string is pointing into a tree object and we can read the
object oid by skipping past the trailing NUL.
As it turns out, this is not true for either of the two calls, and we
may end up reading garbage bytes:
1. In git-merge-tree, our base string is either empty (in which case
we'd never run this code), or it comes from our traverse_path()
helper. The latter overallocates a buffer by the_hash_algo->rawsz
bytes, but then fills it with only make_traverse_path(), leaving
those extra bytes uninitialized (but part of a legitimate heap
buffer).
2. In unpack_trees(), we pass o->prefix, which is some arbitrary
string from the caller. In "git read-tree --prefix=foo", for
instance, it will point to the command-line parameter, and we'll
read 20 bytes past the end of the string.
Interestingly, tools like ASan do not detect (2) because the process
argv is part of a big pre-allocated buffer. So we're reading trash, but
it's trash that's probably part of the next argument, or the
environment.
You can convince it to fail by putting something like this at the
beginning of common-main.c's main() function:
{
int i;
for (i = 0; i < argc; i++)
argv[i] = xstrdup_or_null(argv[i]);
}
That puts the arguments into their own heap buffers, so running:
make SANITIZE=address test
will find problems when "read-tree --prefix" is used (e.g., in t3030).
Doubly interesting, even with the hackery above, this does not fail
prior to ea82b2a085 (tree-walk: store object_id in a separate member,
2019-01-15). That commit switched setup_traverse_info() to actually
copying the hash, rather than simply pointing to it. That pointer was
always pointing to garbage memory, but that commit started actually
dereferencing the bytes, which is what triggers ASan.
That also implies that nobody actually cares about reading these oid
bytes anyway (or at least no path covered by our tests). And manual
inspection of the code backs that up (I'll follow this patch with some
cleanups that show definitively this is the case, but they're quite
invasive, so it's worth doing this fix on its own).
So let's drop the bogus hashcpy(), along with the confusing oversizing
in merge-tree.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 7328482253 (repack: disable bitmaps-by-default if .keep files
exist, 2019-06-29) taught repack to prefer disabling bitmaps to
duplicating objects (unless bitmaps were asked for explicitly).
But there's an easier way to do this: if we keep passing the
--honor-pack-keep flag to pack-objects when auto-enabling bitmaps, then
pack-objects already makes the same decision (it will disable bitmaps
rather than duplicate). Better still, pack-objects can actually decide
to do so based not just on the presence of a .keep file, but on whether
that .keep file actually impacts the new pack we're making (so if we're
racing with a push or fetch, for example, their temporary .keep file
will not block us from generating bitmaps if they haven't yet updated
their refs).
And because repack uses the --write-bitmap-index-quiet flag, we don't
have to worry about pack-objects generating confusing warnings when it
does see a .keep file. We can confirm this by tweaking the .keep test to
check repack's stderr.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Depending on various config options, a full repack may not be able to
build a reachability bitmap index (e.g., if pack.packSizeLimit forces us
to write multiple packs). In these cases pack-objects may write a
warning to stderr.
Since 36eba0323d (repack: enable bitmaps by default on bare repos,
2019-03-14), we may generate these warnings even when the user did not
explicitly ask for bitmaps. This has two downsides:
- it can be confusing, if they don't know what bitmaps are
- a daemonized auto-gc will write this to its log file, and the
presence of the warning may suppress further auto-gc (until
gc.logExpiry has elapsed)
Let's have repack communicate to pack-objects that the choice to turn on
bitmaps was not made explicitly by the user, which in turn allows
pack-objects to suppress these warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>