During abbreviation checks, we navigate to the position within a
pack-index that an OID would be inserted and check surrounding OIDs
for the maximum matching prefix. This position may be beyond the
last position, because the given OID is lexicographically larger
than every OID in the pack. Then nth_packed_object_oid() does not
initialize "oid".
Use the return value of nth_packed_object_oid() to prevent these
errors.
Also the comment about checking near-by objects miscounts the
neighbours. If we have a hit at "first", we check "first-1" and
"first+1" to make sure we have sufficiently long abbreviation not to
match either. If we do not have a hit, "first" is the smallest
among the objects that are larger than what we want to name, so we
check that and "first-1" to make sure we have sufficiently long
abbreviation not to match either. In either case, we only check up
to two near-by objects.
Reported-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Minimize OID comparisons during disambiguation of packfile OIDs.
Teach git to use binary search with the full OID to find the object's
position (or insertion position, if not present) in the pack-index.
The object before and immediately after (or the one at the insertion
position) give the maximum common prefix. No subsequent linear search
is required.
Take care of which two to inspect, in case the object id exists in the
packfile.
If the input to find_unique_abbrev_r() is a partial prefix, then the
OID used for the binary search is padded with zeroes so the object will
not exist in the repo (with high probability) and the same logic
applies.
This commit completes a series of three changes to OID abbreviation
code, and the overall change can be seen using standard commands for
large repos. Below we report performance statistics for perf test 4211.6
from p4211-line-log.sh using three copies of the Linux repo:
| Packs | Loose | HEAD~3 | HEAD | Rel% |
|-------|--------|----------|----------|-------|
| 1 | 0 | 41.27 s | 38.93 s | -4.8% |
| 24 | 0 | 98.04 s | 91.35 s | -5.7% |
| 23 | 323952 | 117.78 s | 112.18 s | -4.8% |
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create get_hex_char_from_oid() to parse oids one hex character at a
time. This prevents unnecessary copying of hex characters in
extend_abbrev_len() when finding the length of a common prefix.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unroll the while loop inside find_unique_abbrev_r to avoid iterating
through all loose objects and packfiles multiple times when the short
name is longer than the predicted length.
Instead, inspect each object that collides with the estimated
abbreviation to find the longest common prefix.
The focus of this change is to refactor the existing method in a way
that clearly does not change the current behavior. In some cases, the
new method is slower than the previous method. Later changes will
correct all performance loss.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time computing
diffs.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch -M a b" while on a branch that is completely unrelated
to either branch a or branch b misbehaved when multiple worktree
was in use. This has been fixed.
* nd/worktree-kill-parse-ref:
branch: fix branch renaming not updating HEADs correctly
In addition to "cc: <a@dd.re.ss> # cruft", "cc: a@dd.re.ss # cruft"
was taught to "git send-email" as a valid way to tell it that it
needs to also send a carbon copy to <a@dd.re.ss> in the trailer
section.
* mm/send-email-cc-cruft:
send-email: don't use Mail::Address, even if available
send-email: fix garbage removal after address
The codepath to call external process filter for smudge/clean
operation learned to show the progress meter.
* ls/convert-filter-progress:
convert: display progress for filtered objects that have been delayed
Message and doc updates.
* ma/up-to-date:
treewide: correct several "up-to-date" to "up to date"
Documentation/user-manual: update outdated example output
Doc fix.
* mg/format-ref-doc-fix:
Documentation/git-for-each-ref: clarify peeling of tags for --format
Documentation: use proper wording for ref format strings
"git archive" did not work well with pathspecs and the
export-ignore attribute.
We may want to resurrect the "we don't archive an empty directory"
bonus patch, but I do not mind merging the above early to 'next'
and leave it as a separate follow-up enhancement.
cf. <20170820090629.tumvqwzkromcykjf@sigill.intra.peff.net>
* rs/archive-excluded-directory:
archive: don't queue excluded directories
archive: factor out helper functions for handling attributes
t5001: add tests for export-ignore attributes and exclude pathspecs
Killing "git merge --edit" before the editor returns control left
the repository in a state with MERGE_MSG but without MERGE_HEAD,
which incorrectly tells the subsequent "git commit" that there was
a squash merge in progress. This has been fixed.
* mg/killed-merge:
merge: save merge state earlier
merge: split write_merge_state in two
merge: clarify call chain
Documentation/git-merge: explain --continue
"git apply" that is used as a better "patch -p1" failed to apply a
taken from a file with CRLF line endings to a file with CRLF line
endings. The root cause was because it misused convert_to_git()
that tried to do "safe-crlf" processing by looking at the index
entry at the same path, which is a nonsense---in that mode, "apply"
is not working on the data in (or derived from) the index at all.
This has been fixed.
* tb/apply-with-crlf:
apply: file commited with CRLF should roundtrip diff and apply
convert: add SAFE_CRLF_KEEP_CRLF
When handshake with a subprocess filter notices that the process
asked for an unknown capability, Git did not report what program
the offending subprocess was running. This has been corrected.
We may want a follow-up fix to tighten the error checking, though.
* cc/subprocess-handshake-missing-capabilities:
sub-process: print the cmd when a capability is unsupported
"git grep -L" and "git grep --quiet -L" reported different exit
codes; this has been corrected.
* as/grep-quiet-no-match-exit-code-fix:
git-grep: correct exit code with --quiet and -L
bash 4.4 or newer gave a warning on NUL byte in command
substitution done in "git stash"; this has been squelched.
* kd/stash-with-bash-4.4:
stash: prevent warning about null bytes in input
"git svn" used with "--localtime" option did not compute the tz
offset for the timestamp in question and instead always used the
current time, which has been corrected.
* ur/svn-local-zone:
git svn fetch: Create correct commit timestamp when using --localtime
"git am -s" has been taught that some input may end with a trailer
block that is not Signed-off-by: and it should refrain from adding
an extra blank line before adding a new sign-off in such a case.
* pw/am-signoff:
am: fix signoff when other trailers are present
"git clone --recurse-submodules --quiet" did not pass the quiet
option down to submodules.
* bw/clone-recursive-quiet:
clone: teach recursive clones to respect -q