Change the handling of fetch.negotiationAlgorithm=<str> to error out
on unknown strings, i.e. everything except "default" or "skipping".
This changes the behavior added in 42cc7485a2 ("negotiator/skipping:
skip commits during fetch", 2018-07-16) which would ignore all unknown
values and silently fall back to the "default" value.
For a feature like this it's much better to produce an error than
proceed. We don't want users to debug some amazingly slow fetch that
should benefit from "skipping", only to find that they'd forgotten to
deploy the new git version on that particular machine.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike ref advertisement, client capabilities and the first want are
separated by SP, not NUL, in the implementation. Fix the documentation
to align with the implementation. pack-protocol.txt is already fixed.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fsck.<msg-id> is set to an unknown value it'll cause "fsck" to
die, but the same is not true of the "fetch" and "receive"
variants. Document this and test for it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test and document that the {fetch,receive}.fsck.* family of variables
doesn't fall back on the corresponding .fsck.* variables.
This was alluded to in the existing documentation by saying that
"receive" looks at receive.fsck.* and "fsck" looks at fsck.* etc., but
it wasn't explicitly stated that there was no fallback, and if you'd
e.g. like to configure the skipList you need to do that for all three.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement support for fetch.fsck.* corresponding with the existing
receive.fsck.*. This allows for pedantically cloning repositories with
specific issues without turning off fetch.fsckObjects.
One such repository is https://github.com/robbyrussell/oh-my-zsh.git
which before this change will emit this error when cloned with
fetch.fsckObjects:
error: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes
fatal: Error in object
fatal: index-pack failed
Now with fetch.fsck.zeroPaddedFilemode=warn we'll warn about that
issue, but the clone will succeed:
warning: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes
warning: object a18c4d13c2a5fa2d4ecd5346c50e119b999b807d: zeroPaddedFilemode: contains zero-padded file modes
warning: object 84df066176c8da3fd59b13731a86d90f4f1e5c9d: zeroPaddedFilemode: contains zero-padded file modes
The motivation for this is to be able to turn on fetch.fsckObjects
globally across a fleet of computers but still be able to manually
clone various legacy repositories by either white-listing specific
issues, or better yet whitelist specific objects.
The use of --git-dir=* instead of -C in the tests could be considered
somewhat archaic, but the tests I'm adding here are duplicating the
corresponding receive.* tests with as few changes as possible.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the transfer.fsckObjects documentation to explicitly note the
unique security and/or corruption issues fetch.fsckObjects suffers
from, since it doesn't have a quarantine environment.
This was already alluded to in the existing documentation, but let's
spell it out so there's no confusion here, and give a concrete example
of how to work around this limitation.
Let's also prominently note that this is considered to be a limitation
of the current implementation, rather than something that's intended
and by design, since we might change this in the future.
See
https://public-inbox.org/git/20180531060259.GE17344@sigill.intra.peff.net/
for further details.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing documentation led the user to believe that all we were
doing were basic reachability sanity checks, but that hasn't been true
for a very long time. Update the description to match reality, and
note the caveat that there's a quarantine for accepting pushes, but
not for fetching.
Also mention that the fsck checks for security issues, which was my
initial motivation for writing this fetch.fsck.* series.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the fsck.<msg-id> and receive.fsck.<msg-id>
variables was mostly duplicated in two places, with fsck.<msg-id>
making no mention of the corresponding receive.fsck.<msg-id>, and the
same for fsck.skipList.
I spent quite a lot of time today wondering why setting the
fsck.<msg-id> variant wasn't working to clone a legacy repository (not
that that would have worked anyway, but a subsequent patch implements
fetch.fsck.<msg-id>).
Rectify this situation by describing the feature in general terms
under the fsck.* documentation, and make the receive.fsck.*
documentation refer to those variables instead.
This documentation was initially added in 2becf00ff7 ("fsck: support
demoting errors to warnings", 2015-06-22) and 4b55b9b479 ("fsck:
document the new receive.fsck.<msg-id> options", 2015-06-22).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refer readers of fetch.fsckObjects and receive.fsckObjects to
transfer.fsckObjects instead of repeating the description at each
location.
I don't think this description of them makes much sense, but for now
I'm just moving the existing documentation around. Making it better
will be done in a later patch.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changelog embedded in the document pre-dates the addition of the
document to git.git (it used to be a Google Doc), so it only goes up
to 752414ae43 ("technical doc: add a design doc for hash function
transition", 2017-09-27).
Since then I made some small edits to it, which would have been worthy
of including in this changelog (but weren't). Instead of amending it
to include these, just note that future changes will be noted in the
log.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-parse ':/substring'" did not consider the history leading
only to HEAD when looking for a commit with the given substring,
when the HEAD is detached. This has been fixed.
* wc/find-commit-with-pattern-on-detached-head:
sha1-name.c: for ":/", find detached HEAD commits
Partial clone support of "git clone" has been updated to correctly
validate the objects it receives from the other side. The server
side has been corrected to send objects that are directly
requested, even if they may match the filtering criteria (e.g. when
doing a "lazy blob" partial clone).
* jt/partial-clone-fsck-connectivity:
clone: check connectivity even if clone is partial
upload-pack: send refs' objects despite "filter"
The content-transfer-encoding of the message "git send-email" sends
out by default was 8bit, which can cause trouble when there is an
overlong line to bust RFC 5322/2822 limit. A new option 'auto' to
automatically switch to quoted-printable when there is such a line
in the payload has been introduced and is made the default.
* bc/send-email-auto-cte:
docs: correct RFC specifying email line length
send-email: automatically determine transfer-encoding
send-email: accept long lines with suitable transfer encoding
send-email: add an auto option for transfer encoding
"git fetch" failed to correctly validate the set of objects it
received when making a shallow history deeper, which has been
corrected.
* jt/connectivity-check-after-unshallow:
fetch-pack: write shallow, then check connectivity
fetch-pack: implement ref-in-want
fetch-pack: put shallow info in output parameter
fetch: refactor to make function args narrower
fetch: refactor fetch_refs into two functions
fetch: refactor the population of peer ref OIDs
upload-pack: test negotiation with changing repository
upload-pack: implement ref-in-want
test-pkt-line: add unpack-sideband subcommand
"git rebase" behaved slightly differently depending on which one of
the three backends gets used; this has been documented and an
effort to make them more uniform has begun.
* en/rebase-consistency:
git-rebase: make --allow-empty-message the default
t3401: add directory rename testcases for rebase and am
git-rebase.txt: document behavioral differences between modes
directory-rename-detection.txt: technical docs on abilities and limitations
git-rebase.txt: address confusion between --no-ff vs --force-rebase
git-rebase: error out when incompatible options passed
t3422: new testcases for checking when incompatible options passed
git-rebase.sh: update help messages a bit
git-rebase.txt: document incompatible options
When submitting a revised version of a patch or series, it can be
helpful (to reviewers) to include a summary of changes since the
previous attempt in the form of an interdiff, typically in the cover
letter. However, it is occasionally useful, despite making for a noisy
read, to insert an interdiff into the commentary section of the lone
patch of a 1-patch series.
Therefore, extend "git format-patch --interdiff=<prev>" to insert an
interdiff into the commentary section of a lone patch rather than
requiring a cover letter. The interdiff is indented to avoid confusing
git-am and human readers into considering it part of the patch proper.
Implementation note: Generating an interdiff for insertion into the
commentary section of a patch which itself is currently being generated
requires invoking the diffing machinery recursively. However, the
machinery does not (presently) support this since it uses global state.
Consequently, we need to take care to stash away the state of the
in-progress operation while generating the interdiff, and restore it
after.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When submitting a revised version of a patch series, it can be helpful
(to reviewers) to include a summary of changes since the previous
attempt in the form of an interdiff, however, doing so involves manually
copy/pasting the diff into the cover letter.
Add an --interdiff option to automate this process. The argument to
--interdiff specifies the tip of the previous attempt against which to
generate the interdiff. For example:
git format-patch --cover-letter --interdiff=v1 -3 v2
The previous attempt and the patch series being formatted must share a
common base.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.multiPackIndex config setting controls the multi-pack-
index (MIDX) feature. If false, the setting will disable all reads
from the multi-pack-index file.
Read this config setting in the new prepare_multi_pack_index_one()
which is called during prepare_packed_git(). This check is run once
per repository.
Add comparison commands in t5319-multi-pack-index.sh to check
typical Git behavior remains the same as the config setting is turned
on and off. This currently includes 'git rev-list' and 'git log'
commands to trigger several object database reads. Currently, these
would only catch an error in the prepare_multi_pack_index_one(), but
with later commits will catch errors in object lookups, abbreviations,
and approximate object counts.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The final pair of chunks for the multi-pack-index file stores the object
offsets. We default to using 32-bit offsets as in the pack-index version
1 format, but if there exists an offset larger than 32-bits, we use a
trick similar to the pack-index version 2 format by storing all offsets
at least 2^31 in a 64-bit table; we use the 32-bit table to point into
that 64-bit table as necessary.
We only store these 64-bit offsets if necessary, so create a test that
manipulates a version 2 pack-index to fake a large offset. This allows
us to test that the large offset table is created, but the data does not
match the actual packfile offsets. The multi-pack-index offset does match
the (corrupted) pack-index offset, so a future feature will compare these
offsets during a 'verify' step.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index needs to track which packfiles it indexes. Store
these in our first required chunk. Since filenames are not well
structured, add padding to keep good alignment in later chunks.
Modify the 'git multi-pack-index read' subcommand to output the
existence of the pack-file name chunk. Modify t5319-multi-pack-index.sh
to reflect this new output and the new expected number of chunks.
Defense in depth: A pattern we are using in the multi-pack-index feature
is to verify the data as we write it. We want to ensure we never write
invalid data to the multi-pack-index. There are many checks that verify
that the values we are writing fit the format definitions. This mainly
helps developers while working on the feature, but it can also identify
issues that only appear when dealing with very large data sets. These
large sets are hard to encode into test cases.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of writing multi-pack-indexes, add a skeleton
'git multi-pack-index write' subcommand and send the options to a
write_midx_file() method. Also create a skeleton test script that
tests the 'write' subcommand.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new 'git multi-pack-index' builtin will be the plumbing access
for writing, reading, and checking multi-pack-index files. The
initial implementation is a no-op.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.
To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough. However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".
As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.
This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.
As this is a white space mode, it is made exclusive to other white space
modes in the move detection.
This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):
- A
- B
- C
+ A
+ B
+ C
At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.
Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:
- <TAB> A
...
+ A <TAB>
...
As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:
- A
- B
- A
- B
+ A
+ B
+ A
+ B
When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can already disable replace refs using a command line
option or environment variable, but those are awkward to
apply universally. Let's add a config option to do the same
thing.
That raises the question of why one might want to do so
universally. The answer is that replace refs violate the
immutability of objects. For instance, if you wanted to
cache the diff between commit XYZ and its parent, then in
theory that never changes; the hash XYZ represents the total
state. But replace refs violate that; pushing up a new ref
may create a completely new diff.
The obvious "if it hurts, don't do it" answer is not to
create replace refs if you're doing this kind of caching.
But for a site hosting arbitrary repositories, they may want
to allow users to share replace refs with each other, but
not actually respect them on the site (because the caching
is more important than the replace feature).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify that setting core.ignoreCase to deviate from reality would
not turn a case-incapable filesystem into a case-capable one.
* ms/core-icase-doc:
Documentation: declare "core.ignoreCase" as internal variable
The "-l" option in "git branch -l" is an unfortunate short-hand for
"--create-reflog", but many users, both old and new, somehow expect
it to be something else, perhaps "--list". This step warns when "-l"
is used as a short-hand for "--create-reflog" and warns about the
future repurposing of the it when it is used.
* jk/branch-l-0-deprecation:
branch: deprecate "-l" option
t: switch "branch -l" to "branch --create-reflog"
t3200: unset core.logallrefupdates when testing reflog creation
"git grep" learned the "--column" option that gives not just the
line number but the column number of the hit.
* tb/grep-column:
contrib/git-jump/git-jump: jump to exact location
grep.c: add configuration variables to show matched option
builtin/grep.c: add '--column' option to 'git-grep(1)'
grep.c: display column number of first match
grep.[ch]: extend grep_opt to allow showing matched column
grep.c: expose {,inverted} match column in match_line()
Documentation/config.txt: camel-case lineNumber for consistency
Add missing colon in two places to fix formatting of options.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit allows git to create and check x509 type signatures using
gpgsm.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Supporting multiple signing formats we will have the need to configure a
custom program each. Add a new config value to cater for that.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add "gpg.format" where the user can specify which type of signature to
use for commits. At the moment only "openpgp" is supported and the value is
not even used. This commit prepares for a new types of signatures.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original implementation of the move detection logic the choice for
ignoring white space changes is the same for the move detection as it is
for the regular diff. Some cases came up where different treatment would
have been nice.
Allow the user to specify that white space should be ignored differently
during detection of moved lines than during generation of added and removed
lines. This is done by providing analogs to the --ignore-space-at-eol,
-b, and -w options by introducing the option --color-moved-ws=<modes>
with the modes named "ignore-space-at-eol", "ignore-space-change" and
"ignore-all-space", which is used only during the move detection phase.
As we change the default, we'll adjust the tests.
For now we do not infer any options to treat white spaces in the move
detection from the generic white space options given to diff.
This can be tuned later to reasonable default.
As we plan on adding more white space related options in a later patch,
that interferes with the current white space options, use a flag field
and clamp it down to XDF_WHITESPACE_FLAGS, as that (a) allows to easily
check at parse time if we give invalid combinations and (b) can reuse
parts of this patch.
By having the white space treatment in its own option, we'll also
make it easier for a later patch to have an config option for
spaces in the move detection.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some build machines started consistently failing to fetch updated
source using "repo sync", with error
error: The last gc run reported the following. Please correct the root cause
and remove /build/.repo/projects/tools/git.git/gc.log.
Automatic cleanup will not be performed until the file is removed.
warning: There are too many unreachable loose objects; run 'git prune' to remove them.
The cause takes some time to describe.
In v2.0.0-rc0~145^2 (gc: config option for running --auto in
background, 2014-02-08), "git gc --auto" learned to run in the
background instead of blocking the invoking command. In this mode, it
closed stderr to avoid interleaving output with any subsequent
commands, causing warnings like the above to be swallowed; v2.6.3~24^2
(gc: save log from daemonized gc --auto and print it next time,
2015-09-19) addressed that by storing any diagnostic output in
.git/gc.log and allowing the next "git gc --auto" run to print it.
To avoid wasteful repeated fruitless gcs, when gc.log is present, the
subsequent "gc --auto" would die after printing its contents. Most
git commands, such as "git fetch", ignore the exit status from "git gc
--auto" so all is well at this point: the user gets to see the error
message, and the fetch succeeds, without a wasteful additional attempt
at an automatic gc.
External tools like repo[1], though, do care about the exit status
from "git gc --auto". In non-daemonized mode, the exit status is
straightforward: if there is an error, it is nonzero, but after a
warning like the above, the status is zero. The daemonized mode, as a
side effect of the other properties provided, offers a very strange
exit code convention:
- if no housekeeping was required, the exit status is 0
- the first real run, after forking into the background, returns exit
status 0 unconditionally. The parent process has no way to know
whether gc will succeed.
- if there is any diagnostic output in gc.log, subsequent runs return
a nonzero exit status to indicate that gc was not triggered.
There's nothing for the calling program to act on on the basis of that
error. Use status 0 consistently instead, to indicate that we decided
not to run a gc (just like if no housekeeping was required). This
way, repo and similar tools can get the benefit of the same behavior
as tools like "git fetch" that ignore the exit status from gc --auto.
Once the period of time described by gc.pruneExpire elapses, the
unreachable loose objects will be removed by "git gc --auto"
automatically.
[1] https://gerrit-review.googlesource.com/c/git-repo/+/10598/
Reported-by: Andrii Dehtiarov <adehtiarov@google.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new negotiation algorithm used during fetch that skips
commits in an effort to find common ancestors faster. The skips grow
similarly to the Fibonacci sequence as the commit walk proceeds further
away from the tips. The skips may cause unnecessary commits to be
included in the packfile, but the negotiation step typically ends more
quickly.
Usage of this algorithm is guarded behind the configuration flag
fetch.negotiationAlgorithm.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>