In asciidoc's HTML output of the "gitrevisions" and "git-rev-parse"
documentation, the header:
The ... (three-dot) Symmetric Difference Notation
is rendered using "&8230;", a horizontal ellipsis. This is visually
ugly, but also hard to search for or cut-and-paste. We really mean three
ascii dots (0x2e) here, so let's make sure it renders as such.
The simplest way to do that is just escaping the leading dot, as the
instances in the rest of the section do. Arguably this should all be
converted to use backticks, which would let us drop the quoting here and
elsewhere (e.g., {carat}). But that does change the rendering slightly.
So let's fix the bug first, and we can decide on migrating the whole
section separately.
Note that this produces an empty doc-diff of the manpages. Curiously,
asciidoc produces the same ellipsis entity in the XML file, but docbook
then converts it back into three literal dots for the roff output! So
the roff manpages have been correct all along (which may be a reason
nobody noticed this until now).
Reported-by: Arthur Milchior
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Folks may want to merge histories that have no common ancestry; provide
a flag with the same name as used by `git merge` to allow this.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much as `git ls-files` has a -z option, let's add one to merge-tree so
that the conflict-info section can be NUL terminated (and avoid quoting
of unusual filenames).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like `git merge` updates the index with information of the form
(mode, oid, stage, name)
provide this output for conflicted files for merge-tree as well.
Provide a --name-only option for users to exclude the mode, oid, and
stage and only get the list of conflicted filenames.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers of `git merge-tree --write-tree` will often want to know which
files had conflicts. While they could potentially attempt to parse the
CONFLICT notices printed, those messages are not meant to be machine
readable. Provide a simpler mechanism of just printing the files (in
the same format as `git ls-files` with quoting, but restricted to
unmerged files) in the output before the free-form messages.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running `git merge-tree --write-tree`, we previously would only
return an exit status reflecting the cleanness of a merge, and print out
the toplevel tree of the resulting merge. Merges also have
informational messages, such as:
* "Auto-merging <PATH>"
* "CONFLICT (content): ..."
* "CONFLICT (file/directory)"
* etc.
In fact, when non-content conflicts occur (such as file/directory,
modify/delete, add/add with differing modes, rename/rename (1to2),
etc.), these informational messages may be the only notification the
user gets since these conflicts are not representable in the contents
of the file.
Add a --[no-]messages option so that callers can request these messages
be included at the end of the output. Include such messages by default
when there are conflicts, and omit them by default when the merge is
clean.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the ability to perform real merges rather than just trivial
merges (meaning handling three way content merges, recursive ancestor
consolidation, renames, proper directory/file conflict handling, and so
forth). However, unlike `git merge`, the working tree and index are
left alone and no branch is updated.
The only output is:
- the toplevel resulting tree printed on stdout
- exit status of 0 (clean), 1 (conflicts present), anything else
(merge could not be performed; unknown if clean or conflicted)
This output is meant to be used by some higher level script, perhaps in
a sequence of steps like this:
NEWTREE=$(git merge-tree --write-tree $BRANCH1 $BRANCH2)
test $? -eq 0 || die "There were conflicts..."
NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
git update-ref $BRANCH1 $NEWCOMMIT
Note that higher level scripts may also want to access the
conflict/warning messages normally output during a merge, or have quick
access to a list of files with conflicts. That is not available in this
preliminary implementation, but subsequent commits will add that
ability (meaning that NEWTREE would be a lot more than a tree in the
case of conflicts).
This also marks the traditional trivial merge of merge-tree as
deprecated. The trivial merge not only had limited applicability, the
output format was also difficult to work with (and its format
undocumented), and will generally be less performant than real merges.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds a command line option analogous to that of GNU
grep(1)'s -m / --max-count, which users might already be used to.
This makes it possible to limit the amount of matches shown in the
output while keeping the functionality of other options such as -C
(show code context) or -p (show containing function), which would be
difficult to do with a shell pipeline (e.g. head(1)).
Signed-off-by: Carlos López 00xc@protonmail.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"sudo git foo" used to consider a repository owned by the original
user a safe one to access; it now also considers a repository owned
by root a safe one, too (after all, if an attacker can craft a
malicious repository owned by root, the box is 0wned already).
* cb/path-owner-check-with-sudo-plus:
git-compat-util: allow root to access both SUDO_UID and root owned
Reachability bitmaps are designed to speed up the "counting objects"
phase of generating a pack during a clone or fetch. They are not
optimized for Git clients sending a small topic branch via "git push".
In some cases (see [1]), using reachability bitmaps during "git push"
can cause significant performance regressions.
Add a new "push.useBitmaps" configuration variable to allow users to
tell "git push" not to use bitmaps. We already have "pack.bitmaps"
that controls the use of bitmaps, but a separate configuration variable
allows the reachability bitmaps to still be used in other areas,
such as "git upload-pack", while disabling it only for "git push".
[1]: https://lore.kernel.org/git/87zhoz8b9o.fsf@evledraar.gmail.com/
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previous changes introduced a regression which will prevent root for
accessing repositories owned by thyself if using sudo because SUDO_UID
takes precedence.
Loosen that restriction by allowing root to access repositories owned
by both uid by default and without having to add a safe.directory
exception.
A previous workaround that was documented in the tests is no longer
needed so it has been removed together with its specially crafted
prerequisite.
Helped-by: Johanness Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some config variables are combinations of multiple words, and we
typically write them in camelCase forms in manpage and translatable
strings. It's not easy to find mismatches for these camelCase config
variables during code reviews, but occasionally they are identified
during localization translations.
To check for mismatched config variables, I introduced a new feature
in the helper program for localization[^1]. The following mismatched
config variables have been identified by running the helper program,
such as "git-po-helper check-pot".
Lowercase in manpage should use camelCase:
* Documentation/config/http.txt: http.pinnedpubkey
Lowercase in translable strings should use camelCase:
* builtin/fast-import.c: pack.indexversion
* builtin/gc.c: gc.logexpiry
* builtin/index-pack.c: pack.indexversion
* builtin/pack-objects.c: pack.indexversion
* builtin/repack.c: pack.writebitmaps
* commit.c: i18n.commitencoding
* gpg-interface.c: user.signingkey
* http.c: http.postbuffer
* submodule-config.c: submodule.fetchjobs
Mismatched camelCases, choose the former:
* Documentation/config/transfer.txt: transfer.credentialsInUrl
remote.c: transfer.credentialsInURL
[^1]: https://github.com/git-l10n/git-po-helper
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename fetch.credentialsInUrl to transfer.credentialsInUrl as the
single configuration variable should work both in pushing and
fetching.
* ab/credentials-in-url-more:
transfer doc: move fetch.credentialsInUrl to "transfer" config namespace
fetch doc: note "pushurl" caveat about "credentialsInUrl", elaborate
Bitmap file has a trailing checksum at the end of the file. However
there is no information in the bitmap-format documentation about it.
Add a trailer section to include the trailing checksum info in the
`Documentation/technical/bitmap-format.txt` file.
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The asciidoc generated html for `Documentation/technical/bitmap-
format.txt` is broken. This is mainly because `-` is used for nested
lists (which is not allowed in asciidoc) instead of `*`.
Fix these and also reformat it for better readability of the html page.
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/Makefile does not include bitmap-format.txt to generate
a html page using asciidoc.
Teach Documentation/Makefile to also generate a html page for
Documentation/technical/bitmap-format.txt file.
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git revert" learns "--reference" option to use more human-readable
reference to the commit it reverts in the message template it
prepares for the user.
* jc/revert-show-parent-info:
revert: --reference should apply only to 'revert', not 'cherry-pick'
revert: optionally refer to commit in the "reference" format
Drop the dependency on gzip(1) and use our internal implementation to
create tar.gz and tgz files.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git uses zlib for its own object store, but calls gzip when creating tgz
archives. Add an option to perform the gzip compression for the latter
using zlib, without depending on the external gzip binary.
Plug it in by making write_block a function pointer and switching to a
compressing variant if the filter command has the magic value "git
archive gzip". Does that indirection slow down tar creation? Not
really, at least not in this test:
$ hyperfine -w3 -L rev HEAD,origin/main -p 'git checkout {rev} && make' \
'./git -C ../linux archive --format=tar HEAD # {rev}'
Benchmark #1: ./git -C ../linux archive --format=tar HEAD # HEAD
Time (mean ± σ): 4.044 s ± 0.007 s [User: 3.901 s, System: 0.137 s]
Range (min … max): 4.038 s … 4.059 s 10 runs
Benchmark #2: ./git -C ../linux archive --format=tar HEAD # origin/main
Time (mean ± σ): 4.047 s ± 0.009 s [User: 3.903 s, System: 0.138 s]
Range (min … max): 4.038 s … 4.066 s 10 runs
How does tgz creation perform?
$ hyperfine -w3 -L command 'gzip -cn','git archive gzip' \
'./git -c tar.tgz.command="{command}" -C ../linux archive --format=tgz HEAD'
Benchmark #1: ./git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD
Time (mean ± σ): 20.404 s ± 0.006 s [User: 23.943 s, System: 0.401 s]
Range (min … max): 20.395 s … 20.414 s 10 runs
Benchmark #2: ./git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD
Time (mean ± σ): 23.807 s ± 0.023 s [User: 23.655 s, System: 0.145 s]
Range (min … max): 23.782 s … 23.857 s 10 runs
Summary
'./git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD' ran
1.17 ± 0.00 times faster than './git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD'
So the internal implementation takes 17% longer on the Linux repo, but
uses 2% less CPU time. That's because the external gzip can run in
parallel on its own processor, while the internal one works sequentially
and avoids the inter-process communication overhead.
What are the benefits? Only an internal sequential implementation can
offer this eco mode, and it allows avoiding the gzip(1) requirement.
This implementation uses the helper functions from our zlib.c instead of
the convenient gz* functions from zlib, because the latter doesn't give
the control over the generated gzip header that the next patch requires.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mention all formats in the --format section, use backtick quoting for
literal values throughout, clarify the description of the configuration
option.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the "fetch.credentialsInUrl" configuration variable introduced
in 6dcbdc0d66 (remote: create fetch.credentialsInUrl config,
2022-06-06) to "transfer".
There are existing exceptions, but generally speaking the
"<namespace>.<var>" configuration should only apply to command
described in the "namespace" (and its sub-commands, so e.g. "clone.*"
or "fetch.*" might also configure "git-remote-https").
But in the case of "fetch.credentialsInUrl" we've got a configuration
variable that configures the behavior of all of "clone", "push" and
"fetch", someone adjusting "fetch.*" configuration won't expect to
have the behavior of "git push" altered, especially as we have the
pre-existing "{transfer,fetch,receive}.fsckObjects", which configures
different parts of the transfer dialog.
So let's move this configuration variable to the "transfer" namespace
before it's exposed in a release. We could add all of
"{transfer,fetch,pull}.credentialsInUrl" at some other time, but once
we have "fetch" configure "pull" such an arrangement would would be a
confusing mess, as we'd at least need to have "fetch" configure
"push" (but not the other way around), or change existing behavior.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the documentation and release notes entry for the
"fetch.credentialsInUrl" feature added in 6dcbdc0d66 (remote: create
fetch.credentialsInUrl config, 2022-06-06), it currently doesn't
detect passwords in `remote.<name>.pushurl` configuration. We
shouldn't lull users into a false sense of security, so we need to
mention that prominently.
This also elaborates and clarifies the "exposes the password in
multiple ways" part of the documentation. As noted in [1] a user
unfamiliar with git's implementation won't know what to make of that
scary claim, e.g. git hypothetically have novel git-specific ways of
exposing configured credentials.
The reality is that this configuration is intended as an aid for users
who can't fully trust their OS's or system's security model, so lets
say that's what this is intended for, and mention the most common ways
passwords stored in configuration might inadvertently get exposed.
1. https://lore.kernel.org/git/220524.86ilpuvcqh.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fetch.credentialsInUrl" configuration variable controls what
happens when a URL with embedded login credential is used.
* ds/credentials-in-url:
remote: create fetch.credentialsInUrl config
The two examples in the doc for 'git diff-index' were not updated when
the raw output format was changed in 81e50eabf0 ([PATCH] The diff-raw
format updates., 2005-05-21) (first example) and in b6d8f309d9 ([PATCH]
diff-raw format update take #2., 2005-05-23) and 7cb6ac1e4b (diff:
diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value,
2017-12-03) (second example).
Update the output, inventing some characters to complete the source
hash in the second example. Also correct the destination mode in the
second example, which was wrongly '100664' since the addition of the
example in c64b9b8860 (Reference documentation for the core git
commands., 2005-05-05).
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Near the end of the "Raw output format" section, an example shows the
output of 'git diff-files' for a tracked file modified on disk but not
yet added to the index. However the wording is:
<sha1> is shown as all 0's if a file is new on the filesystem
and it is out of sync with the index.
which is confusing since it can be understood to mean that 'file' is a
new, yet untracked file, in which case 'git diff-files' does not care
about it at all.
When this example was introduced all the way back in c64b9b8860
(Reference documentation for the core git commands., 2005-05-05), 'old'
and 'new' referred to the two entities being compared, depending on the
command being used (diff-index, diff-tree or diff-files - which at the
time were diff-cache, diff-tree and show-diff). The wording used at the
time was:
<new-sha1> is shown as all 0's if new is a file on the
filesystem and it is out of sync with the cache.
This section was reworked in 81e50eabf0 ([PATCH] The diff-raw
format updates., 2005-05-21) and the mention of the meaning of 'new' and
'old' was removed. Then in f73ae1fc5d (Some typos and light editing of
various manpages, 2005-10-05), the wording was changed to what it is
now.
In addition, in b6d8f309d9 ([PATCH] diff-raw format update take #2.,
2005-05-23), the section was further reworked and did not use '<sha1>'
anymore, making the example the sole user of this token.
Rework the introductory sentence of the example to instead refer to
'sha1 for "dst"', which is what the text description above it uses, and
fix the wording so that we do not mention a "new file".
While at it, also tweak the wording used in the description of the raw
format to explicitely state that all 0's are used for the destination
hash if the working tree is out of sync with the index, instead of the
more vague "look at worktree".
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"dst" can legitimately be "0\{40\}" for a creation patch, e.g. when
the stat information is stale, but it falls into "look at work tree"
case. The original description in b6d8f309 ([PATCH] diff-raw format
update take #2., 2005-05-23) forgot that deletion also makes the
"dst" 0* SHA-1.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of the stream_loose_object() function introduced in the
preceding commit to unpack large objects. Before this we'd need to
malloc() the size of the blob before unpacking it, which could cause
OOM with very large blobs.
We could use the new streaming interface to unpack all blobs, but
doing so would be much slower, as demonstrated e.g. with this
benchmark using git-hyperfine[0]:
rm -rf /tmp/scalar.git &&
git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git &&
mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack &&
git hyperfine \
-r 2 --warmup 1 \
-L rev origin/master,HEAD -L v "10,512,1k,1m" \
-s 'make' \
-p 'git init --bare dest.git' \
-c 'rm -rf dest.git' \
'./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack'
Here we'll perform worse with lower core.bigFileThreshold settings
with this change in terms of speed, but we're getting lower memory use
in return:
Summary
'./git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran
1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
A better benchmark to demonstrate the benefits of that this one, which
creates an artificial repo with a 1, 25, 50, 75 and 100MB blob:
rm -rf /tmp/repo &&
git init /tmp/repo &&
(
cd /tmp/repo &&
for i in 1 25 50 75 100
do
dd if=/dev/urandom of=blob.$i count=$(($i*1024)) bs=1024
done &&
git add blob.* &&
git commit -mblobs &&
git gc &&
PACK=$(echo .git/objects/pack/pack-*.pack) &&
cp "$PACK" my.pack
) &&
git hyperfine \
--show-output \
-L rev origin/master,HEAD -L v "512,50m,100m" \
-s 'make' \
-p 'git init --bare dest.git' \
-c 'rm -rf dest.git' \
'/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum'
Using this test we'll always use >100MB of memory on
origin/master (around ~105MB), but max out at e.g. ~55MB if we set
core.bigFileThreshold=50m.
The relevant "Maximum resident set size" lines were manually added
below the relevant benchmark:
'/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' ran
Maximum resident set size (kbytes): 107080
1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
Maximum resident set size (kbytes): 106968
1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
Maximum resident set size (kbytes): 107032
1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
Maximum resident set size (kbytes): 107072
1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
Maximum resident set size (kbytes): 55704
2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
Maximum resident set size (kbytes): 4564
This shows that if you have enough memory this new streaming method is
slower the lower you set the streaming threshold, but the benefit is
more bounded memory use.
An earlier version of this patch introduced a new
"core.bigFileStreamingThreshold" instead of re-using the existing
"core.bigFileThreshold" variable[1]. As noted in a detailed overview
of its users in [2] using it has several different meanings.
Still, we consider it good enough to simply re-use it. While it's
possible that someone might want to e.g. consider objects "small" for
the purposes of diffing but "big" for the purposes of writing them
such use-cases are probably too obscure to worry about. We can always
split up "core.bigFileThreshold" in the future if there's a need for
that.
0. https://github.com/avar/git-hyperfine/
1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/
2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.bigFileThreshold documentation has been largely unchanged
since 5eef828bc0 (fast-import: Stream very large blobs directly to
pack, 2010-02-01).
But since then this setting has been expanded to affect a lot more
than that description indicated. Most notably in how "git diff" treats
them, see 6bf3b81348 (diff --stat: mark any file larger than
core.bigfilethreshold binary, 2014-08-16).
In addition to that, numerous commands and APIs make use of a
streaming mode for files above this threshold.
So let's attempt to summarize 12 years of changes in behavior, which
can be seen with:
git log --oneline -Gbig_file_thre 5eef828bc03.. -- '*.c'
To do that turn this into a bullet-point list. The summary Han Xin
produced in [1] helped a lot, but is a bit too detailed for
documentation aimed at users. Let's instead summarize how
user-observable behavior differs, and generally describe how we tend
to stream these files in various commands.
1. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/
Helped-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new bug() and BUG_if_bug() API is introduced to make it easier to
uniformly log "detect multiple bugs and abort in the end" pattern.
* ab/bug-if-bug:
cache-tree.c: use bug() and BUG_if_bug()
receive-pack: use bug() and BUG_if_bug()
parse-options.c: use optbug() instead of BUG() "opts" check
parse-options.c: use new bug() API for optbug()
usage.c: add a non-fatal bug() function to go with BUG()
common-main.c: move non-trace2 exit() behavior out of trace2.c
Using `ssh-add -L` for gpg.ssh.defaultKeyCommand is not a good
recommendation. It might switch keys depending on the order of known
keys and it only supports ssh-* and no ecdsa or other keys.
Clarify that we expect a literal key prefixed by `key::`, give valid
example use cases and refer to `user.signingKey` as the preferred
option.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
source: <20220421044233.894255-1-alexhenrie24@gmail.com>
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
Test that "git config --show-scope" shows the "worktree" scope, and add
it to the list of scopes in Documentation/git-config.txt.
"git config --help" does not need to be updated because it already
mentions "worktree".
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implementation of "scalar diagnose" subcommand.
* js/scalar-diagnose:
scalar: teach `diagnose` to gather loose objects information
scalar: teach `diagnose` to gather packfile info
scalar diagnose: include disk space information
scalar: implement `scalar diagnose`
scalar: validate the optional enlistment argument
archive --add-virtual-file: allow paths containing colons
archive: optionally add "virtual" files
The documentation on the interaction between "--add-file" and
"--prefix" options of "git archive" has been improved.
* rs/document-archive-prefix:
archive: improve documentation of --prefix
Users sometimes provide a "username:password" combination in their
plaintext URLs. Since Git stores these URLs in plaintext in the
.git/config file, this is a very insecure way of storing these
credentials. Credential managers are a more secure way of storing this
information.
System administrators might want to prevent this kind of use by users on
their machines.
Create a new "fetch.credentialsInUrl" config option and teach Git to
warn or die when seeing a URL with this kind of information. The warning
anonymizes the sensitive information of the URL to be clear about the
issue.
This change currently defaults the behavior to "allow" which does
nothing with these URLs. We can consider changing this behavior to
"warn" by default if we wish. At that time, we may want to add some
advice about setting fetch.credentialsInUrl=ignore for users who still
want to follow this pattern (and not receive the warning).
An earlier version of this change injected the logic into
url_normalize() in urlmatch.c. While most code paths that parse URLs
eventually normalize the URL, that normalization does not happen early
enough in the stack to avoid attempting connections to the URL first. By
inserting a check into the remote validation, we identify the issue
before making a connection. In the old code path, this was revealed by
testing the new t5601-clone.sh test under --stress, resulting in an
instance where the return code was 13 (SIGPIPE) instead of 128 from the
die().
However, we can reuse the parsing information from url_normalize() in
order to benefit from its well-worn parsing logic. We can use the struct
url_info that is created in that method to replace the password with
"<redacted>" in our error messages. This comes with a slight downside
that the normalized URL might look slightly different from the input URL
(for instance, the normalized version adds a closing slash). This should
not hinder users figuring out what the problem is and being able to fix
the issue.
As an attempt to ensure the parsing logic did not catch any
unintentional cases, I modified this change locally to to use the "die"
option by default. Running the test suite succeeds except for the
explicit username:password URLs used in t5550-http-fetch-dumb.sh and
t5541-http-push-smart.sh. This means that all other tested URLs did not
trigger this logic.
The tests show that the proper error messages appear (or do not
appear), but also count the number of error messages. When only warning,
each process validates the remote URL and outputs a warning. This
happens twice for clone, three times for fetch, and once for push.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>