The environment GIT_ADVICE has been intentionally kept undocumented
to discourage its use by interactive users. Add documentation to
help tool writers.
* ds/doc-wholesale-disabling-advice-messages:
advice: recommend GIT_ADVICE=0 for tools
A file descriptor left open is now properly closed when "git
sparse-checkout" updates the sparse patterns.
* jk/sparse-fdleak-fix:
sparse-checkout: use fdopen_lock_file() instead of xfdopen()
sparse-checkout: check commit_lock_file when writing patterns
sparse-checkout: consolidate cleanup when writing patterns
"git cat-file" works well with the sparse-index, and gets marked as
such.
* kl/cat-file-on-sparse-index:
builtin/cat-file: mark 'git cat-file' sparse-index compatible
t1092: allow run_on_* functions to use standard input
One-line messages to "die" and other helper functions will get LF
added by these helper functions, but many existing messages had an
unnecessary LF at the end, which have been corrected.
* jk/messages-with-excess-lf-fix:
drop trailing newline from warning/error/die messages
"git pack-refs --auto" for the files backend was too aggressive,
which has been a bit tamed.
* ps/pack-refs-auto-heuristics:
refs/files: use heuristic to decide whether to repack with `--auto`
t0601: merge tests for auto-packing of refs
wrapper: introduce `log2u()`
A data corruption bug when multi-pack-index is used and the same
objects are stored in multiple packfiles has been corrected.
* tb/multi-pack-reuse-fix:
builtin/pack-objects.c: do not open-code `MAX_PACK_OBJECT_HEADER`
pack-bitmap.c: avoid repeated `pack_pos_to_offset()` during reuse
builtin/pack-objects.c: translate bit positions during pack-reuse
pack-bitmap: tag bitmapped packs with their corresponding MIDX
t/t5332-multi-pack-reuse.sh: verify pack generation with --strict
"git verify-pack" and "git index-pack" started dying outside a
repository, which has been corrected.
* ps/index-pack-outside-repo-fix:
builtin/index-pack: fix segfaults when running outside of a repo
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.
* ps/bundle-outside-repo-fix:
bundle: default to SHA1 when reading bundle headers
builtin/bundle: have unbundle check for repo before opening its bundle
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.
cf. <Zqh2T_2RLt0SeKF7@tanuki>
* jc/patch-id:
patch-id: tighten code to detect the patch header
patch-id: rewrite code that detects the beginning of a patch
patch-id: make get_one_patchid() more extensible
patch-id: call flush_current_id() only when needed
t4204: patch-id supports various input format
"git pack-redundant" has been marked for removal in Git 3.0.
* ps/declare-pack-redundamt-dead:
Documentation/BreakingChanges: announce removal of git-pack-redundant(1)
The code forgot to discard unnecessary in-core commit buffer data
for commits that "git log --skip=<number>" traversed but omitted
from the output, which has been corrected.
* jk/free-commit-buffer-of-skipped-commits:
revision: free commit buffers for skipped commits
Back when 7cc91a2f (Add the configuration option skipFetchAll,
2009-11-09) added for the sole purpose of adding skipFetchAll as a
synonym to skipDefaultUpdate, there was no explanation about the
reason why it was needed., but these two configuration variables
mean exactly the same thing.
Also, when we taught the "prefetch" task to "git maintenance" later,
we did make it pay attention to the setting, but we forgot to
document it.
Document these variables as synonyms that collectively implements
the last-one-wins semantics, and also clarify that the prefetch task
is also controlled by this variable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_ADVICE environment variable was added implicitly in b79deeb554
(advice: add --no-advice global option, 2024-05-03) but was not
documented. Add documentation to show that it is an option for tools
that want to disable these messages. Make note that while the
--no-advice option exists, older Git versions will fail to parse that
option. The environment variable presents a way to change the behavior
of Git versions that understand it without disrupting older versions.
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some large repositories use tags to track a huge list of release
versions. While this choice is costly on the ref advertisement, it is
further wasteful for clients who do not need those tags. Allow clients
to optionally skip the tag advertisement.
This behavior is similar to that of 'git clone --no-tags' implemented in
0dab2468ee (clone: add a --no-tags option to clone without tags,
2017-04-26), including the modification of the remote.origin.tagOpt
config value to include "--no-tags".
One thing that is opposite of the 'git clone' implementation is that
this allows '--tags' as an assumed option, which can be naturally negated
with '--no-tags'. The clone command does not accept '--tags' but allows
"--no-no-tags" as the negation of its '--no-tags' option.
While testing this option, combine the test with the previously untested
'--no-src' option introduced in 4527db8ff8 (scalar: add --[no-]src
option, 2023-08-28).
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make our codebase compilable with the -Werror=unused-parameter
option.
* jk/unused-parameters:
CodingGuidelines: mention -Wunused-parameter and UNUSED
config.mak.dev: enable -Wunused-parameter by default
compat: mark unused parameters in win32/mingw functions
compat: disable -Wunused-parameter in win32/headless.c
compat: disable -Wunused-parameter in 3rd-party code
t-reftable-readwrite: mark unused parameter in callback function
gc: mark unused config parameter in virtual functions
Cox Communications no longer supports email and transfered accounts to
yahoo. I closed the account at yahoo since I use gmail.com.
Signed-off-by: Stephen P. Smith <ishchis2@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating sparse patterns, we open a lock_file to write out the new
data. The lock_file struct holds the file descriptor, but we call
fdopen() to get a stdio handle to do the actual write.
After we finish writing, we fflush() so that all of the data is on disk,
and then call commit_lock_file() which closes the descriptor. But we
never fclose() the stdio handle, leaking it.
The obvious solution seems like it would be to just call fclose(). But
when? If we do it before commit_lock_file(), then the lock_file code is
left thinking it owns the now-closed file descriptor, and will do an
extra close() on the descriptor. But if we do it before, we have the
opposite problem: the lock_file code will close the descriptor, and
fclose() will do the extra close().
We can handle this correctly by using fdopen_lock_file(). That leaves
ownership of the stdio handle with the lock_file, which knows not to
double-close it.
We do have to adjust the code a bit:
- we have to handle errors ourselves; we can just die(), since that's
what xfdopen() would have done (and we can even provide a more
specific error message).
- we no longer need to call fflush(); committing the lock-file
auto-closes it, which will now do the flush for us. As a bonus, this
will actually check that the flush was successful before renaming
the file into place.
- we can get rid of the local "fd" variable, since we never look at it
ourselves now
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a new "sparse-checkout" file, we do the usual strategy of
writing to a lockfile and committing it into place. But we don't check
the outcome of commit_lock_file(). Failing there would prevent us from
writing a bogus file (good), but we would ignore the error and return a
successful exit code (bad).
Fix this by calling die(). Note that we need to keep the sparse_filename
variable valid for longer, since the filename stored in the lock_file
struct will be dropped when we run commit_lock_file().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In write_patterns_and_update(), we always need to free the pattern list
before exiting the function. Rather than handling it manually when we
return early, we can jump to an "out" label where cleanup happens. This
let us drop one line, but also establishes a pattern we can use for
other cleanup.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our error reporting routines append a trailing newline, and the strings
we pass to them should not include them (otherwise we get an extra blank
line after the message).
These cases were all found by looking at the results of:
git grep -P '[^_](error|error_errno|warning|die|die_errno)\(.*\\n"[,)]' '*.c'
Note that we _do_ sometimes include a newline in the middle of such
messages, to create multiline output (hence our grep matching "," or ")"
after we see the newline, so we know we're at the end of the string).
It's possible that one or more of these cases could intentionally be
including a blank line at the end, but having looked at them all
manually, I think these are all just mistakes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change affects how 'git cat-file' works with the index when
specifying an object with the ":<path>" syntax (which will give file
contents from the index).
'git cat-file' expands a sparse index to a full index any time contents
are requested from the index by specifying an object with the ":<path>"
syntax. This is true even when the requested file is part of the sparse
index, and results in much slower 'git cat-file' operations when working
within the sparse index.
Mark 'git cat-file' as not needing a full index, so that you only pay
the cost of expanding the sparse index to a full index when you request
a file outside of the sparse index.
Add tests to ensure both that:
- 'git cat-file' returns the correct file contents whether or not the
file is in the sparse index
- 'git cat-file' expands to the full index any time you request
something outside of the sparse index
Signed-off-by: Kevin Lyles <klyles+github@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'run_on_sparse' and 'run_on_all' functions do not work correctly for
commands accepting standard input, because they run the same command
multiple times and the first instance consumes it. This also indirectly
affects 'test_all_match' and 'test_sparse_match'.
To allow these functions to work with commands accepting standard input,
first slurp standard input to a temporary file, and then run the command
with its standard input redirected from the temporary file. This ensures
that each command sees the same contents from its standard input.
Note that this does not impact commands that do not read from standard
input; they continue to ignore it. Additionally, existing uses of the
run_on_* functions do not need to do anything differently, as the
standard input of the test environment is already connected to
/dev/null.
We do not explicitly clean up the input files because they are cleaned
up with the rest of the test repositories and their contents may be
useful for figuring out which command failed when a test case fails.
Signed-off-by: Kevin Lyles <klyles@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--auto` flag for git-pack-refs(1) allows the ref backend to decide
whether or not a repack is in order. This switch has been introduced
mostly with the "reftable" backend in mind, which already knows to
auto-compact its tables during normal operations. When the flag is set,
then it will use the same auto-compaction mechanism and thus end up
doing nothing in most cases.
The "files" backend does not have any such heuristic yet and instead
packs any loose references unconditionally. So we rewrite the complete
"packed-refs" file even if there's only a single loose reference to be
packed.
Even worse, starting with 9f6714ab3e (builtin/gc: pack refs when using
`git maintenance run --auto`, 2024-03-25), `git pack-refs --auto` is
unconditionally executed via our auto maintenance, so we end up repacking
references every single time auto maintenance kicks in. And while that
commit already mentioned that the "files" backend unconditionally packs
refs now, the author obviously didn't quite think about the consequences
thereof. So while the idea was sound, we really should have added a
heuristic to the "files" backend before implementing it.
Introduce a heuristic that decides whether or not it is worth to pack
loose references. The important factors to decide here are the number of
loose references in comparison to the overall size of the "packed-refs"
file. The bigger the "packed-refs" file, the longer it takes to rewrite
it and thus we scale up the limit of allowed loose references before we
repack.
As is the nature of heuristics, this mechansim isn't obviously
"correct", but should rather be seen as a tradeoff between how much
resources we spend packing refs and how inefficient the ref store
becomes. For all I can say, we have successfully been using the exact
same heuristic in Gitaly for several years by now.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two tests in t0601 which exercise the same underlying logic,
once via `git pack-refs --auto` and once via `git maintenance run
--auto`. Merge these two tests into one such that it becomes easier to
extend test coverage for both commands at the same time.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have an implementation of a function that computes the log2 for an
integer. While we could instead use log2(3P), that involves floating
point numbers and is thus needlessly complex and inefficient.
We're about to add a second caller that wants to compute log2 for a
`size_t`. Let's thus move the function into "wrapper.h" such that it
becomes generally available.
While at it, tweak the implementation a bit:
- The parameter is converted from `int` to `uintmax_t`. This
conversion is safe to do in "bisect.c" because we already check that
the argument is positive.
- The return value is an `unsigned`. It cannot ever be negative, so it
is pointless for it to be a signed integer.
- Loop until `!n` instead of requiring that `n > 1` and then subtract
1 from the result and add a special case for `!sz`. This helps
compilers to generate more efficient code.
Compilers recognize the pattern of this function and optimize
accordingly. On GCC 14.2 x86_64:
log2u(unsigned long):
test rdi, rdi
je .L3
bsr rax, rdi
ret
.L3:
mov eax, -1
ret
Clang 18.1 does not yet recognize the pattern, but starts to do so on
Clang trunk x86_64. The code isn't quite as efficient as the one
generated by GCC, but still manages to optimize away the loop:
log2u(unsigned long):
test rdi, rdi
je .LBB0_1
shr rdi
bsr rcx, rdi
mov eax, 127
cmovne rax, rcx
xor eax, -64
add eax, 65
ret
.LBB0_1:
mov eax, -1
ret
The pattern is also recognized on other platforms like ARM64 GCC 14.2.0,
where we end up using `clz`:
log2u(unsigned long):
clz x2, x0
cmp x0, 0
mov w1, 63
sub w0, w1, w2
csinv w0, w0, wzr, ne
ret
Note that we have a similar function `fastlog2()` in the reftable code.
As that codebase is separate from the Git codebase we do not adapt it to
use the new function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was reported that git-verify-pack(1) has started to crash with Git
v2.46.0 when run outside of a repository. This is another fallout from
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), where we have stopped setting the default hash algorithm
for `the_repository`. Consequently, code that relies on `the_hash_algo`
will now crash when it hasn't explicitly been initialized, which may be
the case when running outside of a Git repository.
The crash is not in git-verify-pack(1) but instead in git-index-pack(1),
which gets called by the former. Ideally, both of these programs should
be able to identify the hash algorithm used by the packfile and index
without having to rely on external information. But unfortunately, the
format for neither of them is completely self-describing, so it is not
possible to derive that information. This is a design issue that we
should address by introducing a new packfile version that encodes its
object hash.
For now though the more important fix is to not make either of these
programs crash anymore, which we do by falling back to SHA1 when the
object hash is unconfigured. This pessimizes reading packfiles which
use a different hash than SHA1, but restores previous behaviour.
Reported-by: Ilya K <me@0upti.me>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-pack-redundant(1) command is already in the process of being
phased out and dies unless the user passes the `--i-still-use-this` flag
since 4406522b76 (pack-redundant: escalate deprecation warning to an
error, 2023-03-23). We haven't heard any complaints, so let's announce
the removal of this command in Git 3.0 in our breaking changes document.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>