Commit Graph

23503 Commits

Author SHA1 Message Date
Junio C Hamano
03633a288c Merge branch 'kn/reflog-drop'
"git reflog" learns "drop" subcommand, that discards the entire
reflog data for a ref.

* kn/reflog-drop:
  reflog: implement subcommand to drop reflogs
  reflog: improve error for when reflog is not found
2025-04-15 13:50:15 -07:00
Junio C Hamano
ee847e0034 Merge branch 'ps/object-wo-the-repository'
The object layer has been updated to take an explicit repository
instance as a parameter in more code paths.

* ps/object-wo-the-repository:
  hash: stop depending on `the_repository` in `null_oid()`
  hash: fix "-Wsign-compare" warnings
  object-file: split out logic regarding hash algorithms
  delta-islands: stop depending on `the_repository`
  object-file-convert: stop depending on `the_repository`
  pack-bitmap-write: stop depending on `the_repository`
  pack-revindex: stop depending on `the_repository`
  pack-check: stop depending on `the_repository`
  environment: move access to "core.bigFileThreshold" into repo settings
  pack-write: stop depending on `the_repository` and `the_hash_algo`
  object: stop depending on `the_repository`
  csum-file: stop depending on `the_repository`
2025-04-15 13:50:15 -07:00
Junio C Hamano
f3f00d93a1 Merge branch 'md/t1403-path-is-file'
Test tweak.

* md/t1403-path-is-file:
  t1403: verify that path exists and is a file
2025-04-15 13:50:14 -07:00
Junio C Hamano
c39e5cbaa5 Merge branch 'jk/zlib-inflate-fixes'
Fix our use of zlib corner cases.

* jk/zlib-inflate-fixes:
  unpack_loose_rest(): rewrite return handling for clarity
  unpack_loose_rest(): simplify error handling
  unpack_loose_rest(): never clean up zstream
  unpack_loose_rest(): avoid numeric comparison of zlib status
  unpack_loose_header(): avoid numeric comparison of zlib status
  git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT
  unpack_loose_header(): fix infinite loop on broken zlib input
  unpack_loose_header(): report headers without NUL as "bad"
  unpack_loose_header(): simplify next_out assignment
  loose_object_info(): BUG() on inflating content with unknown type
2025-04-15 13:50:14 -07:00
Junio C Hamano
23ee5065c2 Merge branch 'tb/incremental-midx-part-2'
Incrementally updating multi-pack index files.

* tb/incremental-midx-part-2:
  midx: implement writing incremental MIDX bitmaps
  pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators
  pack-bitmap.c: keep track of each layer's type bitmaps
  ewah: implement `struct ewah_or_iterator`
  pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs
  pack-bitmap.c: compute disk-usage with incremental MIDXs
  pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs
  pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs
  pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs
  pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs
  pack-bitmap.c: open and store incremental bitmap layers
  pack-revindex: prepare for incremental MIDX bitmaps
  Documentation: describe incremental MIDX bitmaps
  Documentation: remove a "future work" item from the MIDX docs
2025-04-08 11:43:14 -07:00
Junio C Hamano
6e2a3b8ae0 Merge branch 'ps/reftable-sans-compat-util'
Make the code in reftable library less reliant on the service
routines it used to borrow from Git proper, to make it easier to
use by external users of the library.

* ps/reftable-sans-compat-util:
  Makefile: skip reftable library for Coccinelle
  reftable: decouple from Git codebase by pulling in "compat/posix.h"
  git-compat-util.h: split out POSIX-emulating bits
  compat/mingw: split out POSIX-related bits
  reftable/basics: introduce `REFTABLE_UNUSED` annotation
  reftable/basics: stop using `SWAP()` macro
  reftable/stack: stop using `sleep_millisec()`
  reftable/system: introduce `reftable_rand()`
  reftable/reader: stop using `ARRAY_SIZE()` macro
  reftable/basics: provide wrappers for big endian conversion
  reftable/basics: stop using `st_mult()` in array allocators
  reftable: stop using `BUG()` in trivial cases
  reftable/record: don't `BUG()` in `reftable_record_cmp()`
  reftable/record: stop using `BUG()` in `reftable_record_init()`
  reftable/record: stop using `COPY_ARRAY()`
  reftable/blocksource: stop using `xmmap()`
  reftable/stack: stop using `write_in_full()`
  reftable/stack: stop using `read_in_full()`
2025-04-08 11:43:14 -07:00
Junio C Hamano
45e31f0bac Merge branch 'js/mingw-admins-are-special'
"Dubious ownership" checks on Windows has been tightened up.

* js/mingw-admins-are-special:
  test-tool path-utils: support debugging "dubious ownership" issues
  mingw: special-case administrators even more
2025-04-07 14:23:20 -07:00
Junio C Hamano
3bc7f869f0 Merge branch 'dm/completion-remote-names-fix'
The bash command line completion script (in contrib/) has been
updated to cope with remote repository nicknames with slashes in
them.

* dm/completion-remote-names-fix:
  completion: fix bugs with slashes in remote names
  completion: add helper to count path components
2025-04-07 14:23:19 -07:00
Junio C Hamano
58a8c38226 Merge branch 'tb/combine-cruft-below-size'
"git repack" learned "--combine-cruft-below-size" option that
controls how cruft-packs are combined.

* tb/combine-cruft-below-size:
  repack: begin combining cruft packs with `--combine-cruft-below-size`
  repack: avoid combining cruft packs with `--max-cruft-size`
  t/t7704-repack-cruft.sh: consolidate `write_blob()`
  t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests
  t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests
2025-04-07 14:23:18 -07:00
Junio C Hamano
68c048c84c Merge branch 'cc/lop-remote'
Bugfix in newly introduced large-object-promisor remote support.

* cc/lop-remote:
  promisor-remote: compare remote names case sensitively
  promisor-remote: fix possible issue when no URL is advertised
  promisor-remote: fix segfault when remote URL is missing
  t5710: arrange to delete the client before cloning
2025-04-07 14:23:17 -07:00
Junio C Hamano
477cc3b6c7 Merge branch 'jc/name-rev-stdin'
Using "git name-rev --stdin" as an example, improve the framework to
prepare tests to pretend to be in the future where the breaking
changes have already happened.

* jc/name-rev-stdin:
  name-rev: remove "--stdin" support
  t6120: further modernize
  t6120: avoid hiding "git" exit status
  t: introduce WITH_BREAKING_CHANGES prerequisite
  t: extend test_lazy_prereq
  t: document test_lazy_prereq
2025-04-07 14:23:17 -07:00
Junio C Hamano
ff926a6d1b Merge branch 'en/random-cleanups'
Miscellaneous code clean-ups.

* en/random-cleanups:
  merge-ort: remove extraneous word in comment
  merge-ort: fix accidental strset<->strintmap
  t7615: be more explicit about diff algorithm used
  t6423: fix a comment that accidentally reversed two commits
  stash: remove merge-recursive.h include
2025-03-29 16:39:10 +09:00
Junio C Hamano
6767149eca Merge branch 'rs/xdiff-context-length-fix'
The xdiff code on 32-bit platform misbehaved when an insanely large
context size is given, which has been corrected.

* rs/xdiff-context-length-fix:
  xdiff: avoid arithmetic overflow in xdl_get_hunk()
2025-03-29 16:39:10 +09:00
Junio C Hamano
b9b404fa1c Merge branch 'en/diff-rename-follow-fix'
A corner-case bug in "git log --follow -B" has been fixed.

* en/diff-rename-follow-fix:
  diffcore-rename: fix BUG when break detection and --follow used together
2025-03-29 16:39:09 +09:00
Junio C Hamano
27fe152e88 Merge branch 'tb/multi-cruft-pack-refresh-fix'
Certain "cruft" objects would have never been refreshed when there
are multiple cruft packs in the repository, which has been
corrected.

* tb/multi-cruft-pack-refresh-fix:
  builtin/pack-objects.c: freshen objects from existing cruft packs
2025-03-29 16:39:09 +09:00
Junio C Hamano
650b2e2fdb Merge branch 'jk/fetch-ref-prefix-cleanup'
In protocol v2 where the refs advertisement is constrained, we try
to tell the server side not to limit the advertisement when there
is no specific need to, which has been the source of confusion and
recent bugs.  Revamp the logic to simplify.

* jk/fetch-ref-prefix-cleanup:
  fetch: use ref prefix list to skip ls-refs
  fetch: avoid ls-refs only to ask for HEAD symref update
  fetch: stop protecting additions to ref-prefix list
  fetch: ask server to advertise HEAD for config-less fetch
  refspec_ref_prefixes(): clean up refspec_item logic
  t5516: beef up exact-oid ref prefixes test
  t5516: drop NEEDSWORK about v2 reachability behavior
  t5516: prefer "oid" to "sha1" in some test titles
  t5702: fix typo in test name
2025-03-29 16:39:08 +09:00
Junio C Hamano
eb7923be1f Merge branch 'en/merge-ort-prepare-to-remove-recursive'
First step of deprecating and removing merge-recursive.

* en/merge-ort-prepare-to-remove-recursive:
  am: switch from merge_recursive_generic() to merge_ort_generic()
  merge-ort: fix merge.directoryRenames=false
  t3650: document bug when directory renames are turned off
  merge-ort: support having merge verbosity be set to 0
  merge-ort: allow rename detection to be disabled
  merge-ort: add new merge_ort_generic() function
2025-03-29 16:39:07 +09:00
Junio C Hamano
8d6413a1be Merge branch 'ps/refname-avail-check-optim'
The code paths to check whether a refname X is available (by seeing
if another ref X/Y exists, etc.) have been optimized.

* ps/refname-avail-check-optim:
  refs: reuse iterators when determining refname availability
  refs/iterator: implement seeking for files iterators
  refs/iterator: implement seeking for packed-ref iterators
  refs/iterator: implement seeking for ref-cache iterators
  refs/iterator: implement seeking for reftable iterators
  refs/iterator: implement seeking for merged iterators
  refs/iterator: provide infrastructure to re-seek iterators
  refs/iterator: separate lifecycle from iteration
  refs: stop re-verifying common prefixes for availability
  refs/files: batch refname availability checks for initial transactions
  refs/files: batch refname availability checks for normal transactions
  refs/reftable: batch refname availability checks
  refs: introduce function to batch refname availability checks
  builtin/update-ref: skip ambiguity checks when parsing object IDs
  object-name: allow skipping ambiguity checks in `get_oid()` family
  object-name: introduce `repo_get_oid_with_flags()`
2025-03-29 16:39:07 +09:00
Junio C Hamano
01d17c0530 Merge branch 'cc/signed-fast-export-import'
"git fast-export | git fast-import" learns to deal with commit and
tag objects with embedded signatures a bit better.

* cc/signed-fast-export-import:
  fast-export, fast-import: add support for signed-commits
  fast-export: do not modify memory from get_commit_buffer
  git-fast-export.adoc: clarify why 'verbatim' may not be a good idea
  fast-export: rename --signed-tags='warn' to 'warn-verbatim'
  fast-export: fix missing whitespace after switch
  git-fast-import.adoc: add missing LF in the BNF
2025-03-29 16:39:07 +09:00
Junio C Hamano
52241c96c7 Merge branch 'en/merge-process-renames-crash-fix'
The merge-recursive and merge-ort machinery crashed in corner cases
when certain renames are involved.

* en/merge-process-renames-crash-fix:
  merge-ort: fix slightly overzealous assertion for rename-to-self
  t6423: add a testcase causing a failed assertion in process_renames
2025-03-26 16:26:11 +09:00
Junio C Hamano
1a764cdbdc Merge branch 'ua/some-builtins-wo-the-repository'
A handful of built-in command implementations have been rewritten
to use the repository instance supplied by git.c:run_builtin(), its
caller.

* ua/some-builtins-wo-the-repository:
  builtin/checkout-index: stop using `the_repository`
  builtin/for-each-ref: stop using `the_repository`
  builtin/ls-files: stop using `the_repository`
  builtin/pack-refs: stop using `the_repository`
  builtin/send-pack: stop using `the_repository`
  builtin/verify-commit: stop using `the_repository`
  builtin/verify-tag: stop using `the_repository`
  config: teach repo_config to allow `repo` to be NULL
2025-03-26 16:26:10 +09:00
Junio C Hamano
def5e32bc5 Merge branch 'tb/refs-exclude-fixes'
The refname exclusion logic in the packed-ref backend has been
broken for some time, which confused upload-pack to advertise
different set of refs.  This has been corrected.

* tb/refs-exclude-fixes:
  refs.c: stop matching non-directory prefixes in exclude patterns
  refs.c: remove empty '--exclude' patterns
2025-03-26 16:26:10 +09:00
Junio C Hamano
de35b7b3ff Merge branch 'sj/ref-consistency-checks-more'
"git fsck" becomes more careful when checking the refs.

* sj/ref-consistency-checks-more:
  builtin/fsck: add `git refs verify` child process
  packed-backend: check whether the "packed-refs" is sorted
  packed-backend: add "packed-refs" entry consistency check
  packed-backend: check whether the refname contains NUL characters
  packed-backend: add "packed-refs" header consistency check
  packed-backend: check if header starts with "# pack-refs with: "
  packed-backend: check whether the "packed-refs" is regular file
  builtin/refs: get worktrees without reading head information
  t0602: use subshell to ensure working directory unchanged
2025-03-26 16:26:10 +09:00
Junio C Hamano
f50df872a4 Merge branch 'jt/diff-pairs'
A post-processing filter for "diff --raw" output has been
introduced.

* jt/diff-pairs:
  builtin/diff-pairs: allow explicit diff queue flush
  builtin: introduce diff-pairs command
  diff: add option to skip resolving diff statuses
  diff: return diff_filepair from diff queue helpers
2025-03-26 16:26:09 +09:00
Johannes Schindelin
5bb88e89ef test-tool path-utils: support debugging "dubious ownership" issues
This adds a new sub-sub-command for `test-tool`, simply passing through
the command-line arguments to the `is_path_owned_by_current_user()`
function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-25 04:45:56 -07:00
David Mandelberg
778d2f1760 completion: fix bugs with slashes in remote names
Previously, some calls to for-each-ref passed fixed numbers of path
components to strip from refs, assuming that remote names had no slashes
in them. This made completions like:

git push github/dseomn :com<Tab>

Result in:

git push github/dseomn :dseomn/completion-remote-slash

With this patch, it instead results in:

git push github/dseomn :completion-remote-slash

Signed-off-by: David Mandelberg <david@mandelberg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-23 23:03:13 -07:00
David Mandelberg
5637bdc352 completion: add helper to count path components
A follow-up commit will use this with for-each-ref to strip the right
number of path components from refnames.

Signed-off-by: David Mandelberg <david@mandelberg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-23 23:03:12 -07:00
Taylor Blau
27afc272c4 midx: implement writing incremental MIDX bitmaps
Now that the pack-bitmap machinery has learned how to read and interact
with an incremental MIDX bitmap, teach the pack-bitmap-write.c machinery
(and relevant callers from within the MIDX machinery) to write such
bitmaps.

The details for doing so are mostly straightforward. The main changes
are as follows:

  - find_object_pos() now makes use of an extra MIDX parameter which is
    used to locate the bit positions of objects which are from previous
    layers (and thus do not exist in the current layer's pack_order
    field).

    (Note also that the pack_order field is moved into struct
    write_midx_context to further simplify the callers for
    write_midx_bitmap()).

  - bitmap_writer_build_type_index() first determines how many objects
    precede the current bitmap layer and offsets the bits it sets in
    each respective type-level bitmap by that amount so they can be OR'd
    together.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21 04:34:16 -07:00
Taylor Blau
484d7adcda repack: begin combining cruft packs with --combine-cruft-below-size
The previous commit changed the behavior of repack's '--max-cruft-size'
to specify a cruft pack-specific override for '--max-pack-size'.

Introduce a new flag, '--combine-cruft-below-size' which is a
replacement for the old behavior of '--max-cruft-size'. This new flag
does explicitly what it says: it combines together cruft packs which are
smaller than a given threshold, and leaves alone ones which are
larger.

This accomplishes the original intent of '--max-cruft-size', which was
to avoid repacking cruft packs larger than the given threshold.

The new behavior is slightly different. Instead of building up small
packs together until the threshold is met, '--combine-cruft-below-size'
packs up *all* cruft packs smaller than the threshold. This means that
we may make a pack much larger than the given threshold (e.g., if you
aggregate 5 packs which are each 99 MiB in size with a threshold of 100
MiB).

But that's OK: the point isn't to restrict the size of the cruft packs
we generate, it's to avoid working with ones that have already grown too
large. If repositories still want to limit the size of the generated
cruft pack(s), they may use '--max-cruft-size'.

There's some minor test fallout as a result of the slight differences in
behavior between the old meaning of '--max-cruft-size' and the behavior
of '--combine-cruft-below-size'. In the test which is now called
"--combine-cruft-below-size combines packs", we need to use the new flag
over the old one to exercise that test's intended behavior. The
remainder of the changes there are to improve the clarity of the
comments.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21 03:42:07 -07:00
Taylor Blau
0855ed966c repack: avoid combining cruft packs with --max-cruft-size
In 37dc6d8104 (builtin/repack.c: implement support for
`--max-cruft-size`, 2023-10-02), we exposed new functionality that
allowed repositories to specify the behavior of when we should combine
multiple cruft packs together.

This feature was designed to ensure that we never repacked cruft packs
which were larger than the given threshold in order to provide tighter
I/O bounds for repositories that have many unreachable objects. In
essence, specifying '--max-cruft-size=N' instructed 'repack' to
aggregate cruft packs together (in order of ascending size) until the
combine size grows past 'N', and then make a new cruft pack whose
contents includes the packs we rolled up.

But this isn't quite how it works in practice. Suppose for example that
we have two cruft packs which are each 100MiB in size. One might expect
specifying "--max-cruft-size=200M" would combine these two packs
together, and then avoid repacking them until a pruning GC takes place.
In reality, 'repack' would try and aggregate these together, but writing
a pack that is strictly smaller than 200 MiB (since pack-objects'
"--max-pack-size" provides a strict bound for packs containing more than
one object).

So instead we'll write out a pack that is, say, 199 MiB in size, and
then another 1 MiB pack containing the balance. If we later repack the
repository without adding any new unreachable objects, we'll repeat the
same exercise again, making the same 199 MiB and 1 MiB packs each time.

This happens because of a poor choice to bolt the '--max-cruft-size'
functionality onto pack-objects' '--max-pack-size', forcing us to
generate packs which are always smaller than the provided threshold and
thus subject to repacking.

The following commit will introduce a new flag that implements something
similar to the behavior above. Let's prepare for that by making repack's
'--max-cruft-size' flag behave as an cruft pack-specific override for
'--max-pack-size'.

Do so by temporarily repurposing the 'collapse_small_cruft_packs()'
function to instead generate a cruft pack using the same instructions as
if we didn't specify any maximum pack size. The calling code looks
something like:

    if (args->max_pack_size && !cruft_expiration) {
        collapse_small_cruft_packs(in, args->max_pack_size, existing);
    } else {
        for_each_string_list_item(item, &existing->non_kept_packs)
            fprintf(in, "-%s.pack\n", item->string);
        for_each_string_list_item(item, &existing->cruft_packs)
            fprintf(in, "-%s.pack\n", item->string);
    }

This patch makes collapse_small_cruft_packs() behave identically to the
'else' arm of the conditional above. This repurposing of
'collapse_small_cruft_packs()' is intentional, since it will set us up
nicely to introduce the new behavior in the following commit.

Naturally, there is some test fallout in the test which exercises the
old meaning of '--max-cruft-size'. Mark that test as failing for now to
be dealt with in the following commit. Likewise, add a new test which
explicitly tests the behavior of '--max-cruft-size' to place a hard
limit on the size of any generated cruft pack(s).

Note that this is a breaking change, as it alters the user-visible
behavior of '--max-cruft-size'. But I'm OK changing this behavior in
this instance, since the behavior wasn't accurate to begin with.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21 03:42:07 -07:00
Taylor Blau
7fb12bb27e t/t7704-repack-cruft.sh: consolidate write_blob()
A previous commit moved a handful of tests from a different script into
t7704, including one that relies on generating random blobs.

Incidentally, the original home of this test defined its own helper
"write_blob" for doing so, which is identical in function to our
"generate_random_blob" (and is slightly inferior to the latter, which
cleans up after itself).

Rewrite the test that uses "write_blob" to no longer do so and then
remove the function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21 03:42:06 -07:00
Taylor Blau
1b01b03e52 t/t7704-repack-cruft.sh: clarify wording in --max-cruft-size tests
Now that a number of new tests have landed in t7704, make sure that they
all make sense and are testing the things they say they are.

Things are mostly OK, but a handful of tests needed tweaks. Those tweaks
are as follows:

  - Use the terms "too large" or "too small" in tests that exercise the
    '--max-cruft-size' behavior. This has historically been treated as a
    threshold beneath which to combine cruft packs, but that will change
    in a subsequent commit. Prepare for that by using a more generic
    term.

  - Remove references to "--max-cruft-size" in the freshening tests.
    These tests provide coverage of our ability to record updated mtimes
    for objects already in cruft packs whose mtimes are upserted from
    various sources (loose objects, finding that object in a new pack,
    another cruft pack, etc.).

    These have nothing to do with the '--max-cruft-size' feature, and in
    fact none of the tests even *use* '--max-cruft-size'. Name them
    appropriately to make it clear that these tests exercise freshening
    behavior, not '--max-cruft-size' behavior.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21 03:42:06 -07:00
Taylor Blau
cee95f2670 t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests
The cruft pack feature has two primary test scripts which exercise
various parts of it, which are:

  - t5329-pack-objects-cruft.sh
  - t7704-repack-cruft.sh

The former is designed to test low-level pack generation mechanics at
the 'git pack-objects --cruft'-level, which is plumbing. The latter, on
the other hand, is designed to test the user-facing behavior through
'git repack --cruft', which is porcelain (under the "ancillary
manipulators" sub-section).

At some point a handful of tests which should have been added to the
latter script were instead written to the former. This isn't a huge
deal, but rectifying it is straightforward. Move a handful of
'repack'-related tests out of t5329 and into their rightful home in
t7704.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21 03:42:05 -07:00
Christian Couder
b059339bb3 promisor-remote: fix segfault when remote URL is missing
Using strvec_push() to push `NULL` into a 'strvec' results in a
segfault, because `xstrdup(NULL)` crashes.

So when an URL is missing from the config, let's not push the remote
name and URL into the 'strvec's.

While at it, let's also not push them in case the URL is empty. It's
just not worth the trouble and it's consistent with how Git otherwise
treats missing and empty URLs in the same way.

Note that in case of missing or empty URL, Git uses the remote name to
fetch, which can work if the remote is on the same filesystem. So
configurations where the client, server and remote are all on the same
filesystem may need URLs to be configured even if they are the same as
the remote names. But this is a rare case, and the work around is easy
enough.

We leave improving the strvec API and/or xstrdup() for a future
separate effort.

While at it, let's also use git_config_get_string_tmp() instead of
git_config_get_string() to simplify memory management.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18 12:22:33 -07:00
Christian Couder
9e05fbe61b t5710: arrange to delete the client before cloning
If `test_when_finished "rm -rf client"` is run after we clone, it
will not run if the clone failed, so the "client" directory might
not be removed at the end of the test.

`git clone` does try to remove the directory when it fails, but
let's be safe and try to protect against possibly weird clone
failures by moving `test_when_finished "rm -rf client"` before
the clone. It just makes more sense this way around.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18 12:22:33 -07:00
Elijah Newren
947e219fb6 am: switch from merge_recursive_generic() to merge_ort_generic()
Switch from merge-recursive to merge-ort.  Adjust the following
testcases due to the switch:

* t4151: This test left an untracked file in the way of the merge.
  merge-recursive could only sometimes tell when untracked files were
  in the way, and by the time it discovers others, it has already made
  too many changes to back out of the merge.  So, instead of writing the
  results to e.g. 'file1' it would instead write them to
  'file1~branch1'.  This is confusing for users, because they might not
  notice 'file1~branch1' and accidentally add and commit 'file1'.
  In contrast, merge-ort correctly notices the file in the way before
  making any changes and aborts.  Since this test didn't care about the
  file in the way, just remove it before calling git-am.

* t4255: Usage of merge-ort allows us to change two known failures into
  successes.

* t6427: As noted a few commits ago, the choice of conflict label for
  diff3 markers for the ancestor commit was previously handled by
  merge-recursive.c rather than by callers.  Since that has now changed,
  `git am` needs to specify that label.  Although the previous conflict
  label ("constructed merge base") was already fairly somewhat slanted
  towards `git am`, let's use wording more along the lines of the
  related command-line flag from `git apply` and function involved to
  tie it more closely to `git am`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18 09:49:08 -07:00
Elijah Newren
a16e8efe5c merge-ort: fix merge.directoryRenames=false
There are two issues here.

First, when merge.directoryRenames is set to false, there are a few code
paths that should be turned off.  I missed one; collect_renames() was
still doing some directory rename detection logic unconditionally.  It
ended up not having much effect because
get_provisional_directory_renames() was skipped earlier and not setting
up renames->dir_renames, but the code should still be skipped.

Second, the larger issue is that sometimes we get a cached_pair rename
from a previous commit being replayed mapping A->B, but in a subsequent
commit but collect_merge_info() doesn't even recurse into the
directory containing B because there are no source pairings for that
rename that are relevant; we can merge that commit fine without knowing
the rename.  But since the cached renames are added to the normal
renames, when we go to process it and find that B is not part of
opt->priv->paths, we hit the assertion error
  process_renames: Assertion `newinfo && ~newinfo->merged.clean` failed.
I think we could fix this at the beginning of detect_regular_renames() by
pruning from cached_pairs any entry whose destination isn't in
opt->priv->paths, but it's suboptimal in that we'd kind of like the
cached_pair to be restored afterwards so that it can help the subsequent
commit, but more importantly since it sits at the intersection of
the caching renames optimization and the relevant renames optimization,
and the trivial directory resolution optimization, and I don't currently
have Documentation/technical/remembering-renames.txt fully paged in, I'm
not sure if that's a full solution or a bandaid for the current
testcase.  However, since the remembering renames optimization was the
weakest of the set, and the optimization is far less important when
directory rename detection is off (as that implies far fewer potential
renames), let's just use a bigger hammer to ensure this special case is
fixed: turn off the rename caching.  We do the same thing already when
we encounter rename/rename(1to1) cases (as per `git grep -3
disabling.the.optimization`, though it uses a slightly different
triggering mechanism since it's trying to affect the next time that
merge_check_renames_reusable() is called), and I think it makes sense
to do the same here.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18 09:49:04 -07:00
Johannes Schindelin
a9185cc89b t3650: document bug when directory renames are turned off
There is a bug in the way renames are cached that rears its head when
`merge.directoryRenames` is set to false; it results in the following
message:

    merge-ort.c:3002: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
    Aborted

It is quite a curious bug: the same test case will succeed, without any
assertion, if instead run with `merge.directoryRenames=true`.

Further, the assertion does not manifest while replaying the first
commit, it manifests while replaying the _second_ commit of the commit
range. But it does _not_ manifest when the second commit is replayed
individually.

This would indicate that there is an incomplete rename cache left-over
from the first replayed commit which is being reused for the second
commit, and if directory rename detection is enabled, the missing paths
are somehow regenerated.

Incidentally, the same bug can by triggered by modifying t6429 to switch
from merge.directoryRenames=true to merge.directoryRenames=false.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[en: tweaked the commit message slightly, including adjusting the
 line number of the assertion to the latest version, and the much
 later discovery that a simple t6429 tweak would also display the
 issue.]
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18 09:48:57 -07:00
Elijah Newren
a707d4f941 merge-ort: allow rename detection to be disabled
When merge-ort was written, I did not at first allow rename detection to
be disabled, because I suspected that most folks disabling rename
detection were doing so solely for performance reasons.  Since I put a
lot of working into providing dramatic speedups for rename detection
performance as used by the merge machinery, I wanted to know if there
were still real world repositories where rename detection was
problematic from a performance perspective.  We have had years now to
collect such information, and while we never received one, waiting
longer with the option disabled seems unlikely to help surface such
issues at this point.  Also, there has been at least one request to
allow rename detection to be disabled for behavioral rather than
performance reasons (see the thread including
https://lore.kernel.org/git/CABPp-BG-Nx6SCxxkGXn_Fwd2wseifMFND8eddvWxiZVZk0zRaA@mail.gmail.com/
), so let's start heeding the config and command line settings.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18 09:48:47 -07:00
Junio C Hamano
77f32ba430 Merge branch 'tb/multi-cruft-pack-refresh-fix' into tb/combine-cruft-below-size
* tb/multi-cruft-pack-refresh-fix:
  builtin/pack-objects.c: freshen objects from existing cruft packs
2025-03-17 17:00:38 -07:00
Karthik Nayak
d1270689a1 reflog: implement subcommand to drop reflogs
While 'git-reflog(1)' currently allows users to expire reflogs and
delete individual entries, it lacks functionality to completely remove
reflogs for specific references. This becomes problematic in
repositories where reflogs are not needed but continue to accumulate
entries despite setting 'core.logAllRefUpdates=false'.

Add a new 'drop' subcommand to git-reflog that allows users to delete
the entire reflog for a specified reference. Include an '--all' flag to
enable dropping all reflogs from all worktrees and an addon flag
'--single-worktree', to only drop all reflogs from the current worktree.

While here, remove an extraneous newline in the file.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17 16:58:11 -07:00
Karthik Nayak
52f2dfb084 reflog: improve error for when reflog is not found
The 'git reflog expire' prints the error message '<ref> points nowhere!'
when used with a non-existent ref. This message is a bit confusing and
vague. Modify the message to be more clear and direct.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17 16:58:11 -07:00
Elijah Newren
a373f93370 t7615: be more explicit about diff algorithm used
t7615 is entirely about testing the differences about different
diff algorithms, but it doesn't specify any diff algorithm when it
is testing myers.  Given that we have discussed potentially switching
defaults (https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/), it
makes sense in tests that are about different diff algorithms to be
explicitly about which one is intended to be used in each test.  Add
that specificity.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17 15:39:03 -07:00
Elijah Newren
9c69ad275e t6423: fix a comment that accidentally reversed two commits
The comment describing testcase 13b of t6423 somehow mixed up commits
A and B in one paragraph.  Fix the references.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17 15:39:03 -07:00
Elijah Newren
554051d691 diffcore-rename: fix BUG when break detection and --follow used together
Prior to commit 9db2ac5616 (diffcore-rename: accelerate rename_dst
setup, 2020-12-11), the function add_rename_dst() resulted in quadratic
runtime since each call inserted the new entry into the array in sorted
order.  The reason for the sorted order requirement was so that
locate_rename_dst(), used when break detection is turned on, could find
the appropriate entry in logarithmic time via bisection on string
comparisons.  (It's better to be quadratic in moving pointers than
quadratic in string comparisons, so this made some sense.)  However,
since break detection always sticks the broken pairs adjacent to each
other, that commit decided to simply append entries to rename_dst, and
record the mapping of (filename) -> (index within rename_dst) via a
strintmap.  Doing this relied on the fact that when adding the source of
a broken pair via register_rename_src(), that the next item we'd process
was the other half of the same broken pair and would be added to
rename_dst via add_rename_dst().  This assumption was fine under break
detection alone, but the combination of break detection and
single_follow violated that assumption because of this code:

		else if (options->single_follow &&
			 strcmp(options->single_follow, p->two->path))
			continue; /* not interested */

which would end up skipping calling add_rename_dst() below that point.
Since I knew I was assuming that the dst pair of a break would always be
added right after the src pair of a break, I added a new BUG() directive
as part of that commit later on at time of use that would check my
assumptions held.  That BUG() didn't trip for nearly 4 years...which
sadly meant I had long since forgotten the related details.  Anyway...

When the dst half of a broken pair is skipped like this, it means that
not only could my recorded index be invalid (just past the end of the
array), it could also point to some unrelated dst that just happened to
be the next one added to the array.  So, to fix this, we need to add a
little more safety around the checks for the recorded break_idx.

It turns out that making a testcase to trigger this is quite the
challenge.  I actually added two testscases:
  * One testcase which uses --follow incorrectly (it uses its single
    pathspec to specifying something other than a single filename), and
    which triggers the same bug reported-by Olaf.  This triggers a
    special case within locate_rename_dst() where idx evaluates to 0
    and rename_dst is NULL, meaning that our return value of
    &rename_dst[idx] happens to evaluate to NULL as well.  This
    addressing of an index into a NULL array hints at deeper problems,
    which are raised in the next testcase...
  * A second testcase which when run under valgrind shows that the code
    actually depends upon unintialized memory, in particular the entry
    just after the end of the rename_dst array.

In short, when the two rare options -B and --follow are used together,
fix the accidental find of the wrong dst entry (which would often be
uninitialized memory just past the end of the array, but also could
have just been a dst for an unrelated path if no dst was recorded for
the expected path).  Do so by adding a little more care around checking
the recorded indices in break_idx.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-14 18:43:28 -07:00
René Scharfe
d39e28e68c xdiff: avoid arithmetic overflow in xdl_get_hunk()
xdl_get_hunk() calculates the maximum number of common lines between two
changes that would fit into the same hunk for the given context options.
It involves doubling and addition and thus can overflow if the terms are
huge.

The type of ctxlen and interhunkctxlen in xdemitconf_t is long, while
the type of the corresponding context and interhunkcontext in struct
diff_options is int.  On many platforms longs are bigger that ints,
which prevents the overflow.  On Windows they have the same range and
the overflow manifests as hunks that are split erroneously and lines
being repeated between them.

Fix the overflow by checking and not going beyond LONG_MAX.  This allows
specifying a huge context line count and getting all lines of a changed
files in a single hunk, as expected.

Reported-by: Jason Cho <jason11choca@proton.me>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-14 16:19:40 -07:00
Taylor Blau
08f612ba70 builtin/pack-objects.c: freshen objects from existing cruft packs
Once an object is written into a cruft pack, we can only freshen it by
writing a new loose or packed copy of that object with a more recent
mtime.

Prior to 61568efa95 (builtin/pack-objects.c: support `--max-pack-size`
with `--cruft`, 2023-08-28), we typically had at most one cruft pack in
a repository at any given time. So freshening unreachable objects was
straightforward when already rewriting the cruft pack (and its *.mtimes
file).

But 61568efa95 changes things: 'pack-objects' now supports writing
multiple cruft packs when invoked with `--cruft` and the
`--max-pack-size` flag. Cruft packs are rewritten until they reach some
size threshold, at which point they are considered "frozen", and will
only be modified in a pruning GC, or if the threshold itself is
adjusted.

Prior to this patch, however, this process breaks down when we attempt
to freshen an object packed in an earlier cruft pack, and that cruft
pack is larger than the threshold and thus will survive the repack.

When this is the case, it is impossible to freshen objects in cruft
pack(s) when those cruft packs are larger than the threshold. This is
because we would avoid writing them in the new cruft pack entirely, for
a couple of reasons.

 1. When enumerating packed objects via 'add_objects_in_unpacked_packs()'
    we pass the SKIP_IN_CORE_KEPT_PACKS, which is used to avoid looping
    over the packs we're going to retain (which are marked as kept
    in-core by 'read_cruft_objects()').

    This means that we will avoid enumerating additional packed copies
    of objects found in any cruft packs which are larger than the given
    size threshold. Thus there is no opportunity to call
    'create_object_entry()' whatsoever.

 2. We likewise will discard the loose copy (if one exists) of any
    unreachable object packed in a cruft pack that is larger than the
    threshold. Here our call path is 'add_unreachable_loose_objects()',
    which uses the 'add_loose_object()' callback.

    That function will eventually land us in 'want_object_in_pack()'
    (via 'add_cruft_object_entry()'), and we'll discard the object as it
    appears in one of the packs which we marked as kept in-core.

This means in effect that it is impossible to freshen an unreachable
object once it appears in a cruft pack larger than the given threshold.

Instead, we should pack an additional copy of an unreachable object we
want to freshen even if it appears in a cruft pack, provided that the
cruft copy has an mtime which is before the mtime of the copy we are
trying to pack/freshen. This is sub-optimal in the sense that it
requires keeping an additional copy of unreachable objects upon
freshening, but we don't have a better alternative without the ability
to make in-place modifications to existing *.mtimes files.

In order to implement this, we have to adjust the behavior of
'want_found_object()'. When 'pack-objects' is told that we're *not*
going to retain any cruft packs (i.e. the set of packs marked as kept
in-core does not contain a cruft pack), the behavior is unchanged.

But when there *is* at least one cruft pack that we're holding onto, it
is no longer sufficient to reject a copy of an object found in that
cruft pack for that reason alone. In this case, we only want to reject a
candidate object when copies of that object either:

 - exists in a non-cruft pack that we are retaining, regardless of that
   pack's mtime, or

 - exists in a cruft pack with an mtime at least as recent as the copy
   we are debating whether or not to pack, in which case freshening
   would be redundant.

To do this, keep track of whether or not we have any cruft packs in our
in-core kept list with a new 'ignore_packed_keep_in_core_has_cruft'
flag. When we end up in this new special case, we replace a call to
'has_object_kept_pack()' to 'want_cruft_object_mtime()', and only reject
objects when we have a copy in an existing cruft pack with at least as
recent an mtime as our candidate (in which case "freshening" would be
redundant).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-13 11:48:04 -07:00
Junio C Hamano
870c74987b Merge branch 'tc/zlib-ng-fix'
"git version --build-options" stopped showing zlib version by
mistake due to recent refactoring, which has been corrected.

* tc/zlib-ng-fix:
  help: print zlib-ng version number
  help: include git-zlib.h to print zlib version
2025-03-12 12:06:58 -07:00
Patrick Steinhardt
cec2b6f55a refs/iterator: separate lifecycle from iteration
The ref and reflog iterators have their lifecycle attached to iteration:
once the iterator reaches its end, it is automatically released and the
caller doesn't have to care about that anymore. When the iterator should
be released before it has been exhausted, callers must explicitly abort
the iterator via `ref_iterator_abort()`.

This lifecycle is somewhat unusual in the Git codebase and creates two
problems:

  - Callsites need to be very careful about when exactly they call
    `ref_iterator_abort()`, as calling the function is only valid when
    the iterator itself still is. This leads to somewhat awkward calling
    patterns in some situations.

  - It is impossible to reuse iterators and re-seek them to a different
    prefix. This feature isn't supported by any iterator implementation
    except for the reftable iterators anyway, but if it was implemented
    it would allow us to optimize cases where we need to search for
    specific references repeatedly by reusing internal state.

Detangle the lifecycle from iteration so that we don't deallocate the
iterator anymore once it is exhausted. Instead, callers are now expected
to always call a newly introduce `ref_iterator_free()` function that
deallocates the iterator and its internal state.

Note that the `dir_iterator` is somewhat special because it does not
implement the `ref_iterator` interface, but is only used to implement
other iterators. Consequently, we have to provide `dir_iterator_free()`
instead of `dir_iterator_release()` as the allocated structure itself is
managed by the `dir_iterator` interfaces, as well, and not freed by
`ref_iterator_free()` like in all the other cases.

While at it, drop the return value of `ref_iterator_abort()`, which
wasn't really required by any of the iterator implementations anyway.
Furthermore, stop calling `base_ref_iterator_free()` in any of the
backends, but instead call it in `ref_iterator_free()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12 11:31:18 -07:00
Junio C Hamano
de3dec1187 name-rev: remove "--stdin" support
As part of Git 3.0, remove the hidden synonym for "--annotate-stdin"
for real.  As this does not change the fact that it used to be
called "--stdin" in older version of Git, keep that passage in the
documentation for "--annotate-stdin".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-12 08:48:54 -07:00