Commit Graph

12575 Commits

Author SHA1 Message Date
Junio C Hamano
5d55ad01f5 Merge branch 'tb/fetch-follow-tags-fix'
* tb/fetch-follow-tags-fix:
  fetch: fix following tags when fetching specific OID
2025-03-10 08:45:58 -07:00
Taylor Blau
bd52d9a058 fetch: fix following tags when fetching specific OID
In 3f763ddf28 (fetch: set remote/HEAD if it does not exist, 2024-11-22),
unconditionally adds "HEAD" to the list of ref prefixes we send to the
server.

This breaks a core assumption that the list of prefixes we send to the
server is complete. We must either send all prefixes we care about, or
none at all (in the latter case the server then advertises everything).

The tag following code is careful to only add "refs/tags/" to the list
of prefixes if there are already entries in the prefix list. But because
the new code from 3f763ddf28 runs after the tag code, and because it
unconditionally adds to the prefix list, we may end up with a prefix
list that _should_ have "refs/tags/" in it, but doesn't.

When that is the case, the server does not advertise any tags, and our
auto-following breaks because we never learned about any tags in the
first place.

Fix this by only adding "HEAD" to the ref prefixes when we know that we
are already limiting the advertisement. In either case we'll learn about
HEAD (either through the limited advertisement, or implicitly through a
full advertisement).

Reported-by: Igor Todorovski <itodorov@ca.ibm.com>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-07 16:15:18 -08:00
Junio C Hamano
cdf458c60e Merge branch 'kn/ref-migrate-skip-reflog'
Usage string of "git refs" has been corrected.

* kn/ref-migrate-skip-reflog:
  refs: show --no-reflog in the help text
2025-03-05 10:37:45 -08:00
Junio C Hamano
feffb34257 Merge branch 'ps/path-sans-the-repository'
The path.[ch] API takes an explicit repository parameter passed
throughout the callchain, instead of relying on the_repository
singleton instance.

* ps/path-sans-the-repository:
  path: adjust last remaining users of `the_repository`
  environment: move access to "core.sharedRepository" into repo settings
  environment: move access to "core.hooksPath" into repo settings
  repo-settings: introduce function to clear struct
  path: drop `git_path()` in favor of `repo_git_path()`
  rerere: let `rerere_path()` write paths into a caller-provided buffer
  path: drop `git_common_path()` in favor of `repo_common_path()`
  worktree: return allocated string from `get_worktree_git_dir()`
  path: drop `git_path_buf()` in favor of `repo_git_path_replace()`
  path: drop `git_pathdup()` in favor of `repo_git_path()`
  path: drop unused `strbuf_git_path()` function
  path: refactor `repo_submodule_path()` family of functions
  submodule: refactor `submodule_to_gitdir()` to accept a repo
  path: refactor `repo_worktree_path()` family of functions
  path: refactor `repo_git_path()` family of functions
  path: refactor `repo_common_path()` family of functions
2025-03-05 10:37:43 -08:00
Junio C Hamano
6dff5de1da refs: show --no-reflog in the help text
We forgot that we must keep the documentation and help text in sync.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 14:51:29 -08:00
Patrick Steinhardt
028f618658 path: adjust last remaining users of the_repository
With the preceding refactorings we now only have a couple of implicit
users of `the_repository` left in the "path" subsystem, all of which
depend on global state via `calc_shared_perm()`. Make the dependency on
`the_repository` explicit by passing the repo as a parameter instead and
adjust callers accordingly.

Note that this change bubbles up into a couple of subsystems that were
previously declared as free from `the_repository`. Instead of marking
all of them as `the_repository`-dependent again, we instead use the
repository that is available in the calling context. There are three
exceptions though with "copy.c", "pack-write.c" and "tempfile.c".
Adjusting these would require us to adapt callsites all over the place,
so this is left for a future iteration.

Mark "path.c" as free from `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
Patrick Steinhardt
f1ce861c34 environment: move access to "core.sharedRepository" into repo settings
Similar as with the preceding commit, we track "core.sharedRepository"
via a pair of global variables. Move them into `struct repo_settings` so
that we can instead track them per-repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
Patrick Steinhardt
88dd321cfe path: drop git_path() in favor of repo_git_path()
Remove `git_path()` in favor of the `repo_git_path()` family of
functions, which makes the implicit dependency on `the_repository` go
away.

Note that `git_path()` returned a string allocated via `get_pathname()`,
which uses a rotating set of statically allocated buffers. Consequently,
callers didn't have to free the returned string. The same isn't true for
`repo_common_path()`, so we also have to add logic to free the returned
strings.

This refactoring also allows us to remove `repo_common_pathv()` as well
as `get_pathname()` from the public interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
Patrick Steinhardt
8ee018d863 rerere: let rerere_path() write paths into a caller-provided buffer
Same as with `get_worktree_git_dir()` a couple of commits ago, the
`rerere_path()` function returns paths that need not be free'd by the
caller because `git_path()` internally uses `get_pathname()`.

Refactor the function to instead accept a caller-provided buffer that
the path will be written into, passing on ownership to the caller. This
refactoring prepares us for the removal of `git_path()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28 13:54:11 -08:00
Junio C Hamano
3c0f4abaf5 Merge branch 'kn/ref-migrate-skip-reflog'
"git refs migrate" can optionally be told not to migrate the reflog.

* kn/ref-migrate-skip-reflog:
  builtin/refs: add '--no-reflog' flag to drop reflogs
2025-02-27 15:23:00 -08:00
Junio C Hamano
9d8cce051a Merge branch 'ua/os-version-capability'
The value of "uname -s" is by default sent over the wire as a part
of the "version" capability.

* ua/os-version-capability:
  agent: advertise OS name via agent capability
  t5701: add setup test to remove side-effect dependency
  version: extend get_uname_info() to hide system details
  version: refactor get_uname_info()
  version: refactor redact_non_printables()
  version: replace manual ASCII checks with isprint() for clarity
2025-02-27 15:23:00 -08:00
Junio C Hamano
e24570b0a3 Merge branch 'jk/check-mailmap-wo-name-fix'
"git check-mailmap" segfault fix.

* jk/check-mailmap-wo-name-fix:
  mailmap: fix check-mailmap with full mailmap line
2025-02-26 08:51:00 -08:00
Junio C Hamano
9b07c152df Merge branch 'pw/merge-tree-stdin-deadlock-fix'
"git merge-tree --stdin" has been improved (including a workaround
for a deadlock).

* pw/merge-tree-stdin-deadlock-fix:
  merge-tree: fix link formatting in html docs
  merge-tree: improve docs for --stdin
  merge-tree: only use basic merge config
  merge-tree: remove redundant code
  merge-tree --stdin: flush stdout to avoid deadlock
2025-02-25 14:19:36 -08:00
Jacob Keller
bb60c52131 mailmap: fix check-mailmap with full mailmap line
I recently had reported to me a crash from a coworker using the recently
added sendemail mailmap support:

  3724814 Segmentation fault      (core dumped) git check-mailmap "bugs@company.xx"

This appears to happen because of the NULL pointer name passed into
map_user(). Fix this by passing "" instead of NULL so that we have a
valid pointer.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21 18:27:16 -08:00
Junio C Hamano
ee8020ff40 Merge branch 'ua/update-server-info-sans-the-repository'
Code clean-up.

* ua/update-server-info-sans-the-repository:
  builtin/update-server-info: remove the_repository global variable
2025-02-21 10:35:53 -08:00
Karthik Nayak
89be7d2774 builtin/refs: add '--no-reflog' flag to drop reflogs
The "git refs migrate" subcommand converts the backend used for ref
storage. It always migrates reflog data as well as refs. Introduce an
option to exclude reflogs from migration, allowing them to be discarded
when they are unnecessary.

This is particularly useful in server-side repositories, where reflogs
are typically not expected. However, some repositories may still have
them due to historical reasons, such as bugs, misconfigurations, or
administrative decisions to enable reflogs for debugging. In such
repositories, it would be optimal to drop reflogs during the migration.

To address this, introduce the '--no-reflog' flag, which prevents reflog
migration. When this flag is used, reflogs from the original reference
backend are migrated. Since only the new reference backend remains in
the repository, all previous reflogs are permanently discarded.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-21 09:55:02 -08:00
Junio C Hamano
716b00e6e9 Merge branch 'da/difftool-sans-the-repository'
"git difftool" code clean-up.

* da/difftool-sans-the-repository:
  difftool: eliminate use of USE_THE_REPOSITORY_VARIABLE
  difftool: eliminate use of the_repository
  difftool: eliminate use of global variables
2025-02-18 15:30:32 -08:00
Junio C Hamano
7722b997c6 Merge branch 'jt/rev-list-missing-print-info'
"git rev-list --missing=" learned to accept "print-info" that gives
known details expected of the missing objects, like path and type.

* jt/rev-list-missing-print-info:
  rev-list: extend print-info to print missing object type
  rev-list: add print-info action to print missing object path
2025-02-18 15:30:32 -08:00
Junio C Hamano
e565f37553 Merge branch 'ds/backfill'
Lazy-loading missing files in a blobless clone on demand is costly
as it tends to be one-blob-at-a-time.  "git backfill" is introduced
to help bulk-download necessary files beforehand.

* ds/backfill:
  backfill: assume --sparse when sparse-checkout is enabled
  backfill: add --sparse option
  backfill: add --min-batch-size=<n> option
  backfill: basic functionality and tests
  backfill: add builtin boilerplate
2025-02-18 15:30:31 -08:00
Phillip Wood
54cf5d2da8 merge-tree: only use basic merge config
Commit 9c93ba4d0a (merge-recursive: honor diff.algorithm, 2024-07-13)
replaced init_merge_options() with init_basic_merge_config() for use in
plumbing commands and init_ui_merge_config() for use in porcelain
commands. As "git merge-tree" is a plumbing command it should call
init_basic_merge_config() rather than init_ui_merge_config(). The merge
ort machinery ignores "diff.algorithm" so the behavior is unchanged by
this commit but it future proofs us against any future changes to
init_ui_merge_config().

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:39 -08:00
Phillip Wood
1b0e5f4499 merge-tree: remove redundant code
real_merge() only ever returns "0" or "1" as it dies if the merge status
is less than zero. Therefore the check for "result < 0" is redundant and
the result variable is not needed. The return value of real_merge() is
ignored because exit status of "git merge-tree --stdin" is "0" for both
successful and conflicted merges (the status of each merge is written to
stdout). The return type of real_merge() is not changed as it is used
for the program's exit status when "--stdin" is not given.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:39 -08:00
Phillip Wood
344a107b55 merge-tree --stdin: flush stdout to avoid deadlock
If a process tries to read the output from "git merge-tree --stdin"
before it closes merge-tree's stdin then it deadlocks. This happens
because merge-tree does not flush its output before trying to read
another line of input and means that it is not possible to cherry-pick a
sequence of commits using "git merge-tree --stdin". Fix this by calling
maybe_flush_or_die() before trying to read the next line of
input. Flushing the output after each merge does not seem to affect the
performance, any difference is lost in the noise even after increasing
the number of runs.

$ git rev-list --merges --parents -n100 origin/master |
	sed 's/^[^ ]* //' >/tmp/merges
$ hyperfine -L flush 0,1 --warmup 1 --runs 30 \
	'GIT_FLUSH={flush} ./git merge-tree --stdin </tmp/merges'
Benchmark 1: GIT_FLUSH=0 ./git merge-tree --stdin </tmp/merges
  Time (mean ± σ):     546.6 ms ±  11.7 ms    [User: 503.2 ms, System: 40.9 ms]
  Range (min … max):   535.9 ms … 567.7 ms    30 runs

Benchmark 2: GIT_FLUSH=1 ./git merge-tree --stdin </tmp/merges
  Time (mean ± σ):     546.9 ms ±  12.0 ms    [User: 505.9 ms, System: 38.9 ms]
  Range (min … max):   529.8 ms … 570.0 ms    30 runs

Summary
  'GIT_FLUSH=0 ./git merge-tree --stdin </tmp/merges' ran
    1.00 ± 0.03 times faster than 'GIT_FLUSH=1 ./git merge-tree --stdin </tmp/merges'

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:52:39 -08:00
Usman Akinyemi
6aa09fd872 version: extend get_uname_info() to hide system details
Currently, get_uname_info() function provides the full OS information.
In a following commit, we will need it to provide only the OS name.

Let's extend it to accept a "full" flag that makes it switch between
providing full OS information and providing only the OS name.

We may need to refactor this function in the future if an
`osVersion.format` is added.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:05:12 -08:00
Usman Akinyemi
0a78d61247 version: refactor get_uname_info()
Some code from "builtin/bugreport.c" uses uname(2) to get system
information.

Let's refactor this code into a new get_uname_info() function, so
that we can reuse it in a following commit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 09:05:12 -08:00
Junio C Hamano
c3fffcfe8e Merge branch 'bf/fetch-set-head-fix'
Fetching into a bare repository incorrectly assumed it always used
a mirror layout when deciding to update remote-tracking HEAD, which
has been corrected.

* bf/fetch-set-head-fix:
  fetch set_head: fix non-mirror remotes in bare repositories
  fetch set_head: refactor to use remote directly
2025-02-14 17:53:48 -08:00
Junio C Hamano
5785d9143b Merge branch 'tc/clone-single-revision'
"git clone" learned to make a shallow clone for a single commit
that is not necessarily be at the tip of any branch.

* tc/clone-single-revision:
  builtin/clone: teach git-clone(1) the --revision= option
  parse-options: introduce die_for_incompatible_opt2()
  clone: introduce struct clone_opts in builtin/clone.c
  clone: add tags refspec earlier to fetch refspec
  clone: refactor wanted_peer_refs()
  clone: make it possible to specify --tags
  clone: cut down on global variables in clone.c
2025-02-14 17:53:48 -08:00
Junio C Hamano
998c5f0c75 Merge branch 'ms/refspec-cleanup'
Code clean-up.  cf. <Z6G-toOJjMmK8iJG@pks.im>

* ms/refspec-cleanup:
  refspec: relocate apply_refspecs and related funtions
  refspec: relocate matching related functions
  remote: rename query_refspecs functions
  refspec: relocate refname_matches_negative_refspec_item
  remote: rename function omit_name_by_refspec
2025-02-12 10:08:54 -08:00
Junio C Hamano
5b9d01bc4d Merge branch 'zh/gc-expire-to'
"git gc" learned the "--expire-to" option and passes it down to
underlying "git repack".

* zh/gc-expire-to:
  gc: add `--expire-to` option
2025-02-12 10:08:53 -08:00
Junio C Hamano
07c401d392 Merge branch 'ps/repack-keep-unreachable-in-unpacked-repo'
"git repack --keep-unreachable" to send unreachable objects to the
main pack "git repack -ad" produces did not work when there is no
existing packs, which has been corrected.

* ps/repack-keep-unreachable-in-unpacked-repo:
  builtin/repack: fix `--keep-unreachable` when there are no packs
2025-02-12 10:08:52 -08:00
Junio C Hamano
aae91a86fb Merge branch 'ds/name-hash-tweaks'
"git pack-objects" and its wrapper "git repack" learned an option
to use an alternative path-hash function to improve delta-base
selection to produce a packfile with deeper history than window
size.

* ds/name-hash-tweaks:
  pack-objects: prevent name hash version change
  test-tool: add helper for name-hash values
  p5313: add size comparison test
  pack-objects: add GIT_TEST_NAME_HASH_VERSION
  repack: add --name-hash-version option
  pack-objects: add --name-hash-version option
  pack-objects: create new name-hash function version
2025-02-12 10:08:51 -08:00
Usman Akinyemi
62898b8f5e builtin/update-server-info: remove the_repository global variable
Remove the_repository global variable in favor of the repository
argument that gets passed in "builtin/update-server-info.c".

When `-h` is passed to the command outside a Git repository, the
`run_builtin()` will call the `cmd_update_server_info()` function
with `repo` set to NULL and then early in the function, "parse_options()"
call will give the options help and exit, without having to consult much
of the configuration file. So it is safe to omit reading the config when
`repo` argument the caller gave us is NULL.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-10 16:20:21 -08:00
Junio C Hamano
246569bf83 Merge branch 'ps/hash-cleanup'
Further code clean-up on the use of hash functions.  Now the
context object knows what hash function it is working with.

* ps/hash-cleanup:
  global: adapt callers to use generic hash context helpers
  hash: provide generic wrappers to update hash contexts
  hash: stop typedeffing the hash context
  hash: convert hashing context to a structure
2025-02-10 10:18:31 -08:00
Patrick Steinhardt
07242c2a5a path: drop git_common_path() in favor of repo_common_path()
Remove `git_common_path()` in favor of the `repo_common_path()` family
of functions, which makes the implicit dependency on `the_repository` go
away.

Note that `git_common_path()` used to return a string allocated via
`get_pathname()`, which uses a rotating set of statically allocated
buffers. Consequently, callers didn't have to free the returned string.
The same isn't true for `repo_common_path()`, so we also have to add
logic to free the returned strings.

This refactoring also allows us to remove `repo_common_pathv()` from the
public interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:23 -08:00
Patrick Steinhardt
8e4710f011 worktree: return allocated string from get_worktree_git_dir()
The `get_worktree_git_dir()` function returns a string constant that
does not need to be free'd by the caller. This string is computed for
three different cases:

  - If we don't have a worktree we return a path into the Git directory.
    The returned string is owned by `the_repository`, so there is no
    need for the caller to free it.

  - If we have a worktree, but no worktree ID then the caller requests
    the main worktree. In this case we return a path into the common
    directory, which again is owned by `the_repository` and thus does
    not need to be free'd.

  - In the third case, where we have an actual worktree, we compute the
    path relative to "$GIT_COMMON_DIR/worktrees/". This string does not
    need to be released either, even though `git_common_path()` ends up
    allocating memory. But this doesn't result in a memory leak either
    because we write into a buffer returned by `get_pathname()`, which
    returns one out of four static buffers.

We're about to drop `git_common_path()` in favor of `repo_common_path()`,
which doesn't use the same mechanism but instead returns an allocated
string owned by the caller. While we could adapt `get_worktree_git_dir()`
to also use `get_pathname()` and print the derived common path into that
buffer, the whole schema feels a lot like premature optimization in this
context. There are some callsites where we call `get_worktree_git_dir()`
in a loop that iterates through all worktrees. But none of these loops
seem to be even remotely in the hot path, so saving a single allocation
there does not feel worth it.

Refactor the function to instead consistently return an allocated path
so that we can start using `repo_common_path()` in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:23 -08:00
Patrick Steinhardt
3859e39659 path: drop git_path_buf() in favor of repo_git_path_replace()
Remove `git_path_buf()` in favor of `repo_git_path_replace()`. The
latter does essentially the same, with the only exception that it does
not rely on `the_repository` but takes the repo as separate parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:22 -08:00
Patrick Steinhardt
bba59f58a4 path: drop git_pathdup() in favor of repo_git_path()
Remove `git_pathdup()` in favor of `repo_git_path()`. The latter does
essentially the same, with the only exception that it does not rely on
`the_repository` but takes the repo as separate parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:22 -08:00
Patrick Steinhardt
f5c714e2a7 path: refactor repo_submodule_path() family of functions
As explained in an earlier commit, we're refactoring path-related
functions to provide a consistent interface for computing paths into the
commondir, gitdir and worktree. Refactor the "submodule" family of
functions accordingly.

Note that in contrast to the other `repo_*_path()` families, we have to
pass in the repository as a non-constant pointer. This is because we end
up calling `repo_read_gitmodules()` deep down in the callstack, which
may end up modifying the repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:22 -08:00
Patrick Steinhardt
f9467895d8 submodule: refactor submodule_to_gitdir() to accept a repo
The `submodule_to_gitdir()` function implicitly uses `the_repository` to
resolve submodule paths. Refactor the function to instead accept a repo
as parameter to remove the dependency on global state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07 09:59:21 -08:00
David Aguilar
7c2f291943 difftool: eliminate use of USE_THE_REPOSITORY_VARIABLE
Remove the USE_THE_REPOSITORY_VARIABLE #define now that all
state is passed to each function from callers.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 13:00:21 -08:00
David Aguilar
a24953f3df difftool: eliminate use of the_repository
Make callers pass a repository struct into each function instead
of relying on the global the_repository variable.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 13:00:21 -08:00
David Aguilar
8241ae63d8 difftool: eliminate use of global variables
Move difftool's global variables into a difftools_option struct
in preparation for removal of USE_THE_REPOSITORY_VARIABLE.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 13:00:21 -08:00
Toon Claes
337855629f builtin/clone: teach git-clone(1) the --revision= option
The git-clone(1) command has the option `--branch` that allows the user
to select the branch they want HEAD to point to. In a non-bare
repository this also checks out that branch.

Option `--branch` also accepts a tag. When a tag name is provided, the
commit this tag points to is checked out and HEAD is detached. Thus
`--branch` can be used to clone a repository and check out a ref kept
under `refs/heads` or `refs/tags`. But some other refs might be in use
as well. For example Git forges might use refs like `refs/pull/<id>` and
`refs/merge-requests/<id>` to track pull/merge requests. These refs
cannot be selected upon git-clone(1).

Add option `--revision` to git-clone(1). This option accepts a fully
qualified reference, or a hexadecimal commit ID. This enables the user
to clone and check out any revision they want. `--revision` can be used
in conjunction with `--depth` to do a minimal clone that only contains
the blob and tree for a single revision. This can be useful for
automated tests running in CI systems.

Using option `--branch` and `--single-branch` together is a similar
scenario, but serves a different purpose. Using these two options, a
singlet remote tracking branch is created and the fetch refspec is set
up so git-fetch(1) will receive updates on that branch from the remote.
This allows the user work on that single branch.

Option `--revision` on contrary detaches HEAD, creates no tracking
branches, and writes no fetch refspec.

Signed-off-by: Toon Claes <toon@iotcl.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
[jc: removed unnecessary TEST_PASSES_SANITIZE_LEAK from the test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:26:42 -08:00
Toon Claes
9144b9362b parse-options: introduce die_for_incompatible_opt2()
The functions die_for_incompatible_opt3() and
die_for_incompatible_opt4() already exist to die whenever a user
specifies three or four options respectively that are not compatible.

Introduce die_for_incompatible_opt2() which dies when two options that
are incompatible are set.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:54 -08:00
Toon Claes
7a52a8c7d8 clone: introduce struct clone_opts in builtin/clone.c
There is a lot of state stored in global variables in builtin/clone.c.
In the long run we'd like to remove many of those.

Introduce `struct clone_opts` in this file. This struct will be used to
contain all details needed to perform the clone. The struct object can
be thrown around to all the functions that need these details.

The first field we're adding is `wants_head`. In some scenarios
(specifically when both `--single-branch` and `--branch` are given) we
are not interested in `HEAD` on the remote. The field `wants_head` in
`struct clone_opts` will hold this information. We could have put
`option_branch` and `option_single_branch` into that struct instead, but
in a following commit we'll be using `wants_head` as well.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:54 -08:00
Toon Claes
2ca67c6f14 clone: add tags refspec earlier to fetch refspec
In clone.c we call refspec_ref_prefixes() to copy the fetch refspecs
from the `remote->fetch` refspec into `ref_prefixes` of
`transport_ls_refs_options`. Afterwards we add the tags prefix
`refs/tags/` prefix as well. At a later point, in wanted_peer_refs() we
process refs using both `remote->fetch` and `TAG_REFSPEC`.

Simplify the code by appending `TAG_REFSPEC` to `remote->fetch` before
calling refspec_ref_prefixes().

To be able to do this, we set `option_tags` to 0 when --mirror is given.
This is because --mirror mirrors (hence the name) all the refs,
including tags and they do not need to be treated separately.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
Toon Claes
879780f9a1 clone: refactor wanted_peer_refs()
The function wanted_peer_refs() is used to map the refs returned by the
server to refs we will save in our clone.

Over time this function grown to be very complex. Refactor it.

Previously, there was a separate code path for when
`option_single_branch` was set. It resulted in duplicated code and
deeper nested conditions. After this refactor the code path for when
`option_single_branch` is truthy modifies `refs` and then falls through
to the common code path. This approach relies on the `refspec` being set
correctly and thus only mapping refs that are relevant.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
Toon Claes
bc26f7690a clone: make it possible to specify --tags
Option --no-tags was added in 0dab2468ee (clone: add a --no-tags option
to clone without tags, 2017-04-26). At the time there was no need to
support --tags as well, although there was some conversation about
it[1].

To simplify the code and to prepare for future commits, invert the flag
internally. Functionally there is no change, because the flag is
default-enabled passing `--tags` has no effect, so there's no need to
add tests for this.

[1]: https://lore.kernel.org/git/CAGZ79kbHuMpiavJ90kQLEL_AR0BEyArcZoEWAjPPhOFacN16YQ@mail.gmail.com/

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
Toon Claes
7f420a6bda clone: cut down on global variables in clone.c
In clone.c the `struct option` which is used to parse the input options
for git-clone(1) is a global variable. Due to this, many variables that
are used to parse the value into, are also global.

Make `builtin_clone_options` a local variable in cmd_clone() and carry
along all variables that are only used in that function.

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 12:23:53 -08:00
Justin Tobler
3295c35398 rev-list: extend print-info to print missing object type
Additional information about missing objects found in git-rev-list(1)
can be printed by specifying the `print-info` missing action for the
`--missing` option. Extend this action to also print missing object type
information inferred from its containing object. This token follows the
form `type=<type>` and specifies the expected object type of the missing
object.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-05 09:32:01 -08:00
Justin Tobler
c6d896bcfd rev-list: add print-info action to print missing object path
Missing objects identified through git-rev-list(1) can be printed by
setting the `--missing=print` option. Additional information about the
missing object, such as its path and type, may be present in its
containing object.

Add the `print-info` missing action for the `--missing` option that,
when set, prints additional insight about the missing object inferred
from its containing object. Each line of output for a missing object is
in the form: `?<oid> [<token>=<value>]...`. The `<token>=<value>` pairs
containing additional information are separated from each other by a SP.
The value is encoded in a token specific fashion, but SP or LF contained
in value are always expected to be represented in such a way that the
resulting encoded value does not have either of these two problematic
bytes. This format is kept generic so it can be extended in the future
to support additional information.

For now, only a missing object path info is implemented. It follows the
form `path=<path>` and specifies the full path to the object from the
top-level tree. A path containing SP or special characters is enclosed
in double-quotes in the C style as needed. In a subsequent commit,
missing object type info will also be added.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-05 09:32:01 -08:00