Commit Graph

21131 Commits

Author SHA1 Message Date
Junio C Hamano
c6a5e1a22e Merge branch 'tb/repack-cleanup'
The recent change to "git repack" made it react less nicely when a
leftover .idx file that no longer has the corresponding .pack file
in the repository, which has been corrected.

* tb/repack-cleanup:
  builtin/repack.c: avoid dir traversal in `collect_pack_filenames()`
  builtin/repack.c: only repack `.pack`s that exist
2023-07-18 07:28:53 -07:00
Junio C Hamano
6016ee0a71 Merge branch 'tb/fsck-no-progress'
"git fsck --no-progress" still spewed noise from the commit-graph
subsystem, which has been corrected.

* tb/fsck-no-progress:
  commit-graph.c: avoid duplicated progress output during `verify`
  commit-graph.c: pass progress to `verify_one_commit_graph()`
  commit-graph.c: iteratively verify commit-graph chains
  commit-graph.c: extract `verify_one_commit_graph()`
  fsck: suppress MIDX output with `--no-progress`
  fsck: suppress commit-graph output with `--no-progress`
2023-07-18 07:28:53 -07:00
Junio C Hamano
13ed10efd4 Merge branch 'jc/pathspec-match-with-common-prefix'
"git ls-files '(attr:X)D/'" that triggers the common prefix
optimization codepath failed to read from "D/.gitattributes",
which has been corrected.

* jc/pathspec-match-with-common-prefix:
  dir: match "attr" pathspec magic with correct paths
  t6135: attr magic with path pattern
2023-07-17 11:30:43 -07:00
Junio C Hamano
ce481ac8b3 Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.

* cw/compat-util-header-cleanup:
  git-compat-util: move alloc macros to git-compat-util.h
  treewide: remove unnecessary includes for wrapper.h
  kwset: move translation table from ctype
  sane-ctype.h: create header for sane-ctype macros
  git-compat-util: move wrapper.c funcs to its header
  git-compat-util: move strbuf.c funcs to its header
2023-07-17 11:30:42 -07:00
Junio C Hamano
9187b276e9 Merge branch 'pw/diff-no-index-from-named-pipes'
"git diff --no-index" learned to read from named pipes as if they
were regular files, to allow "git diff <(process) <(substitution)"
some shells support.

* pw/diff-no-index-from-named-pipes:
  diff --no-index: support reading from named pipes
  t4054: test diff --no-index with stdin
  diff --no-index: die on error reading stdin
  diff --no-index: refuse to compare stdin to a directory
2023-07-17 11:30:41 -07:00
Junio C Hamano
ce36dea07b Merge branch 'ma/t0091-fixup'
"git bugreport" tests did not test what it wanted to test, which
has been corrected.

* ma/t0091-fixup:
  t0091-bugreport.sh: actually verify some content of report
2023-07-14 10:46:07 -07:00
Junio C Hamano
81ebc54e81 Merge branch 'ks/ref-filter-signature'
The "git for-each-ref" family of commands learned placeholders
related to GPG signature verification.

* ks/ref-filter-signature:
  ref-filter: add new "signature" atom
  t/lib-gpg: introduce new prereq GPG2
2023-07-14 10:46:07 -07:00
Derrick Stolee
0af067276e builtin/repack.c: only repack .packs that exist
In 73320e49ad (builtin/repack.c: only collect fully-formed packs,
2023-06-07), we switched the check for which packs to collect by
starting at the .idx files and looking for matching .pack files. This
avoids trying to repack pack-files that have not had their pack-indexes
installed yet.

However, it does cause maintenance to halt if we find the (problematic,
but not insurmountable) case of a .idx file without a corresponding
.pack file. In an environment where packfile maintenance is a critical
function, such a hard stop is costly and requires human intervention to
resolve (by deleting the .idx file).

This was not the case before. We successfully repacked through this
scenario until the recent change to scan for .idx files.

Further, if we are actually in a case where objects are missing, we
detect this at a different point during the reachability walk.

In other cases, Git prepares its list of packfiles by scanning .idx
files and then only adds it to the packfile list if the corresponding
.pack file exists. It even does so without a warning! (See
add_packed_git() in packfile.c for details.)

This case is much less likely to occur than the failures seen before
73320e49ad. Packfiles are "installed" by writing the .pack file before
the .idx and that process can be interrupted. Packfiles _should_ be
deleted by deleting the .idx first, followed by the .pack file, but
unlink_pack_path() does not do this: it deletes the .pack _first_,
allowing a window where this process could be interrupted. We leave the
consideration of changing this order as a separate concern. Knowing that
this condition is possible from interrupted Git processes and not other
tools lends some weight that Git should be more flexible around this
scenario.

Add a check to see if the .pack file exists before adding it to the list
for repacking. This will stop a number of maintenance failures seen in
production but fixed by deleting the .idx files.

This brings us closer to the case before 73320e49ad in that 'git
repack' will not fail when there is an orphaned .idx file, at least, not
due to the way we scan for packfiles. In the case that the .pack file
was erroneously deleted without copies of its objects in other installed
packfiles, then 'git repack' will fail due to the reachable object walk.

This does resolve the case where automated repacks will no longer be
halted on this case. The tests in t7700 show both these successful
scenarios and the case of failing if the .pack was truly required.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-11 13:07:50 -07:00
Taylor Blau
9281cd07f0 commit-graph.c: avoid duplicated progress output during verify
When `git commit-graph verify` was taught how to verify commit-graph
chains in 3da4b609bb (commit-graph: verify chains with --shallow mode,
2019-06-18), it produced one line of progress per layer of the
commit-graph chain.

    $ git.compile commit-graph verify
    Verifying commits in commit graph: 100% (4356/4356), done.
    Verifying commits in commit graph: 100% (131912/131912), done.

This could be somewhat confusing to users, who may wonder why there are
multiple occurrences of "Verifying commits in commit graph".

There are likely good arguments on whether or not there should be
one line of progress output per commit-graph layer. On the one hand, the
existing output shows us verifying each individual layer of the chain.
But on the other hand, the fact that a commit-graph may be stored among
multiple layers is an implementation detail that the caller need not be
aware of.

Clarify this by showing a single progress meter regardless of the number
of layers in the commit-graph chain. After this patch, the output
reflects the logical contents of a commit-graph chain, instead of
showing one line of output per commit-graph layer:

    $ git.compile commit-graph verify
    Verifying commits in commit graph: 100% (136268/136268), done.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:45 -07:00
Taylor Blau
39bdd30377 fsck: suppress MIDX output with --no-progress
In a similar spirit as the previous commit, address a bug where `git
fsck` produces output when calling `git multi-pack-index verify` even
when invoked with `--no-progress`.

    $ git.compile fsck --connectivity-only --no-progress --no-dangling
    Verifying OID order in multi-pack-index: 100% (605677/605677), done.
    Sorting objects by packfile: 100% (605678/605678), done.
    Verifying object offsets: 100% (605678/605678), done.

The three lines produced by `git fsck` come from `git multi-pack-index
verify`, but should be squelched due to `--no-progress`.

The MIDX machinery learned to generate these progress messages as early
as 430efb8a74 (midx: add progress indicators in multi-pack-index
verify, 2019-03-21), but did not respect `--progress` or `--no-progress`
until ad60096d1c (midx: honor the MIDX_PROGRESS flag in
verify_midx_file, 2019-10-21).

But the `git multi-pack-index verify` step was added to fsck in
66ec0390e7 (fsck: verify multi-pack-index, 2018-09-13), pre-dating any
of the above patches.

Pass `--[no-]progress` as appropriate to ensure that we don't produce
output when told not to.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:40 -07:00
Taylor Blau
eda206f611 fsck: suppress commit-graph output with --no-progress
Since e0fd51e1d7 (fsck: verify commit-graph, 2018-06-27), `fsck` runs
`git commit-graph verify` to check the integrity of any commit-graph(s).

Originally, the `git commit-graph verify` step would always print to
stdout/stderr, regardless of whether or not `fsck` was invoked with
`--[no-]progress` or not. But in 7371612255 (commit-graph: add
--[no-]progress to write and verify, 2019-08-26), the commit-graph
machinery learned the `--[no-]progress` option, though `fsck` was not
updated to pass this new flag (or not).

This led to seeing output from running `git fsck`, even with
`--no-progress` on repositories that have a commit-graph:

    $ git.compile fsck --connectivity-only --no-progress --no-dangling
    Verifying commits in commit graph: 100% (4356/4356), done.
    Verifying commits in commit graph: 100% (131912/131912), done.

Ensure that `fsck` passes `--[no-]progress` as appropriate when calling
`git commit-graph verify`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:37 -07:00
Junio C Hamano
f4a8fde057 dir: match "attr" pathspec magic with correct paths
The match_pathspec_item() function takes "prefix" value, allowing a
caller to chop off the common leading prefix of pathspec pattern
strings from the path and only use the remainder of the path to
match the pathspec patterns (after chopping the same leading prefix
of them, of course).

This "common leading prefix" optimization has two main features:

 * discard the entries in the in-core index that are outside of the
   common leading prefix; if you are doing "ls-files one/a one/b",
   we know all matches must be from "one/", so first the code
   discards all entries outside the "one/" directory from the
   in-core index.  This allows us to work on a smaller dataset.

 * allow skipping the comparison of the leading bytes when matching
   pathspec with path.  When "ls-files" finds the path "one/a/1" in
   the in-core index given "one/a" and "one/b" as the pathspec,
   knowing that common leading prefix "one/" was found lets the
   pathspec matchinery not to bother comparing "one/" part, and
   allows it to feed "a/1" down, as long as the pathspec element
   "one/a" gets corresponding adjustment to "a".

When the "attr" pathspec magic is in effect, however, the current
code breaks down.

The attributes, other than the ones that are built-in and the ones
that come from the $GIT_DIR/info/attributes file and the top-level
.gitattributes file, are lazily read from the filesystem on-demand,
as we encounter each path and ask if it matches the pathspec.  For
example, if you say "git ls-files "(attr:label)sub/" in a repository
with a file "sub/file" that is given the 'label' attribute in
"sub/.gitattributes":

 * The common prefix optimization finds that "sub/" is the common
   prefix and prunes the in-core index so that it has only entries
   inside that directory.  This is desirable.

 * The code then walks the in-core index, finds "sub/file", and
   eventually asks do_match_pathspec() if it matches the given
   pathspec.

 * do_match_pathspec() calls match_pathspec_item() _after_ stripping
   the common prefix "sub/" from the path, giving it "file", plus
   the length of the common prefix (4-bytes), so that the pathspec
   element "(attr:label)sub/" can be treated as if it were "(attr:label)".

The last one is what breaks the match in the current code, as the
pathspec subsystem ends up asking the attribute subsystem to find
the attribute attached to the path "file".  We need to ask about the
attributes on "sub/file" when calling match_pathspec_attrs(); this
can be done by looking at "prefix" bytes before the beginning of
"name", which is the same trick already used by another piece of the
code in the same match_pathspec_item() function.

Unfortunately this was not discovered so far because the code works
with slightly different arguments, e.g.

 $ git ls-files "(attr:label)sub"
 $ git ls-files "(attr:label)sub/" "no/such/dir/"

would have reported "sub/file" as a path with the 'label' attribute
just fine, because neither would trigger the common prefix
optimization.

Reported-by: Matthew Hughes <mhughes@uw.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-08 14:36:31 -07:00
Junio C Hamano
b00ec259e7 Merge branch 'jk/fsck-indices-in-worktrees'
Code clarification.

* jk/fsck-indices-in-worktrees:
  fsck: avoid misleading variable name
2023-07-08 11:23:08 -07:00
Junio C Hamano
7f5ad0ca8d Merge branch 'js/empty-index-fixes'
A few places failed to differenciate the case where the index is
truly empty (nothing added) and we haven't yet read from the
on-disk index file, which have been corrected.

* js/empty-index-fixes:
  commit -a -m: allow the top-level tree to become empty again
  split-index: accept that a base index can be empty
  do_read_index(): always mark index as initialized unless erroring out
2023-07-08 11:23:07 -07:00
Junio C Hamano
d52a45cf56 Merge branch 'ks/t4205-test-describe-with-abbrev-fix'
Test update.

* ks/t4205-test-describe-with-abbrev-fix:
  t4205: correctly test %(describe:abbrev=...)
2023-07-08 11:23:07 -07:00
Junio C Hamano
7e360bc626 t6135: attr magic with path pattern
The test coverage on attribute magic combined with path pattern
was a bit thin.  Let's add a few and make sure "(attr:X)sub" and
"(attr:X)sub/" behave the same.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:23:42 -07:00
Junio C Hamano
b3d1c85d48 Merge branch 'gc/config-context'
Reduce reliance on a global state in the config reading API.

* gc/config-context:
  config: pass source to config_parser_event_fn_t
  config: add kvi.path, use it to evaluate includes
  config.c: remove config_reader from configsets
  config: pass kvi to die_bad_number()
  trace2: plumb config kvi
  config.c: pass ctx with CLI config
  config: pass ctx with config files
  config.c: pass ctx in configsets
  config: add ctx arg to config_fn_t
  urlmatch.h: use config_fn_t type
  config: inline git_color_default_config
2023-07-06 11:54:48 -07:00
Junio C Hamano
391414e971 Merge branch 'jk/cherry-pick-revert-status'
During a cherry-pick or revert session that works on multiple
commits, "git status" did not give correct information, which has
been corrected.

* jk/cherry-pick-revert-status:
  fix cherry-pick/revert status when doing multiple commits
2023-07-06 11:54:47 -07:00
Junio C Hamano
84b889bd03 Merge branch 'pw/apply-too-large'
"git apply" punts when it is fed too large a patch input; the error
message it gives when it happens has been clarified.

* pw/apply-too-large:
  apply: improve error messages when reading patch
2023-07-06 11:54:47 -07:00
Junio C Hamano
a9cc3b8fc7 Merge branch 'tl/notes-separator'
'git notes append' was taught '--separator' to specify string to insert
between paragraphs.

* tl/notes-separator:
  notes: introduce "--no-separator" option
  notes.c: introduce "--[no-]stripspace" option
  notes.c: append separator instead of insert by pos
  notes.c: introduce '--separator=<paragraph-break>' option
  t3321: add test cases about the notes stripspace behavior
  notes.c: use designated initializers for clarity
  notes.c: cleanup 'strbuf_grow' call in 'append_edit'
2023-07-06 11:54:47 -07:00
Junio C Hamano
5a1d9e8f87 Merge branch 'gc/config-partial-submodule-kvi-fix'
Partially revert a sanity check that the rest of the config code
was not ready, to avoid triggering it in a corner case.

* gc/config-partial-submodule-kvi-fix:
  config: don't BUG when both kvi and source are set
2023-07-06 11:54:46 -07:00
Phillip Wood
1e3f26542a diff --no-index: support reading from named pipes
In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:

  diff -u <(printf "a\nb\n") <(printf "a\nc\n")

However, this syntax does not produce useful results with "git diff
--no-index". On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named pipe. On
Linux, the arguments are symlinks to pipes, so git diff "helpfully"
diffs these symlinks, comparing their targets like "pipe:[1234]" and
"pipe:[5678]".

To address this "diff --no-index" is changed so that if a path given on
the commandline is a named pipe or a symbolic link that resolves to a
named pipe then we read the data to diff from that pipe. This is
implemented by generalizing the code that already exists to handle
reading from stdin when the user passes the path "-".

If the user tries to compare a named pipe to a directory then we die as
we do when trying to compare stdin to a directory.

As process substitution is not support by POSIX this change is tested by
using a pipe and a symbolic link to a pipe.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 14:00:28 -07:00
Phillip Wood
df521462f0 t4054: test diff --no-index with stdin
"git diff --no-index" supports reading from stdin with the path "-".
There is no test coverage for this so add a regression test before
changing the code in the next commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 14:00:28 -07:00
Phillip Wood
498198453d diff --no-index: refuse to compare stdin to a directory
When the user runs

    git diff --no-index file directory

we follow the behavior of POSIX diff and rewrite the arguments as

    git diff --no-index file directory/file

Doing that when "file" is "-" (which means "read from stdin") does not
make sense so we should error out if the user asks us to compare "-" to
a directory. This matches the behavior of GNU diff and diff on *BSD.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 14:00:28 -07:00
Martin Ågren
1aa92b8500 t0091-bugreport.sh: actually verify some content of report
In the first test in this script, 'creates a report with content in the
right places', we generate a report and pipe it into our helper
`check_all_headers_populated()`. The idea of the helper is to find all
lines that look like headers ("[Some Header Here]") and to check that
the next line is non-empty. This is supposed to catch erroneous outputs
such as the following:

  [A Header]
  something
  more here

  [Another Header]

  [Too Early Header]
  contents

However, we provide the lines of the bug report as filenames to grep,
meaning we mostly end up spewing errors:

  grep: : No such file or directory
  grep: [System Info]: No such file or directory
  grep: git version:: No such file or directory
  grep: git version 2.41.0.2.gfb7d80edca: No such file or directory

This doesn't disturb the test, which tugs along and reports success, not
really having verified the contents of the report at all.

Note that after 788a776069 ("bugreport: collect list of populated
hooks", 2020-05-07), the bug report, which is created in our hook-less
test repo, contains an empty section with the enabled hooks. Thus, even
the intention of our helper is a bit misguided: there is nothing
inherently wrong with having an empty section in the bug report.

Let's instead split this test into three: first verify that we generate
a report at all, then check that the introductory blurb looks the way it
should, then verify that the "[System Info]" seems to contain the right
things. (The "[Enabled Hooks]" section is tested later in the script.)

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:45:46 -07:00
Calvin Wan
91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Calvin Wan
da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
Junio C Hamano
89d62d5e8e Merge branch 'bc/more-git-var'
Add more "git var" for toolsmiths to learn various locations Git is
configured with either via the configuration or hardcoded defaults.

* bc/more-git-var:
  var: add config file locations
  var: add attributes files locations
  attr: expose and rename accessor functions
  var: adjust memory allocation for strings
  var: format variable structure with C99 initializers
  var: add support for listing the shell
  t: add a function to check executable bit
  var: mark unused parameters in git_var callbacks
2023-07-04 16:08:18 -07:00
Junio C Hamano
812907d16f Merge branch 'ps/revision-stdin-with-options'
The set-up code for the get_revision() API now allows feeding
options like --all and --not in the --stdin mode.

* ps/revision-stdin-with-options:
  revision: handle pseudo-opts in `--stdin` mode
  revision: small readability improvement for reading from stdin
  revision: reorder `read_revisions_from_stdin()`
2023-07-04 16:08:18 -07:00
Junio C Hamano
4c237d2ca2 Merge branch 'tb/gc-recent-object-hook'
Test update.

* tb/gc-recent-object-hook:
  t7701: make annotated tag unreachable
2023-06-29 16:43:21 -07:00
Junio C Hamano
3ea43bbe17 Merge branch 'jc/abort-ll-merge-with-a-signal'
When the external merge driver is killed by a signal, its output
should not be trusted as a resolution with conflicts that is
proposed by the driver, but the code did.

* jc/abort-ll-merge-with-a-signal:
  t6406: skip "external merge driver getting killed by a signal" test on Windows
  ll-merge: killing the external merge driver aborts the merge
2023-06-29 16:43:21 -07:00
Junio C Hamano
a1264a08a1 Merge branch 'en/header-split-cache-h-part-3'
Header files cleanup.

* en/header-split-cache-h-part-3: (28 commits)
  fsmonitor-ll.h: split this header out of fsmonitor.h
  hash-ll, hashmap: move oidhash() to hash-ll
  object-store-ll.h: split this header out of object-store.h
  khash: name the structs that khash declares
  merge-ll: rename from ll-merge
  git-compat-util.h: remove unneccessary include of wildmatch.h
  builtin.h: remove unneccessary includes
  list-objects-filter-options.h: remove unneccessary include
  diff.h: remove unnecessary include of oidset.h
  repository: remove unnecessary include of path.h
  log-tree: replace include of revision.h with simple forward declaration
  cache.h: remove this no-longer-used header
  read-cache*.h: move declarations for read-cache.c functions from cache.h
  repository.h: move declaration of the_index from cache.h
  merge.h: move declarations for merge.c from cache.h
  diff.h: move declaration for global in diff.c from cache.h
  preload-index.h: move declarations for preload-index.c from elsewhere
  sparse-index.h: move declarations for sparse-index.c from cache.h
  name-hash.h: move declarations for name-hash.c from cache.h
  run-command.h: move declarations for run-command.c from cache.h
  ...
2023-06-29 16:43:21 -07:00
Eric Sunshine
6e6a529b57 fsck: avoid misleading variable name
When reporting a problem, `git fsck` emits a message such as:

    missing blob 1234abcd (:file)

However, this can be ambiguous when the problem is detected in the index
of a worktree other than the one in which `git fsck` was invoked. To
address this shortcoming, 592ec63b38 (fsck: mention file path for index
errors, 2023-02-24) enhanced the output to mention the path of the index
when the problem is detected in some other worktree:

    missing blob 1234abcd (.git/worktrees/wt/index:file)

Unfortunately, the variable in fsck_index() which controls whether the
index path should be shown is misleadingly named "is_main_index" which
can be misunderstood as referring to the main worktree (i.e. the one
housing the .git/ repository) rather than to the current worktree (i.e.
the one in which `git fsck` was invoked). Avoid such potential confusion
by choosing a name more reflective of its actual purpose.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 13:58:57 -07:00
Kousik Sanagavarapu
1876a5ae15 t4205: correctly test %(describe:abbrev=...)
The pretty format %(describe:abbrev=<number>) tells describe to use
at least <number> digits of the oid to generate the human-readable
format of the commit-ish.

There are three things to test here:
  - Check that we can describe a commit that is not tagged (that is,
    for example our HEAD is at least one commit ahead of some reachable
    commit which is tagged) with at least <number> digits of the oid
    being used for describing it.

  - Check that when using such a commit-ish, we always use at least
    <number> digits of the oid to describe it.

  - Check that we can describe a tag. This just gives the name of the
    tag irrespective of abbrev (abbrev doesn't make sense here).

Do this, instead of the current test which only tests the last case.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 12:20:35 -07:00
Johannes Schindelin
2ee045eea1 commit -a -m: allow the top-level tree to become empty again
In 03267e8656 (commit: discard partial cache before (re-)reading it,
2022-11-08), a memory leak was plugged by discarding any partial index
before re-reading it.

The problem with this memory leak fix is that it was based on an
incomplete understanding of the logic introduced in 7168624c35 (Do not
generate full commit log message if it is not going to be used,
2007-11-28).

That logic was introduced to add a shortcut when committing without
editing the commit message interactively. A part of that logic was to
ensure that the index was read into memory:

	if (!active_nr && read_cache() < 0)
		die(...)

Translation to English: If the index has not yet been read, read it, and
if that fails, error out.

That logic was incorrect, though: It used `!active_nr` as an indicator
that the index was not yet read. Usually this is not a problem because
in the vast majority of instances, the index contains at least one
entry.

And it was natural to do it this way because at the time that condition
was introduced, the `index_state` structure had no explicit flag to
indicate that it was initialized: This flag was only introduced in
913e0e99b6 (unpack_trees(): protect the handcrafted in-core index from
read_cache(), 2008-08-23), but that commit did not adjust the code path
where no index file was found and a new, pristine index was initialized.

Now, when the index does not contain any entry (which is quite
common in Git's test suite because it starts quite a many repositories
from scratch), subsequent calls to `do_read_index()` will mistake the
index not to be initialized, and read it again unnecessarily.

This is a problem because after initializing the empty index e.g. the
`cache_tree` in that index could have been initialized before a
subsequent call to `do_read_index()` wants to ensure an initialized
index. And if that subsequent call mistakes the index not to have been
initialized, it would lead to leaked memory.

The correct fix for that memory leak is to adjust the condition so that
it does not mistake `active_nr == 0` to mean that the index has not yet
been read.

Using the `initialized` flag instead, we avoid that mistake, and as a
bonus we can fix a bug at the same time that was introduced by the
memory leak fix: When deleting all tracked files and then asking `git
commit -a -m ...` to commit the result, Git would internally update the
index, then discard and re-read the index undoing the update, and fail
to commit anything.

This fixes https://github.com/git-for-windows/git/issues/4462

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 12:20:04 -07:00
Glen Choo
8868b1ebfb config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.

In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.

Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.

The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:40 -07:00
Glen Choo
26b669324b config.c: pass ctx with CLI config
Pass config_context when parsing CLI config. To provide the .kvi member,
refactor out kvi_from_param() from the logic that caches CLI config in
configsets. Now that config_context and config_context.kvi is always
present when config machinery calls config callbacks, plumb "kvi" so
that we can remove all calls of current_config_scope() except for
trace2/*.c (which will be handled in a later commit), and remove all
other current_config_*() (the functions themselves and their calls).
Note that this results in .kvi containing a different, more complete
set of information than the mocked up "struct config_source" in
git_config_from_parameters().

Plumbing "kvi" reveals a few places where we've been doing the wrong
thing:

* git_config_parse_parameter() hasn't been setting config source
  information, so plumb "kvi" there too.

* Several sites in builtin/config.c have been calling current_config_*()
  functions outside of config callbacks (indirectly, via the
  format_config() helper), which means they're reading state that isn't
  set correctly:

  * "git config --get-urlmatch --show-scope" iterates config to collect
    values, but then attempts to display the scope after config
    iteration, causing the "unknown" scope to be shown instead of the
    config file's scope. It's clear that this wasn't intended: we knew
    that "--get-urlmatch" couldn't show config source metadata, which is
    why "--show-origin" was marked incompatible with "--get-urlmatch"
    when it was introduced [1]. It was most likely a mistake that we
    allowed "--show-scope" to sneak through.

    Fix this by copying the "kvi" value in the collection phase so that
    it can be read back later. This means that we can now support "git
    config --get-urlmatch --show-origin", but that is left unchanged
    for now.

  * "git config --default" doesn't have config source metadata when
    displaying the default value, so "--show-scope" also results in
    "unknown", and "--show-origin" results in a BUG(). Fix this by
    treating the default value as if it came from the command line (e.g.
    like we do with "git -c" or "git config --file"), using
    kvi_from_param().

[1] https://lore.kernel.org/git/20160205112001.GA13397@sigill.intra.peff.net/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Glen Choo
6021e1d158 config.c: pass ctx in configsets
Pass config_context to config callbacks in configset_iter(), trivially
setting the .kvi member to the cached key_value_info. Then, in config
callbacks that are only used with configsets, use the .kvi member to
replace calls to current_config_*(), and delete current_config_line()
because it has no remaining callers.

This leaves builtin/config.c and config.c as the only remaining users of
current_config_*().

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Glen Choo
a4e7e317f8 config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Jacob Keller
a096a889f4 fix cherry-pick/revert status when doing multiple commits
The status report for an in-progress cherry-pick does not show the
current commit if the cherry-pick happens as part of a series of
multiple commits:

 $ git cherry-pick <commit1> <commit2>
 < one of the cherry-picks fails to merge clean >
 Cherry-pick currently in progress.
  (run "git cherry-pick --continue" to continue)
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

 $ git status
 On branch <branch>
 Your branch is ahead of '<upstream>' by 1 commit.
   (use "git push" to publish your local commits)

 Cherry-pick currently in progress.
   (run "git cherry-pick --continue" to continue)
   (use "git cherry-pick --skip" to skip this patch)
   (use "git cherry-pick --abort" to cancel the cherry-pick operation)

The show_cherry_pick_in_progress() function prints "Cherry-pick
currently in progress". That function does have a more verbose print
based on whether the cherry_pick_head_oid is null or not. If it is not
null, then a more helpful message including which commit is actually
being picked is displayed.

The introduction of the "Cherry-pick currently in progress" message
comes from 4a72486de9 ("fix cherry-pick/revert status after commit",
2019-04-17). This commit modified wt_status_get_state() in order to
detect that a cherry-pick was in progress even if the user has used `git
commit` in the middle of the sequence.

The check used to detect this is the call to sequencer_get_last_command.
If the sequencer indicates that the lass command was a REPLAY_PICK, then
the state->cherry_pick_in_progress is set to 1 and the
cherry_pick_head_oid is initialized to the null_oid. Similar behavior is
done for the case of REPLAY_REVERT.

It happens that this call of sequencer_get_last_command will always
report the action even if the user hasn't interrupted anything. Thus,
during a range of cherry-picks or reverts, the cherry_pick_head_oid and
revert_head_oid will always be overwritten and initialized to the null
oid.

This results in status always displaying the terse message which does
not include commit information.

Fix this by adding an additional check so that we do not re-initialize
the cherry_pick_head_oid or revert_head_oid if we have already set the
cherry_pick_in_progress or revert_in_progress bits. This ensures that
git status will display the more helpful information when its available.
Add a test case covering this behavior.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 15:48:55 -07:00
brian m. carlson
ed773a18c6 var: add config file locations
Much like with attributes files, sometimes programs would like to know
the location of configuration files at the global or system levels.
However, it isn't always clear where these may live, especially for the
system file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.

Since other parties cannot intuitively know how Git was compiled and
where it looks for these files, help them by providing variables that
can be queried.  Because we have multiple paths for global config
values, print them in order from highest to lowest priority, and be sure
to split on newlines so that "git var -l" produces two entries for the
global value.

However, be careful not to split all values on newlines, since our
editor values could well contain such characters, and we don't want to
split them in such a case.

Note in the documentation that some values may contain multiple paths
and that callers should be prepared for that fact.  This helps people
write code that will continue to work in the event we allow multiple
items elsewhere in the future.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
brian m. carlson
576a37fccb var: add attributes files locations
Currently, there are some programs which would like to read and parse
the gitattributes files at the global or system levels.  However, it's
not always obvious where these files live, especially for the system
file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.

It's not reasonable to expect all callers of Git to intuitively know
where the Git distributor or user has configured these locations to
be, so add some entries to allow us to determine their location.  Honor
the GIT_ATTR_NOSYSTEM environment variable if one is specified.  Expose
the accessor functions in a way that we can reuse them from within the
var code.

In order to make our paths consistent on Windows and also use the same
form as paths use in "git rev-parse", let's normalize the path before we
return it.  This results in Windows-style paths that use slashes, which
is convenient for making our tests function in a consistent way across
platforms.  Note that this requires that some of our values be freed, so
let's add a flag about whether the value needs to be freed and use it
accordingly.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
brian m. carlson
1e65721227 var: add support for listing the shell
On most Unix systems, finding a suitable shell is easy: one simply uses
"sh" with an appropriate PATH value.  However, in many Windows
environments, the shell is shipped alongside Git, and it may or may not
be in PATH, even if Git is.

In such an environment, it can be very helpful to query Git for the
shell it's using, since other tools may want to use the same shell as
well.  To help them out, let's add a variable, GIT_SHELL_PATH, that
points to the location of the shell.

On Unix, we know our shell must be executable to be functional, so
assume that the distributor has correctly configured their environment,
and use that as a basic test.  On Git for Windows, we know that our
shell will be one of a few fixed values, all of which end in "sh" (such
as "bash").  This seems like it might be a nice test on Unix as well,
since it is customary for all shells to end in "sh", but there probably
exist such systems that don't have such a configuration, so be careful
here not to break them.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:05 -07:00
brian m. carlson
d6546af75c t: add a function to check executable bit
In line with our other helper functions for paths, let's add a function
to check whether a path is executable, and if not, print a suitable
error message.  Document this function, and note that it must only be
used under the POSIXPERM prerequisite, since it doesn't otherwise work
on Windows.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:05 -07:00
Glen Choo
a53f43f900 config: don't BUG when both kvi and source are set
When iterating through config, we read config source metadata from
global values - either a "struct config_source + enum config_scope"
or a "struct key_value_info", using the current_config* functions. Prior
to the series starting from 0c60285147 (config.c: create config_reader
and the_reader, 2023-03-28), we weren't very picky about which values we
should read in which situation; we did note that both groups of values
generally shouldn't be set together, but if both were set,
current_config* preferentially reads key_value_info. When that series
added more structure, we enforced that either the former (when parsing a
config source) can be set, or the latter (when iterating a config set),
but *never* both at the same time. See 9828453ff0 (config.c: remove
current_config_kvi, 2023-03-28) and 5cdf18e7cd (config.c: remove
current_parsing_scope, 2023-03-28).

That was a good simplifying constraint that helped us reason about the
global state, but it turns out that there is at least one situation
where we need both to be set at the same time: in a blobless partial
clone where .gitmodules is missing. "git fetch" in such a repo will
start a config parse over .gitmodules (setting the config_source), and
Git will attempt to lazy-fetch it from the promisor remote. However,
when we try to read the promisor configuration, we start iterating a
config set (setting the key_value_info), and we BUG() out because that's
not allowed any more.

Teaching config_reader to gracefully handle this is somewhat
complicated, but fortunately, there are proposed changes to the config.c
machinery to get rid of this global state, and make the BUG() obsolete
[1]. We should rely on that as the eventual solution, and avoid doing
yet another refactor in the meantime.

Therefore, fix the bug by removing the BUG() check. We're reverting to
an older, less safe state, but that's generally okay since
key_value_info is always preferentially read, so we'd always read the
correct values when we iterate a config set in the middle of a config
parse (like we are here). The reverse would be wrong, but extremely
unlikely to happen since very few callers parse config without going
through a config set.

[1] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 12:07:47 -07:00
Junio C Hamano
e224f26892 Merge branch 'tb/collect-pack-filenames-fix'
Avoid breakage of "git pack-objects --cruft" due to inconsistency
between the way the code enumerates packfiles in the repository.

* tb/collect-pack-filenames-fix:
  builtin/repack.c: only collect fully-formed packs
2023-06-26 09:29:50 -07:00
Junio C Hamano
8d5c5a05d7 Merge branch 'jk/commit-use-no-divider-with-interpret-trailers'
When "git commit --trailer=..." invokes the interpret-trailers
machinery, it knows what it feeds to interpret-trailers is a full
log message without any patch, but failed to express that by
passing the "--no-divider" option, which has been corrected.

* jk/commit-use-no-divider-with-interpret-trailers:
  commit: pass --no-divider to interpret-trailers
2023-06-26 09:29:49 -07:00
Phillip Wood
42612e18d2 apply: improve error messages when reading patch
Commit f1c0e3946e (apply: reject patches larger than ~1 GiB, 2022-10-25)
added a limit on the size of patch that apply will process to avoid
integer overflows. The implementation re-used the existing error message
for when we are unable to read the patch. This is unfortunate because (a) it
does not signal to the user that the patch is being rejected because it
is too large and (b) it uses error_errno() without setting errno.

This patch adds a specific error message for the case when a patch is
too large. It also updates the existing message to make it clearer that
it is the patch that cannot be read rather than any other file and marks
both messages for translation. The "git apply" prefix is also dropped to
match most of the rest of the error messages in apply.c (there are still
a few error messages that prefixed with "git apply" and are not marked
for translation after this patch). The test added in f1c0e3946e is
updated accordingly.

Reported-by: Premek Vysoky <Premek.Vysoky@microsoft.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26 08:58:50 -07:00
Taylor Blau
25d59524bb t7701: make annotated tag unreachable
In 4dc16e2cb0 (gc: introduce `gc.recentObjectsHook`, 2023-06-07), we
added tests to ensure that prune-able (i.e. unreachable and with mtime
older than the cutoff) objects which are marked as recent via the new
`gc.recentObjectsHook` configuration are unpacked as loose with
`--unpack-unreachable`.

In that test, we also ensure that objects which are reachable from other
unreachable objects which were *not* pruned are kept as well, regardless
of their mtimes. For this, we use an annotated tag pointing at a blob
($obj2) which would otherwise be pruned.

But after pruning, that object is kept around for two reasons. One, the
tag object's mtime wasn't adjusted to be beyond the 1-hour cutoff, so it
would be kept as due to its recency regardless. The other reason is
because the tag itself is reachable.

Use mktag to write the tag object directly without pointing a reference
at it, and adjust the mtime of the tag object to be older than the
cutoff to ensure that our `gc.recentObjectsHook` configuration is
working as intended.

Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-24 15:50:41 -07:00
Junio C Hamano
34d765e736 t6406: skip "external merge driver getting killed by a signal" test on Windows
The run_command() on the platform does not seem to report death by
signal as the caller expects.  For now, skip the test on Windows.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-23 16:34:40 -07:00