Commit Graph

73280 Commits

Author SHA1 Message Date
Jeff King
720ba25d99 upload-pack: switch deepen-not list to an oid_array
When we see a "deepen-not" line from the client, we verify that the
given name can be resolved as a ref, and then add it to a string list to
be passed later to an internal "rev-list --not" traversal. We record the
actual refname in the string list (so the traversal resolves it again
later), but we'd be better off recording the resolved oid:

  1. There's a tiny bit of wasted work in resolving it twice.

  2. There's a small race condition with simultaneous updates; the later
     traversal may resolve to a different value (or not at all). This
     shouldn't cause any bad behavior (we do not care about the value
     in this first resolution, so whatever value rev-list gets is OK)
     but it could mean a confusing error message (if upload-pack fails
     to resolve the ref it produces a useful message, but a failing
     traversal later results in just "revision walk setup failed").

  3. It makes it simpler to de-duplicate the results. We don't de-dup at
     all right now, but we will in the next patch.

>From the client's perspective the behavior should be the same.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
Jeff King
fae9627470 upload-pack: drop separate v2 "haves" array
When upload-pack sees a "have" line in the v0 protocol, it immediately
calls got_oid() with its argument and potentially produces an ACK
response. In the v2 protocol, we simply record the argument in an
oid_array, and only later process all of the "have" objects by calling
the equivalent of got_oid() on the contents of the array.

This makes some sense, as v2 is a pure request/response protocol, as
opposed to v0's asynchronous negotiation phase. But there's a downside:
a client can send us an infinite number of garbage "have" lines, which
we'll happily slurp into the array, consuming memory. Whereas in v0,
they are limited by the number of objects in the repository (because
got_oid() only records objects we have ourselves, and we avoid
duplicates by setting a flag on the object struct).

We can make v2 behave more like v0 by also calling got_oid() directly
when v2 parses a "have" line. Calling it early like this is OK because
got_oid() itself does not interact with the client; it only confirms
that we have the object and sets a few flags. Note that unlike v0, v2
does not ever (before or after this patch) check the return code of
got_oid(), which lets the caller know whether we have the object. But
again, that makes sense; v0 is using it to asynchronously tell the
client to stop sending. In v2's synchronous protocol, we just discard
those entries (and decide how to ACK at the end of each round).

There is one slight tweak we need, though. In v2's state machine, we
reach the SEND_ACKS state if the other side sent us any "have" lines,
whether they were useful or not. Right now we do that by checking
whether the "have" array had any entries, but if we record only the
useful ones, that doesn't work. Instead, we can add a simple boolean
that tells us whether we saw any have line (even if it was useless).

This lets us drop the "haves" array entirely, as we're now placing
objects directly into the "have_obj" object array (which is where
got_oid() put them in the long run anyway). And as a bonus, we can drop
the secondary "common" array used in process_haves_and_send_acks(). It
was essentially a copy of "haves" minus the objects we do not have. But
now that we are using "have_obj" directly, we know everything in it is
useful. So in addition to protecting ourselves against malicious input,
we should slightly lower our memory usage for normal inputs.

Note that there is one user-visible effect. The trace2 output records
the number of "haves". Previously this was the total number of "have"
lines we saw, but now is the number of useful ones. We could retain the
original meaning by keeping a separate counter, but it doesn't seem
worth the effort; this trace info is for debugging and metrics, and
arguably the count of common oids is at least as useful as the total
count.

Reported-by: Benjamin Flesch <benjaminflesch@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 14:42:01 -08:00
Michael Lohmann
f3fc5d9c91 revision: implement git log --merge also for rebase/cherry-pick/revert
'git log' learned in ae3e5e1ef2 (git log -p --merge [[--] paths...],
2006-07-03) to show commits touching conflicted files in the range
HEAD...MERGE_HEAD, an addition documented in d249b45547 (Document
rev-list's option --merge, 2006-08-04).

It can be useful to look at the commit history to understand what lead
to merge conflicts also for other mergy operations besides merges, like
cherry-pick, revert and rebase.

For rebases and cherry-picks, an interesting range to look at is
HEAD...{REBASE_HEAD,CHERRY_PICK_HEAD}, since even if all the commits
included in that range are not directly part of the 3-way merge,
conflicts encountered during these operations can indeed be caused by
changes introduced in preceding commits on both sides of the history.

For revert, as we are (most likely) reversing changes from a previous
commit, an appropriate range is REVERT_HEAD..HEAD, which is equivalent
to REVERT_HEAD...HEAD and to HEAD...REVERT_HEAD, if we keep HEAD and its
parents on the left side of the range.

As such, adjust the code in prepare_show_merge so it constructs the
range HEAD...$OTHER for OTHER={MERGE_HEAD, CHERRY_PICK_HEAD, REVERT_HEAD
or REBASE_HEAD}. Note that we try these pseudorefs in order, so keep
REBASE_HEAD last since the three other operations can be performed
during a rebase. Note also that in the uncommon case where $OTHER and
HEAD do not share a common ancestor, this will show the complete
histories of both sides since their root commits, which is the same
behaviour as currently happens in that case for HEAD and MERGE_HEAD.

Adjust the documentation of this option accordingly.

Co-authored-by: Johannes Sixt <j6t@kdbg.org>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
[jc: tweaked in j6t's precedence fix that tries REBASE_HEAD last]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 10:04:39 -08:00
Michael Lohmann
f476143ee6 revision: ensure MERGE_HEAD is a ref in prepare_show_merge
This is done to

 (1) ensure MERGE_HEAD is a ref,
 (2) obtain the oid without any prefixing by refs.c:repo_dwim_ref()
 (3) error out when MERGE_HEAD is a symref.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 10:02:46 -08:00
Johannes Schindelin
24876ebf68 commit-reach(repo_in_merge_bases_many): report missing commits
Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.

Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).

Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.

Note: As of this patch, errors are returned only if any of the specified
merge heads is missing. Over the course of the next patches, missing
commits will also be reported by the `paint_down_to_common()` function,
which is called by `repo_in_merge_bases_many()`, and those errors will
be properly propagated back to the caller at that stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:47:03 -08:00
Johannes Schindelin
207c40e1e4 commit-reach(repo_in_merge_bases_many): optionally expect missing commits
Currently this function treats unrelated commit histories the same way
as commit histories with missing commit objects.

Typically, missing commit objects constitute a corrupt repository,
though, and should be reported as such. The next commits will make it
so, but there is one exception: In `git fetch --update-shallow` we
_expect_ commit objects to be missing, and we do want to treat the
now-incomplete commit histories as unrelated.

To allow for that, let's introduce an additional parameter that is
passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
it in the two callers in `shallow.c`.

This commit changes behavior slightly: unless called from the
`shallow.c` functions that set the `ignore_missing_commits` bit, any
non-existing tip commit that is passed to `repo_in_merge_bases_many()`
will now result in an error.

Note: When encountering missing commits while traversing the commit
history in search for merge bases, with this commit there won't be a
change in behavior just yet, their children will still be interpreted as
root commits. This bug will get fixed by follow-up commits.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:47:03 -08:00
Johannes Schindelin
e67431d496 commit-reach(paint_down_to_common): plug two memory leaks
When a commit is missing, we return early (currently pretending that no
merge basis could be found in that case). At that stage, it is possible
that a merge base could have been found already, and added to the
`result`, which is now leaked.

The priority queue has a similar issue: There might still be a commit in
that queue.

Let's release both, to address the potential memory leaks.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:47:03 -08:00
Christian Couder
a4324babe6 revision: fix --missing=[print|allow*] for annotated tags
In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it works with missing commits, not just blobs/trees.

Unfortunately, such a command was still failing with a "fatal: bad
object <oid>" if it was passed a missing commit, blob or tree as an
argument (before the rev walking even begins). This was fixed in a
recent commit.

That fix still doesn't work when an argument passed to the command is
an annotated tag pointing to a missing commit though. In that case
`git rev-list --missing=...` still errors out with a "fatal: bad
object <oid>" error where <oid> is the object ID of the missing
commit.

Let's fix this issue, and also, while at it, let's add tests not just
for annotated tags but also for regular tags and branches.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-28 09:28:18 -08:00
Junio C Hamano
0f9d4d28b7 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 16:04:33 -08:00
Junio C Hamano
ebd46baf99 Merge branch 'jb/doc-interactive-singlekey-do-not-need-perl'
Doc clean-up.

* jb/doc-interactive-singlekey-do-not-need-perl:
  doc: remove outdated information about interactive.singleKey
2024-02-27 16:04:33 -08:00
Junio C Hamano
a56bb9f66a Merge branch 'jk/t0303-clean'
Test clean-up.

* jk/t0303-clean:
  t0303: check that helper_test_clean removes all credentials
2024-02-27 16:04:33 -08:00
Junio C Hamano
70dadd510b Merge branch 'mh/libsecret-empty-password-fix'
Credential helper based on libsecret (in contrib/) has been updated
to handle an empty password correctly.

* mh/libsecret-empty-password-fix:
  libsecret: retrieve empty password
2024-02-27 16:04:32 -08:00
Junio C Hamano
f71ed54f4d Merge branch 'bb/completion-no-grep-into-awk'
Some parts of command line completion script (in contrib/) have
been micro-optimized.

* bb/completion-no-grep-into-awk:
  completion: use awk for filtering the config entries
2024-02-27 16:04:32 -08:00
Junio C Hamano
66b1160141 Merge branch 'km/mergetool-vimdiff-layout-fallback'
Variants of vimdiff learned to honor mergetool.<variant>.layout settings.

* km/mergetool-vimdiff-layout-fallback:
  mergetools: vimdiff: use correct tool's name when reading mergetool config
2024-02-27 16:04:32 -08:00
Junio C Hamano
03f9f1a3a2 Merge branch 'ba/credential-test-clean-fix'
Test clean-up.

* ba/credential-test-clean-fix:
  t/lib-credential: clean additional credential
2024-02-27 16:04:32 -08:00
Junio C Hamano
98793866b9 Merge branch 'rj/tag-column-fix'
"git tag --column" failed to check the exit status of its "git
column" invocation, which has been corrected.

* rj/tag-column-fix:
  tag: error when git-column fails
2024-02-27 16:04:32 -08:00
Junio C Hamano
45072eefef Merge branch 'jc/am-whitespace-doc'
"git am --help" now tells readers what actions are available in
"git am --whitespace=<action>", in addition to saying that the
option is passed through to the underlying "git apply".

* jc/am-whitespace-doc:
  doc: add shortcut to "am --whitespace=<action>"
2024-02-27 16:04:31 -08:00
Patrick Steinhardt
b0f6b6b523 refs/reftable: don't fail empty transactions in repo without HEAD
Under normal circumstances, it shouldn't ever happen that a repository
has no HEAD reference. In fact, git-update-ref(1) would fail any request
to delete the HEAD reference, and a newly initialized repository always
pre-creates it, too.

We have however changed git-clone(1) to partially initialize the
refdb just up to the point where remote helpers can find the
repository. With that change, we are going to run into a situation
where repositories have no refs at all.

Now there is a very particular edge case in this situation: when
preparing an empty ref transacton, we end up returning whatever value
`read_ref_without_reload()` returned to the caller. Under normal
conditions this would be fine: "HEAD" should usually exist, and thus the
function would return `0`. But if "HEAD" doesn't exist, the function
returns a positive value which we end up returning to the caller.

Fix this bug by resetting the return code to `0` and add a test.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 13:53:39 -08:00
Junio C Hamano
b6818ff3b1 Merge branch 'ps/remote-helper-repo-initialization-fix' into ps/reftable-repo-init-fix
* ps/remote-helper-repo-initialization-fix:
  builtin/clone: allow remote helpers to detect repo
2024-02-27 13:53:22 -08:00
Rubén Justo
3574816d98 completion: fix __git_complete_worktree_paths
Use __git to invoke "worktree list" in __git_complete_worktree_paths, to
respect any "-C" and "--git-dir" options present on the command line.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 13:37:24 -08:00
Patrick Steinhardt
199f44cb2e builtin/clone: allow remote helpers to detect repo
In 18c9cb7524 (builtin/clone: create the refdb with the correct object
format, 2023-12-12), we have changed git-clone(1) so that it delays
creation of the refdb until after it has learned about the remote's
object format. This change was required for the reftable backend, which
encodes the object format into the tables. So if we pre-initialized the
refdb with the default object format, but the remote uses a different
object format than that, then the resulting tables would have encoded
the wrong object format.

This change unfortunately breaks remote helpers which try to access the
repository that is about to be created. Because the refdb has not yet
been initialized at the point where we spawn the remote helper, we also
don't yet have "HEAD" or "refs/". Consequently, any Git commands ran by
the remote helper which try to access the repository would fail because
it cannot be discovered.

This is essentially a chicken-and-egg problem: we cannot initialize the
refdb because we don't know about the object format. But we cannot learn
about the object format because the remote helper may be unable to
access the partially-initialized repository.

Ideally, we would address this issue via capabilities. But the remote
helper protocol is not structured in a way that guarantees that the
capability announcement happens before the remote helper tries to access
the repository.

Instead, fix this issue by partially initializing the refdb up to the
point where it becomes discoverable by Git commands.

Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 12:58:57 -08:00
Phillip Wood
72a8d3f027 rebase -i: stop setting GIT_CHERRY_PICK_HELP
Setting this environment variable causes the sequencer to display a
custom message when it stops for the user to resolve conflicts and
remove CHERRY_PICK_HEAD. Setting it in "git rebase" is a vestige of
the scripted implementation, now that it is a builtin command we do
not need to communicate with the sequencer machinery via environment
variables.

Move the conflicts advice to use when rebasing into
sequencer.c so we do not need to pass it via the environment.

Note that we retain the changes in e4301f73ff (sequencer: unset
GIT_CHERRY_PICK_HELP for 'exec' commands, 2024-02-02) just in case
GIT_CHERRY_PICK_HELP is set in the environment when "git rebase" is
run.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 10:33:36 -08:00
Junio C Hamano
e6d5479e7a git: extend --no-lazy-fetch to work across subprocesses
Modeling after how the `--no-replace-objects` option is made usable
across subprocess spawning (e.g., cURL based remote helpers are
spawned as a separate process while running "git fetch"), allow the
`--no-lazy-fetch` option to be passed across process boundaries.

Do not model how the value of GIT_NO_REPLACE_OBJECTS environment
variable is ignored, though.  Just use the usual git_env_bool() to
allow "export GIT_NO_LAZY_FETCH=0" and "unset GIT_NO_LAZY_FETCH"
to be equivalents.

Also do not model how the request is not propagated to subprocesses
we spawn (e.g. "git clone --local" that spawns a new process to work
in the origin repository, while the original one working in the
newly created one) by the "--no-replace-objects" option, as this "do
not lazily fetch from the promisor" is more about a per-request
debugging aid, not "this repository's promisor should not be relied
upon" property specific to a repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 09:53:14 -08:00
Josh Triplett
e90cc075cc commit: unify logic to avoid multiple scissors lines when merging
prepare_to_commit has some logic to figure out whether merge already
added a scissors line, and therefore it shouldn't add another. Now that
wt_status_add_cut_line has built-in state for whether it has
already added a previous line, just set that state instead, and then
remove that condition from subsequent calls to wt_status_add_cut_line.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 09:40:47 -08:00
Josh Triplett
688a0a751e commit: avoid redundant scissor line with --cleanup=scissors -v
`git commit --cleanup=scissors -v` prints two scissors lines:
one at the start of the comment lines, and the other right before the
diff. This is redundant, and pushes the diff further down in the user's
editor than it needs to be.

Make wt_status_add_cut_line() remember if it has added a cut line before,
and avoid adding a redundant one.

Add a test for this.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 09:40:46 -08:00
Junio C Hamano
4e89f0e07c doc: clarify the wording on <git-compat-util.h> requirement
The reason why we require the <git-compat-util.h> file to be the
first header file to be included is because it insulates other
header files and source files from platform differences, like which
system header files must be included in what order, and what C
preprocessor feature macros must be defined to trigger certain
features we want out of the system.

We tried to clarify the rule in the coding guidelines document, but
the wording was a bit fuzzy that can lead to misinterpretations like
you can include <xdiff/xinclude.h> only to avoid having to include
<git-compat-util.h> even if you have nothing to do with the xdiff
implementation, for example.  "You do not have to include more than
one of these" was also misleading and would have been puzzling if
you _needed_ to depend on more than one of these approved headers
(answer: you are allowed to include them all if you need the
declarations in them for reasons other than that you want to avoid
including compat-util yourself).

Instead of using the phrase "approved headers", enumerate them as
exceptions, each labeled with its intended audiences, to avoid such
misinterpretations.  The structure also makes it easier to add new
exceptions, so add the description of "t/unit-tests/test-lib.h"
being an exception only for the unit tests implementation as an
example.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Kyle Lippincott <spectral@google.com>
Acked-by: Elijah Newren <newren@gmail.com>
2024-02-27 08:53:32 -08:00
Richard Macklin
40b8076462 rebase: fix typo in autosquash documentation
This is a minor follow-up to cb00f524df (rebase: rewrite
--(no-)autosquash documentation, 2023-11-14) to fix a typo introduced in
that commit.

Signed-off-by: Richard Macklin <code@rmacklin.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-27 08:50:49 -08:00
Junio C Hamano
b3806f7633 git: document GIT_NO_REPLACE_OBJECTS environment variable
This variable is used as the primary way to disable the object
replacement mechanism, with the "--no-replace-objects" command line
option as an end-user visible way to set it, but has not been
documented.

The original reason why it was left undocumented might be because it
was meant as an internal implementation detail, but the thing is,
that our tests use the environment variable directly without the
command line option, and there certainly are folks who learned its
use from there, making it impossible to deprecate or change its
behaviour by now.

Add documentation and note that for this variable, unlike many
boolean-looking environment variables, only the presence matters,
not what value it is set to.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 22:08:49 -08:00
Junio C Hamano
a2082dbdd3 Start the 2.45 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:10:25 -08:00
Junio C Hamano
7ece6ad823 Merge branch 'ps/ref-tests-update-even-more'
More tests that are marked as "ref-files only" have been updated to
improve test coverage of reftable backend.

* ps/ref-tests-update-even-more:
  t7003: ensure filter-branch prunes reflogs with the reftable backend
  t2011: exercise D/F conflicts with HEAD with the reftable backend
  t1405: remove unneeded cleanup step
  t1404: make D/F conflict tests compatible with reftable backend
  t1400: exercise reflog with gaps with reftable backend
  t0410: convert tests to use DEFAULT_REPO_FORMAT prereq
  t: move tests exercising the "files" backend
2024-02-26 18:10:25 -08:00
Junio C Hamano
65462776c2 Merge branch 'gt/at-is-synonym-for-head-in-add-patch'
Teach "git checkout -p" and friends that "@" is a synonym for
"HEAD".

* gt/at-is-synonym-for-head-in-add-patch:
  add -p tests: remove PERL prerequisites
  add-patch: classify '@' as a synonym for 'HEAD'
2024-02-26 18:10:25 -08:00
Junio C Hamano
cf258a9e4e Merge branch 'kh/column-reject-negative-padding'
"git column" has been taught to reject negative padding value, as
it would lead to nonsense behaviour including division by zero.

* kh/column-reject-negative-padding:
  column: guard against negative padding
  column: disallow negative padding
2024-02-26 18:10:25 -08:00
Junio C Hamano
225f892685 Merge branch 'jc/t9210-lazy-fix'
Adjust use of "rev-list --missing" in an existing tests so that it
does not depend on a buggy failure mode.

* jc/t9210-lazy-fix:
  t9210: do not rely on lazy fetching to fail
2024-02-26 18:10:24 -08:00
Junio C Hamano
9f67cbd0a7 Merge branch 'ps/reftable-iteration-perf'
The code to iterate over refs with the reftable backend has seen
some optimization.

* ps/reftable-iteration-perf:
  reftable/reader: add comments to `table_iter_next()`
  reftable/record: don't try to reallocate ref record name
  reftable/block: swap buffers instead of copying
  reftable/pq: allocation-less comparison of entry keys
  reftable/merged: skip comparison for records of the same subiter
  reftable/merged: allocation-less dropping of shadowed records
  reftable/record: introduce function to compare records by key
2024-02-26 18:10:24 -08:00
Junio C Hamano
274400998b Merge branch 'rs/use-xstrncmpz'
Code clean-up.

* rs/use-xstrncmpz:
  use xstrncmpz()
2024-02-26 18:10:24 -08:00
Junio C Hamano
cf47fb7ec7 Merge branch 'cp/apply-core-filemode'
"git apply" on a filesystem without filemode support have learned
to take a hint from what is in the index for the path, even when
not working with the "--index" or "--cached" option, when checking
the executable bit match what is required by the preimage in the
patch.

* cp/apply-core-filemode:
  apply: code simplification
  apply: correctly reverse patch's pre- and post-image mode bits
  apply: ignore working tree filemode when !core.filemode
2024-02-26 18:10:24 -08:00
Junio C Hamano
b4385bf016 Merge branch 'ps/reftable-backend'
Integrate the reftable code into the refs framework as a backend.

* ps/reftable-backend:
  refs/reftable: fix leak when copying reflog fails
  ci: add jobs to test with the reftable backend
  refs: introduce reftable backend
2024-02-26 18:10:23 -08:00
Jeff Hostetler
558d146d13 fsmonitor: remove custom loop from non-directory path handler
Refactor the code that handles refresh events for pathnames that do
not contain a trailing slash.  Instead of using a custom loop to try
to scan the index and detect if the FSEvent named a file or might be a
directory prefix, use the recently created helper function to do that.

Also update the comments to describe what and why we are doing this.

On platforms that DO NOT annotate FS events with a trailing
slash, if we fail to find an exact match for the pathname
in the index, we do not know if the pathname represents a
directory or simply an untracked file.  Pretend that the pathname
is a directory and try again before assuming it is an untracked
file.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:03 -08:00
Jeff Hostetler
a52482036c fsmonitor: return invalidated cache-entry count on directory event
Teach the refresh callback helper function for directory FSEvents to
return the number of cache-entries that were invalidated in response
to a directory event.

This will be used in a later commit to help determine if the observed
pathname in the FSEvent was a (possibly) case-incorrect directory
prefix (on a case-insensitive filesystem) of one or more actual
cache-entries.

If there exists at least one case-insensitive prefix match, then we
can assume that the directory is a (case-incorrect) prefix of at least
one tracked item rather than a completely unknown/untracked file or
directory.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:03 -08:00
Jeff Hostetler
7c97174dcd fsmonitor: move untracked-cache invalidation into helper functions
Move the call to invalidate the untracked-cache for the FSEvent
pathname into the two helper functions.

In a later commit in this series, we will call these helpers
from other contexts and it safer to include the UC invalidation
in the helpers than to remember to also add it to each helper
call-site.

This has the side-effect of invalidating the UC *before* we
invalidate the ce_flags in the cache-entry.  These activities
are independent and do not affect each other.  Also, by doing
the UC work first, we can avoid worrying about "early returns"
or the need for the usual "goto the end" in each of the
handler functions.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
Jeff Hostetler
48f4cd7155 fsmonitor: refactor untracked-cache invalidation
Update fsmonitor_refresh_callback() to use the new
untracked_cache_invalidate_trimmed_path() to invalidate
the cache using the observed pathname without needing to
modify the caller's buffer.

Previously, we modified the caller's buffer when the observed pathname
contained a trailing slash (and did not restore it).  This wasn't a
problem for the single use-case caller, but felt dirty nontheless.  In
a later commit we will want to invalidate case-corrected versions of
the pathname (using possibly borrowed pathnames from the name-hash or
dir-name-hash) and we may not want to keep the tradition of altering
the passed-in pathname.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
Jeff Hostetler
3e4ffda639 dir: create untracked_cache_invalidate_trimmed_path()
Create a wrapper function for untracked_cache_invalidate_path()
that silently trims a trailing slash, if present, before calling
the wrapped function.

The untracked cache expects to be called with a pathname that
does not contain a trailing slash.  This can make it inconvenient
for callers that have a directory path.  Lets hide this complexity.

This will be used by a later commit in the FSMonitor code which
may receive directory pathnames from an FSEvent.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
Jeff Hostetler
8687c2b067 fsmonitor: refactor refresh callback for non-directory events
Move the code that handles unqualified FSEvents (without a trailing
slash) into a helper function.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
Jeff Hostetler
7a15a62aeb fsmonitor: clarify handling of directory events in callback helper
Improve documentation of the refresh callback helper function
used for directory FSEvents.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:02 -08:00
Jeff Hostetler
e5da3ddbe9 fsmonitor: refactor refresh callback on directory events
Move the code to handle directory FSEvents (containing pathnames with
a trailing slash) into a helper function.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:01 -08:00
Jeff Hostetler
32ca706fad t7527: add case-insensitve test for FSMonitor
The FSMonitor client code trusts the spelling of the pathnames in the
FSEvents received from the FSMonitor daemon.  On case-insensitive file
systems, these OBSERVED pathnames may be spelled differently than the
EXPECTED pathnames listed in the .git/index.  This causes a miss when
using `index_name_pos()` which expects the given case to be correct.

When this happens, the FSMonitor client code does not update the state
of the CE_FSMONITOR_VALID bit when refreshing the index (and before
starting to scan the worktree).

This results in modified files NOT being reported by `git status` when
there is a discrepancy in the case-spelling of a tracked file's
pathname.

This commit contains a (rather contrived) test case to demonstrate
this.  A later commit in this series will update the FSMonitor client
code to recognize these discrepancies and update the CE_ bit accordingly.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:01 -08:00
Jeff Hostetler
b316552339 name-hash: add index_dir_find()
index_dir_exists() returns a boolean to indicate if there is a
case-insensitive match in the directory name-hash, but does not
provide the caller with the exact spelling of that match.

Create index_dir_find() to do the case-insensitive search *and*
optionally return the spelling of the matched directory prefix in a
provided strbuf.

To avoid code duplication, convert index_dir_exists() to be a trivial
wrapper around the new index_dir_find().

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 15:34:01 -08:00
Johannes Schindelin
4f66942215 neue: remove a bogus empty file
This file has been added as part of 2232a88ab6 (attr: add builtin
objectmode values support, 2023-11-16) and most likely serves no
relevant purpose.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:11:07 -08:00
Philippe Blain
b9e55be740 merge-ort: turn submodule conflict suggestions into an advice
Add a new advice type 'submoduleMergeConflict' for the error message
shown when a non-trivial submodule conflict is encountered, which
was added in 4057523a40 (submodule merge: update conflict error
message, 2022-08-04). That commit mentions making this message an
advice as possible future work.  The message can now be disabled
with the advice mechanism.

Update the tests as the expected message now appears on stderr instead
of stdout.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:07:01 -08:00
Jeff King
5edd126720 read_ref_at(): special-case ref@{0} for an empty reflog
The previous commit special-cased get_oid_basic()'s handling of ref@{n}
for a reflog with n entries. But its special case doesn't work for
ref@{0} in an empty reflog, because read_ref_at() dies when it notices
the empty reflog!

We can make this work by special-casing this in read_ref_at(). It's
somewhat gross, for two reasons:

  1. We have no reflog entry to describe in the "msg" out-parameter. So
     we have to leave it uninitialized or make something up.

  2. Likewise, we have no oid to put in the "oid" out-parameter. Leaving
     it untouched is actually the best thing here, as all of the callers
     will have initialized it with the current ref value via
     repo_dwim_log(). This is rather subtle, but it is how things worked
     in 6436a20284 (refs: allow @{n} to work with n-sized reflog,
     2021-01-07) before we reverted it.

The key difference from 6436a20284 here is that we'll return "1" to
indicate that we _didn't_ find the requested reflog entry. Coupled with
the special-casing in get_oid_basic() in the previous commit, that's
enough to make looking up ref@{0} work, and we can flip 6436a20284's
test back to expect_success.

It also means that the call in show-branch which segfaulted with
6436a20284 (and which is now tested in t3202) remains OK. The caller
notices that we could not find any reflog entry, and so it breaks out of
its loop, showing nothing. This is different from the current behavior
of producing an error, but it's just as reasonable (and is exactly what
we'd do if you asked it to walk starting at ref@{1} but there was only 1
entry).

Thus nobody should actually look at the reflog entry info we return. But
we'll still put in some fake values just to be on the safe side, since
this is such a subtle and confusing interface. Likewise, we'll document
what's going on in a comment above the function declaration. If this
were a function with a lot of callers, the footgun would probably not be
worth it. But it has only ever had two callers in its 18-year existence,
and it seems unlikely to grow more. So let's hold our noses and let
users enjoy the convenience of a simulated ref@{0}.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:05:35 -08:00