Commit Graph

74032 Commits

Author SHA1 Message Date
Taylor Blau
fcb2205b77 midx: implement support for writing incremental MIDX chains
Now that the rest of the MIDX subsystem and relevant callers have been
updated to learn about how to read and process incremental MIDX chains,
let's finally update the implementation in `write_midx_internal()` to be
able to write incremental MIDX chains.

This new feature is available behind the `--incremental` option for the
`multi-pack-index` builtin, like so:

    $ git multi-pack-index write --incremental

The implementation for doing so is relatively straightforward, and boils
down to a handful of different kinds of changes implemented in this
patch:

  - The `compute_sorted_entries()` function is taught to reject objects
    which appear in any existing MIDX layer.

  - Functions like `write_midx_revindex()` are adjusted to write
    pack_order values which are offset by the number of objects in the
    base MIDX layer.

  - The end of `write_midx_internal()` is adjusted to move
    non-incremental MIDX files when necessary (i.e. when creating an
    incremental chain with an existing non-incremental MIDX in the
    repository).

There are a handful of other changes that are introduced, like new
functions to clear incremental MIDX files that are unrelated to the
current chain (using the same "keep_hash" mechanism as in the
non-incremental case).

The tests explicitly exercising the new incremental MIDX feature are
relatively limited for two reasons:

  1. Most of the "interesting" behavior is already thoroughly covered in
     t5319-multi-pack-index.sh, which handles the core logic of reading
     objects through a MIDX.

     The new tests in t5334-incremental-multi-pack-index.sh are mostly
     focused on creating and destroying incremental MIDXs, as well as
     stitching their results together across layers.

  2. A new GIT_TEST environment variable is added called
     "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the
     entire test suite to write incremental MIDXs after repacking when
     combined with the "GIT_TEST_MULTI_PACK_INDEX" variable.

     This exercises the long tail of other interesting behavior that is
     defined implicitly throughout the rest of the CI suite. It is
     likewise added to the linux-TEST-vars job.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:39 -07:00
Taylor Blau
147c3f6740 t/t5313-pack-bounds-checks.sh: prepare for sub-directories
Prepare for sub-directories to appear in $GIT_DIR/objects/pack by
adjusting the copy, remove, and chmod invocations to perform their
behavior recursively.

This prepares us for the new $GIT_DIR/objects/pack/multi-pack-index.d
directory which will be added in a following commit.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:39 -07:00
Taylor Blau
9552c3595a t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
Two years ago, commit ff1e653c8e (midx: respect
'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP', 2021-08-31) introduced a new
environment variable which caused the test suite to write MIDX bitmaps
after any 'git repack' invocation.

At the time, this was done to help flush out any bugs with MIDX bitmaps
that weren't explicitly covered in the t5326-multi-pack-bitmap.sh
script.

Two years later, that flag has served us well and is no longer providing
meaningful coverage, as the script in t5326 has matured substantially
and covers many more interesting cases than it did back when ff1e653c8e
was originally written.

Remove the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable
as it is no longer serving a useful purpose. More importantly, removing
this variable clears the way for us to introduce a new one to help
similarly flush out bugs related to incremental MIDX chains.

Because these incremental MIDX chains are (for now) incompatible with
MIDX bitmaps, we cannot have both.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
Taylor Blau
3592796d0a midx: implement verification support for incremental MIDXs
Teach the verification implementation used by `git multi-pack-index
verify` to perform verification for incremental MIDX chains by
independently validating each layer within the chain.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
Taylor Blau
b80236d0e3 midx: support reading incremental MIDX chains
Now that the MIDX machinery's internals have been taught to understand
incremental MIDXs over the previous handful of commits, the MIDX
machinery itself can begin reading incremental MIDXs.

(Note that while the on-disk format for incremental MIDXs has been
defined, the writing end has not been implemented. This will take place
in the commit after next.)

The core of this change involves following the order specified in the
MIDX chain in reverse and opening up MIDXs in the chain one-by-one,
adding them to the previous layer's `->base_midx` pointer at each step.

In order to implement this, the `load_multi_pack_index()` function is
taught to call a new `load_multi_pack_index_chain()` function if loading
a non-incremental MIDX failed via `load_multi_pack_index_one()`.

When loading a MIDX chain, `load_midx_chain_fd_st()` reads each line in
the file one-by-one and dispatches calls to
`load_multi_pack_index_one()` to read each layer of the MIDX chain. When
a layer was successfully read, it is added to the MIDX chain by calling
`add_midx_to_chain()` which validates the contents of the `BASE` chunk,
performs some bounds checks on the number of combined packs and objects,
and attaches the new MIDX by assigning its `base_midx` pointer to the
existing part of the chain.

As a supplement to this, introduce a new mode in the test-read-midx
test-tool which allows us to read the information for a specific MIDX in
the chain by specifying its trailing checksum via the command-line
arguments like so:

    $ test-tool read-midx .git/objects [checksum]

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
Taylor Blau
97fd770ea1 midx: teach midx_fanout_add_midx_fanout() about incremental MIDXs
The function `midx_fanout_add_midx_fanout()` is used to help construct
the fanout table when generating a MIDX by reusing data from an existing
MIDX.

Prepare this function to work with incremental MIDXs by making a few
changes:

  - The bounds checks need to be adjusted to start object lookups taking
    into account the number of objects in the previous MIDX layer (i.e.,
    by starting the lookups at position `m->num_objects_in_base` instead
    of position 0).

  - Likewise, the bounds checks need to end at `m->num_objects_in_base`
    objects after `m->num_objects`.

  - Finally, `midx_fanout_add_midx_fanout()` needs to recur on earlier
    MIDX layers when dealing with an incremental MIDX chain by calling
    itself when given a MIDX with a non-NULL `base_midx`.

Note that after 0c5a62f14b (midx-write.c: do not read existing MIDX with
`packs_to_include`, 2024-06-11), we do not use this function with an
existing MIDX (incremental or not) when generating a MIDX with
--stdin-packs, and likewise for incremental MIDXs.

But it is still used when adding the fanout table from an incremental
MIDX when generating a non-incremental MIDX (without --stdin-packs, of
course).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
Taylor Blau
b31f2aac56 midx: teach midx_preferred_pack() about incremental MIDXs
The function `midx_preferred_pack()` is used to determine the identity
of the preferred pack, which is the identity of a unique pack within
the MIDX which is used as a tie-breaker when selecting from which pack
to represent an object that appears in multiple packs within the MIDX.

Historically we have said that the MIDX's preferred pack has the unique
property that all objects from that pack are represented in the MIDX.
But that isn't quite true: a more precise statement would be that all
objects from that pack *which appear in the MIDX* are selected from that
pack.

This helps us extend the concept of preferred packs across a MIDX chain,
where some object(s) in the preferred pack may appear in other packs
in an earlier MIDX layer, in which case those object(s) will not appear
in a subsequent MIDX layer from either the preferred pack or any other
pack.

Extend the concept of preferred packs by using the pack which represents
the object at the first position in MIDX pseudo-pack order belonging to
the current MIDX layer (i.e., at position 'm->num_objects_in_base').

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00
Taylor Blau
853165c50a midx: teach midx_contains_pack() about incremental MIDXs
Now that the `midx_contains_pack()` versus `midx_locate_pack()` debacle
has been cleaned up, teach the former about how to operate in an
incremental MIDX-aware world in a similar fashion as in previous
commits.

Instead of using either of the two `midx_for_object()` or
`midx_for_pack()` helpers, this function is split into two: one that
determines whether a pack is contained in a single MIDX, and another
which calls the former in a loop over all MIDXs.

This approach does not require that we change any of the implementation
in what is now `midx_contains_pack_1()` as it still operates over a
single MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
Taylor Blau
5d0ee3f675 midx: remove unused midx_locate_pack()
Commit 307d75bbe6 (midx: implement `midx_locate_pack()`, 2023-12-14)
introduced `midx_locate_pack()`, which was described at the time as a
complement to the function `midx_contains_pack()` which allowed
callers to determine where in the MIDX lexical order a pack appeared, as
opposed to whether or not it was simply contained.

307d75bbe6 suggests that future patches would be added which would
introduce callers for this new function, but none ever were, meaning the
function has gone unused since its introduction.

Clean this up by in effect reverting 307d75bbe6, which removes the
unused functions and inlines its definition back into
`midx_contains_pack()`.

(Looking back through the list archives when 307d75bbe6 was written,
this was in preparation for this[1] patch from back when we had the
concept of "disjoint" packs while developing multi-pack verbatim reuse.
That concept was abandoned before the series was merged, but I never
dropped what would become 307d75bbe6 from the series, leading to the
state prior to this commit).

[1]: https://lore.kernel.org/git/3019738b52ba8cd78ea696a3b800fa91e722eb66.1701198172.git.me@ttaylorr.com/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
Taylor Blau
3b00e35108 midx: teach fill_midx_entry() about incremental MIDXs
In a similar fashion as previous commits, teach the `fill_midx_entry()`
function to work in a incremental MIDX-aware fashion.

This function, unlike others which accept an index into either the
lexical order of objects or packs, takes in an object_id, and attempts
to fill a caller-provided 'struct pack_entry' with the remaining pieces
of information about that object from the MIDX.

The function uses `bsearch_midx()` which fills out the frame-local 'pos'
variable, recording the given object_id's lexical position within the
MIDX chain, if found (if no matching object ID was found, we'll return
immediately without filling out the `pack_entry` structure).

Once given that position, we jump back through the `->base_midx` pointer
to ensure that our `m` points at the MIDX layer which contains the given
object_id (and not an ancestor or descendant of it in the chain). Note
that we can drop the bounds check "if (pos >= m->num_objects)" because
`midx_for_object()` performs this check for us.

After that point, we only need to make two special considerations within
this function:

  - First, the pack_int_id returned to us by `nth_midxed_pack_int_id()`
    is a position in the concatenated lexical order of packs, so we must
    ensure that we subtract `m->num_packs_in_base` before accessing the
    MIDX-local `packs` array.

  - Second, we must avoid translating the `pos` back to a MIDX-local
    index, since we use it as an argument to `nth_midxed_offset()` which
    expects a position relative to the concatenated lexical order of
    objects.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
Taylor Blau
df7ede83be midx: teach nth_midxed_offset() about incremental MIDXs
In a similar fashion as in previous commits, teach the function
`nth_midxed_offset()` about incremental MIDXs.

The given object `pos` is used to find the containing MIDX, and
translated back into a MIDX-local position by assigning the return value
of `midx_for_object()` to it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
Taylor Blau
88f309e095 midx: teach bsearch_midx() about incremental MIDXs
Now that the special cases callers of `bsearch_midx()` have been dealt
with, teach `bsearch_midx()` to handle incremental MIDX chains.

The incremental MIDX-aware version of `bsearch_midx()` works by
repeatedly searching for a given OID in each layer along the
`->base_midx` pointer, stopping either when an exact match is found, or
the end of the chain is reached.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:37 -07:00
Taylor Blau
3f5f1cff92 midx: introduce bsearch_one_midx()
The `bsearch_midx()` function will be extended in a following commit to
search for the location of a given object ID across all MIDXs in a chain
(or the single non-chain MIDX if no chain is available).

While most callers will naturally want to use the updated
`bsearch_midx()` function, there are a handful of special cases that
will want finer control and will only want to search through a single
MIDX.

For instance, the object abbreviation code, which cares about object IDs
near to where we'd expect to find a match in a MIDX. In that case, we
want to look at the nearby matches in each layer of the MIDX chain, not
just a single one).

Split the more fine-grained control out into a separate function called
`bsearch_one_midx()` which searches only a single MIDX.

At present both `bsearch_midx()` and `bsearch_one_midx()` have identical
behavior, but the following commit will rewrite the former to be aware
of incremental MIDXs for the remaining non-special case callers.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
Taylor Blau
60750e1eb9 midx: teach nth_bitmapped_pack() about incremental MIDXs
In a similar fashion as in previous commits, teach the function
`nth_bitmapped_pack()` about incremental MIDXs by translating the given
`pack_int_id` from the concatenated lexical order to a MIDX-local
lexical position.

When accessing the containing MIDX's array of packs, use the local pack
ID. Likewise, when reading the 'BTMP' chunk, use the MIDX-local offset
when accessing the data within that chunk.

(Note that the both the call to prepare_midx_pack() and the assignment
of bp->pack_int_id both care about the global pack_int_id, so avoid
shadowing the given 'pack_int_id' parameter).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
Taylor Blau
26afb5afa1 midx: teach nth_midxed_object_oid() about incremental MIDXs
The function `nth_midxed_object_oid()` returns the object ID for a given
object position in the MIDX lexicographic order.

Teach this function to instead operate over the concatenated
lexicographic order defined in an earlier step so that it is able to be
used with incremental MIDXs.

To do this, we need to both (a) adjust the bounds check for the given
'n', as well as record the MIDX-local position after chasing the
`->base_midx` pointer to find the MIDX which contains that object.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
Taylor Blau
1820bd878c midx: teach prepare_midx_pack() about incremental MIDXs
The function `prepare_midx_pack()` is part of the midx.h API and
loads the pack identified by the MIDX-local 'pack_int_id'. This patch
prepares that function to be aware of an incremental MIDX world.

To do this, introduce the second of the two general purpose helpers
mentioned in the previous commit. This commit introduces
`midx_for_pack()`, which is the pack-specific analog of
`midx_for_object()`, and works in the same fashion.

Like `midx_for_object()`, this function chases down the '->base_midx'
field until it finds the MIDX layer within the chain that contains the
given pack.

Use this function within `prepare_midx_pack()` so that the `pack_int_id`
it expects is now relative to the entire MIDX chain, and that it
prepares the given pack in the appropriate MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
Taylor Blau
19419821ba midx: teach nth_midxed_pack_int_id() about incremental MIDXs
The function `nth_midxed_pack_int_id()` takes in a object position in
MIDX lexicographic order and returns an identifier of the pack from
which that object was selected in the MIDX.

Currently, the given object position is an index into the lexicographic
order of objects in a single MIDX. Change this position to instead refer
into the concatenated lexicographic order of all MIDXs in a MIDX chain.

This has two visible effects within the implementation of
`prepare_midx_pack()`:

  - First, the given position is now an index into the concatenated
    lexicographic order of all MIDXs in the order in which they appear
    in the MIDX chain.

  - Second the pack ID returned from this function is now also in the
    concatenated order of packs among all layers of the MIDX chain in
    the same order that they appear in the MIDX chain.

To do this, introduce the first of two general purpose helpers, this one
being `midx_for_object()`. `midx_for_object()` takes a double pointer to
a `struct multi_pack_index` as well as an object `pos` in terms of the
entire MIDX chain[^1].

The function chases down the '->base_midx' field until it finds the MIDX
layer within the chain that contains the given object. It then:

  - modifies the double pointer to point to the containing MIDX, instead
    of the tip of the chain, and

  - returns the MIDX-local position[^2] at which the given object can be
    found.

Use this function within `nth_midxed_pack_int_id()` so that the `pos` it
expects is now relative to the entire MIDX chain, and that it returns
the appropriate pack position for that object.

[^1]: As a reminder, this means that the object is identified among the
  objects contained in all layers of the incremental MIDX chain, not any
  particular layer. For example, consider MIDX chain with two individual
  MIDXs, one with 4 objects and another with 3 objects. If the MIDX with
  4 objects appears earlier in the chain, then asking for object 6 would
  return the second object in the MIDX with 3 objects.

[^2]: Building on the previous example, asking for object 6 in a MIDX
  chain with (4, 3) objects, respectively, this would set the double
  pointer to point at the MIDX containing three objects, and would
  return an index to the second object within that MIDX.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
Taylor Blau
2678a73009 midx: add new fields for incremental MIDX chains
The incremental MIDX chain feature is designed around the idea of
indexing into a concatenated lexicographic ordering of object IDs
present in the MIDX.

When given an object position, the MIDX machinery needs to be able to
locate both (a) which MIDX layer contains the given object, and (b) at
what position *within that MIDX layer* that object appears.

To do this, three new fields are added to the `struct multi_pack_index`:

  - struct multi_pack_index *base_midx;
  - uint32_t num_objects_in_base;
  - uint32_t num_packs_in_base;

These three fields store the pieces of information suggested by their
respective field names. In turn, the `num_objects_in_base` and
`num_packs_in_base` fields are used to crawl backwards along the
`base_midx` pointer to locate the appropriate position for a given
object within the MIDX that contains it.

The following commits will update various parts of the MIDX machinery
(as well as their callers from outside of midx.c and midx-write.c) to be
aware and make use of these fields when performing object lookups.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:36 -07:00
Taylor Blau
6eb1a7d7b0 Documentation: describe incremental MIDX format
Prepare to implement incremental multi-pack indexes (MIDXs) over the
next several commits by first describing the relevant prerequisites
(like a new chunk in the MIDX format, the directory structure for
incremental MIDXs, etc.)

The format is described in detail in the patch contents below, but the
high-level description is as follows.

Incremental MIDXs live in $GIT_DIR/objects/pack/multi-pack-index.d, and
each `*.midx` within that directory has a single "parent" MIDX, which is
the MIDX layer immediately before it in the MIDX chain. The chain order
resides in a file 'multi-pack-index-chain' in the same directory.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:35 -07:00
Junio C Hamano
04f5a52757 Post 2.46-rc0 batch #2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-16 11:18:58 -07:00
Junio C Hamano
d6c86368c8 Merge branch 'bc/gitfaq-more'
A handful of entries are added to the GitFAQ document.

* bc/gitfaq-more:
  doc: mention that proxies must be completely transparent
  gitfaq: add entry about syncing working trees
  gitfaq: give advice on using eol attribute in gitattributes
  gitfaq: add documentation on proxies
2024-07-16 11:18:58 -07:00
Junio C Hamano
fe5ba894ec Merge branch 'bc/http-proactive-auth'
The http transport can now be told to send request with
authentication material without first getting a 401 response.

* bc/http-proactive-auth:
  http: allow authenticating proactively
2024-07-16 11:18:57 -07:00
Junio C Hamano
12d49fd028 Merge branch 'jc/where-is-bash-for-ci'
Shell script clean-up.

* jc/where-is-bash-for-ci:
  ci: unify bash calling convention
2024-07-16 11:18:57 -07:00
Junio C Hamano
5d71940dda Merge branch 'ds/advice-sparse-index-expansion'
A new warning message is issued when a command has to expand a
sparse index to handle working tree cruft that are outside of the
sparse checkout.

* ds/advice-sparse-index-expansion:
  advice: warn when sparse index expands
2024-07-16 11:18:56 -07:00
Junio C Hamano
f4c6a0e275 Merge branch 'cb/send-email-sanitize-trailer-addresses'
Address-looking strings found on the trailer are now placed on the
Cc: list after running through sanitize_address by "git send-email".

* cb/send-email-sanitize-trailer-addresses:
  git-send-email: use sanitized address when reading mbox body
2024-07-16 11:18:56 -07:00
Junio C Hamano
ffc8f1142c Merge branch 'en/ort-inner-merge-error-fix'
The "ort" merge backend saw one bugfix for a crash that happens
when inner merge gets killed, and assorted code clean-ups.

* en/ort-inner-merge-error-fix:
  merge-ort: fix missing early return
  merge-ort: convert more error() cases to path_msg()
  merge-ort: upon merge abort, only show messages causing the abort
  merge-ort: loosen commented requirements
  merge-ort: clearer propagation of failure-to-function from merge_submodule
  merge-ort: fix type of local 'clean' var in handle_content_merge ()
  merge-ort: maintain expected invariant for priv member
  merge-ort: extract handling of priv member into reusable function
2024-07-16 11:18:55 -07:00
Junio C Hamano
ad850ef1cf Post 2.46-rc0 batch #1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-15 10:11:44 -07:00
Junio C Hamano
9118e46e81 Merge branch 'cp/unit-test-reftable-record'
A test in reftable library has been rewritten using the unit test
framework.

* cp/unit-test-reftable-record:
  t-reftable-record: add tests for reftable_log_record_compare_key()
  t-reftable-record: add tests for reftable_ref_record_compare_name()
  t-reftable-record: add index tests for reftable_record_is_deletion()
  t-reftable-record: add obj tests for reftable_record_is_deletion()
  t-reftable-record: add log tests for reftable_record_is_deletion()
  t-reftable-record: add ref tests for reftable_record_is_deletion()
  t-reftable-record: add comparison tests for obj records
  t-reftable-record: add comparison tests for index records
  t-reftable-record: add comparison tests for ref records
  t-reftable-record: add reftable_record_cmp() tests for log records
  t: move reftable/record_test.c to the unit testing framework
2024-07-15 10:11:44 -07:00
Junio C Hamano
f582dc3c5a Merge branch 'jc/disable-push-nego-for-deletion'
"git push" that pushes only deletion gave an unnecessary and
harmless error message when push negotiation is configured, which
has been corrected.

* jc/disable-push-nego-for-deletion:
  push: avoid showing false negotiation errors
2024-07-15 10:11:43 -07:00
Junio C Hamano
fbeed643b9 Merge branch 'ri/doc-show-branch-fix'
Docfix.

* ri/doc-show-branch-fix:
  doc: fix the max number of branches shown by "show-branch"
2024-07-15 10:11:43 -07:00
Junio C Hamano
d319ad5704 Merge branch 'tb/dev-build-pedantic-fix'
Developer build procedure fix.

* tb/dev-build-pedantic-fix:
  config.mak.dev: fix typo when enabling -Wpedantic
2024-07-15 10:11:42 -07:00
Junio C Hamano
76f49679b1 Merge branch 'rs/clang-format-updates'
Custom control structures we invented more recently have been
taught to the clang-format file.

* rs/clang-format-updates:
  clang-format: include kh_foreach* macros in ForEachMacros
2024-07-15 10:11:42 -07:00
Junio C Hamano
ccb74f51c9 Merge branch 'am/gitweb-feed-use-committer-date'
GitWeb update to use committer date consistently in rss/atom feeds.

* am/gitweb-feed-use-committer-date:
  gitweb: rss/atom change published/updated date to committer date
2024-07-15 10:11:41 -07:00
Junio C Hamano
820e796984 Merge branch 'jk/tests-without-dns'
Test suite has been taught not to unnecessarily rely on DNS failing
a bogus external name.

* jk/tests-without-dns:
  t/lib-bundle-uri: use local fake bundle URLs
  t5551: do not confirm that bogus url cannot be used
  t5553: use local url for invalid fetch
2024-07-15 10:11:41 -07:00
Junio C Hamano
cda729581b Merge branch 'gt/unit-test-oidmap'
An existing test of oidmap API has been rewritten with the
unit-test framework.

* gt/unit-test-oidmap:
  t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
2024-07-15 10:11:40 -07:00
Junio C Hamano
b227482ea0 Merge branch 'as/describe-broken-refresh-index-fix'
"git describe --dirty --broken" forgot to refresh the index before
seeing if there is any chang, ("git describe --dirty" correctly did
so), which has been corrected.

* as/describe-broken-refresh-index-fix:
  describe: refresh the index when 'broken' flag is used
2024-07-15 10:11:40 -07:00
Junio C Hamano
d8b9b1fc81 Merge branch 'rj/t0613-no-longer-leaks'
A test that no longer leaks has been marked as such.

* rj/t0613-no-longer-leaks:
  t0613: mark as leak-free
2024-07-15 10:11:39 -07:00
Junio C Hamano
84fc58f24b Merge branch 'rj/t0612-no-longer-leaks'
A test that no longer leaks has been marked as such.

* rj/t0612-no-longer-leaks:
  t0612: mark as leak-free
2024-07-15 10:11:39 -07:00
Junio C Hamano
a7dae3bdc8 Git 2.46-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-12 08:41:58 -07:00
Junio C Hamano
e6ae4d6efe Merge branch 'rs/simplify-submodule-helper-super-prefix-invocation'
Code clean-up.

* rs/simplify-submodule-helper-super-prefix-invocation:
  submodule--helper: use strvec_pushf() for --super-prefix
2024-07-12 08:41:58 -07:00
Junio C Hamano
7c01dcd018 Merge branch 'as/pathspec-h-typofix'
Typofix.

* as/pathspec-h-typofix:
  pathspec: fix typo "glossary-context.txt" -> "glossary-content.txt"
2024-07-12 08:41:57 -07:00
brian m. carlson
610cbc1dfb http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response.  Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.

However, there may be times when a user would like to authenticate
nevertheless.  For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use.  Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.

Let's make this possible with a new option, "http.proactiveAuth".  This
option specifies a type of authentication which can be used to
authenticate against the host in question.  This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.

If we're in auto mode and we got a username and password, set the
authentication scheme to Basic.  libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves.  In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.

Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided.  Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it.  While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:27:51 -07:00
brian m. carlson
70405acf60 doc: mention that proxies must be completely transparent
We already document in the FAQ that proxies must be completely
transparent and not modify the request or response in any way, but add
similar documentation to the http.proxy entry.  We know that while the
FAQ is very useful, users sometimes are less likely to read in favor of
the documentation specific to an option or command, so adding it in both
places will help users be adequately informed.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
brian m. carlson
804ecbcfd1 gitfaq: add entry about syncing working trees
Users very commonly want to sync their working tree with uncommitted
changes across machines, often to carry across in-progress work or
stashes.  Despite this not being a recommended approach, users want to
do it and are not dissuaded by suggestions not to, so let's recommend a
sensible technique.

The technique that many users are using is their preferred cloud syncing
service, which is a bad idea.  Users have reported problems where they
end up with duplicate files that won't go away (with names like "file.c
2"), broken references, oddly named references that have date stamps
appended to them, missing objects, and general corruption and data loss.
That's because almost all of these tools sync file by file, which is a
great technique if your project is a single word processing document or
spreadsheet, but is utterly abysmal for Git repositories because they
don't necessarily snapshot the entire repository correctly.  They also
tend to sync the files immediately instead of when the repository is
quiescent, so writing multiple files, as occurs during a commit or a gc,
can confuse the tools and lead to corruption.

We know that the old standby, rsync, is up to the task, provided that
the repository is quiescent, so let's suggest that and dissuade people
from using cloud syncing tools.  Let's tell people about common things
they should be aware of before doing this and that this is still
potentially risky.  Additionally, let's tell people that Git's security
model does not permit sharing working trees across users in case they
planned to do that.  While we'd still prefer users didn't try to do
this, hopefully this will lead them in a safer direction.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
brian m. carlson
c98f78b806 gitfaq: give advice on using eol attribute in gitattributes
In the FAQ, we tell people how to use the text attribute, but we fail to
explain what to do with the eol attribute.  As we ourselves have
noticed, most shell implementations do not care for carriage returns,
and as such, people will practically always want them to use LF endings.
Similar things can be said for batch files on Windows, except with CRLF
endings.

Since these are common things to have in a repository, let's help users
make a good decision by recommending that they use the gitattributes
file to correctly check out the endings.

In addition, let's correct the cross-reference to this question, which
originally referred to "the following entry", even though a new entry
has been inserted in between.  The cross-reference notation should
prevent this from occurring and provide a link in formats, such as HTML,
which support that.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
brian m. carlson
2101341484 gitfaq: add documentation on proxies
Many corporate environments and local systems have proxies in use.  Note
the situations in which proxies can be used and how to configure them.
At the same time, note what standards a proxy must follow to work with
Git.  Explicitly call out certain classes that are known to routinely
have problems reported various places online, including in the Git for
Windows issue tracker and on Stack Overflow, and recommend against the
use of such software, noting that they are associated with myriad
security problems (including, for example, breaking sandboxing and image
integrity[0], and, for TLS middleboxes, the use of insecure protocols
and ciphers and lack of certificate verification[1]). Don't mention the
specific nature of these security problems in the FAQ entry because they
are extremely numerous and varied and we wish to keep the FAQ entry
relatively brief.

[0] https://issues.chromium.org/issues/40285192
[1] https://faculty.cc.gatech.edu/~mbailey/publications/ndss17_interception.pdf

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
Junio C Hamano
58696bfcaa ci: unify bash calling convention
Under ci/ hierarchy, we run scripts under either "sh" (any Bourne
compatible POSIX shell would work) or specifically "bash" (as they
require features from bash, e.g., ${parameter/pattern/string}
expansion).  As we have the CI environment under our control, we can
expect that /bin/sh will always be fine to run the scripts that only
require a Bourne shell, but we may not know where "bash" is
installed depending on the distro used.

So let's make sure we start these scripts with either one of these:

	#!/bin/sh
	#!/usr/bin/env bash

Yes, the latter has to assume that everybody installs "env" at that
path and not as /bin/env or /usr/local/bin/env, but this currently
is the best we could do.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 16:23:05 -07:00
Junio C Hamano
557ae147e6 The ninteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 14:53:11 -07:00
Junio C Hamano
a43b001cce Merge branch 'ds/sparse-lstat-caching'
The code to deal with modified paths that are out-of-cone in a
sparsely checked out working tree has been optimized.

* ds/sparse-lstat-caching:
  sparse-index: improve lstat caching of sparse paths
  sparse-index: count lstat() calls
  sparse-index: use strbuf in path_found()
  sparse-index: refactor path_found()
  sparse-checkout: refactor skip worktree retry logic
2024-07-08 14:53:11 -07:00
Junio C Hamano
125e389470 Merge branch 'xx/bundie-uri-fixes'
When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.

* xx/bundie-uri-fixes:
  unbundle: extend object verification for fetches
  fetch-pack: expose fsckObjects configuration logic
  bundle-uri: verify oid before writing refs
2024-07-08 14:53:11 -07:00