index-pack: repack local links into promisor packs

Teach index-pack to, when processing the objects in a pack with
--promisor specified on the CLI, repack local objects (and the local
objects that they refer to, recursively) referenced by these objects
into promisor packs.

This prevents the situation in which, when fetching from a promisor
remote, we end up with promisor objects (newly fetched) referring
to non-promisor objects (locally created prior to the fetch). This
situation may arise if the client had previously pushed objects to the
remote, for example. One issue that arises in this situation is that,
if the non-promisor objects become inaccessible except through promisor
objects (for example, if the branch pointing to them has moved to
point to the promisor object that refers to them), then GC will garbage
collect them. There are other ways to solve this, but the simplest
seems to be to enforce the invariant that we don't have promisor objects
referring to non-promisor objects.

This repacking is done from index-pack to minimize the performance
impact. During a fetch, the only time most objects are fully inflated
in memory is when their object ID is computed, so we also scan the
objects (to see which objects they refer to) during this time.

Also to minimize the performance impact, an object is calculated to be
local if it's a loose object or present in a non-promisor pack. (If it's
also in a promisor pack or referred to by an object in a promisor pack,
it is technically already a promisor object. But a misidentification
of a promisor object as a non-promisor object is relatively benign
here - we will thus repack that promisor object into a promisor pack,
duplicating it in the object store, but there is no correctness issue,
just an issue of inefficiency.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Jonathan Tan
2024-11-01 13:11:48 -07:00
committed by Junio C Hamano
parent d9e24ce2ca
commit c08589efdc
4 changed files with 172 additions and 2 deletions

View File

@@ -239,6 +239,7 @@ static enum {
static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE;
static int exclude_promisor_objects;
static int exclude_promisor_objects_best_effort;
static int use_delta_islands;
@@ -4312,6 +4313,18 @@ static int option_parse_cruft_expiration(const struct option *opt UNUSED,
return 0;
}
static int is_not_in_promisor_pack_obj(struct object *obj, void *data UNUSED)
{
struct object_info info = OBJECT_INFO_INIT;
if (oid_object_info_extended(the_repository, &obj->oid, &info, 0))
BUG("should_include_obj should only be called on existing objects");
return info.whence != OI_PACKED || !info.u.packed.pack->pack_promisor;
}
static int is_not_in_promisor_pack(struct commit *commit, void *data) {
return is_not_in_promisor_pack_obj((struct object *) commit, data);
}
int cmd_pack_objects(int argc,
const char **argv,
const char *prefix,
@@ -4424,6 +4437,9 @@ int cmd_pack_objects(int argc,
option_parse_missing_action),
OPT_BOOL(0, "exclude-promisor-objects", &exclude_promisor_objects,
N_("do not pack objects in promisor packfiles")),
OPT_BOOL(0, "exclude-promisor-objects-best-effort",
&exclude_promisor_objects_best_effort,
N_("implies --missing=allow-any")),
OPT_BOOL(0, "delta-islands", &use_delta_islands,
N_("respect islands during delta compression")),
OPT_STRING_LIST(0, "uri-protocol", &uri_protocols,
@@ -4504,10 +4520,18 @@ int cmd_pack_objects(int argc,
strvec_push(&rp, "--unpacked");
}
if (exclude_promisor_objects && exclude_promisor_objects_best_effort)
die(_("options '%s' and '%s' cannot be used together"),
"--exclude-promisor-objects", "--exclude-promisor-objects-best-effort");
if (exclude_promisor_objects) {
use_internal_rev_list = 1;
fetch_if_missing = 0;
strvec_push(&rp, "--exclude-promisor-objects");
} else if (exclude_promisor_objects_best_effort) {
use_internal_rev_list = 1;
fetch_if_missing = 0;
option_parse_missing_action(NULL, "allow-any", 0);
/* revs configured below */
}
if (unpack_unreachable || keep_unreachable || pack_loose_unreachable)
use_internal_rev_list = 1;
@@ -4627,6 +4651,10 @@ int cmd_pack_objects(int argc,
repo_init_revisions(the_repository, &revs, NULL);
list_objects_filter_copy(&revs.filter, &filter_options);
if (exclude_promisor_objects_best_effort) {
revs.include_check = is_not_in_promisor_pack;
revs.include_check_obj = is_not_in_promisor_pack_obj;
}
get_object_list(&revs, rp.nr, rp.v);
release_revisions(&revs);
}