1ab5948141e62b52bcb812b04a901b3efaf1b578
110 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
2efe7958d6 |
Merge branch 'ps/reftable-iteration-perf-part2' into ps/reftable-reflog-iteration-perf
* ps/reftable-iteration-perf-part2: refs/reftable: precompute prefix length reftable: allow inlining of a few functions reftable/record: decode keys in place reftable/record: reuse refname when copying reftable/record: reuse refname when decoding reftable/merged: avoid duplicate pqueue emptiness check reftable/merged: circumvent pqueue with single subiter reftable/merged: handle subiter cleanup on close only reftable/merged: remove unnecessary null check for subiters reftable/merged: make subiters own their records reftable/merged: advance subiter on subsequent iteration reftable/merged: make `merged_iter` structure private reftable/pq: use `size_t` to track iterator index |
||
|
|
43f70eaea0 |
refs/reftable: precompute prefix length
We're recomputing the prefix length on every iteration of the ref
iterator. Precompute it for another speedup when iterating over 1
million refs:
Benchmark 1: show-ref: single matching ref (revision = HEAD~)
Time (mean ± σ): 100.3 ms ± 3.7 ms [User: 97.3 ms, System: 2.8 ms]
Range (min … max): 97.5 ms … 139.7 ms 1000 runs
Benchmark 2: show-ref: single matching ref (revision = HEAD)
Time (mean ± σ): 95.8 ms ± 3.4 ms [User: 92.9 ms, System: 2.8 ms]
Range (min … max): 93.0 ms … 121.9 ms 1000 runs
Summary
show-ref: single matching ref (revision = HEAD) ran
1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
||
|
|
b0f6b6b523 |
refs/reftable: don't fail empty transactions in repo without HEAD
Under normal circumstances, it shouldn't ever happen that a repository has no HEAD reference. In fact, git-update-ref(1) would fail any request to delete the HEAD reference, and a newly initialized repository always pre-creates it, too. We have however changed git-clone(1) to partially initialize the refdb just up to the point where remote helpers can find the repository. With that change, we are going to run into a situation where repositories have no refs at all. Now there is a very particular edge case in this situation: when preparing an empty ref transacton, we end up returning whatever value `read_ref_without_reload()` returned to the caller. Under normal conditions this would be fine: "HEAD" should usually exist, and thus the function would return `0`. But if "HEAD" doesn't exist, the function returns a positive value which we end up returning to the caller. Fix this bug by resetting the return code to `0` and add a test. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
|
|
33d15b5435 |
for-each-ref: add new option to include root refs
The git-for-each-ref(1) command doesn't provide a way to print root refs i.e pseudorefs and HEAD with the regular "refs/" prefixed refs. This commit adds a new option "--include-root-refs" to git-for-each-ref(1). When used this would also print pseudorefs and HEAD for the current worktree. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
|
|
59c50a96c5 |
refs: stop resolving ref corresponding to reflogs
The reflog iterator tries to resolve the corresponding ref for every
reflog that it is about to yield. Historically, this was done due to
multiple reasons:
- It ensures that the refname is safe because we end up calling
`check_refname_format()`. Also, non-conformant refnames are skipped
altogether.
- The iterator used to yield the resolved object ID as well as its
flags to the callback. This info was never used though, and the
corresponding parameters were dropped in the preceding commit.
- When a ref is corrupt then the reflog is not emitted at all.
We're about to introduce a new `git reflog list` subcommand that will
print all reflogs that the refdb knows about. Skipping over reflogs
whose refs are corrupted would be quite counterproductive in this case
as the user would have no way to learn about reflogs which may still
exist in their repository to help and rescue such a corrupted ref. Thus,
the only remaining reason for why we'd want to resolve the ref is to
verify its refname.
Refactor the code to call `check_refname_format()` directly instead of
trying to resolve the ref. This is significantly more efficient given
that we don't have to hit the object database anymore to list reflogs.
And second, it ensures that we end up showing reflogs of broken refs,
which will help to make the reflog more useful.
Note that this really only impacts the case where the corresponding ref
is corrupt. Reflogs for nonexistent refs would have been returned to the
caller beforehand already as we did not pass `RESOLVE_REF_READING` to
the function, and thus `refs_resolve_ref_unsafe()` would have returned
successfully in that case.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
||
|
|
31f898397b |
refs: drop unused params from the reflog iterator callback
The ref and reflog iterators share much of the same underlying code to iterate over the corresponding entries. This results in some weird code because the reflog iterator also exposes an object ID as well as a flag to the callback function. Neither of these fields do refer to the reflog though -- they refer to the corresponding ref with the same name. This is quite misleading. In practice at least the object ID cannot really be implemented in any other way as a reflog does not have a specific object ID in the first place. This is further stressed by the fact that none of the callbacks except for our test helper make use of these fields. Split up the infrastucture so that ref and reflog iterators use separate callback signatures. This allows us to drop the nonsensical fields from the reflog iterator. Note that internally, the backends still use the same shared infra to iterate over both types. As the backends should never end up being called directly anyway, this is not much of a problem and thus kept as-is for simplicity's sake. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
|
|
5e01d83841 |
refs: always treat iterators as ordered
In the preceding commit we have converted the reflog iterator of the "files" backend to be ordered, which was the only remaining ref iterator that wasn't ordered. Refactor the ref iterator infrastructure so that we always assume iterators to be ordered, thus simplifying the code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
|
|
6f22780017 |
refs/files: sort merged worktree and common reflogs
When iterating through reflogs in a worktree we create a merged iterator that merges reflogs from both refdbs. The resulting refs are ordered so that instead we first return all worktree reflogs before we return all common refs. This is the only remaining case where a ref iterator returns entries in a non-lexicographic order. The result would look something like the following (listed with a command we introduce in a subsequent commit): ``` $ git reflog list HEAD refs/worktree/per-worktree refs/heads/main refs/heads/wt ``` So we first print the per-worktree reflogs in lexicographic order, then the common reflogs in lexicographic order. This is confusing and not consistent with how we print per-worktree refs, which are exclusively sorted lexicographically. Sort reflogs lexicographically in the same way as we sort normal refs. As this is already implemented properly by the "reftable" backend via a separate selection function, we simply pull out that logic and reuse it for the "files" backend. As logs are properly sorted now, mark the merged reflog iterator as sorted. Tests will be added in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
|
|
8a0bebdeae |
refs/reftable: fix leak when copying reflog fails
When copying a ref with the reftable backend we also copy the corresponding log records. When seeking the first log record that we're about to copy fails though we directly return from `write_copy_table()` without doing any cleanup, leaking several allocated data structures. Fix this by exiting via our common cleanup logic instead. Reported-by: Jeff King <peff@peff.net> via Coverity Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
|
|
57db2a094d |
refs: introduce reftable backend
Due to scalability issues, Shawn Pearce has originally proposed a new
"reftable" format more than six years ago [1]. Initially, this new
format was implemented in JGit with promising results. Around two years
ago, we have then added the "reftable" library to the Git codebase via
|