Due to scalability issues, Shawn Pearce has originally proposed a new
"reftable" format more than six years ago [1]. Initially, this new
format was implemented in JGit with promising results. Around two years
ago, we have then added the "reftable" library to the Git codebase via
a4bbd13be3 (Merge branch 'hn/reftable', 2021-12-15). With this we have
landed all the low-level code to read and write reftables. Notably
missing though was the integration of this low-level code into the Git
code base in the form of a new ref backend that ties all of this
together.
This gap is now finally closed by introducing a new "reftable" backend
into the Git codebase. This new backend promises to bring some notable
improvements to Git repositories:
- It becomes possible to do truly atomic writes where either all refs
are committed to disk or none are. This was not possible with the
"files" backend because ref updates were split across multiple loose
files.
- The disk space required to store many refs is reduced, both compared
to loose refs and packed-refs. This is enabled both by the reftable
format being a binary format, which is more compact, and by prefix
compression.
- We can ignore filesystem-specific behaviour as ref names are not
encoded via paths anymore. This means there is no need to handle
case sensitivity on Windows systems or Unicode precomposition on
macOS.
- There is no need to rewrite the complete refdb anymore every time a
ref is being deleted like it was the case for packed-refs. This
means that ref deletions are now constant time instead of scaling
linearly with the number of refs.
- We can ignore file/directory conflicts so that it becomes possible
to store both "refs/heads/foo" and "refs/heads/foo/bar".
- Due to this property we can retain reflogs for deleted refs. We have
previously been deleting reflogs together with their refs to avoid
file/directory conflicts, which is not necessary anymore.
- We can properly enumerate all refs. With the "files" backend it is
not easily possible to distinguish between refs and non-refs because
they may live side by side in the gitdir.
Not all of these improvements are realized with the current "reftable"
backend implementation. At this point, the new backend is supposed to be
a drop-in replacement for the "files" backend that is used by basically
all Git repositories nowadays. It strives for 1:1 compatibility, which
means that a user can expect the same behaviour regardless of whether
they use the "reftable" backend or the "files" backend for most of the
part.
Most notably, this means we artificially limit the capabilities of the
"reftable" backend to match the limits of the "files" backend. It is not
possible to create refs that would end up with file/directory conflicts,
we do not retain reflogs, we perform stricter-than-necessary checks.
This is done intentionally due to two main reasons:
- It makes it significantly easier to land the "reftable" backend as
tests behave the same. It would be tough to argue for each and every
single test that doesn't pass with the "reftable" backend.
- It ensures compatibility between repositories that use the "files"
backend and repositories that use the "reftable" backend. Like this,
hosters can migrate their repositories to use the "reftable" backend
without causing issues for clients that use the "files" backend in
their clones.
It is expected that these artificial limitations may eventually go away
in the long term.
Performance-wise things very much depend on the actual workload. The
following benchmarks compare the "files" and "reftable" backends in the
current version:
- Creating N refs in separate transactions shows that the "files"
backend is ~50% faster. This is not surprising given that creating a
ref only requires us to create a single loose ref. The "reftable"
backend will also perform auto compaction on updates. In real-world
workloads we would likely also want to perform pack loose refs,
which would likely change the picture.
Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 1)
Time (mean ± σ): 2.1 ms ± 0.3 ms [User: 0.6 ms, System: 1.7 ms]
Range (min … max): 1.8 ms … 4.3 ms 133 runs
Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 1)
Time (mean ± σ): 2.7 ms ± 0.1 ms [User: 0.6 ms, System: 2.2 ms]
Range (min … max): 2.4 ms … 2.9 ms 132 runs
Benchmark 3: update-ref: create refs sequentially (refformat = files, refcount = 1000)
Time (mean ± σ): 1.975 s ± 0.006 s [User: 0.437 s, System: 1.535 s]
Range (min … max): 1.969 s … 1.980 s 3 runs
Benchmark 4: update-ref: create refs sequentially (refformat = reftable, refcount = 1000)
Time (mean ± σ): 2.611 s ± 0.013 s [User: 0.782 s, System: 1.825 s]
Range (min … max): 2.597 s … 2.622 s 3 runs
Benchmark 5: update-ref: create refs sequentially (refformat = files, refcount = 100000)
Time (mean ± σ): 198.442 s ± 0.241 s [User: 43.051 s, System: 155.250 s]
Range (min … max): 198.189 s … 198.670 s 3 runs
Benchmark 6: update-ref: create refs sequentially (refformat = reftable, refcount = 100000)
Time (mean ± σ): 294.509 s ± 4.269 s [User: 104.046 s, System: 190.326 s]
Range (min … max): 290.223 s … 298.761 s 3 runs
- Creating N refs in a single transaction shows that the "files"
backend is significantly slower once we start to write many refs.
The "reftable" backend only needs to update two files, whereas the
"files" backend needs to write one file per ref.
Benchmark 1: update-ref: create many refs (refformat = files, refcount = 1)
Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.4 ms, System: 1.4 ms]
Range (min … max): 1.8 ms … 2.6 ms 151 runs
Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 1)
Time (mean ± σ): 2.5 ms ± 0.1 ms [User: 0.7 ms, System: 1.7 ms]
Range (min … max): 2.4 ms … 3.4 ms 148 runs
Benchmark 3: update-ref: create many refs (refformat = files, refcount = 1000)
Time (mean ± σ): 152.5 ms ± 5.2 ms [User: 19.1 ms, System: 133.1 ms]
Range (min … max): 148.5 ms … 167.8 ms 15 runs
Benchmark 4: update-ref: create many refs (refformat = reftable, refcount = 1000)
Time (mean ± σ): 58.0 ms ± 2.5 ms [User: 28.4 ms, System: 29.4 ms]
Range (min … max): 56.3 ms … 72.9 ms 40 runs
Benchmark 5: update-ref: create many refs (refformat = files, refcount = 1000000)
Time (mean ± σ): 152.752 s ± 0.710 s [User: 20.315 s, System: 131.310 s]
Range (min … max): 152.165 s … 153.542 s 3 runs
Benchmark 6: update-ref: create many refs (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 51.912 s ± 0.127 s [User: 26.483 s, System: 25.424 s]
Range (min … max): 51.769 s … 52.012 s 3 runs
- Deleting a ref in a fully-packed repository shows that the "files"
backend scales with the number of refs. The "reftable" backend has
constant-time deletions.
Benchmark 1: update-ref: delete ref (refformat = files, refcount = 1)
Time (mean ± σ): 1.7 ms ± 0.1 ms [User: 0.4 ms, System: 1.2 ms]
Range (min … max): 1.6 ms … 2.1 ms 316 runs
Benchmark 2: update-ref: delete ref (refformat = reftable, refcount = 1)
Time (mean ± σ): 1.8 ms ± 0.1 ms [User: 0.4 ms, System: 1.3 ms]
Range (min … max): 1.7 ms … 2.1 ms 294 runs
Benchmark 3: update-ref: delete ref (refformat = files, refcount = 1000)
Time (mean ± σ): 2.0 ms ± 0.1 ms [User: 0.5 ms, System: 1.4 ms]
Range (min … max): 1.9 ms … 2.5 ms 287 runs
Benchmark 4: update-ref: delete ref (refformat = reftable, refcount = 1000)
Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.5 ms, System: 1.3 ms]
Range (min … max): 1.8 ms … 2.1 ms 217 runs
Benchmark 5: update-ref: delete ref (refformat = files, refcount = 1000000)
Time (mean ± σ): 229.8 ms ± 7.9 ms [User: 182.6 ms, System: 46.8 ms]
Range (min … max): 224.6 ms … 245.2 ms 6 runs
Benchmark 6: update-ref: delete ref (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 2.0 ms ± 0.0 ms [User: 0.6 ms, System: 1.3 ms]
Range (min … max): 2.0 ms … 2.1 ms 3 runs
- Listing all refs shows no significant advantage for either of the
backends. The "files" backend is a bit faster, but not by a
significant margin. When repositories are not packed the "reftable"
backend outperforms the "files" backend because the "reftable"
backend performs auto-compaction.
Benchmark 1: show-ref: print all refs (refformat = files, refcount = 1, packed = true)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.0 ms 1729 runs
Benchmark 2: show-ref: print all refs (refformat = reftable, refcount = 1, packed = true)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 1.8 ms 1816 runs
Benchmark 3: show-ref: print all refs (refformat = files, refcount = 1000, packed = true)
Time (mean ± σ): 4.3 ms ± 0.1 ms [User: 0.9 ms, System: 3.3 ms]
Range (min … max): 4.1 ms … 4.6 ms 645 runs
Benchmark 4: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = true)
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.0 ms, System: 3.3 ms]
Range (min … max): 4.2 ms … 5.9 ms 643 runs
Benchmark 5: show-ref: print all refs (refformat = files, refcount = 1000000, packed = true)
Time (mean ± σ): 2.537 s ± 0.034 s [User: 0.488 s, System: 2.048 s]
Range (min … max): 2.511 s … 2.627 s 10 runs
Benchmark 6: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = true)
Time (mean ± σ): 2.712 s ± 0.017 s [User: 0.653 s, System: 2.059 s]
Range (min … max): 2.692 s … 2.752 s 10 runs
Benchmark 7: show-ref: print all refs (refformat = files, refcount = 1, packed = false)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 1.9 ms 1834 runs
Benchmark 8: show-ref: print all refs (refformat = reftable, refcount = 1, packed = false)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 2.0 ms 1840 runs
Benchmark 9: show-ref: print all refs (refformat = files, refcount = 1000, packed = false)
Time (mean ± σ): 13.8 ms ± 0.2 ms [User: 2.8 ms, System: 10.8 ms]
Range (min … max): 13.3 ms … 14.5 ms 208 runs
Benchmark 10: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = false)
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.2 ms, System: 3.3 ms]
Range (min … max): 4.3 ms … 6.2 ms 624 runs
Benchmark 11: show-ref: print all refs (refformat = files, refcount = 1000000, packed = false)
Time (mean ± σ): 12.127 s ± 0.129 s [User: 2.675 s, System: 9.451 s]
Range (min … max): 11.965 s … 12.370 s 10 runs
Benchmark 12: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = false)
Time (mean ± σ): 2.799 s ± 0.022 s [User: 0.735 s, System: 2.063 s]
Range (min … max): 2.769 s … 2.836 s 10 runs
- Printing a single ref shows no real difference between the "files"
and "reftable" backends.
Benchmark 1: show-ref: print single ref (refformat = files, refcount = 1)
Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.4 ms, System: 1.0 ms]
Range (min … max): 1.4 ms … 1.8 ms 1779 runs
Benchmark 2: show-ref: print single ref (refformat = reftable, refcount = 1)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 2.5 ms 1753 runs
Benchmark 3: show-ref: print single ref (refformat = files, refcount = 1000)
Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.3 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 1.9 ms 1840 runs
Benchmark 4: show-ref: print single ref (refformat = reftable, refcount = 1000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.0 ms 1831 runs
Benchmark 5: show-ref: print single ref (refformat = files, refcount = 1000000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.1 ms 1848 runs
Benchmark 6: show-ref: print single ref (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.1 ms 1762 runs
So overall, performance depends on the usecases. Except for many
sequential writes the "reftable" backend is roughly on par or
significantly faster than the "files" backend though. Given that the
"files" backend has received 18 years of optimizations by now this can
be seen as a win. Furthermore, we can expect that the "reftable" backend
will grow faster over time when attention turns more towards
optimizations.
The complete test suite passes, except for those tests explicitly marked
to require the REFFILES prerequisite. Some tests in t0610 are marked as
failing because they depend on still-in-flight bug fixes. Tests can be
run with the new backend by setting the GIT_TEST_DEFAULT_REF_FORMAT
environment variable to "reftable".
There is a single known conceptual incompatibility with the dumb HTTP
transport. As "info/refs" SHOULD NOT contain the HEAD reference, and
because the "HEAD" file is not valid anymore, it is impossible for the
remote client to figure out the default branch without changing the
protocol. This shortcoming needs to be handled in a subsequent patch
series.
As the reftable library has already been introduced a while ago, this
commit message will not go into the details of how exactly the on-disk
format works. Please refer to our preexisting technical documentation at
Documentation/technical/reftable for this.
[1]: https://public-inbox.org/git/CAJo=hJtyof=HRy=2sLP0ng0uZ4=S-DpZ5dR1aF+VHVETKG20OQ@mail.gmail.com/
Original-idea-by: Shawn Pearce <spearce@spearce.org>
Based-on-patch-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More tests that are supposed to pass leak sanitizer are marked as such.
* rj/test-with-leak-check:
t0080: mark as leak-free
test-lib: check for TEST_PASSES_SANITIZE_LEAK
t6113: mark as leak-free
t5332: mark as leak-free
Prepare existing tests on refs to work better with non-default
backends.
* ps/tests-with-ref-files-backend:
t: mark tests regarding git-pack-refs(1) to be backend specific
t5526: break test submodule differently
t1419: mark test suite as files-backend specific
t1302: make tests more robust with new extensions
t1301: mark test for `core.sharedRepository` as reffiles specific
t1300: make tests more robust with non-default ref backends
"git diff --no-index file1 file2" segfaulted while invoking the
external diff driver, which has been corrected.
* jk/diff-external-with-no-index:
diff: handle NULL meta-info when spawning external diff
Fetching via protocol v0 over Smart HTTP transport sometimes failed
to correctly auto-follow tags.
* jk/fetch-auto-tag-following-fix:
transport-helper: re-examine object dir after fetching
The labels on conflict markers for the common ancestor, our version,
and the other version are available to custom 3-way merge driver
via %S, %X, and %Y placeholders.
* ad/custom-merge-placeholder-for-symbolic-pathnames:
merge-ll: expose revision names to custom drivers
Tests on ref API are moved around to prepare for reftable.
* jc/reffiles-tests:
t5312: move reffiles specific tests to t0601
t4202: move reffiles specific tests to t0600
t3903: make drop stash test ref backend agnostic
t1503: move reffiles specific tests to t0600
t1415: move reffiles specific tests to t0601
t1410: move reffiles specific tests to t0600
t1406: move reffiles specific tests to t0600
t1405: move reffiles specific tests to t0601
t1404: move reffiles specific tests to t0600
t1414: convert test to use Git commands instead of writing refs manually
remove REFFILES prerequisite for some tests in t1405 and t2017
t3210: move to t0601
The "disable repository discovery of a bare repository" check,
triggered by setting safe.bareRepository configuration variable to
'explicit', has been loosened to exclude the ".git/" directory inside
a non-bare repository from the check. So you can do "cd .git &&
git cmd" to run a Git command that works on a bare repository without
explicitly specifying $GIT_DIR now.
* kl/allow-working-in-dot-git-in-non-bare-repository:
setup: allow cwd=.git w/ bareRepository=explicit
"git archive --remote=<remote>" learned to talk over the smart
http (aka stateless) transport.
* jx/remote-archive-over-smart-http:
transport-helper: call do_take_over() in process_connect
transport-helper: call do_take_over() in connect_helper
http-backend: new rpc-service for git-upload-archive
transport-helper: protocol v2 supports upload-archive
remote-curl: supports git-upload-archive service
transport-helper: no connection restriction in connect_helper
All conditional "advice" messages show how to turn them off, which
becomes repetitive. Setting advice.* configuration explicitly on
now omits the instruction part.
* rj/advice-disable-how-to-disable:
advice: allow disabling the automatic hint in advise_if_enabled()
"git diff --no-rename A B" did not disable rename detection but did
not trigger an error from the command line parser.
* rs/parse-options-with-keep-unknown-abbrev-fix:
parse-options: simplify positivation handling
parse-options: fully disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
The beginning of the [System Info] section, which should match the
"git version --build-options" output, may not identify our version
as "git version 2.whatever". When built in a repository cloned
without tags, for example, "git version unknown.g00000000" can be a
legit version string.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename detection logic ignored the final line of a file if it is an
incomplete line.
* en/diffcore-delta-final-line-fix:
diffcore-delta: avoid ignoring final 'line' of file
Update to a new feature recently added, "git show-ref --exists".
* tc/show-ref-exists-fix:
builtin/show-ref: treat directory as non-existing in --exists
This test is leak-free since it was added in e137fe3b29 (unit tests: add
TAP unit test framework, 2023-11-09)
Let's mark it as leak-free to make sure it stays that way (and to reduce
noise when looking for other leak-free scripts after we fix some leaks).
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
TEST_PASSES_SANITIZE_LEAK must be set before sourcing test-lib.sh, as we
say in t/README:
GIT_TEST_PASSING_SANITIZE_LEAK=true skips those tests that haven't
declared themselves as leak-free by setting
"TEST_PASSES_SANITIZE_LEAK=true" before sourcing "test-lib.sh". This
test mode is used by the "linux-leaks" CI target.
GIT_TEST_PASSING_SANITIZE_LEAK=check checks that our
"TEST_PASSES_SANITIZE_LEAK=true" markings are current. Rather than
skipping those tests that haven't set "TEST_PASSES_SANITIZE_LEAK=true"
before sourcing "test-lib.sh" this mode runs them with
"--invert-exit-code". This is used to check that there's a one-to-one
mapping between "TEST_PASSES_SANITIZE_LEAK=true" and those tests that
pass under "SANITIZE=leak". This is especially useful when testing a
series that fixes various memory leaks with "git rebase -x".
In a recent commit we fixed a test where it was set after sourcing
test-lib.sh, leading to confusing results.
To prevent future oversights, let's add a simple check to ensure the
value for TEST_PASSES_SANITIZE_LEAK remains unchanged at test_done().
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test does not leak since a96015a517 (pack-bitmap: plug leak in
find_objects(), 2023-12-14) when the annotation
TEST_PASSES_SANITIZE_LEAK=true was also added.
Unfortunately it was added after test-lib.sh is sourced, which makes
GIT_TEST_PASSING_SANITIZE_LEAK=check error:
$ make SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check test T=t6113-rev-list-bitmap-filters.sh
...
make[2]: Entering directory '/tmp/git/git/t'
*** t6113-rev-list-bitmap-filters.sh ***
in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
ok 1 - set up bitmapped repo
ok 2 - filters fallback to non-bitmap traversal
ok 3 - blob:none filter
ok 4 - blob:none filter with specified blob
ok 5 - blob:limit filter
ok 6 - blob:limit filter with specified blob
ok 7 - tree:0 filter
ok 8 - tree:0 filter with specified blob, tree
ok 9 - tree:1 filter
ok 10 - object:type filter
ok 11 - object:type filter with --filter-provided-objects
ok 12 - combine filter
ok 13 - combine filter with --filter-provided-objects
ok 14 - bitmap traversal with --unpacked
# passed all 14 test(s)
1..14
# faking up non-zero exit with --invert-exit-code
make[2]: *** [Makefile:68: t6113-rev-list-bitmap-filters.sh] Error 1
make[2]: Leaving directory '/tmp/git/git/t'
make[1]: *** [Makefile:55: test] Error 2
make[1]: Leaving directory '/tmp/git/git/t'
make: *** [Makefile:3212: test] Error 2
Let's move the annotation before sourcing test-lib.sh, to make
GIT_TEST_PASSING_SANITIZE_LEAK=check happy.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test is leak-free since it was added in af626ac0e0 (pack-bitmap:
enable reuse from all bitmapped packs, 2023-12-14).
Let's mark it as leak-free to make sure it stays that way (and to reduce
noise when looking for other leak-free scripts after we fix some leaks).
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both t1409 and t3210 exercise parts of git-pack-refs(1). Given that we
must check the on-disk files to verify whether the backend has indeed
packed refs as expected those test suites are deeply tied to the actual
backend that is in use.
Mark the test suites to depend on the REFFILES backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 10f5c52656 (submodule: avoid auto-discovery in
prepare_submodule_repo_env(), 2016-09-01) we fixed a bug when doing a
recursive fetch with submodule in the case where the submodule is broken
due to whatever reason. The test to exercise that the fix works breaks
the submodule by deleting its `HEAD` reference, which will cause us to
not detect the directory as a Git repository.
While this is perfectly fine in theory, this way of breaking the repo
becomes problematic with the current efforts to introduce another refdb
backend into Git. The new reftable backend has a stub HEAD file that
always contains "ref: refs/heads/.invalid" so that tools continue to be
able to detect such a repository. But as the reftable backend will never
delete this file even when asked to delete `HEAD` the current way to
delete the `HEAD` reference will stop working.
Adapt the code to instead delete the objects database. Going back with
this new way to cause breakage confirms that it triggers the infinite
recursion just the same, and there are no equivalent ongoing efforts to
replace the object database with an alternate backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 59c35fac54 (refs/packed-backend.c: implement jump lists to avoid
excluded pattern(s), 2023-07-10) we have implemented logic to handle
excluded refs more efficiently in the "packed" ref backend. This logic
allows us to skip emitting refs completely which we know to not be of
any interest to the caller, which can avoid quite some allocations and
object lookups.
This was wired up via a new `exclude_patterns` parameter passed to the
backend's ref iterator. The backend only needs to handle them on a best
effort basis though, and in fact we only handle it for the "packed-refs"
file, but not for loose references. Consequently, all callers must still
filter emitted refs with those exclude patterns.
The result is that handling exclude patterns is completely optional in
the ref backend, and any future backends may or may not implement it.
Let's thus mark the test for t1419 to depend on the REFFILES prereq.
An alternative would be to introduce a new prereq that tells us whether
the backend under test supports exclude patterns or not. But this does
feel a bit overblown:
- It would either map to the REFFILES prereq, in which case it feels
overengineered because the prereq is only ever relevant to t1419.
- Otherwise, it could auto-detect whether the backend supports exclude
patterns. But this could lead to silent failures in case the support
for this feature breaks at any point in time.
It should thus be good enough to just use the REFFILES prereq for now.
If future backends ever grow support for exclude patterns we can easily
add their respective prereq as another condition for this test suite to
execute.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1302 we exercise logic around "core.repositoryFormatVersion" and
extensions. These tests are not particularly robust against extensions
like the newly introduced "refStorage" extension as we tend to clobber
the repository's config file. We thus overwrite any extensions that were
set, which may render the repository inaccessible in case it has to be
accessed with a non-default ref storage.
Refactor the tests to be more robust:
- Check the DEFAULT_REPO_FORMAT prereq to determine the expected
repository format version. This helps to ensure that we only need to
update the prereq in a central place when new extensions are added.
Furthermore, this allows us to stop seeding the now-unneeded object
ID cache that was only used to figure out the repository version.
- Use a separate repository to rewrite ".git/config" to test
combinations of the repository format version and extensions. This
ensures that we don't break the main test repository. While we could
rewrite these tests to not overwrite preexisting extensions, it
feels cleaner like this so that we can test extensions standalone
without interference from the environment.
- Do not rewrite ".git/config" when exercising the "preciousObjects"
extension.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1301 we verify whether reflog files written by the "files" ref
backend correctly honor permissions when "core.sharedRepository" is set.
The test logic is thus specific to the reffiles backend and will not
work with any other backends.
Mark the test accordingly with the REFFILES prereq.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t1300 test suite exercises the git-config(1) tool. To do so, the
test overwrites ".git/config" to contain custom contents in several
places with code like the following:
```
cat > .git/config <<\EOF
...
EOF
```
While this is easy enough to do, it may create problems when using a
non-default repository format because this causes us to overwrite the
repository format version as well as any potential extensions. With the
upcoming "reftable" ref backend the result is that Git would try to
access refs via the "files" backend even though the repository has been
initialized with the "reftable" backend, which will cause failures when
trying to access any refs.
Ideally, we would rewrite the whole test suite to not depend on state
written by previous tests, but that would result in a lot of changes in
this test suite. Instead, we only refactor tests which access the refdb
to be more robust by using their own separate repositories, which allows
us to be more careful and not discard required extensions.
Note that we also have to touch up how the CUSTOM_CONFIG_FILE gets
accessed. This environment variable contains the relative path to a
custom config file which we're setting up. But because we are now using
subrepositories, this relative path will not be found anymore because
our working directory changes. This issue is addressed by storing the
absolute path to the file in CUSTOM_CONFIG_FILE instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running this:
$ touch foo bar
$ chmod +x foo
$ git -c diff.external=echo diff --ext-diff --no-index foo bar
results in a segfault. The issue is that run_diff_cmd() passes a NULL
"xfrm_msg" variable to run_external_diff(), which feeds it to
strvec_push(), causing the segfault. The bug dates back to 82fbf269b9
(run_external_diff: use an argv_array for the command line, 2014-04-19),
though it mostly only ever worked accidentally. Before then, we just
stuck the NULL pointer into a "const char **" array, so our NULL ended
up acting as an extra end-of-argv sentinel (which was OK, because it was
the last thing in the array).
Curiously, though, this is only a problem with --no-index. We set up
xfrm_msg by calling fill_metainfo(). This result may be empty, or may
have text like "index 1234..5678\n", "rename from foo\nrename from
bar\n", etc. In run_external_diff(), we only look at xfrm_msg if the
"other" variable is not NULL. That variable is set when the paths of the
two sides of the diff pair aren't the same (in which case the
destination path becomes "other"). So normally it would kick in only for
a rename, in which case xfrm_msg should not be NULL (it would have the
rename information in it).
But with a "--no-index" of two blobs, we of course have two different
pathnames, and thus end up with a non-NULL "other" filename (which is
always just a repeat of the file2-name), but possibly a NULL xfrm_msg.
So how to fix it? I have a feeling that --no-index always passing
"other" to the external diff command is probably a bug. There was no
rename, and the name is always redundant with existing information we
pass (and this may even cause us to pass a useless "xfrm_msg" that
contains an "index 1234..5678" line). So one option would be to change
that behavior. We don't seem to have ever documented the "other" or
"xfrm_msg" parameters for external diffs.
But I'm not sure what fallout we might have from changing that behavior
now. So this patch takes the less-risky option, and simply teaches
run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes
it agnostic to whether "other" and "xfrm_msg" always come as a pair. It
fixes the segfault now, and if we want to change the --no-index "other"
behavior on top, it will handle that, too.
Reported-by: Wilfred Hughes <me@wilfred.me.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tighten URL checks fsck makes in a URL recorded for submodules.
* vd/fsck-submodule-url-test:
submodule-config.c: strengthen URL fsck check
t7450: test submodule urls
test-submodule: remove command line handling for check-name
submodule-config.h: move check_submodule_url
When $HOME/.gitignore is missing but XDG config file available, we
should write into the latter, not former. "git gc" and "git
maintenance" wrote into a wrong "global config" file, which have
been corrected.
* kh/maintenance-use-xdg-when-it-should:
maintenance: use XDG config if it exists
config: factor out global config file retrieval
config: rename global config function
config: format newlines
A few tests to "git commit -o <pathspec>" and "git commit -i
<pathspec>" has been added.
* gt/test-commit-o-i-options:
t7501: add tests for --amend --signoff
t7501: add tests for --include and --only
CI for GitLab learned to drive macOS jobs.
* ps/gitlab-ci-macos:
ci: add macOS jobs to GitLab CI
ci: make p4 setup on macOS more robust
ci: handle TEST_OUTPUT_DIRECTORY when printing test failures
Makefile: detect new Homebrew location for ARM-based Macs
t7527: decrease likelihood of racing with fsmonitor daemon
Completion update to prepare for reftable
* ps/completion-with-reftable-fix:
completion: treat dangling symrefs as existing pseudorefs
completion: silence pseudoref existence check
completion: improve existence check for pseudo-refs
t9902: verify that completion does not print anything
completion: discover repo path in `__git_pseudoref_exists ()`
Tweak a few tests not to manually modify the reference database
(hence easier to work with other backends like reftable).
* jt/tests-with-reftable:
t5541: remove lockfile creation
t1401: remove lockfile creation
Custom merge drivers need access to the names of the revisions they
are working on, so that the merge conflict markers they introduce
can refer to those revisions. The placeholders '%S', '%X' and '%Y'
are introduced to this end.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch fixes a bug where fetch over http (or any helper) using the
v0 protocol may sometimes fail to auto-follow tags. The bug comes from
61c7711cfe (sha1-file: use loose object cache for quick existence check,
2018-11-12). But to explain why (and why this is the right fix), let's
take a step back.
After fetching a pack, the object database has changed, but we may still
hold in-memory caches that are now out of date. Traditionally this was
just the packed_git list, but 61c7711cfe started using a loose-object
cache, as well.
Usually these caches are invalidated automatically. When an expected
object cannot be found, the low-level object lookup routines call
reprepare_packed_git(), which re-scans the set of packs (and thanks to
some preparatory patches ahead of 61c7711cfe, throws away the loose
object cache). But not all calls do this! In some cases we expect that
the object might not exist, and pass OBJECT_INFO_QUICK to tell the
low-level routines not to bother re-scanning. And the tag auto-following
code is one such caller, since we are asking about oids that the other
side has (but we might not have locally).
To deal with this, we explicitly call reprepare_packed_git() ourselves
after fetching a pack; this goes all the way back to 48ec3e5c07
(Incorporate fetched packs in future object traversal, 2008-06-15). But
that only helps if we call fetch_pack() in the main fetch process. When
we're using a transport helper, it happens in a separate sub-process,
and the parent process is left with old values. So this is only a
problem with protocols which require a separate helper process (like
http).
This patch fixes it by teaching the parent process in the transport
helper relationship to make that same reprepare call after the helper
finishes fetching.
You might be left with some lingering questions, like:
1. Why only the v0 protocol, and not v2? It's because in v2 the child
helper doesn't actually run fetch_pack(); it merely establishes a
tunnel over which the main process can talk to the remote side (so
the fetch_pack() and reprepare happen in the main process).
2. Wouldn't we have the same bug even before the 61c7711cfe added
the loose object cache? For example, when we store the fetch as a
pack locally, wouldn't our packed_git list still be out of date?
If we store a pack, everything works because other parts of the
fetch process happen to trigger a call to reprepare_packed_git().
In particular, before storing whatever ref was originally
requested, we'll make sure we have the pointed-to object, and that
call happens without the QUICK flag. So in that case we'll see that
we don't know about it, reprepare, and then repeat our lookup. And
now we _do_ know about the pack, and further calls with QUICK will
find its contents.
Whereas when we unpack the result into loose objects, we never get
that same invalidation trigger. We didn't have packs before, and we
don't after. But when we do the loose object lookup, we find the
object. There's no way to realize that we didn't have the object
before the pack, and that having it now means things have changed
(in theory we could do a superfluous cache lookup to see that it
was missing from the old cache; but depending on the tags the other
side showed us, we might not even have filled in that part of the
cache earlier).
3. Why does the included test use "--depth 1"? This is important
because without it, we happen to invalidate the cache as a side
effect of other parts of the fetch process. What happens in a
non-shallow fetch is something like this:
1. we call find_non_local_tags() once before actually getting the
pack, to see if there are any tags we can fill in from what we
already have. This fills in the cache (which is obviously
missing objects we're about to fetch).
2. before fetching the actual pack, fetch_and_consume_refs()
calls check_exist_and_connected(), to see if we even need to
fetch a pack at all. This doesn't use QUICK (though arguably
it could, as it's purely an optimization). And since it sees
there are objects we are indeed missing, that triggers a
reprepare_packed_git() call, which throws out the loose object
cache.
3. after fetching, now we call find_non_local_tags() again. And
since step (2) invalidated our loose object cache, we find
the new objects and create the tags.
So everything works, but mostly due to luck. Whereas in a fetch
with --depth, we skip step 2 entirely, and thus the out-of-date
cache is still in place for step 3, giving us the wrong answer.
So the test works with a small "--depth 1" fetch, which makes sure that
we don't store the pack from the other side, and that we don't trigger
the accidental cache invalidation. And of course it forces the use of
v0 along with using the http protocol.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move a few tests into t0601 since they specifically test the packed-refs
file and thus are specific to the reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move two tests into t0600 since they write loose reflog refs manually
and thus are specific to the reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test, the calls to cut(1) are only used to verify that the
contents of the reflog entry look as expected. By replacing these with
git-reflog(1) calls, we can make this test ref-backend agnostic.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test to t0600 with other reffiles specific tests since it
checks for loose refs and is specific to the reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test into t0601 with other reffiles pack-refs specific tests
since it checks for individual loose refs and thus is specific to the
reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move these tests to t0600 with other reffiles specific tests since they
do things like take a lock on an individual ref, and write directly into
the reflog refs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test to t0600 with the rest of the tests that are specific to
reffiles. This test reaches into reflog directories manually, and so are
specific to reffiles.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test to t0601 with other reffiles specific pack-refs tests
since it is reffiles specific in that it looks into the loose refs
directory for an assertion.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests modify loose refs manually and are specific to the reffiles
backend. Move these to t0600 to be part of a test suite of reffiles
specific tests.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test can be re-written to use Git commands rather than writing a
manual ref in the reflog. This way this test no longer needs the
REFFILES prerequisite.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests are compatible with the reftable backend and thus do not
need the REFFILES prerequisite. Even though 53af25e4
(t1405: mark test that checks existence as REFFILES, 2022-01-31) and
53af25e4 (t1405: mark test that checks existence as REFFILES,
2022-01-31) marked these tests to require REFFILES, the reftable backend
in its current state does indeed work with these tests.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move t3210 to t0601, since these tests are reffiles specific in that
they modify loose refs manually. This is part of the effort to
categorize these tests together based on the ref backend they test. When
we upstream the reftable backend, we can add more tests to t06xx. This
way, all tests that test specific ref backend behavior will be grouped
together.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>