admin/git

Author	SHA1	Message	Date
Patrick Steinhardt	de404249ab	t5333: fix missing terminator for sed(1) 's' command In `6aec8d38fd` (t: refactor tests depending on Perl to print data, 2025-04-03) we have changed some of the tests in t4150 to use sed(1) instead of Perl. One of the conversions is broken though: sed: -e expression #1, char 41: unterminated `s' command Curiously enough, the test itself still passes. This is caused by a sequence of failures: 1. The output of sed(1) is piped into git-update-ref(1), and because sed(1) is the upstream command we don't notice that it fails. 2. git-update-ref(1) does not receive any input and thus won't create any references. 3. We then repack the repository with the configured pseudo merges pattern, but as we didn't create any references the pattern doesn't match anything. 4. We use `test_pseudo_merges()` to compute the list of pseudo-merges and write it into a file. This file is empty as there are none. 5. The loop over the pseudo-merges becomes a no-op. 6. The final test succeeds as well because the number of lines in an empty file is obviously the same as the number of unique lines, namely zero. Fix the issue by adding the terminating '\|' to the sed(1) command. Furthermore, make the test a tiny bit more robust by not using it as part of a pipe. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-07 09:12:44 -07:00
Patrick Steinhardt	385e175cb5	t4150: fix warning printed by awk due to escaped '\@' In `6aec8d38fd` (t: refactor tests depending on Perl to print data, 2025-04-03) we have changed one of the tests in t4150 to use awk(1) instead of Perl. The test works, but at least gawk(1) prints a warning now: awk: cmd. line:3: warning: escape sequence `\@' treated as plain `@' Fix this by removing the backslash. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-07-07 09:12:43 -07:00
Junio C Hamano	427b538fc3	Merge branch 'mm/test-in-absolute-home' Tests that compare $HOME and $(pwd), which should be the same directory unless the tests chdir's around, would fail when the user enters the test directory via symbolic links, which has been corrected. * mm/test-in-absolute-home: t: run tests from a normalized working directory	2025-06-09 07:15:51 -07:00
Junio C Hamano	14de3eb344	Merge branch 'js/t5410-tee-hang-workaround' * js/t5410-tee-hang-workaround: t5410: avoid hangs in CI runs in the win+Meson test jobs	2025-06-05 11:56:29 -07:00
Johannes Schindelin	52a86dd26d	t5410: avoid hangs in CI runs in the win+Meson test jobs In the GitHub workflow used in Git's CI builds, the `vs test` jobs use a subset of a specific revision of Git for Windows' SDK to run Git's test suite. This revision is validated by another CI workflow to ensure that said revision _can_ run Git's test suite successfully, skipping buggy updates in Git for Windows' SDK. The `win+Meson test` jobs do things differently, quite differently. They use the Bash of the Git for Windows version that is installed on the runners to run Git's test suite. This difference has consequences. When `68cb0b5253` (builtin/receive-pack: add option to skip connectivity check, 2025-05-20) introduced a test case that uses `tee <file> \| git receive-pack` as `--receive-pack` parameter (imitating an existing pattern in the same test script), it hit just the sweet spot to trigger a bug in the MSYS2 runtime shipped in Git for Windows v2.49.0. This version is the one currently installed on GitHub's runners. The problem is that the `git receive-pack` process finishes while the `tee` process does not need to write anything anymore and therefore does not receive an EOF. Instead, it should receive a SIGPIPE, but the bug in the MSYS2 runtime prevents that from working as intended. As a consequence, the `tee` process waits for more input from the `git.exe send-pack` process but none is coming, and the test script patiently waits until the 6h timeout hits. Only every once in a while, the `git receive-pack` process manages to send an EOF to the `tee` process and no hang occurs. Therefore, the problem can be worked around by cancelling the clearly-hanging job after twenty or so minutes and re-running it, repeating the process about half a dozen times, until the hang was successfully avoided. This bug in the MSYS2 runtime has been fixed in the meantime, which is the reason why the same test case causes no problems in the `win test` and the `vs test` jobs. This will continue to be the case until the Git for Windows version on the GitHub runners is upgraded to a version that distributes a newer MSYS2 runtime version. However, as of time of writing, this _is_ the latest Git for Windows version, and will be for another 1.5 weeks, until Git v2.50.0 is scheduled to appear (and shortly thereafter Git for Windows v2.50.0). Traditionally it takes a while before the runners pick up the new version. We could just wait it out, six hours at a time. Here, I opt for an alternative: Detect the buggy MSYS2 runtime and simply skip the test case. It's not like the `receive-pack` test cases are specific to Windows, and even then, to my chagrin the CI runs in git-for-windows/git spend around ten hours of compute time each and every time to run the entire test suite on all the platforms, even the tests that cover cross-platform code, and for Windows alone we do that three times: with GCC, with MSVC, and with MSVC via Meson. Therefore, I deem it more than acceptable to skip this test case in one of those matrices. For good luck, also the preceding test case is skipped in that scenario, as it uses the same `--receive-pack=tee <file> \| git receive-pack` pattern, even though I never observed that test case to hang in practice. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-06-05 09:45:42 -07:00
Junio C Hamano	c38b74f286	Merge branch 'sj/ref-contents-check-fix' "git verify-refs" (and hence "git fsck --reference") started erroring out in a repository in which secondary worktrees were prepared with Git 2.43 or lower. * sj/ref-contents-check-fix: fsck: ignore missing "refs" directory for linked worktrees	2025-06-03 08:55:23 -07:00
shejialuo	d5b3c38b8a	fsck: ignore missing "refs" directory for linked worktrees "git refs verify" doesn't work if there are worktrees created on Git v2.43.0 or older versions. These versions don't automatically create the "refs" directory, causing the error: error: cannot open directory .git/worktrees/<worktree name>/refs: No such file or directory Since `8f4c00de95` (builtin/worktree: create refdb via ref backend, 2024-01-08), we automatically create the "refs" directory for new worktrees. And in `7c78d819e6` (ref: support multiple worktrees check for refs, 2024-11-20), we assume that all linked worktrees have this directory and would wrongly report an error to the user, thus introducing compatibility issue. Check for ENOENT errno before reporting directory access errors for linked worktrees to maintain backward compatibility. Reported-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-06-02 11:20:19 -07:00
Junio C Hamano	bbe8a3723b	Merge branch 'jc/signed-fast-export-is-experimental' Mark a new feature added during this cycle as experimental and fix its default so that existing users of the fast-export command is not broken. * jc/signed-fast-export-is-experimental: fast-export: --signed-commits is experimental	2025-06-02 09:25:34 -07:00
Mark Mentovai	2d207ed1ec	t: run tests from a normalized working directory Some tests make git perform actions that produce observable pathnames, and have expectations on those paths. Tests run with $HOME set to a $TRASH_DIRECTORY, and with their working directory the same $TRASH_DIRECTORY, although these paths are logically identical, they do not observe the same pathname canonicalization rules and thus might not be represented by strings that compare equal. In particular, no pathname normalization is applied to $TRASH_DIRECTORY or $HOME, while tests change their working directory with `cd -P`, which normalizes the working directory's path by fully resolving symbolic links. t7900's macOS maintenance tests (which are not limited to running on macOS) have an expectation on a path that `git maintenance` forms by using abspath.c strbuf_realpath() to resolve a canonical absolute path based on $HOME. When t7900 runs from a working directory that contains symbolic links in its pathname, $HOME will also contain symbolic links, which `git maintenance` resolves but the test's expectation does not, causing a test failure. Align $TRASH_DIRECTORY and $HOME with the normalized path as used for the working directory by resetting them to match the working directory after it's established by `cd -P`. With all paths in agreement and symbolic links resolved, pathname expectations can be set and met based on string comparison without regard to external environmental factors such as the presence of symbolic links in a path. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Mark Mentovai <mark@chromium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-30 14:55:03 -07:00
Junio C Hamano	5d2812ff3c	Merge branch 'mm/apply-reverse-mode-of-deleted-path' "git apply --index/--cached" when applying a deletion patch in reverse failed to give the mode bits of the path "removed" by the patch to the file it creates, which has been corrected. * mm/apply-reverse-mode-of-deleted-path: apply: set file mode when --reverse creates a deleted file t4129: test that git apply warns for unexpected mode changes	2025-05-30 11:59:17 -07:00
Junio C Hamano	0b4c6baa70	fast-export: --signed-commits is experimental As the design of signature handling is still being discussed, it is likely that the data stream produced by the code in Git 2.50 would have to be changed in such a way that is not backward compatible. Mark the feature as experimental and discourge its use for now. Also flip the default on the generation side to "strip"; users of existing versions would not have passed --signed-commits=strip and will be broken by this change if the default is made to abort, and will be encouraged by the error message to produce data stream with future breakage guarantees by passing --signed-commits option. As we tone down the default behaviour, we no longer need the FAST_EXPORT_SIGNED_COMMITS_NOABORT environment variable, which was not discoverable enough. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-28 10:30:47 -07:00
Junio C Hamano	b4847a4477	Merge branch 'jt/receive-pack-skip-connectivity-check' "git receive-pack" optionally learns not to care about connectivity check, which can be useful when the repository arranges to ensure connectivity by some other means. * jt/receive-pack-skip-connectivity-check: builtin/receive-pack: add option to skip connectivity check t5410: test receive-pack connectivity check	2025-05-28 07:59:56 -07:00
Junio C Hamano	b5afd0a7ee	Merge branch 'kn/passing-leak-tests' Remove the leftover hints to the test framework to mark tests that do not pass the leak checker tests, as they should no longer be needed. * kn/passing-leak-tests: t: remove unexpected SANITIZE_LEAK variables	2025-05-28 07:59:56 -07:00
Junio C Hamano	80f49f2ae7	Merge branch 'en/sequencer-comment-messages' Prefix '#' to the commit title in the "rebase -i" todo file, just like a merge commit being replayed. * en/sequencer-comment-messages: sequencer: make it clearer that commit descriptions are just comments	2025-05-27 13:59:11 -07:00
Junio C Hamano	d8b48af391	Merge branch 'sj/use-mmap-to-check-packed-refs' The code path to access the "packed-refs" file while "fsck" is taught to mmap the file, instead of reading the whole file in the memory. * sj/use-mmap-to-check-packed-refs: packed-backend: mmap large "packed-refs" file during fsck packed-backend: extract snapshot allocation in `load_contents` packed-backend: fsck should warn when "packed-refs" file is empty	2025-05-27 13:59:10 -07:00
Junio C Hamano	6e5fb398d3	Merge branch 'ds/sparse-apply-add-p' "git apply" and "git add -i/-p" code paths no longer unnecessarily expand sparse-index while working. * ds/sparse-apply-add-p: p2000: add performance test for patch-mode commands reset: integrate sparse index with --patch git add: make -p/-i aware of sparse index apply: integrate with the sparse index	2025-05-27 13:59:09 -07:00
Junio C Hamano	f545f401be	Merge branch 'en/merge-tree-check' "git merge-tree" learned an option to see if it resolves cleanly without actually creating a result. * en/merge-tree-check: merge-tree: add a new --quiet flag merge-ort: add a new mergeability_only option	2025-05-27 13:59:08 -07:00
Junio C Hamano	17d9dbd3c2	Merge branch 'jk/no-funny-object-types' Support to create a loose object file with unknown object type has been dropped. * jk/no-funny-object-types: object-file: drop support for writing objects with unknown types hash-object: handle --literally with OPT_NEGBIT hash-object: merge HASH_* and INDEX_* flags hash-object: stop allowing unknown types t: add lib-loose.sh t/helper: add zlib test-tool oid_object_info(): drop type_name strbuf fsck: stop using object_info->type_name strbuf oid_object_info_convert(): stop using string for object type cat-file: use type enum instead of buffer for -t option object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag cat-file: make --allow-unknown-type a noop object-file.h: fix typo in variable declaration	2025-05-27 13:59:08 -07:00
Junio C Hamano	dcb89740a0	Merge branch 'md/userdiff-bash-shell-function' The userdiff pattern for shell scripts has been updated to cope with more bash-isms. * md/userdiff-bash-shell-function: userdiff: extend Bash pattern to cover more shell function forms	2025-05-27 13:59:06 -07:00
Mark Mentovai	1d9a66493b	apply: set file mode when --reverse creates a deleted file Commit `01aff0a` (apply: correctly reverse patch's pre- and post-image mode bits, 2023-12-26) revised reverse_patches() to maintain the desired property that when only one of patch::old_mode and patch::new_mode is set, the mode will be carried in old_mode. That property is generally correct, with one notable exception: when creating a file, only new_mode will be set. Since reversing a deletion results in a creation, new_mode must be set in that case. Omitting handling for this case means that reversing a patch that removes an executable file will not result in the executable permission being set on the re-created file. Existing test coverage for file modes focuses only on mode changes of existing files. Swap old_mode and new_mode in reverse_patches() for what's represented in the patch as a file deletion, as it is transformed into a file creation under reversal. This causes git apply --reverse to set the executable permission properly when re-creating a deleted executable file. Add tests ensuring that git apply sets file modes correctly on file creation, both in the forward and reverse directions. Signed-off-by: Mark Mentovai <mark@chromium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-27 06:48:07 -07:00
Mark Mentovai	2cc8c17d67	t4129: test that git apply warns for unexpected mode changes There is no test covering what commit `01aff0a` (apply: correctly reverse patch's pre- and post-image mode bits, 2023-12-26) addressed. Prior to that commit, git apply was erroneously unaware of a file's expected mode while reverse-patching a file whose mode was not changing. Add the missing test coverage to assure that git apply is aware of the expected mode of a file being patched when the patch does not indicate that the file's mode is changing. This is achieved by arranging a file mode so that it doesn't agree with patch being applied, and checking git apply's output for the warning it's supposed to raise in this situation. Test in both reverse and normal (forward) directions. Signed-off-by: Mark Mentovai <mark@chromium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-27 06:48:07 -07:00
Junio C Hamano	95c79efb8d	Merge branch 'ds/scalar-no-maintenance' Two "scalar" subcommands that adds a repository that hasn't been under "scalar"'s control are taught an option not to enable the scheduled maintenance on it. * ds/scalar-no-maintenance: scalar reconfigure: improve --maintenance docs scalar reconfigure: add --maintenance=<mode> option scalar clone: add --no-maintenance option scalar register: add --no-maintenance option scalar: customize register_dir()'s behavior	2025-05-23 15:34:07 -07:00
Karthik Nayak	368d8c86f7	t: remove unexpected SANITIZE_LEAK variables As of `1fc7ddf35b` (test-lib: unconditionally enable leak checking, 2024-11-20), both the `GIT_TEST_PASSING_SANITIZE_LEAK` and `TEST_PASSES_SANITIZE_LEAK` variables no longer have any meaning, the leak checks are enabled by default. However, some newly added tests include them by mistake. Let's clean this up. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-20 15:09:33 -07:00
Justin Tobler	68cb0b5253	builtin/receive-pack: add option to skip connectivity check During git-receive-pack(1), connectivity of the object graph is validated to ensure that the received packfile does not leave the repository in a broken state. This is done via git-rev-list(1) and walking the objects, which can be expensive for large repositories. Generally, this check is critical to avoid an incomplete received packfile from corrupting a repository. Server operators may have additional knowledge though around exactly how Git is being used on the server-side which can be used to facilitate more efficient connectivity computation of incoming objects. For example, if it can be ensured that all objects in a repository are connected and do not depend on any missing objects, the connectivity of newly written objects can be checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. These boundary objects can be checked in the canonical repository to ensure the new objects connect as expected and thus avoid walking the rest of the object graph. Git itself cannot make the guarantees required for such an optimization as it is possible for a repository to contain an unreachable object that references a missing object without the repository being considered corrupt. Introduce the --skip-connectivity-check option for git-receive-pack(1) which bypasses this connectivity check to give more control to the server-side. Note that without proper server-side validation of newly received objects handled outside of Git, usage of this option risks corrupting a repository. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-20 11:43:36 -07:00
Justin Tobler	95262afe78	t5410: test receive-pack connectivity check As part of git-recieve-pack(1), the connectivity of objects is checked. Add a test validating that git-receive-pack(1) fails due to an incoming packfile that would leave the repository with missing objects. Instead of creating a new test file, "t5410" is generalized for receive-pack testing. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-20 11:43:36 -07:00
Junio C Hamano	90eedabbf7	Merge branch 'ps/reftable-read-block-perffix' Performance regression in not-yet-released code has been corrected. * ps/reftable-read-block-perffix: reftable: fix perf regression when reading blocks of unwanted type	2025-05-19 16:02:48 -07:00
Junio C Hamano	a9dcacbf2a	Merge branch 'jk/oidmap-cleanup' Code cleanup. * jk/oidmap-cleanup: raw_object_store: drop extra pointer to replace_map oidmap: add size function oidmap: rename oidmap_free() to oidmap_clear()	2025-05-19 16:02:47 -07:00
Junio C Hamano	9af978fa04	Merge branch 'rc/t1001-test-path-is-file' Test update. * rc/t1001-test-path-is-file: t1001: replace 'test -f' with 'test_path_is_file'	2025-05-19 16:02:47 -07:00
Junio C Hamano	4bb72548fc	Merge branch 'sc/bundle-uri-use-all-refs-in-bundle' Bundle-URI feature did not use refs recorded in the bundle other than normal branches as anchoring points to optimize the follow-up fetch during "git clone"; now it is told to utilize all. * sc/bundle-uri-use-all-refs-in-bundle: bundle-uri: add test for bundle-uri clones with tags bundle-uri: copy all bundle references ino the refs/bundle space	2025-05-19 16:02:45 -07:00
Junio C Hamano	0b8d22fd40	Merge branch 'pw/sequencer-reflog-use-after-free' Use-after-free fix in the sequencer. * pw/sequencer-reflog-use-after-free: sequencer: rework reflog message handling sequencer: move reflog message functions	2025-05-19 16:02:44 -07:00
Elijah Newren	29d7bf1951	merge-tree: add a new --quiet flag Git Forges may be interested in whether two branches can be merged while not being interested in what the resulting merge tree is nor which files conflicted. For such cases, add a new --quiet flag which will make use of the new mergeability_only flag added to merge-ort in the previous commit. This option allows the merge machinery to, in the outer layer of the merge: * exit early when a conflict is detected * avoid writing (most) merged blobs/trees to the object store Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 15:09:14 -07:00
Elijah Newren	e42667241d	sequencer: make it clearer that commit descriptions are just comments Every once in a while, users report that editing the commit summaries in the todo list does not get reflected in the rebase operation, suggesting that users are (a) only using one-line commit messages, and (b) not understanding that the commit summaries are merely helpful comments to help them find the right hashes. It may be difficult to correct users' poor commit messages, but we can at least try to make it clearer that the commit summaries are not directives of some sort by inserting a comment character. Hopefully that leads to them looking a little further and noticing the hints at the bottom to use 'reword' or 'edit' directives. Yes, this change may look funny at first since it hardcodes '#' rather than using comment_line_str. However: * comment_line_str exists to allow disambiguation between lines in a commit message and lines that are instructions to users editing the commit message. No such disambiguation is needed for these comments that occur on the same line after existing directives * the exact "comment" character(s) on regular pick lines used aren't actually important; I could have used anything, including completely random variable length text for each line and it'd work because we ignore everything after 'pick' and the hash. * The whole point of this change is to signal to users that they should NOT be editing any part of the line after the hash (and if they do so, their edits will be ignored), while the whole point of comment_line_str is to allow highly flexible editing. So making it more general by using comment_line_str actually feels counterproductive. * The character for merge directives absolutely must be '#'; that has been deeply hardcoded for a long time (see below), and will break if some other comment character is used instead. In a desire to have pick and merge directives be similar, I use the same comment character for both. * Perhaps merge directives could be fixed to not be inflexible about the comment character used, if someone feels highly motivated, but I think that should be done in a separate follow-on patch. Here are (some of?) the locations where '#' has already been hardcoded for a long time for merges: 1) In check_label_or_ref_arg(): case TODO_LABEL: /* * '#' is not a valid label as the merge command uses it to * separate merge parents from the commit subject. / 2) In do_merge(): / * For octopus merges, the arg starts with the list of revisions to be * merged. The list is optionally followed by '#' and the oneline. / merge_arg_len = oneline_offset = arg_len; for (p = arg; p - arg < arg_len; p += strspn(p, " \t\n")) { if (!p) break; if (p == '#' && (!p[1] \|\| isspace(p[1]))) { 3) In label_oid(): if ((buf->len == the_hash_algo->hexsz && !get_oid_hex(label, &dummy)) \|\| (buf->len == 1 && label == '#') \|\| hashmap_get_from_hash(&state->labels, strihash(label), label)) { /* * If the label already exists, or if the label is a * valid full OID, or the label is a '#' (which we use * as a separator between merge heads and oneline), we * append a dash and a number to make it unique. */ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 12:28:27 -07:00
Derrick Stolee	ecf9ba20e3	p2000: add performance test for patch-mode commands The previous three changes contributed performance improvements to 'git apply', 'git add -p', and 'git reset -p' when using a sparse index. The improvement to 'git apply' also improved 'git checkout -p'. Add performance tests to demonstrate this (and to help validate that performance remains good in the future). In the truncated test output below, we see that the full checkout performance changes within noise expectations, but the sparse index cases improve 33% and then 96% for 'git add -p' and 41% and then 95% for 'git reset -p'. 'git checkout -p' improves immediatley by 91% because it does not need any change to its builtin. Test HEAD~4 HEAD~3 HEAD~2 HEAD~1 ------------------------------------------------------------------------------------- 2000.118: ... git add -p (full-v3) 0.79 0.79 +0.0% 0.82 +3.8% 0.82 +3.8% 2000.119: ... git add -p (full-v4) 0.74 0.76 +2.7% 0.74 +0.0% 0.76 +2.7% 2000.120: ... git add -p (sparse-v3) 1.94 1.28 -34.0% 0.07 -96.4% 0.07 -96.4% 2000.121: ... git add -p (sparse-v4) 1.93 1.28 -33.7% 0.06 -96.9% 0.06 -96.9% 2000.122: ... git checkout -p (full-v3) 1.18 1.18 +0.0% 1.18 +0.0% 1.19 +0.8% 2000.123: ... git checkout -p (full-v4) 1.10 1.12 +1.8% 1.11 +0.9% 1.11 +0.9% 2000.124: ... git checkout -p (sparse-v3) 1.31 0.11 -91.6% 0.11 -91.6% 0.11 -91.6% 2000.125: ... git checkout -p (sparse-v4) 1.29 0.11 -91.5% 0.11 -91.5% 0.11 -91.5% 2000.126: ... git reset -p (full-v3) 0.81 0.80 -1.2% 0.83 +2.5% 0.83 +2.5% 2000.127: ... git reset -p (full-v4) 0.78 0.77 -1.3% 0.77 -1.3% 0.78 +0.0% 2000.128: ... git reset -p (sparse-v3) 1.58 0.92 -41.8% 0.91 -42.4% 0.07 -95.6% 2000.129: ... git reset -p (sparse-v4) 1.58 0.92 -41.8% 0.92 -41.8% 0.07 -95.6% It is worth noting that if our test was more involved and had multiple hunks to evaluate, then the time spent in 'git apply' would dominate due to multiple index loads and writes. As it stands, we need the sparse index improvement in 'git add -p' itself to confirm this performance improvement. Since the change for 'git add -i' is identical, we avoid a second test case for that similar operation. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 12:02:47 -07:00
Derrick Stolee	efab7dc1f4	reset: integrate sparse index with --patch Similar to the previous change for 'git add -p', the reset builtin checked for integration with the sparse index after possibly redirecting its logic toward the interactive logic. This means that the builtin would expand the sparse index to a full one upon read. Move this check earlier within cmd_reset() to improve performance here. Add tests to guarantee that we are not universally expanding the index. Add behavior tests to check that we are doing the same operations as a full index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 12:02:47 -07:00
Derrick Stolee	02ed8555f6	git add: make -p/-i aware of sparse index It is slow to expand a sparse index in-memory due to parsing of trees. We aim to minimize that performance cost when possible. 'git add -p' uses 'git apply' child processes to modify the index, but still there are some expansions that occur. It turns out that control flows out of cmd_add() in the interactive cases before the lines that confirm that the builtin is integrated with the sparse index. Moving that integration point earlier in cmd_add() allows 'git add -i' and 'git add -p' to operate without expanding a sparse index to a full one. Add test cases that confirm that these interactive add options work with the sparse index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 12:01:51 -07:00
Derrick Stolee	952de281fe	apply: integrate with the sparse index The sparse index allows storing directory entries in the index, marked with the skip-wortkree bit and pointing to a tree object. This may be an unexpected data shape for some implementation areas, so we are rolling it out incrementally on a builtin-per-builtin basis. This change enables the sparse index for 'git apply'. The main motivation for this change is that 'git apply' is used as a child process of 'git add -p' and expanding the sparse index for each of those child processes can lead to significant performance issues. The good news is that the actual index manipulation code used by 'git apply' is already integrated with the sparse index, so the only product change is to mark the builtin as allowing the sparse index so it isn't inflated on read. The more involved part of this change is around adding tests that verify how 'git apply' behaves in a sparse-checkout environment and whether or not the index expands in certain operations. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 12:00:33 -07:00
Moumita Dhar	ea8a71b40d	userdiff: extend Bash pattern to cover more shell function forms The previous function regex required explicit matching of function bodies using `{`, `(`, `((`, or `[[`, which caused several issues: - It failed to capture valid functions where `{` was on the next line due to line continuation (`\`). - It did not recognize functions with single command body, such as `x () echo hello`. Replacing the function body matching logic with `.*$`, ensures that everything on the function definition line is captured. Additionally, the word regex is refined to better recognize shell syntax, including additional parameter expansion operators and command-line options. Signed-off-by: Moumita Dhar <dhar61595@gmail.com> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 11:52:41 -07:00
Jeff King	65a6a79b42	hash-object: stop allowing unknown types When passed the "--literally" option, hash-object will allow any arbitrary string for its "-t" type option. Such objects are only useful for testing or debugging, as they cannot be used in the normal way (e.g., you cannot fetch their contents!). Let's drop this feature, which will eventually let us simplify the object-writing code. This is technically backwards incompatible, but since such objects were never really functional, it seems unlikely that anybody will notice. We will retain the --literally flag, as it also instructs hash-object not to worry about other format issues (e.g., type-specific things that fsck would complain about). The documentation does not need to be updated, as it was always vague about which checks we're loosening (it uses only the phrase "any garbage"). The code change is a bit hard to verify from just the patch text. We can drop our local hash_literally() helper, but it was really just wrapping write_object_file_literally(). We now replace that with calling index_fd(), as we do for the non-literal code path, but dropping the INDEX_FORMAT_CHECK flag. This ends up being the same semantically as what the _literally() code path was doing (modulo handling unknown types, which is our goal). We'll be able to clean up these code paths a bit more in subsequent patches. The existing test is flipped to show that we now reject the unknown type. The additional "extra-long type" test is now redundant, as we bail early upon seeing a bogus type. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 09:43:11 -07:00
Jeff King	b5643b60ac	t: add lib-loose.sh This commit adds a shell library for writing raw loose objects into the object database. Normally this is done with hash-object, but the specific intent here is to allow broken objects that hash-object may not support. We'll convert several cases that use "hash-object --literally" to write objects with invalid types. That works currently, but dropping this dependency will allow us to remove that feature and simplify the object-writing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 09:43:11 -07:00
Jeff King	f2ed511a2f	t/helper: add zlib test-tool It's occasionally useful when testing or debugging to be able to do raw zlib inflate/deflate operations (e.g., to check the bytes of a specific loose or packed object). Even though zlib's deflate algorithm is used by many other programs, this is surprisingly hard to do in a portable way. E.g., gzip can do this if you manually munge some header bytes. But the result is somewhat arcane, and we don't assume gzip is available anyway. Likewise, pigz will handle raw zlib, but we can't assume it is available. So let's introduce a short test helper for just doing zlib operations. We'll use it in subsequent patches to add some new tests, but it would also have come in handy a few times in the past: - The hard-coded pack data from `3b910d0c5e` (add tests for indexing packs with delta cycles, 2013-08-23) could probably be generated on the fly. - Likewise we could avoid the hard-coded data from `0b1493c2d4` (git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT, 2025-02-25). Though note this would require support for more zlib options. - It would have helped with the debugging documented in `41dfbb2dbe` (howto: add article on recovering a corrupted object, 2013-10-25). I'll leave refactoring existing tests for another day, but I hope the examples above show the general utility. I aimed for simplicity in the code. In particular, it will read all input into a memory buffer, rather than streaming. That makes the zlib loops harder to get wrong (which has been a source of subtle bugs in the past). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 09:43:11 -07:00
Jeff King	4ae0e9423c	fsck: stop using object_info->type_name strbuf When fsck-ing a loose object, we use object_info's type_name strbuf to record the parsed object type as a string. For most objects this is redundant with the object_type enum, but it does let us report the string when we encounter an object with an unknown type (for which there is no matching enum value). There are a few downsides, though: 1. The code to report these cases is not actually robust. Since we did not pass a strbuf to unpack_loose_header(), we only retrieved types from headers up to 32 bytes. In longer cases, we'd simply say "object corrupt or missing". 2. This is the last caller that uses object_info's type_name strbuf support. It would be nice to refactor it so that we can simplify that code. 3. Likewise, we'll check the hash of the object using its unknown type (again, as long as that type is short enough). That depends on the hash_object_file_literally() code, which we'd eventually like to get rid of. So we can simplify things by bailing immediately in read_loose_object() when we encounter an unknown type. This has a few user-visible effects: a. Instead of producing a single line of error output like this: error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object is of unknown type 'bogus': .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 we'll now issue two lines (the first from read_loose_object() when we see the unparsable header, and the second from the fsck code, since we couldn't read the object): error: unable to parse type from header 'bogus 4' of .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object corrupt or missing: .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 This is a little more verbose, but this sort of error should be rare (such objects are almost impossible to work with, and cannot be transferred between repositories as they are not representable in packfiles). And as a bonus, reporting the broken header in full could help with debugging other cases (e.g., a header like "blob xyzzy\0" would fail in parsing the size, but previously we'd not have showed the offending bytes). b. An object with an unknown type will be reported as corrupt, without actually doing a hash check. Again, I think this is unlikely to matter in practice since such objects are totally unusable. We'll update one fsck test to match the new error strings. And we can remove another test that covered the case of an object with an unknown type _and_ a hash corruption. Since we'll skip the hash check now in this case, the test is no longer interesting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 09:43:10 -07:00
Jeff King	f227fc7d43	cat-file: make --allow-unknown-type a noop The cat-file command has some minor support for handling objects with "unknown" types. I.e., strings that are not "blob", "commit", "tree", or "tag". In theory this could be used for debugging or experimenting with extensions to Git. But in practice this support is not very useful: 1. You can get the type and size of such objects, but nothing else. Not even the contents! 2. Only loose objects are supported, since packfiles use numeric ids for the types, rather than strings. 3. Likewise you cannot ever transfer objects between repositories, because they cannot be represented in the packfiles used for the on-the-wire protocol. The support for these unknown types complicates the object-parsing code, and has led to bugs such as `b748ddb7a4` (unpack_loose_header(): fix infinite loop on broken zlib input, 2025-02-25). So let's drop it. The first step is to remove the user-facing parts, which are accessible only via cat-file. This is technically backwards-incompatible, but given the limitations listed above, these objects couldn't possibly be useful in any workflow. However, we can't just rip out the option entirely. That would hurt a caller who ran: git cat-file -t --allow-unknown-object <oid> and fed it normal, well-formed objects. There --allow-unknown-type was doing nothing, but we wouldn't want to start bailing with an error. So to protect any such callers, we'll retain --allow-unknown-type as a noop. The code change is fairly small (but we'll able to clean up more code in follow-on patches). The test updates drop any use of the option. We still retain tests that feed the broken objects to cat-file without --allow-unknown-type, as we should continue to confirm that those objects are rejected. Note that in one spot we can drop a layer of loop, re-indenting the body; viewing the diff with "-w" helps there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 09:43:09 -07:00
Junio C Hamano	38fc278819	Merge branch 'jc/t6011-mv-ro-fix' Test fix. * jc/t6011-mv-ro-fix: t6011: fix misconversion from perl to sed	2025-05-15 17:24:57 -07:00
Junio C Hamano	4dda60c9df	Merge branch 'ps/maintenance-missing-tasks' Make repository clean-up tasks "gc" can do available to "git maintenance" front-end. * ps/maintenance-missing-tasks: builtin/maintenance: introduce "rerere-gc" task builtin/gc: move rerere garbage collection into separate function builtin/maintenance: introduce "worktree-prune" task builtin/gc: move pruning of worktrees into a separate function builtin/gc: remove global variables where it is trivial to do builtin/gc: fix indentation of `cmd_gc()` parameters	2025-05-15 17:24:56 -07:00
shejialuo	784ceccb91	packed-backend: fsck should warn when "packed-refs" file is empty We assume the "packed-refs" won't be empty and instead has at least one line in it (even when there are no refs packed, there is the file header line). Because there is no terminating LF in the empty file, we will report "packedRefEntryNotTerminated(ERROR)" to the user. However, the runtime code paths would accept an empty "packed-refs" file, for example, "create_snapshot" would simply return the "snapshot" without checking the content of "packed-refs". So, we should skip checking the content of "packed-refs" when it is empty during fsck. After `694b7a1999` (repack_without_ref(): write peeled refs in the rewritten file, 2013-04-22), we would always write a header into the "packed-refs" file. So, versions of Git that are not too ancient never write such an empty "packed-refs" file. As an empty file often indicates a sign of a filesystem-level issue, the way we want to resolve this inconsistency is not make everybody totally silent but notice and report the anomaly. Let's create a "FSCK_INFO" message id "EMPTY_PACKED_REFS_FILE" to report to the users that "packed-refs" is empty. Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-14 12:32:58 -07:00
Junio C Hamano	330a09e4a5	Merge branch 'kj/glob-path-with-special-char' "git add 'f?o'" did not add 'foo' if 'f?o', an unusual pathname, also existed on the working tree, which has been corrected. * kj/glob-path-with-special-char: dir.c: literal match with wildcard in pathspec should still glob	2025-05-13 14:05:07 -07:00
Junio C Hamano	a4ad13dd19	Merge branch 'ng/xdiff-truly-minimal' "git diff --minimal" used to give non-minimal output when its optimization kicked in, which has been disabled. * ng/xdiff-truly-minimal: xdiff: disable cleanup_records heuristic with --minimal	2025-05-12 14:22:50 -07:00
Junio C Hamano	6dbc41631d	Merge branch 'ds/fix-thin-fix' "git index-pack --fix-thin" used to abort to prevent a cycle in delta chains from forming in a corner case even when there is no such cycle. * ds/fix-thin-fix: index-pack: allow revisiting REF_DELTA chains t5309: create failing test for 'git index-pack' test-tool: add pack-deltas helper	2025-05-12 14:22:49 -07:00
Jeff King	2744646834	oidmap: rename oidmap_free() to oidmap_clear() This function does not free the oidmap struct itself; it just drops all items from the map (using hashmap_clear_() internally). It should be called oidmap_clear(), per CodingGuidelines. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-12 13:06:26 -07:00
Patrick Steinhardt	1970333644	reftable: fix perf regression when reading blocks of unwanted type In `fd888311fb` (reftable/table: move reading block into block reader, 2025-04-07), we have refactored how reftable blocks are read so that most of the logic is contained in the "block.c" subsystem itself. Most importantly, the whole logic to read the data itself is now contained in that subsystem. This change caused a significant performance regression though when reading blocks that aren't of the specific type one is searching for: Benchmark 1: update-ref: create 100k refs (revision = fd888311fbc~) Time (mean ± σ): 2.171 s ± 0.028 s [User: 1.189 s, System: 0.977 s] Range (min … max): 2.117 s … 2.206 s 10 runs Benchmark 2: update-ref: create 100k refs (revision = `fd888311fb`) Time (mean ± σ): 3.418 s ± 0.030 s [User: 2.371 s, System: 1.037 s] Range (min … max): 3.377 s … 3.473 s 10 runs Summary update-ref: create 100k refs (revision = fd888311fbc~) ran 1.57 ± 0.02 times faster than update-ref: create 100k refs (revision = `fd888311fb`) The root caute of the performance regression is that we changed when exactly blocks of an uninteresting type are being discarded. Previous to the refactoring in the mentioned commit we'd load the block data, read its type, notice that it's not the wanted type and discard the block. After the commit though we don't discard the block immediately, but we fully decode it only to realize that it's not the desired type. We then discard the block again, but have already performed a bunch of pointless work. Fix the regression by making `reftable_block_init()` return early in case the block is not of the desired type. This fixes the performance hit: Benchmark 1: update-ref: create 100k refs (revision = HEAD~) Time (mean ± σ): 2.712 s ± 0.018 s [User: 1.990 s, System: 0.716 s] Range (min … max): 2.682 s … 2.741 s 10 runs Benchmark 2: update-ref: create 100k refs (revision = HEAD) Time (mean ± σ): 1.670 s ± 0.012 s [User: 0.991 s, System: 0.676 s] Range (min … max): 1.652 s … 1.693 s 10 runs Summary update-ref: create 100k refs (revision = HEAD) ran 1.62 ± 0.02 times faster than update-ref: create 100k refs (revision = HEAD~) Note that the baseline performance is lower than in the original due to a couple of unrelated performance improvements that have landed since the original commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-12 10:55:24 -07:00

1 2 3 4 5 ...

23682 Commits