admin/git

Author	SHA1	Message	Date
Derrick Stolee	928ef41dd8	repack: add --name-hash-version option The new '--name-hash-version' option for 'git repack' is a simple pass-through to the underlying 'git pack-objects' subcommand. However, this subcommand may have other options and a temporary filename as part of the subcommand execution that may not be predictable or could change over time. The existing test_subcommand method requires an exact list of arguments for the subcommand. This is too rigid for our needs here, so create a new method, test_subcommand_flex. Use it to check that the --name-hash-version option is passing through. Since we are modifying the 'git repack' command, let's bring its usage in line with the Documentation's synopsis. This removes it from the allow list in t0450 so it will remain in sync in the future. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-27 13:21:43 -08:00
Derrick Stolee	fc62e033cd	pack-objects: add --name-hash-version option The previous change introduced a new pack_name_hash_v2() function that intends to satisfy much of the hash locality features of the existing pack_name_hash() function while also distinguishing paths with similar final components of their paths. This change adds a new --name-hash-version option for 'git pack-objects' to allow users to select their preferred function version. This use of an integer version allows for future expansion and a direct way to later store a name hash version in the .bitmap format. For now, let's consider how effective this mechanism is when repacking a repository with different name hash versions. Specifically, we will execute 'git pack-objects' the same way a 'git repack -adf' process would, except we include --name-hash-version=<n> for testing. On the Git repository, we do not expect much difference. All path names are short. This is backed by our results: \| Stage \| Pack Size \| Repack Time \| \|-----------------------\|-----------\|-------------\| \| After clone \| 260 MB \| N/A \| \| --name-hash-version=1 \| 127 MB \| 129s \| \| --name-hash-version=2 \| 127 MB \| 112s \| This example demonstrates how there is some natural overhead coming from the cloned copy because the server is hosting many forks and has not optimized for exactly this set of reachable objects. But the full repack has similar characteristics for both versions. Let's consider some repositories that are hitting too many collisions with version 1. First, let's explore the kinds of paths that are commonly causing these collisions: * "/CHANGELOG.json" is 15 characters, and is created by the beachball [1] tool. Only the final character of the parent directory can differentiate different versions of this file, but also only the two most-significant digits. If that character is a letter, then this is always a collision. Similar issues occur with the similar "/CHANGELOG.md" path, though there is more opportunity for differences In the parent directory. * Localization files frequently have common filenames but differentiates via parent directories. In C#, the name "/strings.resx.lcl" is used for these localization files and they will all collide in name-hash. [1] https://github.com/microsoft/beachball I've come across many other examples where some internal tool uses a common name across multiple directories and is causing Git to repack poorly due to name-hash collisions. One open-source example is the fluentui [2] repo, which uses beachball to generate CHANGELOG.json and CHANGELOG.md files, and these files have very poor delta characteristics when comparing against versions across parent directories. \| Stage \| Pack Size \| Repack Time \| \|-----------------------\|-----------\|-------------\| \| After clone \| 694 MB \| N/A \| \| --name-hash-version=1 \| 438 MB \| 728s \| \| --name-hash-version=2 \| 168 MB \| 142s \| [2] https://github.com/microsoft/fluentui In this example, we see significant gains in the compressed packfile size as well as the time taken to compute the packfile. Using a collection of repositories that use the beachball tool, I was able to make similar comparisions with dramatic results. While the fluentui repo is public, the others are private so cannot be shared for reproduction. The results are so significant that I find it important to share here: \| Repo \| --name-hash-version=1 \| --name-hash-version=2 \| \|----------\|-----------------------\|-----------------------\| \| fluentui \| 440 MB \| 161 MB \| \| Repo B \| 6,248 MB \| 856 MB \| \| Repo C \| 37,278 MB \| 6,755 MB \| \| Repo D \| 131,204 MB \| 7,463 MB \| Future changes could include making --name-hash-version implied by a config value or even implied by default during a full repack. It is important to point out that the name hash value is stored in the .bitmap file format, so we must force --name-hash-version=1 when bitmaps are being read or written. Later, the bitmap format could be updated to be aware of the name hash version so deltas can be quickly computed across the bitmapped/not-bitmapped boundary. To promote the safety of this parameter, the validate_name_hash_version() method will die() if the given name-hash version is incorrect and will disable newer versions if not yet compatible with other features, such as --write-bitmap-index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-01-27 13:21:41 -08:00
Taylor Blau	1c5a712f26	Merge branch 'jk/dumb-http-finalize' The dumb-http code regressed when the result of re-indexing a pack yielded an .idx file that differs in content from the .idx file it downloaded from the remote. This has been corrected by no longer relying on the .idx file we got from the remote. jk/dumb-http-finalize: packfile: use oidread() instead of hashcpy() to fill object_id packfile: use object_id in find_pack_entry_one() packfile: convert find_sha1_pack() to use object_id http-walker: use object_id instead of bare hash packfile: warn people away from parse_packed_git() packfile: drop sha1_pack_index_name() packfile: drop sha1_pack_name() packfile: drop has_pack_index() dumb-http: store downloaded pack idx as tempfile t5550: count fetches in "previously-fetched .idx" test midx: avoid duplicate packed_git entries	2024-11-01 12:53:32 -04:00
Taylor Blau	aebc4bd8ce	Merge branch 'ak/more-typofixes' More typofixes. * ak/more-typofixes: t: fix typos	2024-11-01 12:53:29 -04:00
Taylor Blau	43ac23945c	Merge branch 'rs/grep-lookahead' Fix 'git grep' regression on macOS by disabling lookahead when encountering invalid UTF-8 byte sequences. * rs/grep-lookahead: grep: disable lookahead on error	2024-11-01 12:53:28 -04:00
Taylor Blau	81a5461518	Merge branch 'ak/t1016-cleanup' Test cleanup. * ak/t1016-cleanup: t1016: clean up style	2024-11-01 12:53:27 -04:00
Taylor Blau	59dc0ab83c	Merge branch 'ua/atoi' Replace various calls to atoi() with strtol_i() and strtoul_ui(), and add improved error handling. * ua/atoi: imap: replace atoi() with strtol_i() for UIDVALIDITY and UIDNEXT parsing merge: replace atoi() with strtol_i() for marker size validation daemon: replace atoi() with strtoul_ui() and strtol_i()	2024-11-01 12:53:26 -04:00
Taylor Blau	20ab7fa3b6	Merge branch 'sa/notes-edit' Teach 'git notes add' and 'git notes append' a new '-e' flag, instructing them to open the note in $GIT_EDITOR before saving. * sa/notes-edit: notes: teach the -e option to edit messages in editor	2024-11-01 12:53:25 -04:00
Taylor Blau	8237b49ade	Merge branch 'sk/t9101-cleanup' Test cleanup. * sk/t9101-cleanup: t9101: ensure no whitespace after redirect	2024-11-01 12:53:24 -04:00
Taylor Blau	07c6066f82	Merge branch 'kh/mv-breakage' Demonstrate an assertion failure in 'git mv'. * kh/mv-breakage: t7001: add failure test which triggers assertion	2024-11-01 12:53:21 -04:00
Taylor Blau	a524cc77ad	Merge branch 'ua/t3404-cleanup' Test update. * ua/t3404-cleanup: t3404: replace test with test_line_count() t3404: avoid losing exit status with focus on `git show` and `git cat-file`	2024-11-01 12:53:18 -04:00
Taylor Blau	268fd2fe58	Merge branch 'ps/platform-compat-fixes' Various platform compatibility fixes split out of the larger effort to use Meson as the primary build tool. * ps/platform-compat-fixes: t6006: fix prereq handling with `test_format ()` http: fix build error on FreeBSD builtin/credential-cache: fix missing parameter for stub function t7300: work around platform-specific behaviour with long paths on MinGW t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin t3404: work around platform-specific behaviour on macOS 10.15 t1401: make invocation of tar(1) work with Win32-provided one t/lib-gpg: fix setup of GNUPGHOME in MinGW t/lib-gitweb: test against the build version of gitweb t/test-lib: wire up NO_ICONV prerequisite t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE	2024-11-01 12:53:17 -04:00
Taylor Blau	6aa984af3b	Merge branch 'sk/t7011-cleanup' Test cleanup. * sk/t7011-cleanup: t7011: ensure no whitespace after redirect	2024-10-30 13:08:07 -04:00
Taylor Blau	8d305fdaac	Merge branch 'co/t6050-pipefix' Avoid losing exit status by having Git command being tested on the upstream side of a pipe. * co/t6050-pipefix: t6050: avoid pipes with upstream Git commands	2024-10-30 13:08:06 -04:00
Taylor Blau	cec4461e3f	Merge branch 'ks/t4205-fixup' Testfix. * ks/t4205-fixup: t4205: fix typo in 'NUL termination with --stat'	2024-10-30 13:08:05 -04:00
Taylor Blau	bc627658b0	Merge branch 'ps/reftable-strbuf' Implements a new reftable-specific strbuf replacement to reduce reftable's dependency on Git-specific data structures. * ps/reftable-strbuf: reftable: handle trivial `reftable_buf` errors reftable/stack: adapt `stack_filename()` to handle allocation failures reftable/record: adapt `reftable_record_key()` to handle allocation failures reftable/stack: adapt `format_name()` to handle allocation failures t/unit-tests: check for `reftable_buf` allocation errors reftable/blocksource: adapt interface name reftable: convert from `strbuf` to `reftable_buf` reftable/basics: provide new `reftable_buf` interface reftable: stop using `strbuf_addf()` reftable: stop using `strbuf_addbuf()`	2024-10-30 13:08:01 -04:00
Patrick Steinhardt	dd6003f200	t6006: fix prereq handling with `test_format ()` In `df383b5842` (t/test-lib: wire up NO_ICONV prerequisite, 2024-10-16) we have introduced a new NO_ICONV prerequisite that makes us skip tests in case Git is not compiled with support for iconv. This change subtly broke t6006: while the test suite still passes, some of its tests won't execute because they run into an error. ./t6006-rev-list-format.sh: line 92: test_expect_%e: command not found The broken tests use `test_format ()`, and the mentioned commit simply prepended the new prerequisite to its arguments. But that does not work, as the function is not aware of prereqs at all and will now treat all of its arguments incorrectly. Fix this by making the function aware of prereqs by accepting an optional fourth argument. Adapt the callsites accordingly. Reported-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-28 13:44:38 -04:00
Jeff King	479ab76c9f	packfile: use object_id in find_pack_entry_one() The main function we use to search a pack index for an object is find_pack_entry_one(). That function still takes a bare pointer to the hash, despite the fact that its underlying bsearch_pack() function needs an object_id struct. And so we end up making an extra copy of the hash into the struct just to do a lookup. As it turns out, all callers but one already have such an object_id. So we can just take a pointer to that struct and use it directly. This avoids the extra copy and provides a more type-safe interface. The one exception is get_delta_base() in packfile.c, when we are chasing a REF_DELTA from inside the pack (and thus we have a pointer directly to the mmap'd pack memory, not a struct). We can just bump the hashcpy() from inside find_pack_entry_one() to this one caller that needs it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-25 17:35:46 -04:00
Jeff King	63aca3f7f1	dumb-http: store downloaded pack idx as tempfile This patch fixes a regression in `b1b8dfde69` (finalize_object_file(): implement collision check, 2024-09-26) where fetching a v1 pack idx file over the dumb-http protocol would cause the fetch to fail. The core of the issue is that dumb-http stores the idx we fetch from the remote at the same path that will eventually hold the idx we generate from "index-pack --stdin". The sequence is something like this: 0. We realize we need some object X, which we don't have locally, and nor does the other side have it as a loose object. 1. We download the list of remote packs from objects/info/packs. 2. For each entry in that file, we download each pack index and store it locally in .git/objects/pack/pack-$hash.idx (the $hash is not something we can verify yet and is given to us by the remote). 3. We check each pack index we got to see if it has object X. When we find a match, we download the matching .pack file from the remote to a tempfile. We feed that to "index-pack --stdin", which reindexes the pack, rather than trusting that it has what the other side claims it does. In most cases, this will end up generating the exact same (byte-for-byte) pack index which we'll store at the same pack-$hash.idx path, because the index generation and $hash id are computed based on what's in the packfile. But: a. The other side might have used other options to generate the index. For instance we use index v2 by default, but long ago it was v1 (and you can still ask for v1 explicitly). b. The other side might even use a different mechanism to determine $hash. E.g., long ago it was based on the sorted list of objects in the packfile, but we switched to using the pack checksum in `1190a1acf8` (pack-objects: name pack files after trailer hash, 2013-12-05). The regression we saw in the real world was (3a). A recent client fetching from a server with a v1 index downloaded that index, then complained about trying to overwrite it with its own v2 index. This collision is otherwise harmless; we know we want to replace the remote version with our local one, but the collision check doesn't realize that. There are a few options to fix it: - we could teach index-pack a command-line option to ignore only pack idx collisions, and use it when the dumb-http code invokes index-pack. This would be an awkward thing to expose users to and would involve a lot of boilerplate to get the option down to the collision code. - we could delete the remote .idx file right before running index-pack. It should be redundant at that point (since we've just downloaded the matching pack). But it feels risky to delete something from our own .git/objects based on what the other side has said. I'm not entirely positive that a malicious server couldn't lie about which pack-$hash.idx it has and get us to delete something precious. - we can stop co-mingling the downloaded idx files in our local objects directory. This is a slightly bigger change but I think fixes the root of the problem more directly. This patch implements the third option. The big design questions are: where do we store the downloaded files, and how do we manage their lifetimes? There are some additional quirks to the dumb-http system we should consider. Remember that in step 2 we downloaded every pack index, but in step 3 we may only download some of the matching packs. What happens to those other idx files now? They sit in the .git/objects/pack directory, possibly waiting to be used at a later date. That may save bandwidth for a subsequent fetch, but it also creates a lot of weird corner cases: - our local object directory now has semi-untrusted .idx files sitting around, without their matching .pack - in case 3b, we noted that we might not generate the same hash as the other side. In that case even if we download the matching pack, our index-pack invocation will store it in a different pack-$hash.idx file. And the unmatched .idx will sit there forever. - if the server repacks, it may delete the old packs. Now we have these orphaned .idx files sitting around locally that will never be used (nor deleted). - if we repack locally we may delete our local version of the server's pack index and not realize we have it. So we'll download it again, even though we have all of the objects it mentions. I think the right solution here is probably some more complex cache management system: download the remote .idx files to their own storage directory, mark them as "seen" when we get their matching pack (to avoid re-downloading even if we repack), and then delete them when the server's objects/info/refs no longer mentions them. But since the dumb http protocol is so ancient and so inferior to the smart http protocol, I don't think it's worth spending a lot of time creating such a system. For this patch I'm just downloading the idx files to .git/objects/tmp_pack_*, and marking them as tempfiles to be deleted when we exit (and due to the name, any we miss due to a crash, etc, should eventually be removed by "git gc" runs based on timestamps). That is slightly worse for one case: if we download an idx but not the matching pack, we won't retain that idx for subsequent runs. But the flip side is that we're making other cases better (we never hold on to useless idx files forever). I suspect that worse case does not even come up often, since it implies that the packs are generated to match distinct parts of history (i.e., in practice even in a repo with many packs you're going to end up grabbing all of those packs to do a clone). If somebody really cares about that, I think the right path forward is a managed cache directory as above, and this patch is providing the first step in that direction anyway (by moving things out of the objects/pack/ directory). There are two test changes. One demonstrates the broken v1 index case (it double-checks the resulting clone with fsck to be careful, but prior to this patch it actually fails at the clone step). The other tweaks the expectation for a test that covers the "slightly worse" case to accommodate the extra index download. The code changes are fairly simple. We stop using finalize_object_file() to copy the remote's index file into place, and leave it as a tempfile. We give the tempfile a real ".idx" name, since the packfile code expects that, and thus we make sure it is out of the usual packs/ directory (so we'd never mistake it for a real local .idx). We also have to change parse_pack_index(), which creates a temporary packed_git to access our index (we need this because all of the pack idx code assumes we have that struct). It reads the index data from the tempfile, but prior to this patch would speculatively write the finalized name into the packed_git struct using the pack-$hash we expect to use. I was mildly surprised that this worked at all, since we call verify_pack_index() on the packed_git which mentions the final name before moving the file into place! But it works because parse_pack_index() leaves the mmap-ed data in the struct, so the lazy-open in verify_pack_index() never triggers, and we read from the tempfile, ignoring the filename in the struct completely. Hacky, but it works. After this patch, parse_pack_index() now uses the index filename we pass in to derive a matching .pack name. This is OK to change because there are only two callers, both in the dumb http code (and the other passes in an existing pack-$hash.idx name, so the derived name is going to be pack-$hash.pack, which is what we were using anyway). I'll follow up with some more cleanups in that area, but this patch is sufficient to fix the regression. Reported-by: fox <fox.gbr@townlong-yak.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-25 17:35:46 -04:00
Jeff King	019b21d402	t5550: count fetches in "previously-fetched .idx" test We have a test in t5550 that looks at index fetching over dumb http. It creates two branches, each of which is completely stored in its own pack, then fetches the branches independently. What should (and does) happen is that the first fetch grabs both .idx files and one .pack file, and then the fetch of the second branch re-uses the previously downloaded .idx files (fetching none) and grabs the now-required .pack file. Since the next few patches will be touching this area of the code, let's beef up the test a little by checking that we're downloading the expected items at each step. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-25 17:35:46 -04:00
Jeff King	8b5763e8fa	midx: avoid duplicate packed_git entries When we scan a pack directory to load the idx entries we find into the packed_git list, we skip any of them that are contained in a midx. We then load them later lazily if we actually need to access the corresponding pack, referencing them both from the midx struct and the packed_git list. The lazy-load in the midx code checks to see if the midx already mentions the pack, but doesn't otherwise check the packed_git list. This makes sense, since we should have added any pack to both lists. But there's a loophole! If we call close_object_store(), that frees the midx entirely, but _not_ the packed_git structs, which we must keep around for Reasons[1]. If we then try to look up more objects, we'll auto-load the midx again, which won't realize that we've already loaded those packs, and will create duplicate entries in the packed_git list. This is possibly inefficient, because it means we may open and map the pack redundantly. But it can also lead to weird user-visible behavior. The case I found is in "git repack", which closes and reopens the midx after repacking and then calls update_server_info(). We end up writing the duplicate entries into objects/info/packs. We could obviously de-dup them while writing that file, but it seems like a violation of more core assumptions that we end up with these duplicate structs at all. We can avoid the duplicates reasonably efficiently by checking their names in the pack_map hash. This annoyingly does require a little more than a straight hash lookup due to the naming conventions, but it should only happen when we are about to actually open a pack. I don't think one extra malloc will be noticeable there. [1] I'm not entirely sure of all the details, except that we generally assume the packed_git structs never go away. We noted this restriction in the comment added by `6f1e9394e2` (object: fix leaking packfiles when closing object store, 2024-08-08), but it's somewhat vague. At any rate, if you try freeing the structs in close_object_store(), you can observe segfaults all over the test suite. So it might be fixable, but it's not trivial. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-25 17:35:46 -04:00
Taylor Blau	55d12c24d7	Merge branch 'wm/shortlog-hash' Teaches 'shortlog' to explicitly use SHA-1 when operating outside of a repository. * wm/shortlog-hash: builtin/shortlog: explicitly set hash algo when there is no repo	2024-10-25 14:02:49 -04:00
Taylor Blau	448022a7fb	Merge branch 'bf/t-readme-mention-reftable' Doc update. * bf/t-readme-mention-reftable: t/README: add missing value for GIT_TEST_DEFAULT_REF_FORMAT	2024-10-25 14:02:21 -04:00
Taylor Blau	f25bb60393	Merge branch 'ak/typofix' More typofixes. * ak/typofix: t: fix typos	2024-10-25 14:02:08 -04:00
Taylor Blau	4d334e5205	Merge branch 'ak/typofixes' Typofixes. * ak/typofixes: t: fix typos t/helper: fix a typo t/perf: fix typos t/unit-tests: fix typos contrib: fix typos compat: fix typos	2024-10-25 14:02:04 -04:00
Taylor Blau	55bc7d54ab	Merge branch 'ps/ci-gitlab-windows' Enable Windows-based CI in GitLab. * ps/ci-gitlab-windows: gitlab-ci: exercise Git on Windows gitlab-ci: introduce stages and dependencies ci: handle Windows-based CI jobs in GitLab CI ci: create script to set up Git for Windows SDK t7300: work around platform-specific behaviour with long paths on MinGW	2024-10-25 14:01:21 -04:00
Taylor Blau	6cbcc68ea7	Merge branch 'db/submodule-fetch-with-remote-name-fix' A "git fetch" from the superproject going down to a submodule used a wrong remote when the default remote names are set differently between them. * db/submodule-fetch-with-remote-name-fix: submodule: correct remote name with fetch	2024-10-25 14:01:09 -04:00
Usman Akinyemi	e36f009e69	merge: replace atoi() with strtol_i() for marker size validation Replace atoi() with strtol_i() for parsing conflict-marker-size to improve error handling. Invalid values, such as those containing letters now trigger a clear error message. Update the test to verify invalid input handling. Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-24 14:03:44 -04:00
Usman Akinyemi	cc4023477f	daemon: replace atoi() with strtoul_ui() and strtol_i() Replace atoi() with strtoul_ui() for --timeout and --init-timeout (non-negative integers) and with strtol_i() for --max-connections (signed integers). This improves error handling and input validation by detecting invalid values and providing clear error messages. Update tests to ensure these arguments are properly validated. Signed-off-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-24 14:03:43 -04:00
Andrew Kreimer	f56f9d6c0b	t: fix typos Fix typos and grammar in documentation, comments, etc. Via codespell. Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-24 12:45:53 -04:00
Kristoffer Haugsbakk	0fcd473fdd	t7001: add failure test which triggers assertion `git mv a/a.txt a b/` is a nonsense instruction. Instead of failing gracefully the command trips over itself,[1] leaving behind unfinished work: 1. first it moves `a/a.txt` to `b/a.txt`; then 2. tries to move `a/`, including `a/a.txt`; then 3. figures out that it’s in a bad state (assertion); and finally 4. aborts. Now you’re left with a partially-updated index. The command should instead fail gracefully and make no changes to the index until it knows that it can complete a sensible action. For now just add a failing test since this has been known about for a while.[2] † 1: Caused by a `pos >= 0` assertion [2]: https://lore.kernel.org/git/d1f739fe-b28e-451f-9e01-3d2e24a0fe0d@app.fastmail.com/ Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-23 16:36:15 -04:00
Seyi Kuforiji	09bf122507	t9101: ensure no whitespace after redirect This change updates the script to conform to the coding standards outlined in the Git project's documentation. According to the guidelines in Documentation/CodingGuidelines under "Redirection operators", there should be no whitespace after redirection operators. Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-23 14:57:32 -04:00
Seyi Kuforiji	91687cd13f	t7011: ensure no whitespace after redirect This change updates the script to conform to the coding standards outlined in the Git project's documentation. According to the guidelines in Documentation/CodingGuidelines under "Redirection operators", there should be no whitespace after redirection operators. Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-22 15:37:59 -04:00
Taylor Blau	8e08668322	Merge branch 'cw/worktree-relative' An extra worktree attached to a repository points at each other to allow finding the repository from the worktree and vice versa possible. Turn this linkage to relative paths. * cw/worktree-relative: worktree: add test for path handling in linked worktrees worktree: link worktrees with relative paths worktree: refactor infer_backlink() to use *strbuf worktree: repair copied repository and linked worktrees	2024-10-22 14:40:39 -04:00
Taylor Blau	6ca9a05e63	Merge branch 'ps/cache-tree-w-broken-index-entry' Fail gracefully instead of crashing when attempting to write the contents of a corrupt in-core index as a tree object. * ps/cache-tree-w-broken-index-entry: unpack-trees: detect mismatching number of cache-tree/index entries cache-tree: detect mismatching number of index entries cache-tree: refactor verification to return error codes	2024-10-22 14:40:38 -04:00
Chizoba ODINAKA	9e362dd060	t6050: avoid pipes with upstream Git commands In pipes, the exit code of a chain of commands is determined by the final command. In order not to miss the exit code of a failed Git command, avoid pipes instead write output of Git commands into a file. For better debugging experience, instances of "grep" were changed to "test_grep". "test_grep" provides more context in case of a failed "grep". Signed-off-by: Chizoba ODINAKA <chizobajames21@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-22 12:47:27 -04:00
René Scharfe	ce025ae4f6	grep: disable lookahead on error regexec(3) can fail. E.g. on macOS it fails if it is used with an UTF-8 locale to match a valid regex against a buffer containing invalid UTF-8 characters. git grep has two ways to search for matches in a file: Either it splits its contents into lines and matches them separately, or it matches the whole content and figures out line boundaries later. The latter is done by look_ahead() and it's quicker in the common case where most files don't contain a match. Fall back to line-by-line matching if look_ahead() encounters an regexec(3) error by propagating errors out of patmatch() and bailing out of look_ahead() if there is one. This way we at least can find matches in lines that contain only valid characters. That matches the behavior of grep(1) on macOS. pcre2match() dies if pcre2_jit_match() or pcre2_match() fail, but since we use the flag PCRE2_MATCH_INVALID_UTF it handles invalid UTF-8 characters gracefully. So implement the fall-back only for regexec(3) and leave the PCRE2 matching unchanged. Reported-by: David Gstir <david@sigma-star.at> Signed-off-by: René Scharfe <l.s.r@web.de> Tested-by: David Gstir <david@sigma-star.at> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-22 12:45:49 -04:00
Andrew Kreimer	c348192afe	t1016: clean up style Use `test_config`. Remove whitespace after redirect operator. Reported-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-22 12:35:05 -04:00
Kousik Sanagavarapu	a73070fbd4	t4205: fix typo in 'NUL termination with --stat' Correct "expected" to rightly terminate with NUL ie '\0' instead of '0' which may have been typoed. We didn't notice this before because the test is run with "test_expect_failure", meaning the test would have been marked broken anyways. Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-21 17:37:11 -04:00
Abraham Samuel Adekunle	dab0b9e176	notes: teach the -e option to edit messages in editor Notes can be added to a commit using: - "-m" to provide a message on the command line. - -C to copy a note from a blob object. - -F to read the note from a file. When these options are used, Git does not open an editor, it simply takes the content provided via these options and attaches it to the commit as a note. Improve flexibility to fine-tune the note before finalizing it by allowing the messages to be prefilled in the editor and edited after the messages have been provided through -[mF]. Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-21 15:52:48 -04:00
Taylor Blau	c1662a00b6	Merge branch 'ps/maintenance-start-crash-fix' "git maintenance start" crashed due to an uninitialized variable reference, which has been corrected. * ps/maintenance-start-crash-fix: builtin/gc: fix crash when running `git maintenance start`	2024-10-18 13:56:26 -04:00
Taylor Blau	6fe1b8cee0	Merge branch 'ng/rebase-merges-branch-name-as-label' "git rebase --rebase-merges" now uses branch names as labels when able. * ng/rebase-merges-branch-name-as-label: rebase-merges: try and use branch names as labels rebase-update-refs: extract load_branch_decorations load_branch_decorations: fix memory leak with non-static filters	2024-10-18 13:56:22 -04:00
Taylor Blau	020c16bdb9	Merge branch 'aa/t7300-modernize' Test modernization. * aa/t7300-modernize: t7300-clean.sh: use test_path_* helper functions for error logging	2024-10-18 13:54:43 -04:00
Patrick Steinhardt	31eedd1d11	t/unit-tests: check for `reftable_buf` allocation errors Adapt our unit tests to check for allocations errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-17 16:59:56 -04:00
Patrick Steinhardt	f177d49163	reftable/blocksource: adapt interface name Adapt the name of the `strbuf` block source to no longer relate to this interface, but instead to the `reftable_buf` interface. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-17 16:59:56 -04:00
Patrick Steinhardt	be4c070a3c	reftable: convert from `strbuf` to `reftable_buf` Convert the reftable library to use the `reftable_buf` interface instead of the `strbuf` interface. This is mostly a mechanical change via sed(1) with some manual fixes where functions for `strbuf` and `reftable_buf` differ. The converted code does not yet handle allocation failures. This will be handled in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-17 16:59:56 -04:00
Patrick Steinhardt	7fa7e14ebe	reftable: stop using `strbuf_addf()` We're about to introduce our own `reftable_buf` type to replace `strbuf`. One function we'll have to convert is `strbuf_addf()`, which is used in a handful of places. This function uses `snprintf()` internally, which makes porting it a bit more involved: - It is not available on all platforms. - Some platforms like Windows have broken implementations. So by using `snprintf()` we'd also push the burden on downstream users of the reftable library to make available a properly working version of it. Most callsites of `strbuf_addf()` are trivial to convert to not using it. We do end up using `snprintf()` in our unit tests, but that isn't much of a problem for downstream users of the reftable library. While at it, remove a useless call to `strbuf_reset()` in `t_reftable_stack_auto_compaction_with_locked_tables()`. We don't write to the buffer before this and initialize it with `STRBUF_INIT`, so there is no need to reset anything. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-17 16:59:55 -04:00
Andrew Kreimer	f1eea0b620	t: fix typos Fix typos in documentation, comments, etc. Via codespell. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-17 16:14:56 -04:00
Wolfgang Müller	b33001645e	builtin/shortlog: explicitly set hash algo when there is no repo Whilst git-shortlog(1) does not explicitly need any repository information when run without reference to one, it still parses some of its arguments with parse_revision_opt() which assumes that the hash algorithm is set. However, in `c8aed5e8da` (repository: stop setting SHA1 as the default object hash, 2024-05-07) we stopped setting up a default hash algorithm and instead require commands to set it up explicitly. This was done for most other commands like in `ab274909d4` (builtin/diff: explicitly set hash algo when there is no repo, 2024-05-07) but was missed for builtin/shortlog, making git-shortlog(1) segfault outside of a repository when given arguments like --author that trigger a call to parse_revision_opt(). Fix this for now by explicitly setting the hash algorithm to SHA1. Also add a regression test for the segfault. Thanks-to: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Wolfgang Müller <wolf@oriole.systems> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-17 16:10:54 -04:00
Patrick Steinhardt	bb0d76dbf7	t7300: work around platform-specific behaviour with long paths on MinGW Windows by default has a restriction in place to only allow paths up to 260 characters. This restriction can nowadays be lifted by setting a registry key, but is still active by default. In t7300 we have one test that exercises the behaviour of git-clean(1) with such long paths. Interestingly enough, this test fails on my system that uses Windows 10 with mingw-w64 installed via MSYS2: instead of observing ENAMETOOLONG, we observe ENOENT. This behaviour is consistent across multiple different environments I have tried. I cannot say why exactly we observe a different error here, but I would not be surprised if this was either dependent on the Windows version, the version of MinGW, the current working directory of Git or any kind of combination of these. Work around the issue by handling both errors. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-16 17:00:49 -04:00

1 2 3 4 5 ...

23021 Commits