admin/git

Author	SHA1	Message	Date
Junio C Hamano	125e389470	Merge branch 'xx/bundie-uri-fixes' When bundleURI interface fetches multiple bundles, Git failed to take full advantage of all bundles and ended up slurping duplicated objects. * xx/bundie-uri-fixes: unbundle: extend object verification for fetches fetch-pack: expose fsckObjects configuration logic bundle-uri: verify oid before writing refs	2024-07-08 14:53:11 -07:00
Junio C Hamano	3997614c24	Merge branch 'ps/leakfixes-more' More memory leaks have been plugged. * ps/leakfixes-more: (29 commits) builtin/blame: fix leaking ignore revs files builtin/blame: fix leaking prefixed paths blame: fix leaking data for blame scoreboards line-range: plug leaking find functions merge: fix leaking merge bases builtin/merge: fix leaking `struct cmdnames` in `get_strategy()` sequencer: fix memory leaks in `make_script_with_merges()` builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()` apply: fix leaking string in `match_fragment()` sequencer: fix leaking string buffer in `commit_staged_changes()` commit: fix leaking parents when calling `commit_tree_extended()` config: fix leaking "core.notesref" variable rerere: fix various trivial leaks builtin/stash: fix leak in `show_stash()` revision: free diff options builtin/log: fix leaking commit list in git-cherry(1) merge-recursive: fix memory leak when finalizing merge builtin/merge-recursive: fix leaking object ID bases builtin/difftool: plug memory leaks in `run_dir_diff()` object-name: free leaking object contexts ...	2024-07-08 14:53:10 -07:00
Junio C Hamano	ecf7fc600a	Merge branch 'tb/path-filter-fix' The Bloom filter used for path limited history traversal was broken on systems whose "char" is unsigned; update the implementation and bump the format version to 2. * tb/path-filter-fix: bloom: introduce `deinit_bloom_filters()` commit-graph: reuse existing Bloom filters where possible object.h: fix mis-aligned flag bits table commit-graph: new Bloom filter version that fixes murmur3 commit-graph: unconditionally load Bloom filters bloom: prepare to discard incompatible Bloom filters bloom: annotate filters with hash version repo-settings: introduce commitgraph.changedPathsVersion t4216: test changed path filters with high bit paths t/helper/test-read-graph: implement `bloom-filters` mode bloom.h: make `load_bloom_filter_from_graph()` public t/helper/test-read-graph.c: extract `dump_graph_info()` gitformat-commit-graph: describe version 2 of BDAT commit-graph: ensure Bloom filters are read with consistent settings revision.c: consult Bloom filters for root commits t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`	2024-07-08 14:53:10 -07:00
Junio C Hamano	6f75d230a1	Merge branch 'db/date-underflow-fix' date parser updates to be more careful about underflowing epoch based timestamp. * db/date-underflow-fix: date: detect underflow/overflow when parsing dates with timezone offset t0006: simplify prerequisites	2024-07-08 14:53:09 -07:00
Junio C Hamano	4e18cd5ef7	Merge branch 'rj/pager-die-upon-exec-failure' When GIT_PAGER failed to spawn, depending on the code path taken, we failed immediately (correct) or just spew the payload to the standard output (incorrect). The code now always fail immediately when GIT_PAGER fails. * rj/pager-die-upon-exec-failure: pager: die when paging to non-existing command	2024-07-08 14:53:08 -07:00
Junio C Hamano	2d97b4e235	Merge branch 'rs/diff-color-moved-w-no-ext-diff-fix' "git diff --no-ext-diff" when diff.external is configured ignored the "--color-moved" option. * rs/diff-color-moved-w-no-ext-diff-fix: diff: allow --color-moved with --no-ext-diff	2024-07-02 09:59:02 -07:00
Junio C Hamano	ca463101c8	Merge branch 'jk/remote-wo-url' Memory ownership rules for the in-core representation of remote..url configuration values have been straightened out, which resulted in a few leak fixes and code clarification. jk/remote-wo-url: remote: drop checks for zero-url case remote: always require at least one url in a remote t5801: test remote..vcs config t5801: make remote-testgit GIT_DIR setup more robust remote: allow resetting url list config: document remote..url/pushurl interaction remote: simplify url/pushurl selection remote: use strvecs to store remote url/pushurl remote: transfer ownership of memory in add_url(), etc remote: refactor alias_url() memory ownership archive: fix check for missing url	2024-07-02 09:59:01 -07:00
Junio C Hamano	7b472da915	Merge branch 'ps/use-the-repository' A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help transition the codebase to rely less on the availability of the singleton the_repository instance. * ps/use-the-repository: hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE` t/helper: remove dependency on `the_repository` in "proc-receive" t/helper: fix segfault in "oid-array" command without repository t/helper: use correct object hash in partial-clone helper compat/fsmonitor: fix socket path in networked SHA256 repos replace-object: use hash algorithm from passed-in repository protocol-caps: use hash algorithm from passed-in repository oidset: pass hash algorithm when parsing file http-fetch: don't crash when parsing packfile without a repo hash-ll: merge with "hash.h" refs: avoid include cycle with "repository.h" global: introduce `USE_THE_REPOSITORY_VARIABLE` macro hash: require hash algorithm in `empty_tree_oid_hex()` hash: require hash algorithm in `is_empty_{blob,tree}_oid()` hash: make `is_null_oid()` independent of `the_repository` hash: convert `oidcmp()` and `oideq()` to compare whole hash global: ensure that object IDs are always padded hash: require hash algorithm in `oidread()` and `oidclr()` hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()` hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions	2024-07-02 09:59:00 -07:00
Junio C Hamano	ae447ed130	Merge branch 'ew/cat-file-unbuffered-tests' The output from "git cat-file --batch-check" and "--batch-command (info)" should not be unbuffered, for which some tests have been added. * ew/cat-file-unbuffered-tests: t1006: ensure cat-file info isn't buffered by default Git.pm: use array in command_bidi_pipe example	2024-07-02 09:58:59 -07:00
Junio C Hamano	b781a3e08e	Merge branch 'jk/fetch-pack-fsck-wo-lock-pack' "git fetch-pack -k -k" without passing "--lock-pack" (which we never do ourselves) did not work at all, which has been corrected. * jk/fetch-pack-fsck-wo-lock-pack: fetch-pack: fix segfault when fscking without --lock-pack	2024-06-27 09:19:59 -07:00
Junio C Hamano	b8d1a1b06c	Merge branch 'jk/t5500-typofix' A helper function shared between two tests had a copy-paste bug, which has been corrected. * jk/t5500-typofix: t5500: fix mistaken $SERVER reference in helper function	2024-06-27 09:19:59 -07:00
Junio C Hamano	6c0bfce914	Merge branch 'kz/merge-fail-early-upon-refresh-failure' When "git merge" sees that the index cannot be refreshed (e.g. due to another process doing the same in the background), it died but after writing MERGE_HEAD etc. files, which was useless for the purpose to recover from the failure. * kz/merge-fail-early-upon-refresh-failure: merge: avoid write merge state when unable to write index	2024-06-27 09:19:58 -07:00
Darcy Burke	9d69789770	date: detect underflow/overflow when parsing dates with timezone offset Overriding the date of a commit to be close to "1970-01-01 00:00:00" with a large enough positive timezone for the equivelant GMT time to be before the epoch is considered valid by `parse_date_basic`. Similar behaviour occurs when using a date close to "2099-12-31 23:59:59" (the maximum date allowed by `tm_to_time_t`) with a large enough negative timezone offset. This leads to an integer underflow or underflow respectively in the commit timestamp, which is not caught by `git-commit`, but will cause other services to fail, such as `git-fsck`, which, for the first case, reports "badDateOverflow: invalid author/committer line - date causes integer overflow". Instead check the timezone offset and fail if the resulting time comes before the epoch "1970-01-01T00:00:00Z" or after the maximum date "2099-12-31T23:59:59Z". Using the REQUIRE_64BIT_TIME prerequisite, make sure that the tests near the end of Git time (aka end of year 2099) are not attempted on purely 32-bit systems, as they cannot express timestamp beyond 2038 anyway. Signed-off-by: Darcy Burke <acednes@gmail.com> [jc: fixups for 32-bit platforms] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 17:07:41 -07:00
Junio C Hamano	a59275d5d6	t0006: simplify prerequisites The system must support 64-bit time and its time_t must be 64-bit wide to pass these tests. Combine these two prerequisites together to simplify the tests. In theory, they could be fulfilled independently and tests could require only one without the other, but in practice, these must come hand-in-hand. Update the "check_parse" test helper to pay attention to the REQUIRE_64BIT_TIME variable, which can be set to the HAVE_64BIT_TIME prerequisite so that a parse test can be skipped on 32-bit systems. This will be used in the next step to skip tests for timestamps near the end of year 2099, as 32-bit systems will not be able to express a timestamp beyond 2038 anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 17:07:26 -07:00
Taylor Blau	5421e7c3a1	commit-graph: reuse existing Bloom filters where possible In an earlier commit, a bug was described where it's possible for Git to produce non-murmur3 hashes when the platform's "char" type is signed, and there are paths with characters whose highest bit is set (i.e. all characters >= 0x80). That patch allows the caller to control which version of Bloom filters are read and written. However, even on platforms with a signed "char" type, it is possible to reuse existing Bloom filters if and only if there are no changed paths in any commit's first parent tree-diff whose characters have their highest bit set. When this is the case, we can reuse the existing filter without having to compute a new one. This is done by marking trees which are known to have (or not have) any such paths. When a commit's root tree is verified to not have any such paths, we mark it as such and declare that the commit's Bloom filter is reusable. Note that this heuristic only goes in one direction. If neither a commit nor its first parent have any paths in their trees with non-ASCII characters, then we know for certain that a path with non-ASCII characters will not appear in a tree-diff against that commit's first parent. The reverse isn't necessarily true: just because the tree-diff doesn't contain any such paths does not imply that no such paths exist in either tree. So we end up recomputing some Bloom filters that we don't strictly have to (i.e. their bits are the same no matter which version of murmur3 we use). But culling these out is impossible, since we'd have to perform the full tree-diff, which is the same effort as computing the Bloom filter from scratch. But because we can cache our results in each tree's flag bits, we can often avoid recomputing many filters, thereby reducing the time it takes to run $ git commit-graph write --changed-paths --reachable when upgrading from v1 to v2 Bloom filters. To benchmark this, let's generate a commit-graph in linux.git with v1 changed-paths in generation order[^1]: $ git clone git@github.com:torvalds/linux.git $ cd linux $ git commit-graph write --reachable --changed-paths $ graph=".git/objects/info/commit-graph" $ mv $graph{,.bak} Then let's time how long it takes to go from v1 to v2 filters (with and without the upgrade path enabled), resetting the state of the commit-graph each time: $ git config commitGraph.changedPathsVersion 2 $ hyperfine -p 'cp -f $graph.bak $graph' -L v 0,1 \ 'GIT_TEST_UPGRADE_BLOOM_FILTERS={v} git.compile commit-graph write --reachable --changed-paths' On linux.git (where there aren't any non-ASCII paths), the timings indicate that this patch represents a speed-up over recomputing all Bloom filters from scratch: Benchmark 1: GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths Time (mean ± σ): 124.873 s ± 0.316 s [User: 124.081 s, System: 0.643 s] Range (min … max): 124.621 s … 125.227 s 3 runs Benchmark 2: GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths Time (mean ± σ): 79.271 s ± 0.163 s [User: 74.611 s, System: 4.521 s] Range (min … max): 79.112 s … 79.437 s 3 runs Summary 'GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths' ran 1.58 ± 0.01 times faster than 'GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths' On git.git, we do have some non-ASCII paths, giving us a more modest improvement from 4.163 seconds to 3.348 seconds, for a 1.24x speed-up. On my machine, the stats for git.git are: - 8,285 Bloom filters computed from scratch - 10 Bloom filters generated as empty - 4 Bloom filters generated as truncated due to too many changed paths - 65,114 Bloom filters were reused when transitioning from v1 to v2. [^1]: Note that this is is important, since `--stdin-packs` or `--stdin-commits` orders commits in the commit-graph by their pack position (with `--stdin-packs`) or in the raw input (with `--stdin-commits`). Since we compute Bloom filters in the same order that commits appear in the graph, we must see a commit's (first) parent before we process the commit itself. This is only guaranteed to happen when sorting commits by their generation number. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:06 -07:00
Taylor Blau	ba5a81d52b	commit-graph: new Bloom filter version that fixes murmur3 The murmur3 implementation in bloom.c has a bug when converting series of 4 bytes into network-order integers when char is signed (which is controllable by a compiler option, and the default signedness of char is platform-specific). When a string contains characters with the high bit set, this bug causes results that, although internally consistent within Git, does not accord with other implementations of murmur3 (thus, the changed path filters wouldn't be readable by other off-the-shelf implementatios of murmur3) and even with Git binaries that were compiled with different signedness of char. This bug affects both how Git writes changed path filters to disk and how Git interprets changed path filters on disk. Therefore, introduce a new version (2) of changed path filters that corrects this problem. The existing version (1) is still supported and is still the default, but users should migrate away from it as soon as possible. Because this bug only manifests with characters that have the high bit set, it may be possible that some (or all) commits in a given repo would have the same changed path filter both before and after this fix is applied. However, in order to determine whether this is the case, the changed paths would first have to be computed, at which point it is not much more expensive to just compute a new changed path filter. So this patch does not include any mechanism to "salvage" changed path filters from repositories. There is also no "mixed" mode - for each invocation of Git, reading and writing changed path filters are done with the same version number; this version number may be explicitly stated (typically if the user knows which version they need) or automatically determined from the version of the existing changed path filters in the repository. There is a change in write_commit_graph(). graph_read_bloom_data() makes it possible for chunk_bloom_data to be non-NULL but bloom_filter_settings to be NULL, which causes a segfault later on. I produced such a segfault while developing this patch, but couldn't find a way to reproduce it neither after this complete patch (or before), but in any case it seemed like a good thing to include that might help future patch authors. The value in t0095 was obtained from another murmur3 implementation using the following Go source code: package main import "fmt" import "github.com/spaolacci/murmur3" func main() { fmt.Printf("%x\n", murmur3.Sum32([]byte("Hello world!"))) fmt.Printf("%x\n", murmur3.Sum32([]byte{0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff})) } Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:06 -07:00
Taylor Blau	08b6ae38c6	t4216: test changed path filters with high bit paths Subsequent commits will teach Git another version of changed path filter that has different behavior with paths that contain at least one character with its high bit set, so test the existing behavior as a baseline. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:06 -07:00
Taylor Blau	57982b8f2a	t/helper/test-read-graph: implement `bloom-filters` mode Implement a mode of the "read-graph" test helper to dump out the hexadecimal contents of the Bloom filter(s) contained in a commit-graph. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:06 -07:00
Taylor Blau	460b15699d	t/helper/test-read-graph.c: extract `dump_graph_info()` Prepare for the 'read-graph' test helper to perform other tasks besides dumping high-level information about the commit-graph by extracting its main routine into a separate function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:05 -07:00
Taylor Blau	cf73936ddf	commit-graph: ensure Bloom filters are read with consistent settings The changed-path Bloom filter mechanism is parameterized by a couple of variables, notably the number of bits per hash (typically "m" in Bloom filter literature) and the number of hashes themselves (typically "k"). It is critically important that filters are read with the Bloom filter settings that they were written with. Failing to do so would mean that each query is liable to compute different fingerprints, meaning that the filter itself could return a false negative. This goes against a basic assumption of using Bloom filters (that they may return false positives, but never false negatives) and can lead to incorrect results. We have some existing logic to carry forward existing Bloom filter settings from one layer to the next. In `write_commit_graph()`, we have something like: if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) { struct commit_graph g = ctx->r->objects->commit_graph; / We have changed-paths already. Keep them in the next graph */ if (g && g->chunk_bloom_data) { ctx->changed_paths = 1; ctx->bloom_settings = g->bloom_filter_settings; } } , which drags forward Bloom filter settings across adjacent layers. This doesn't quite address all cases, however, since it is possible for intermediate layers to contain no Bloom filters at all. For example, suppose we have two layers in a commit-graph chain, say, {G1, G2}. If G1 contains Bloom filters, but G2 doesn't, a new G3 (whose base graph is G2) may be written with arbitrary Bloom filter settings, because we only check the immediately adjacent layer's settings for compatibility. This behavior has existed since the introduction of changed-path Bloom filters. But in practice, this is not such a big deal, since the only way up until this point to modify the Bloom filter settings at write time is with the undocumented environment variables: - GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY - GIT_TEST_BLOOM_SETTINGS_NUM_HASHES - GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS (it is still possible to tweak MAX_CHANGED_PATHS between layers, but this does not affect reads, so is allowed to differ across multiple graph layers). But in future commits, we will introduce another parameter to change the hash algorithm used to compute Bloom fingerprints itself. This will be exposed via a configuration setting, making this foot-gun easier to use. To prevent this potential issue, validate that all layers of a split commit-graph have compatible settings with the newest layer which contains Bloom filters. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Original-test-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:05 -07:00
Taylor Blau	1343c89313	revision.c: consult Bloom filters for root commits The commit-graph stores changed-path Bloom filters which represent the set of paths included in a tree-level diff between a commit's root tree and that of its parent. When a commit has no parents, the tree-diff is computed against that commit's root tree and the empty tree. In other words, every path in that commit's tree is stored in the Bloom filter (since they all appear in the diff). Consult these filters during pathspec-limited traversals in the function `rev_same_tree_as_empty()`. Doing so yields a performance improvement where we can avoid enumerating the full set of paths in a parentless commit's root tree when we know that the path(s) of interest were not listed in that commit's changed-path Bloom filter. Suggested-by: SZEDER Gábor <szeder.dev@gmail.com> Original-patch-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:05 -07:00
Taylor Blau	f88611c6d0	t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()` The existing implementation of test_bloom_filters_not_used() asserts that the Bloom filter sub-system has not been initialized at all, by checking for the absence of any data from it from trace2. In the following commit, it will become possible to load Bloom filters without using them (e.g., because the `commitGraph.changedPathVersion` introduced later in this series is incompatible with the hash version with which the commit-graph's Bloom filters were written). When this is the case, it's possible to initialize the Bloom filter sub-system, while still not using any Bloom filters. When this is the case, check that the data dump from the Bloom sub-system is all zeros, indicating that no filters were used. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:52:05 -07:00
Rubén Justo	78f0a5d187	pager: die when paging to non-existing command When trying to execute a non-existent program from GIT_PAGER, we display an error. However, we also send the complete text to the terminal and return a successful exit code. This can be confusing for the user and the displayed error could easily become obscured by a lengthy text. For example, here the error message would be very far above after sending 50 MB of text: $ GIT_PAGER=non-existent t/test-terminal.perl git log \| wc -c error: cannot run non-existent: No such file or directory 50314363 Let's make the error clear by aborting the process and return an error so that the user can easily correct their mistake. This will be the result of the change: $ GIT_PAGER=non-existent t/test-terminal.perl git log \| wc -c error: cannot run non-existent: No such file or directory fatal: unable to execute pager 'non-existent' 0 The behavior change we're introducing in this commit affects two tests in t7006, which is a good sign regarding test coverage and requires us to address it. The first test is 'git skips paging non-existing command'. This test comes from `f7991f01f2` (t7006: clean up SIGPIPE handling in trace2 tests, 2021-11-21,) where a modification was made to a test that was originally introduced in `c24b7f6736` (pager: test for exit code with and without SIGPIPE, 2021-02-02). That original test was, IMHO, in the same direction we're going in this commit. At any rate, this test obviously needs to be adjusted to check the new behavior we are introducing. Do it. The second test being affected is: 'non-existent pager doesnt cause crash', introduced in `f917f57f40` (pager: fix crash when pager program doesn't exist, 2021-11-24). As its name states, it has the intention of checking that we don't introduce a regression that produces a crash when GIT_PAGER points to a nonexistent program. This test could be considered redundant nowadays, due to us already having several tests checking implicitly what a non-existent command in GIT_PAGER produces. However, let's maintain a good belt-and-suspenders strategy; adapt it to the new world. Finally, it's worth noting that we are not changing the behavior if the command specified in GIT_PAGER is a shell command. In such cases, it is: $ GIT_PAGER=:\;non-existent t/test-terminal.perl git log :;non-existent: 1: non-existent: not found died of signal 13 at t/test-terminal.perl line 33. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-25 13:47:13 -07:00
Junio C Hamano	e5ff701d4c	Merge branch 'tb/commit-graph-use-tempfile' "git update-server-info" and "git commit-graph --write" have been updated to use the tempfile API to avoid leaving cruft after failing. * tb/commit-graph-use-tempfile: server-info.c: remove temporary info files on exit commit-graph.c: remove temporary graph layers on exit	2024-06-24 16:39:15 -07:00
Junio C Hamano	2c4aa7ad74	Merge branch 'jc/add-i-retire-usebuiltin-config' For over a year, setting add.interactive.useBuiltin configuration variable did nothing but giving a "this does not do anything" warning. Finally remove it. * jc/add-i-retire-usebuiltin-config: add-i: finally retire add.interactive.useBuiltin	2024-06-24 16:39:14 -07:00
Junio C Hamano	f0a462ecd5	Merge branch 'tb/precompose-getcwd' We forgot to normalize the result of getcwd() to NFC on macOS where all other paths are normalized, which has been corrected. This still does not address the case where core.precomposeUnicode configuration is not defined globally. * tb/precompose-getcwd: macOS: ls-files path fails if path of workdir is NFD	2024-06-24 16:39:14 -07:00
Junio C Hamano	ffa47b75cf	Merge branch 'tb/pseudo-merge-reachability-bitmap' The pseudo-merge reachability bitmap to help more efficient storage of the reachability bitmap in a repository with too many refs has been added. * tb/pseudo-merge-reachability-bitmap: (26 commits) pack-bitmap.c: ensure pseudo-merge offset reads are bounded Documentation/technical/bitmap-format.txt: add missing position table t/perf: implement performance tests for pseudo-merge bitmaps pseudo-merge: implement support for finding existing merges ewah: `bitmap_equals_ewah()` pack-bitmap: extra trace2 information pack-bitmap.c: use pseudo-merges during traversal t/test-lib-functions.sh: support `--notick` in `test_commit_bulk()` pack-bitmap: implement test helpers for pseudo-merge ewah: implement `ewah_bitmap_popcount()` pseudo-merge: implement support for reading pseudo-merge commits pack-bitmap.c: read pseudo-merge extension pseudo-merge: scaffolding for reads pack-bitmap: extract `read_bitmap()` function pack-bitmap-write.c: write pseudo-merge table pseudo-merge: implement support for selecting pseudo-merge commits config: introduce `git_config_double()` pack-bitmap: make `bitmap_writer_push_bitmapped_commit()` public pack-bitmap: implement `bitmap_writer_has_bitmapped_object_id()` pack-bitmap-write: support storing pseudo-merge commits ...	2024-06-24 16:39:13 -07:00
René Scharfe	0f4b0d4cf0	diff: allow --color-moved with --no-ext-diff We ignore the option --color-moved if an external diff program is configured, presumably because its overhead is unnecessary in that case. Respect the option if we don't actually use the external diff, though. Reported-by: lolligerhans@gmx.de Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-24 13:49:41 -07:00
Junio C Hamano	892fd8b89f	Merge branch 'jc/heads-are-branches' The "--heads" option of "ls-remote" and "show-ref" has been been deprecated; "--branches" replaces "--heads". * jc/heads-are-branches: show-ref: introduce --branches and deprecate --heads ls-remote: introduce --branches and deprecate --heads refs: call branches branches	2024-06-20 15:45:17 -07:00
Junio C Hamano	83ac567781	Merge branch 'pw/rebase-i-error-message' When the user adds to "git rebase -i" instruction to "pick" a merge commit, the error experience is not pleasant. Such an error is now caught earlier in the process that parses the todo list. * pw/rebase-i-error-message: rebase -i: improve error message when picking merge rebase -i: pass struct replay_opts to parse_insn_line()	2024-06-20 15:45:15 -07:00
Junio C Hamano	e4ecba994c	Merge branch 'ds/ahead-behind-fix' Fix for a progress bar. * ds/ahead-behind-fix: commit-graph: increment progress indicator	2024-06-20 15:45:14 -07:00
Junio C Hamano	4401639f96	Merge branch 'ps/abbrev-length-before-setup-fix' Setting core.abbrev too early before the repository set-up (typically in "git clone") caused segfault, which as been corrected. * ps/abbrev-length-before-setup-fix: object-name: don't try to abbreviate to lengths greater than hexsz parse-options-cb: stop clamping "--abbrev=" to hash length config: fix segfault when parsing "core.abbrev" without repo	2024-06-20 15:45:13 -07:00
Junio C Hamano	9071453ef6	Merge branch 'rj/format-patch-auto-cover-with-interdiff' "git format-patch --interdiff" for multi-patch series learned to turn on cover letters automatically (unless told never to enable cover letter with "--no-cover-letter" and such). * rj/format-patch-auto-cover-with-interdiff: format-patch: assume --cover-letter for diff in multi-patch series t4014: cleanups in a few tests	2024-06-20 15:45:12 -07:00
Junio C Hamano	5f14d20984	Merge branch 'kn/update-ref-symref' "git update-ref --stdin" learned to handle transactional updates of symbolic-refs. * kn/update-ref-symref: update-ref: add support for 'symref-update' command reftable: pick either 'oid' or 'target' for new updates update-ref: add support for 'symref-create' command update-ref: add support for 'symref-delete' command update-ref: add support for 'symref-verify' command refs: specify error for regular refs with `old_target` refs: create and use `ref_update_expects_existing_old_ref()`	2024-06-20 15:45:12 -07:00
Junio C Hamano	c1322ca474	Merge branch 'gt/unit-test-oidtree' "oidtree" tests were rewritten to use the unit test framework. * gt/unit-test-oidtree: t/: migrate helper/test-oidtree.c to unit-tests/t-oidtree.c	2024-06-20 15:45:10 -07:00
Junio C Hamano	393879d473	Merge branch 'tb/multi-pack-reuse-fix' Assorted fixes to multi-pack-index code paths. * tb/multi-pack-reuse-fix: pack-revindex.c: guard against out-of-bounds pack lookups pack-bitmap.c: avoid uninitialized `pack_int_id` during reuse midx-write.c: do not read existing MIDX with `packs_to_include`	2024-06-20 15:45:10 -07:00
Junio C Hamano	8ba7dbdefb	Merge branch 'rs/diff-exit-code-with-external-diff' "git diff --exit-code --ext-diff" learned to take the exit status of the external diff driver into account when deciding the exit status of the overall "git diff" invocation when configured to do so. * rs/diff-exit-code-with-external-diff: diff: let external diffs report that changes are uninteresting userdiff: add and use struct external_diff t4020: test exit code with external diffs	2024-06-20 15:45:08 -07:00
Jeff King	40d817875d	t5500: fix mistaken $SERVER reference in helper function The end of t5500 contains two tests which use a single helper function, fetch_filter_blob_limit_zero(). It takes a parameter to point to the path of the server repository, which we store locally as $SERVER. The first caller uses the relative path "server", while the second points into the httpd document root. Commit `07ef3c6604` (fetch test: use more robust test for filtered objects, 2019-12-23) refactored some lines, but accidentally switched "$SERVER" to "server" in one spot. That means the second caller is looking at the server directory from the previous test rather than its own. This happens to work out because the "server" directory from the first test is still hanging around, and the contents of the two are identical. But it was clearly not the intended behavior, and is fragile to cleaning up the leftovers from the first test. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-20 11:06:45 -07:00
Jeff King	96a6621d25	fetch-pack: fix segfault when fscking without --lock-pack The fetch-pack internals have multiple options related to creating ".keep" lock-files for the received pack: - if args.lock_pack is set, then we tell index-pack to create a .keep file. In the fetch-pack plumbing command, this is triggered by passing "-k" twice. - if the caller passes in a pack_lockfiles string list, then we use it to record the path of the keep-file created by index-pack. We get that name by reading the stdout of index-pack. In the fetch-pack command, this is triggered by passing the (undocumented) --lock-pack option; without it, we pass in a NULL string list. So it's possible to ask index-pack to create the lock-file (using "-k -k") but not ask to record it (by avoiding "--lock-pack"). This worked fine until `5476e1efde` (fetch-pack: print and use dangling .gitmodules, 2021-02-22), but now it causes a segfault. Before that commit, if pack_lockfiles was NULL, we wouldn't bother reading the output from index-pack at all. But since that commit, index-pack may produce extra output if we asked it to fsck. So even if nobody cares about the lockfile path, we still need to read it to skip to the output we do care about. We correctly check that we didn't get a NULL lockfile path (which can happen if we did not ask it to create a .keep file at all), but we missed the case where the lockfile path is not NULL (due to "-k -k") but the pack_lockfiles string_list is NULL (because nobody passed "--lock-pack"), and segfault trying to add to the NULL string-list. We can fix this by skipping the append to the string list when either the value or the list is NULL. In that case we must also free the lockfile path to avoid leaking it when it's non-NULL. Nobody noticed the bug for so long because the transport code used by "git fetch" always passes in a pack_lockfiles pointer, and remote-curl (the main user of the fetch-pack plumbing command) always passes --lock-pack. Reported-by: Kirill Smelkov <kirr@nexedi.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-20 10:58:00 -07:00
Xing Xin	63d903ff52	unbundle: extend object verification for fetches The existing fetch.fsckObjects and transfer.fsckObjects configurations were not fully applied to bundle-involved fetches, including direct bundle fetches and bundle-uri enabled fetches. Furthermore, there was no object verification support for unbundle. This commit extends object verification support in `bundle.c:unbundle` by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When this option is enabled, we append the `--fsck-objects` flag to `git-index-pack`. The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches, where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether to enable this option for `bundle.c:unbundle`, specifically in: - `transport.c:fetch_refs_from_bundle` for direct bundle fetches. - `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches. This addition ensures a consistent logic for object verification during fetches. Tests have been added to confirm functionality in the scenarios mentioned above. Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-20 10:30:08 -07:00
Xing Xin	3079026fc1	bundle-uri: verify oid before writing refs When using the bundle-uri mechanism with a bundle list containing multiple interrelated bundles, we encountered a bug where tips from downloaded bundles were not discovered, thus resulting in rather slow clones. This was particularly problematic when employing the "creationTokens" heuristic. To reproduce this issue, consider a repository with a single branch "main" pointing to commit "A". Firstly, create a base bundle with: git bundle create base.bundle main Then, add a new commit "B" on top of "A", and create an incremental bundle for "main": git bundle create incr.bundle A..main Now, generate a bundle list with the following content: [bundle] version = 1 mode = all heuristic = creationToken [bundle "base"] uri = base.bundle creationToken = 1 [bundle "incr"] uri = incr.bundle creationToken = 2 A fresh clone with the bundle list above should result in a reference "refs/bundles/main" pointing to "B" in the new repository. However, git would still download everything from the server, as if it had fetched nothing locally. So why the "refs/bundles/main" is not discovered? After some digging I found that: 1. Bundles in bundle list are downloaded to local files via `bundle-uri.c:download_bundle_list` or via `bundle-uri.c:fetch_bundles_by_token` for the "creationToken" heuristic. 2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which is called by `bundle-uri.c:unbundle_all_bundles` or called within `bundle-uri.c:fetch_bundles_by_token` for the "creationToken" heuristic. 3. To get all prerequisites of the bundle, the bundle header is read inside `bundle-uri.c:unbundle_from_file` to by calling `bundle.c:read_bundle_header`. 4. Then it calls `bundle.c:unbundle`, which calls `bundle.c:verify_bundle` to ensure the repository contains all the prerequisites. 5. `bundle.c:verify_bundle` calls `parse_object`, which eventually invokes `packfile.c:prepare_packed_git` or `packfile.c:reprepare_packed_git`, filling `raw_object_store->packed_git` and setting `packed_git_initialized`. 6. If `bundle.c:unbundle` succeeds, it writes refs via `refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here bundle refs which can target arbitrary objects are written to the repository. 7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions `fetch-pack.c:mark_complete_and_common_ref` and `fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to find local tips for negotiation. The `OBJECT_INFO_QUICK` flag prevents `packfile.c:reprepare_packed_git` from being called, resulting in failures to parse OIDs that reside only in the latest bundle. In the example above, when unbunding "incr.bundle", "base.pack" is added to `packed_git` due to prerequisites verification. However, "B" cannot be found for negotiation because it exists in "incr.pack", which is not included in `packed_git`. Fix the bug by removing `REF_SKIP_OID_VERIFICATION` flag when writing bundle refs. When `refs.c:refs_update_ref` is called to write the corresponding bundle refs, it triggers `refs.c:ref_transaction_commit`. This, in turn, invokes `refs.c:ref_transaction_prepare`, which calls `transaction_prepare` of the refs storage backend. For files backend, it is `files-backend.c:files_transaction_prepare`, and for reftable backend, it is `reftable-backend.c:reftable_be_transaction_prepare`. Both functions eventually call `object.c:parse_object`, which can invoke `packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures that bundle refs point to valid objects and that all tips from bundle refs are correctly parsed during subsequent negotiations. A set of negotiation-related tests for cloning with bundle-uri has been included to demonstrate that downloaded bundles are utilized to accelerate fetching. Additionally, another test has been added to show that bundles with incorrect headers, where refs point to non-existent objects, do not result in any bundle refs being created in the repository. Reviewed-by: Karthik Nayak <karthik.188@gmail.com> Reviewed-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Xing Xin <xingxin.xx@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-20 10:30:07 -07:00
Eric Wong	75daa42ddf	t1006: ensure cat-file info isn't buffered by default While working on buffering changes to `git cat-file' in a separate patch, I inadvertently made the output of --batch-check and the `info' command of --batch-command buffered as if opt->buffer_output is turned on by default. Buffering by default breaks some 3rd-party Perl scripts using cat-file, but this breakage was not detected anywhere in our test suite. Add a small Perl snippet to test this problem since (AFAIK) other equivalent ways to test this behavior from Bourne shell and/or awk would require racy sleeps, non-portable FIFOs or tedious C code. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-20 10:28:46 -07:00
Kyle Zhao	2e5a636593	merge: avoid write merge state when unable to write index Writing the merge state after the index write fails is meaningless and could potentially cause Git to lose changes. Signed-off-by: Kyle Zhao <kylezhao@tencent.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-18 08:13:35 -07:00
Junio C Hamano	4216329457	Merge branch 'ps/no-writable-strings' Building with "-Werror -Wwrite-strings" is now supported. * ps/no-writable-strings: (27 commits) config.mak.dev: enable `-Wwrite-strings` warning builtin/merge: always store allocated strings in `pull_twohead` builtin/rebase: always store allocated string in `options.strategy` builtin/rebase: do not assign default backend to non-constant field imap-send: fix leaking memory in `imap_server_conf` imap-send: drop global `imap_server_conf` variable mailmap: always store allocated strings in mailmap blob revision: always store allocated strings in output encoding remote-curl: avoid assigning string constant to non-const variable send-pack: always allocate receive status parse-options: cast long name for OPTION_ALIAS http: do not assign string constant to non-const field compat/win32: fix const-correctness with string constants pretty: add casts for decoration option pointers object-file: make `buf` parameter of `index_mem()` a constant object-file: mark cached object buffers as const ident: add casts for fallback name and GECOS entry: refactor how we remove items for delayed checkouts line-log: always allocate the output prefix line-log: stop assigning string constant to file parent buffer ...	2024-06-17 15:55:58 -07:00
Junio C Hamano	42b8b5bfd0	Merge branch 'jk/am-retry' "git am" has a safety feature to prevent it from starting a new session when there already is a session going. It reliably triggers when a mbox is given on the command line, but it has to rely on the tty-ness of the standard input. Add an explicit way to opt out of this safety with a command line option. * jk/am-retry: test-terminal: drop stdin handling am: add explicit "--retry" option	2024-06-17 15:55:56 -07:00
Junio C Hamano	40a163f217	Merge branch 'ps/ref-storage-migration' A new command has been added to migrate a repository that uses the files backend for its ref storage to use the reftable backend, with limitations. * ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version	2024-06-17 15:55:55 -07:00
Junio C Hamano	4d8ae4d3ca	Merge branch 'jc/format-patch-with-range-diff' The inter/range-diff output has been moved to the end of the patch when format-patch adds it to a single patch, instead of writing it before the patch text, to be consistent with what is done for a cover letter for a multi-patch series. * jc/format-patch-with-range-diff: format-patch: move range/inter diff at the end of a single patch output show_log: factor out interdiff/range-diff generation	2024-06-17 15:55:52 -07:00
Patrick Steinhardt	912d4756cd	t/helper: remove dependency on `the_repository` in "proc-receive" The "proc-receive" test helper implicitly relies on `the_repository` via `parse_oid_hex()`. This isn't necessary though, and in fact the whole command does not depend on `the_repository` at all. Stop setting up `the_repository` and use `parse_oid_hex_any()` to parse object IDs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:35 -07:00
Patrick Steinhardt	8e9a1d0dc2	t/helper: fix segfault in "oid-array" command without repository The "oid-array" test helper can supposedly work without a Git repository, but will in fact crash because `the_repository->hash_algo` is not initialized. This is because `oid_pos()`, which is used by `oid_array_lookup()`, depends on `the_hash_algo->rawsz`. Ideally, we'd adapt `oid_pos()` to not depend on `the_hash_algo` anymore. That is a bigger untertaking though, so instead we fall back to SHA1 when there is no repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Patrick Steinhardt	fa9e009aa7	t/helper: use correct object hash in partial-clone helper The `object_info()` function of the partial-clone helper is responsible for checking the object ID of a repository other than `the_repository`. We use `parse_oid_hex()` in this function though, which means that we still depend on `the_repository->hash_algo`. Fix this by using the object hash of the function-local repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00

1 2 3 4 5 ...

22387 Commits