admin/git

Author	SHA1	Message	Date
Ævar Arnfjörð Bjarmason	8f12108c29	submodule--helper: understand --checkout, --merge and --rebase synonyms Understand --checkout, --merge and --rebase synonyms for --update={checkout,merge,rebase}, as well as the short options that 'git submodule' itself understands. This removes a difference between the CLI API of "git submodule" and "git submodule--helper", making it easier to make the latter an alias for the former. See `48308681b0` (git submodule update: have a dedicated helper for cloning, 2016-02-29) for the initial addition of --update. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-28 13:13:17 -07:00
Ævar Arnfjörð Bjarmason	36d45163b6	submodule--helper: report "submodule" as our name in some "-h" output Change the user-facing "git submodule--helper" commands so that they'll report their name as being "git submodule". To a user these commands are internal implementation details, and it doesn't make sense to emit usage about an internal helper when "git submodule" is invoked with invalid options. Before this we'd emit e.g.: $ git submodule absorbgitdirs --blah error: unknown option `blah' usage: git submodule--helper absorbgitdirs [<options>] [<path>...] [...] And: $ git submodule set-url -- -- usage: git submodule--helper set-url [--quiet] <path> <newurl> [...] Now we'll start with "usage: git submodule [...]" in both of those cases. This change does not alter the "list", "name", "clone", "config" and "create-branch" commands, those are internal-only (as an aside; their usage info should probably invoke BUG(...)). This only changes the user-facing commands. The "status", "deinit" and "update" commands are not included in this change, because their usage information already used "submodule" rather than "submodule--helper". I don't think it's currently possible to emit some of this usage information in practice, as git-submodule.sh will catch unknown options, and e.g. it doesn't seem to be possible to get "add" to emit its usage information from "submodule--helper". Though that change may be superfluous now, it's also harmless, and will allow us to eventually dispatch further into "git submodule--helper" from git-submodule.sh, while emitting the correct usage output. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-28 13:13:17 -07:00
Ævar Arnfjörð Bjarmason	6e556c412e	submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs" Rename the "absorb-git-dirs" subcommand to "absorbgitdirs", which is what the "git submodule" command itself has called it since the subcommand was implemented in `f6f8586140` (submodule: add absorb-git-dir function, 2016-12-12). Having these two be different will make it more tedious to dispatch to eventually dispatch "git submodule--helper" directly, as we'd need to retain this name mapping. So let's get rid of this needless inconsistency. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-28 13:13:17 -07:00
Ævar Arnfjörð Bjarmason	d9c7f69aaa	submodule--helper: have --require-init imply --init Adjust code added in `0060fd1511` (clone --recurse-submodules: prevent name squatting on Windows, 2019-09-12) to have the internal --require-init option imply --init, rather than having "git-submodule.sh" add it implicitly. This change doesn't make any difference now, but eliminates another special-case where "git submodule--helper update"'s behavior was different from "git submodule update". This will make it easier to eventually replace the cmd_update() function in git-submodule.sh. We'll still need to keep the distinction between "--init" and "--require-init" in git-submodule.sh. Once cmd_update() gets re-implemented in C we'll be able to change variables and other code related to that, but not yet. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-28 13:13:17 -07:00
Elijah Newren	7976721d17	merge-tree: add a --allow-unrelated-histories flag Folks may want to merge histories that have no common ancestry; provide a flag with the same name as used by `git merge` to allow this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:06 -07:00
Elijah Newren	7c48b27822	merge-tree: allow `ls-files -u` style info to be NUL terminated Much as `git ls-files` has a -z option, let's add one to merge-tree so that the conflict-info section can be NUL terminated (and avoid quoting of unusual filenames). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:06 -07:00
Elijah Newren	de90581141	merge-ort: optionally produce machine-readable output With the new `detailed` parameter, a new mode can be triggered when displaying the merge messages: The `detailed` mode prints NUL-delimited fields of the following form: <path-count> NUL <path>... NUL <conflict-type> NUL <message> The `<path-count>` field determines how many `<path>` fields there are. The intention of this mode is to support server-side operations, where worktree-less merges can lead to conflicts and depending on the type and/or path count, the caller might know how to handle said conflict. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:06 -07:00
Elijah Newren	b520bc6caa	merge-tree: provide easy access to `ls-files -u` style info Much like `git merge` updates the index with information of the form (mode, oid, stage, name) provide this output for conflicted files for merge-tree as well. Provide a --name-only option for users to exclude the mode, oid, and stage and only get the list of conflicted filenames. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:06 -07:00
Elijah Newren	7fa3338870	merge-tree: provide a list of which files have conflicts Callers of `git merge-tree --write-tree` will often want to know which files had conflicts. While they could potentially attempt to parse the CONFLICT notices printed, those messages are not meant to be machine readable. Provide a simpler mechanism of just printing the files (in the same format as `git ls-files` with quoting, but restricted to unmerged files) in the output before the free-form messages. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:06 -07:00
Elijah Newren	a1a7811975	merge-tree: support including merge messages in output When running `git merge-tree --write-tree`, we previously would only return an exit status reflecting the cleanness of a merge, and print out the toplevel tree of the resulting merge. Merges also have informational messages, such as: * "Auto-merging <PATH>" * "CONFLICT (content): ..." * "CONFLICT (file/directory)" * etc. In fact, when non-content conflicts occur (such as file/directory, modify/delete, add/add with differing modes, rename/rename (1to2), etc.), these informational messages may be the only notification the user gets since these conflicts are not representable in the contents of the file. Add a --[no-]messages option so that callers can request these messages be included at the end of the output. Include such messages by default when there are conflicts, and omit them by default when the merge is clean. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:06 -07:00
Elijah Newren	1f0c3a29da	merge-tree: implement real merges This adds the ability to perform real merges rather than just trivial merges (meaning handling three way content merges, recursive ancestor consolidation, renames, proper directory/file conflict handling, and so forth). However, unlike `git merge`, the working tree and index are left alone and no branch is updated. The only output is: - the toplevel resulting tree printed on stdout - exit status of 0 (clean), 1 (conflicts present), anything else (merge could not be performed; unknown if clean or conflicted) This output is meant to be used by some higher level script, perhaps in a sequence of steps like this: NEWTREE=$(git merge-tree --write-tree $BRANCH1 $BRANCH2) test $? -eq 0 \|\| die "There were conflicts..." NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2) git update-ref $BRANCH1 $NEWCOMMIT Note that higher level scripts may also want to access the conflict/warning messages normally output during a merge, or have quick access to a list of files with conflicts. That is not available in this preliminary implementation, but subsequent commits will add that ability (meaning that NEWTREE would be a lot more than a tree in the case of conflicts). This also marks the traditional trivial merge of merge-tree as deprecated. The trivial merge not only had limited applicability, the output format was also difficult to work with (and its format undocumented), and will generally be less performant than real merges. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:05 -07:00
Elijah Newren	6ec755a0e1	merge-tree: add option parsing and initial shell for real merge function Let merge-tree accept a `--write-tree` parameter for choosing real merges instead of trivial merges, and accept an optional `--trivial-merge` option to get the traditional behavior. Note that these accept different numbers of arguments, though, so these names need not actually be used. Note that real merges differ from trivial merges in that they handle: - three way content merges - recursive ancestor consolidation - renames - proper directory/file conflict handling - etc. Basically all the stuff you'd expect from `git merge`, just without updating the index and working tree. The initial shell added here does nothing more than die with "real merges are not yet implemented", but that will be fixed in subsequent commits. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:05 -07:00
Elijah Newren	55e48f6bf7	merge-tree: move logic for existing merge into new function In preparation for adding a non-trivial merge capability to merge-tree, move the existing merge logic for trivial merges into a new function. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:05 -07:00
Elijah Newren	70176b7015	merge-tree: rename merge_trees() to trivial_merge_trees() merge-recursive.h defined its own merge_trees() function, different than the one found in builtin/merge-tree.c. That was okay in the past, but we want merge-tree to be able to use the merge-ort functions, which will end up including merge-recursive.h. Rename the function found in builtin/merge-tree.c to avoid the conflict. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 16:10:05 -07:00
Carlos López	68437ede53	grep: add --max-count command line option This patch adds a command line option analogous to that of GNU grep(1)'s -m / --max-count, which users might already be used to. This makes it possible to limit the amount of matches shown in the output while keeping the functionality of other options such as -C (show code context) or -p (show containing function), which would be difficult to do with a shell pipeline (e.g. head(1)). Signed-off-by: Carlos López 00xc@protonmail.com Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-22 13:23:29 -07:00
Richard Oliver	817b0f6027	mktree: do not check type of remote objects With `31c8221a` (mktree: validate entry type in input, 2009-05-14), we called the sha1_object_info() API to obtain the type information, but allowed the call to silently fail when the object was missing locally, so that we can sanity-check the types opportunistically when the object did exist. The implementation is understandable because back then there was no lazy/on-demand downloading of individual objects from the promisor remotes that causes a long delay and materializes the object, hence defeating the point of using "--missing". The design is hurting us now. We could bypass the opportunistic type/mode consistency check altogether when "--missing" is given, but instead, use the oid_object_info_extended() API and tell it that we are only interested in objects that locally exist and are immediately available by passing OBJECT_INFO_SKIP_FETCH_OBJECT bit to it. That way, we will still retain the cheap and opportunistic sanity check for local objects. Signed-off-by: Richard Oliver <roliver@roku.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-21 10:12:15 -07:00
Jeff King	9bef0b1e6e	branch: drop unused worktrees variable After `b489b9d9aa` (branch: use branch_checked_out() when deleting refs, 2022-06-14), we no longer look at our local "worktrees" variable, since branch_checked_out() handles it under the hood. The compiler didn't notice the unused variable because we call functions to initialize and free it (so it's not totally unused, it just doesn't do anything useful). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-21 08:52:37 -07:00
Jeff King	b2463fc30a	fetch: stop passing around unused worktrees variable In `12d47e3b1f` (fetch: use new branch_checked_out() and add tests, 2022-06-14), fetch's update_local_ref() function stopped using its "worktrees" parameter. It doesn't need it, since the branch_checked_out() function examines the global worktrees under the hood. So we can not only drop the unused parameter from that function, but also from its entire call chain. And as we do so all the way up to do_fetch(), we can see that nobody uses it at all, and we can drop the local variable there entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-21 08:52:32 -07:00
Alexander Shopov	325240dfd7	name-rev: prefix annotate-stdin with '--' in message This is an option rather than command. Make the message convey this similar to the other messages in the file. Signed-off-by: Alexander Shopov <ash@kambanaria.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-20 16:20:45 -07:00
Jiang Xin	b4eda05d58	i18n: fix mismatched camelCase config variables Some config variables are combinations of multiple words, and we typically write them in camelCase forms in manpage and translatable strings. It's not easy to find mismatches for these camelCase config variables during code reviews, but occasionally they are identified during localization translations. To check for mismatched config variables, I introduced a new feature in the helper program for localization[^1]. The following mismatched config variables have been identified by running the helper program, such as "git-po-helper check-pot". Lowercase in manpage should use camelCase: * Documentation/config/http.txt: http.pinnedpubkey Lowercase in translable strings should use camelCase: * builtin/fast-import.c: pack.indexversion * builtin/gc.c: gc.logexpiry * builtin/index-pack.c: pack.indexversion * builtin/pack-objects.c: pack.indexversion * builtin/repack.c: pack.writebitmaps * commit.c: i18n.commitencoding * gpg-interface.c: user.signingkey * http.c: http.postbuffer * submodule-config.c: submodule.fetchjobs Mismatched camelCases, choose the former: * Documentation/config/transfer.txt: transfer.credentialsInUrl remote.c: transfer.credentialsInURL [^1]: https://github.com/git-l10n/git-po-helper Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-17 10:38:26 -07:00
Junio C Hamano	e870c5857f	Merge branch 'js/misc-fixes' Assorted fixes to problems found by Coverity. * js/misc-fixes: relative_url(): fix incorrect condition pack-mtimes: avoid closing a bogus file descriptor read_index_from(): avoid memory leak submodule--helper: avoid memory leak when fetching submodules submodule-config: avoid memory leak fsmonitor: avoid memory leak in `fsm_settings__get_incompatible_msg()`	2022-06-17 10:33:31 -07:00
Jacob Keller	2c80a82e34	remote: handle negative refspecs in git remote show By default, the git remote show command will query data from remotes to show data about what might be done on a future git fetch. This process currently does not handle negative refspecs. This can be confusing, because the show command will list refs as if they would be fetched. For example if the fetch refspec "^refs/heads/pr/", it still displays the following: remote jdk19 Fetch URL: git@github.com:openjdk/jdk19.git Push URL: git@github.com:openjdk/jdk19.git HEAD branch: master Remote branches: master tracked pr/1 new (next fetch will store in remotes/jdk19) pr/2 new (next fetch will store in remotes/jdk19) pr/3 new (next fetch will store in remotes/jdk19) Local ref configured for 'git push': master pushes to master (fast-forwardable) Fix this by adding an additional check inside of get_ref_states. If a ref matches one of the negative refspecs, mark it as skipped instead of marking it as new or tracked. With this change, we now report remote branches that are skipped due to negative refspecs properly: * remote jdk19 Fetch URL: git@github.com:openjdk/jdk19.git Push URL: git@github.com:openjdk/jdk19.git HEAD branch: master Remote branches: master tracked pr/1 skipped pr/2 skipped pr/3 skipped Local ref configured for 'git push': master pushes to master (fast-forwardable) By showing the refs as skipped, it helps clarify that these references won't actually be fetched. This does not properly handle refs going stale due to a newly added negative refspec. In addition, git remote prune doesn't handle that negative refspec case either. Fixing that requires digging into get_stale_heads and handling the case of a ref which exists on the remote but is omitted due to a negative refspec locally. Add a new test case which covers the functionality above, as well as a new expected failure indicating the poor overlap with stale refs. Reported-by: Pavel Rappo <pavel.rappo@gmail.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-17 10:03:59 -07:00
Johannes Schindelin	41a86b64c0	submodule--helper: avoid memory leak when fetching submodules In `c51f8f94e5` (submodule--helper: run update procedures from C, 2021-08-24), we added code that first obtains the default remote, and then adds that to a `strvec`. However, we never released the default remote's memory. Reported by Coverity. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-16 13:22:03 -07:00
Fangyi Zhou	3b9a5a33c2	builtin/rebase: remove a redundant space in l10n string Found in l10n. Signed-off-by: Fangyi Zhou <me@fangyi.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-16 11:15:23 -07:00
Junio C Hamano	589bc0942b	Merge branch 'po/rebase-preserve-merges' Various error messages that talk about the removal of "--preserve-merges" in "rebase" have been strengthened, and "rebase --abort" learned to get out of a state that was left by an earlier use of the option. * po/rebase-preserve-merges: rebase: translate a die(preserve-merges) message rebase: note `preserve` merges may be a pull config option rebase: help users when dying with `preserve-merges` rebase.c: state preserve-merges has been removed	2022-06-15 15:09:28 -07:00
Junio C Hamano	bfca631634	Merge branch 'jc/revert-show-parent-info' "git revert" learns "--reference" option to use more human-readable reference to the commit it reverts in the message template it prepares for the user. * jc/revert-show-parent-info: revert: --reference should apply only to 'revert', not 'cherry-pick' revert: optionally refer to commit in the "reference" format	2022-06-15 15:09:27 -07:00
Fangyi Zhou	1f8496c65f	push: fix capitalisation of the option name autoSetupMerge This was found during l10n process by Jiang Xin. Reported-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Fangyi Zhou <me@fangyi.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-15 11:45:46 -07:00
Derrick Stolee	b489b9d9aa	branch: use branch_checked_out() when deleting refs This is the last current use of find_shared_symref() that can easily be replaced by branch_checked_out(). The benefit of this switch is that the code is a bit simpler, but also it is faster on repeated calls. The remaining uses of find_shared_symref() are non-trivial to remove, so we probably should not continue in that direction: * builtin/notes.c uses find_shared_symref() with "NOTES_MERGE_REF" instead of "HEAD", so it doesn't have an immediate analogue with branch_checked_out(). Perhaps we should consider extending it to include that symref in addition to HEAD, BISECT_HEAD, and REBASE_HEAD. * receive-pack.c checks to see if a worktree has a checkout for the ref that is being updated. The tricky part is that it can actually decide to update the worktree directly instead of just skipping the update. This all depends on the receive.denyCurrentBranch config option. The implementation currenty cares about receiving the worktree in the result, so the current branch_checked_out() prototype is insufficient currently. This is something to investigate later, though, since a large number of refs could be updated at the same time and using the strmap implementation of branch_checked_out() could be beneficial. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-15 10:47:19 -07:00
Derrick Stolee	12d47e3b1f	fetch: use new branch_checked_out() and add tests When fetching refs from a remote, it is possible that the refspec will cause use to overwrite a ref that is checked out in a worktree. The existing logic in builtin/fetch.c uses a possibly-slow mechanism. Update those sections to use the new, more efficient branch_checked_out() helper. These uses were not previously tested, so add a test case that can be used for these kinds of collisions. There is only one test now, but more tests will be added as other consumers of branch_checked_out() are added. Note that there are two uses in builtin/fetch.c, but only one of the messages is tested. This is because the tested check is run before completing the fetch, and the untested check is not reachable without concurrent updates to the filesystem. Thus, it is beneficial to keep that extra check for the sake of defense-in-depth. However, we should not attempt to test the check, as the effort required is too complicated to be worth the effort. This use in update_local_ref() also requires a change in the error message because we no longer have access to the worktree struct, only the path of the worktree. This error is so rare that making a distinction between the two is not critical. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-15 10:47:18 -07:00
Junio C Hamano	2246937e41	Merge branch 'tb/show-ref-optim' "git show-ref --heads" (and "--tags") still iterated over all the refs only to discard refs outside the specified area, which has been corrected. * tb/show-ref-optim: builtin/show-ref.c: avoid over-iterating with --heads, --tags	2022-06-13 15:53:42 -07:00
Han Xin	aaf81223f4	unpack-objects: use stream_loose_object() to unpack large objects Make use of the stream_loose_object() function introduced in the preceding commit to unpack large objects. Before this we'd need to malloc() the size of the blob before unpacking it, which could cause OOM with very large blobs. We could use the new streaming interface to unpack all blobs, but doing so would be much slower, as demonstrated e.g. with this benchmark using git-hyperfine[0]: rm -rf /tmp/scalar.git && git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git && mv /tmp/scalar.git/objects/pack/.pack /tmp/scalar.git/my.pack && git hyperfine \ -r 2 --warmup 1 \ -L rev origin/master,HEAD -L v "10,512,1k,1m" \ -s 'make' \ -p 'git init --bare dest.git' \ -c 'rm -rf dest.git' \ './git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack' Here we'll perform worse with lower core.bigFileThreshold settings with this change in terms of speed, but we're getting lower memory use in return: Summary './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' A better benchmark to demonstrate the benefits of that this one, which creates an artificial repo with a 1, 25, 50, 75 and 100MB blob: rm -rf /tmp/repo && git init /tmp/repo && ( cd /tmp/repo && for i in 1 25 50 75 100 do dd if=/dev/urandom of=blob.$i count=$(($i1024)) bs=1024 done && git add blob.* && git commit -mblobs && git gc && PACK=$(echo .git/objects/pack/pack-*.pack) && cp "$PACK" my.pack ) && git hyperfine \ --show-output \ -L rev origin/master,HEAD -L v "512,50m,100m" \ -s 'make' \ -p 'git init --bare dest.git' \ -c 'rm -rf dest.git' \ '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' Using this test we'll always use >100MB of memory on origin/master (around ~105MB), but max out at e.g. ~55MB if we set core.bigFileThreshold=50m. The relevant "Maximum resident set size" lines were manually added below the relevant benchmark: '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'origin/master' ran Maximum resident set size (kbytes): 107080 1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'origin/master' Maximum resident set size (kbytes): 106968 1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'origin/master' Maximum resident set size (kbytes): 107032 1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 107072 1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 55704 2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 4564 This shows that if you have enough memory this new streaming method is slower the lower you set the streaming threshold, but the benefit is more bounded memory use. An earlier version of this patch introduced a new "core.bigFileStreamingThreshold" instead of re-using the existing "core.bigFileThreshold" variable[1]. As noted in a detailed overview of its users in [2] using it has several different meanings. Still, we consider it good enough to simply re-use it. While it's possible that someone might want to e.g. consider objects "small" for the purposes of diffing but "big" for the purposes of writing them such use-cases are probably too obscure to worry about. We can always split up "core.bigFileThreshold" in the future if there's a need for that. 0. https://github.com/avar/git-hyperfine/ 1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/ 2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/ Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Derrick Stolee <stolee@gmail.com> Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Han Xin <chiyutianyi@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-13 10:22:36 -07:00
Han Xin	a1bf5ca29f	unpack-objects: low memory footprint for get_data() in dry_run mode As the name implies, "get_data(size)" will allocate and return a given amount of memory. Allocating memory for a large blob object may cause the system to run out of memory. Before preparing to replace calling of "get_data()" to unpack large blob objects in latter commits, refactor "get_data()" to reduce memory footprint for dry_run mode. Because in dry_run mode, "get_data()" is only used to check the integrity of data, and the returned buffer is not used at all, we can allocate a smaller buffer and use it as zstream output. Make the function return NULL in the dry-run mode, as no callers use the returned buffer. The "find [...]objects/?? -type f \| wc -l" test idiom being used here is adapted from the same "find" use added to another test in `d9545c7f46` (fast-import: implement unpack limit, 2016-04-25). Suggested-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Han Xin <chiyutianyi@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-13 10:22:35 -07:00
Junio C Hamano	4da14b574f	Merge branch 'ab/bug-if-bug' A new bug() and BUG_if_bug() API is introduced to make it easier to uniformly log "detect multiple bugs and abort in the end" pattern. * ab/bug-if-bug: cache-tree.c: use bug() and BUG_if_bug() receive-pack: use bug() and BUG_if_bug() parse-options.c: use optbug() instead of BUG() "opts" check parse-options.c: use new bug() API for optbug() usage.c: add a non-fatal bug() function to go with BUG() common-main.c: move non-trace2 exit() behavior out of trace2.c	2022-06-10 15:04:15 -07:00
Junio C Hamano	9e496fffc8	Merge branch 'jh/builtin-fsmonitor-part3' More fsmonitor--daemon. * jh/builtin-fsmonitor-part3: (30 commits) t7527: improve implicit shutdown testing in fsmonitor--daemon fsmonitor--daemon: allow --super-prefix argument t7527: test Unicode NFC/NFD handling on MacOS t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd t/helper/hexdump: add helper to print hexdump of stdin fsmonitor: on macOS also emit NFC spelling for NFD pathname t7527: test FSMonitor on case insensitive+preserving file system fsmonitor: never set CE_FSMONITOR_VALID on submodules t/perf/p7527: add perf test for builtin FSMonitor t7527: FSMonitor tests for directory moves fsmonitor: optimize processing of directory events fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed fsm-health-win32: force shutdown daemon if worktree root moves fsm-health-win32: add polling framework to monitor daemon health fsmonitor--daemon: stub in health thread fsmonitor--daemon: rename listener thread related variables fsmonitor--daemon: prepare for adding health thread fsmonitor--daemon: cd out of worktree root fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS unpack-trees: initialize fsmonitor_has_run_once in o->result ...	2022-06-10 15:04:15 -07:00
Junio C Hamano	c21fa3bb54	Merge branch 'ab/env-array' Rename .env_array member to .env in the child_process structure. * ab/env-array: run-command API users: use "env" not "env_array" in comments & names run-command API: rename "env_array" to "env"	2022-06-10 15:04:13 -07:00
Junio C Hamano	5a5ea141e7	revision: mark blobs needed for resolve-undo as reachable The resolve-undo extension was added to the index in `cfc5789a` (resolve-undo: record resolved conflicts in a new index extension section, 2009-12-25). This extension records the blob object names and their modes of conflicted paths when the path gets resolved (e.g. with "git add"), to allow "undoing" the resolution with "checkout -m path". These blob objects should be guarded from garbage-collection while we have the resolve-undo information in the index (otherwise unresolve operation may try to use a blob object that has already been pruned away). But the code called from mark_reachable_objects() for the index forgets to do so. Teach add_index_objects_to_pending() helper to also add objects referred to by the resolve-undo extension. Also make matching changes to "fsck", which has code that is fairly similar to the reachability stuff, but have parallel implementations for all these stuff, which may (or may not) someday want to be unified. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-09 16:45:07 -07:00
Junio C Hamano	d2b11e05e0	Merge branch 'jc/clone-remote-name-leak-fix' into maint "git clone --origin X" leaked piece of memory that held value read from the clone.defaultRemoteName configuration variable, which has been plugged. source: <xmqqlevl4ysk.fsf@gitster.g> * jc/clone-remote-name-leak-fix: clone: plug a miniscule leak	2022-06-08 14:27:53 -07:00
Junio C Hamano	67c305f722	Merge branch 'ds/midx-normalize-pathname-before-comparison' into maint The path taken by "git multi-pack-index" command from the end user was compared with path internally prepared by the tool withut first normalizing, which lead to duplicated paths not being noticed, which has been corrected. source: <pull.1221.v2.git.1650911234.gitgitgadget@gmail.com> * ds/midx-normalize-pathname-before-comparison: cache: use const char * for get_object_directory() multi-pack-index: use --object-dir real path midx: use real paths in lookup_multi_pack_index()	2022-06-08 14:27:53 -07:00
Junio C Hamano	363d54ff80	Merge branch 'ah/rebase-keep-base-fix' into maint "git rebase --keep-base <upstream> <branch-to-rebase>" computed the commit to rebase onto incorrectly, which has been corrected. source: <20220421044233.894255-1-alexhenrie24@gmail.com> * ah/rebase-keep-base-fix: rebase: use correct base for --keep-base when a branch is given	2022-06-08 14:27:52 -07:00
Junio C Hamano	c47b89cde6	Merge branch 'jc/show-branch-g-current' into maint The "--current" option of "git show-branch" should have been made incompatible with the "--reflog" mode, but this was not enforced, which has been corrected. source: <xmqqh76mf7s4.fsf_-_@gitster.g> * jc/show-branch-g-current: show-branch: -g and --current are incompatible	2022-06-08 14:27:51 -07:00
Junio C Hamano	2da81d1efb	Merge branch 'ab/plug-leak-in-revisions' Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...	2022-06-07 14:10:56 -07:00
Philip Oakley	f007713cb1	rebase: translate a die(preserve-merges) message This is a user facing message for a situation seen in the wild. Translate it. Signed-off-by: Philip Oakley <philipoakley@iee.email> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-06 10:45:54 -07:00
Philip Oakley	afea77a72a	rebase: note `preserve` merges may be a pull config option The `--preserve-merges` option was removed by v2.34.0. However users may not be aware that it is also a Pull configuration option, which is still offered by major IDE vendors such as Visual Studio. Extend the `--preserve-merges` die message to also direct users to the possible use of the `preserve` option in the `pull.rebase` config. This is an additional 'belt and braces' information statement. Signed-off-by: Philip Oakley <philipoakley@iee.email> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-06 10:45:54 -07:00
Philip Oakley	afd58a0d42	rebase: help users when dying with `preserve-merges` Git would die if a "rebase --preserve-merges" was in progress. Users could neither --quit, --abort, nor --continue the rebase. Make the `rebase --abort` option available to allow users to remove traces of any preserve-merges rebase, even if they had upgraded during a rebase. One trigger case was an unexpectedly difficult to resolve conflict, as reported on the `git-users` group. (https://groups.google.com/g/git-for-windows/c/3jMWbBlXXHM) Other potential use-cases include git-experts using the portable 'Git on a stick' to help users with an older git version. Signed-off-by: Philip Oakley <philipoakley@iee.email> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-06 10:45:54 -07:00
Philip Oakley	2f7b9f9e55	rebase.c: state preserve-merges has been removed Since `feebd2d256` (rebase: hide --preserve-merges option, 2019-10-18) this option is now removed as stated in the subsequent release notes. Fix and reflow the option tip. Signed-off-by: Philip Oakley <philipoakley@iee.email> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-06 10:45:54 -07:00
Taylor Blau	c0c9d35e27	builtin/show-ref.c: avoid over-iterating with --heads, --tags When `show-ref` is combined with the `--heads` or `--tags` options, it can avoid iterating parts of a repository's references that it doesn't care about. But it doesn't take advantage of this potential optimization. When this command was introduced back in `358ddb62cf` (Add "git show-ref" builtin command, 2006-09-15), `for_each_ref_in()` did exist. But since most repositories don't have many (any?) references that aren't branches or tags already, this makes little difference in practice. Though for repositories with a large imbalance of branches and tags (or, more likely in the case of server operators, many hidden references), this can make quite a difference. Take, for example, a repository with 500,000 "hidden" references (all of the form "refs/__hidden__/N"), and a single branch: git commit --allow-empty -m "base" && seq 1 500000 \| sed 's,$.$,create refs/__hidden__/\1 HEAD,' \| git update-ref --stdin && git pack-refs --all Outputting the existence of that single branch currently takes on the order of ~50ms on my machine. The vast majority of this time is wasted iterating through references that we know we're going to discard. Instead, teach `show-ref` that it can iterate just "refs/heads" and/or "refs/tags" when given `--heads` and/or `--tags`, respectively. A few small interesting things to note: - When given either option, we can avoid the general-purpose for_each_ref() call altogether, since we know that it won't give us any references that we wouldn't filter out already. - We can make two separate calls to `for_each_fullref_in()` (and avoid, say, the more specialized `for_each_fullref_in_prefixes()`, since we know that the set of references enumerated by each is disjoint, so we'll never see the same reference appear in both calls. - We have to use the "fullref" variant (instead of just `for_each_branch_ref()` and `for_each_tag_ref()`), since we expect fully-qualified reference names to appear in `show-ref`'s output. When either of `heads_only` or `tags_only` is set, we can eliminate the strcmp() calls in `builtin/show-ref.c::show_ref()` altogether, since we know that `show_ref()` will never see a non-branch or tag reference. Unfortunately, we can't use `for_each_fullref_in_prefixes()` to enhance `show-ref`'s pattern matching, since `show-ref` patterns match on the _suffix_ (e.g., the pattern "foo" shows "refs/heads/foo", "refs/tags/foo", and etc, not "foo/"). Nonetheless, in our synthetic example above, this provides a significant speed-up ("git" is roughly v2.36, "git.compile" is this patch): $ hyperfine -N 'git show-ref --heads' 'git.compile show-ref --heads' Benchmark 1: git show-ref --heads Time (mean ± σ): 49.9 ms ± 6.2 ms [User: 45.6 ms, System: 4.1 ms] Range (min … max): 46.1 ms … 73.6 ms 43 runs Benchmark 2: git.compile show-ref --heads Time (mean ± σ): 2.8 ms ± 0.4 ms [User: 1.4 ms, System: 1.2 ms] Range (min … max): 1.3 ms … 5.6 ms 957 runs Summary 'git.compile show-ref --heads' ran 18.03 ± 3.38 times faster than 'git show-ref --heads' Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-06 09:56:42 -07:00
Junio C Hamano	a50036da1a	Merge branch 'tb/cruft-packs' A mechanism to pack unreachable objects into a "cruft pack", instead of ejecting them into loose form to be reclaimed later, has been introduced. * tb/cruft-packs: sha1-file.c: don't freshen cruft packs builtin/gc.c: conditionally avoid pruning objects via loose builtin/repack.c: add cruft packs to MIDX during geometric repack builtin/repack.c: use named flags for existing_packs builtin/repack.c: allow configuring cruft pack generation builtin/repack.c: support generating a cruft pack builtin/pack-objects.c: --cruft with expiration reachable: report precise timestamps from objects in cruft packs reachable: add options to add_unseen_recent_objects_to_traversal builtin/pack-objects.c: --cruft without expiration builtin/pack-objects.c: return from create_object_entry() t/helper: add 'pack-mtimes' test-tool pack-mtimes: support writing pack .mtimes files chunk-format.h: extract oid_version() pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' pack-mtimes: support reading .mtimes files Documentation/technical: add cruft-packs.txt	2022-06-03 14:30:37 -07:00
Junio C Hamano	28db3b7b71	Merge branch 'jx/l10n-workflow-change' A workflow change for translators are being proposed. * jx/l10n-workflow-change: l10n: Document the new l10n workflow Makefile: add "po-init" rule to initialize po/XX.po Makefile: add "po-update" rule to update po/XX.po po/git.pot: don't check in result of "make pot" po/git.pot: this is now a generated file Makefile: remove duplicate and unwanted files in FOUND_SOURCE_FILES i18n CI: stop allowing non-ASCII source messages in po/git.pot Makefile: have "make pot" not "reset --hard" Makefile: generate "po/git.pot" from stable LOCALIZED_C Makefile: sort source files before feeding to xgettext	2022-06-03 14:30:36 -07:00
Junio C Hamano	16a0e92ddc	Merge branch 'tb/geom-repack-with-keep-and-max' Teach "git repack --geometric" work better with "--keep-pack" and avoid corrupting the repository when packsize limit is used. * tb/geom-repack-with-keep-and-max: builtin/repack.c: ensure that `names` is sorted t7703: demonstrate object corruption with pack.packSizeLimit repack: respect --keep-pack with geometric repack	2022-06-03 14:30:36 -07:00
Junio C Hamano	c276c21da6	Merge branch 'ds/sparse-sparse-checkout' "sparse-checkout" learns to work well with the sparse-index feature. * ds/sparse-sparse-checkout: sparse-checkout: integrate with sparse index p2000: add test for 'git sparse-checkout [add\|set]' sparse-index: complete partial expansion sparse-index: partially expand directories sparse-checkout: --no-sparse-index needs a full index cache-tree: implement cache_tree_find_path() sparse-index: introduce partially-sparse indexes sparse-index: create expand_index() t1092: stress test 'git sparse-checkout set' t1092: refactor 'sparse-index contents' test	2022-06-03 14:30:35 -07:00

... 18 19 20 21 22 ...

11302 Commits