admin/git

Author	SHA1	Message	Date
Josh Steadmon	5fc31180d8	fetch: add trace2 instrumentation Add trace2 regions to fetch-pack.c and builtins/fetch.c to better track time spent in the various phases of a fetch: * listing refs * negotiation for protocol versions v0-v2 * fetching refs * consuming refs Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-03 10:13:17 +09:00
Garima Singh	0bd7f578b2	commit-graph: emit trace2 cmd_mode for each sub-command Emit trace2_cmd_mode() messages for each commit-graph sub-command. The commit graph commands were in flux when trace2 was making it's way to git. Now that we have enough sub-commands in commit-graph, we can label the various modes within them. Distinguishing between read, write and verify is a great start. Signed-off-by: Garima Singh <garima.singh@microsoft.com> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-02 18:36:19 +09:00
Junio C Hamano	5a5350940b	Merge branch 'ds/commit-graph-on-fetch' A configuration variable tells "git fetch" to write the commit graph after finishing. * ds/commit-graph-on-fetch: fetch: add fetch.writeCommitGraph config setting	2019-09-30 13:19:32 +09:00
Junio C Hamano	974bdb0205	Merge branch 'bw/rebase-autostash-keep-current-branch' "git rebase --autostash <upstream> <branch>", when <branch> is different from the current branch, incorrectly moved the tip of the current branch, which has been corrected. * bw/rebase-autostash-keep-current-branch: builtin/rebase.c: Remove pointless message builtin/rebase.c: make sure the active branch isn't moved when autostashing	2019-09-30 13:19:32 +09:00
Junio C Hamano	9755f70fe6	Merge branch 'ds/include-exclude' The internal code originally invented for ".gitignore" processing got reshuffled and renamed to make it less tied to "excluding" and stress more that it is about "matching", as it has been reused for things like sparse checkout specification that want to check if a path is "included". * ds/include-exclude: unpack-trees: rename 'is_excluded_from_list()' treewide: rename 'exclude' methods to 'pattern' treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_' treewide: rename 'struct exclude_list' to 'struct pattern_list' treewide: rename 'struct exclude' to 'struct path_pattern'	2019-09-30 13:19:32 +09:00
Junio C Hamano	640f9cd599	Merge branch 'dl/rebase-i-keep-base' "git rebase --keep-base <upstream>" tries to find the original base of the topic being rebased and rebase on top of that same base, which is useful when running the "git rebase -i" (and its limited variant "git rebase -x"). The command also has learned to fast-forward in more cases where it can instead of replaying to recreate identical commits. * dl/rebase-i-keep-base: rebase: teach rebase --keep-base rebase tests: test linear branch topology rebase: fast-forward --fork-point in more cases rebase: fast-forward --onto in more cases rebase: refactor can_fast_forward into goto tower t3432: test for --no-ff's interaction with fast-forward t3432: distinguish "noop-same" v.s. "work-same" in "same head" tests t3432: test rebase fast-forward behavior t3431: add rebase --fork-point tests	2019-09-30 13:19:31 +09:00
Junio C Hamano	d8ce144e11	Merge branch 'jk/misc-uninitialized-fixes' Various fixes to codepaths gcc 9 had trouble following dataflow. * jk/misc-uninitialized-fixes: pack-objects: drop packlist index_pos optimization test-read-cache: drop namelen variable diff-delta: set size out-parameter to 0 for NULL delta bulk-checkin: zero-initialize hashfile_checkpoint pack-objects: use object_id in packlist_alloc() git-am: handle missing "author" when parsing commit	2019-09-30 13:19:30 +09:00
Junio C Hamano	cf861cd7a0	Merge branch 'rs/get-tagged-oid' Code cleanup. * rs/get-tagged-oid: use get_tagged_oid() tag: factor out get_tagged_oid()	2019-09-30 13:19:29 +09:00
Junio C Hamano	fe048e4fd9	Merge branch 'tg/push-all-in-mirror-forbidden' Fix an earlier regression to "git push --all" which should have been forbidden when the target remote repository is set to be a mirror. * tg/push-all-in-mirror-forbidden: push: disallow --all and refspecs when remote.<name>.mirror is set	2019-09-30 13:19:28 +09:00
Junio C Hamano	3ff6af7753	Merge branch 'nd/switch-and-restore' Resurrect a performance hack. * nd/switch-and-restore: checkout: add simple check for 'git checkout -b'	2019-09-30 13:19:27 +09:00
Junio C Hamano	bf6136cab8	Merge branch 'rs/strbuf-detach' Straighten out the use of strbuf_detach() API function. * rs/strbuf-detach: grep: use return value of strbuf_detach() log-tree: always use return value of strbuf_detach()	2019-09-30 13:19:24 +09:00
Bert Wesarg	cda0d497e3	builtin/submodule--helper: fix usage string for 'update-clone' Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-29 07:33:57 +09:00
Elijah Newren	af2abd870b	fast-export: fix exporting a tag and nothing else fast-export allows specifying revision ranges, which can be used to export a tag without exporting the commit it tags. fast-export handled this rather poorly: it would emit a "from :0" directive. Since marks start at 1 and increase, this means it refers to an unknown commit and fast-import will choke on the input. When we are unable to look up a mark for the object being tagged, use a "from $HASH" directive instead to fix this problem. Note that this is quite similar to the behavior fast-export exhibits with commits and parents when --reference-excluded-parents is passed along with an excluded commit range. For tags of excluded commits we do not require the --reference-excluded-parents flag because we always have to tag something. By contrast, when dealing with commits, pruning a parent is always a viable option, so we need the flag to specify that parent pruning is not wanted. (It is slightly weird that --reference-excluded-parents isn't the default with a separate --prune-excluded-parents flag, but backward compatibility concerns resulted in the current defaults.) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-28 18:54:40 +09:00
SZEDER Gábor	2e09c01232	name-rev: avoid cutoff timestamp underflow When 'git name-rev' is invoked with commit-ish parameters, it tries to save some work, and doesn't visit commits older than the committer date of the oldest given commit minus a one day worth of slop. Since our 'timestamp_t' is an unsigned type, this leads to a timestamp underflow when the committer date of the oldest given commit is within a day of the UNIX epoch. As a result the cutoff timestamp ends up far-far in the future, and 'git name-rev' doesn't visit any commits, and names each given commit as 'undefined'. Check whether subtracting the slop from the oldest committer date would lead to an underflow, and use no cutoff in that case. We don't have a TIME_MIN constant, `dddbad728c` (timestamp_t: a new data type for timestamps, 2017-04-26) didn't add one, so do it now. Note that the type of the cutoff timestamp variable used to be signed before `5589e87fd8` (name-rev: change a "long" variable to timestamp_t, 2017-05-20). The behavior was still the same even back then, but the underflow didn't happen when substracting the slop from the oldest committer date, but when comparing the signed cutoff timestamp with unsigned committer dates in name_rev(). IOW, this underflow bug is as old as 'git name-rev' itself. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-28 13:36:04 +09:00
Thomas Gummerer	34933d0eff	stash: make sure to write refreshed cache When converting stash into C, calls to 'git update-index --refresh' were replaced with the 'refresh_cache()' function. That is fine as long as the index is only needed in-core, and not re-read from disk. However in many cases we do actually need the refreshed index to be written to disk, for example 'merge_recursive_generic()' discards the in-core index before re-reading it from disk, and in the case of 'apply --quiet', the 'refresh_cache()' we currently have is pointless without writing the index to disk. Always write the index after refreshing it to ensure there are no regressions in this compared to the scripted stash. In the future we can consider avoiding the write where possible after making sure none of the subsequent calls actually need the refreshed cache, and it is not expected to be refreshed after stash exits or it is written somewhere else already. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-20 09:58:22 -07:00
Thomas Gummerer	e080b34540	merge: use refresh_and_write_cache Use the 'refresh_and_write_cache()' convenience function introduced in the last commit, instead of refreshing and writing the index manually in merge.c Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-20 09:58:22 -07:00
Thomas Gummerer	22184497a3	factor out refresh_and_write_cache function Getting the lock for the index, refreshing it and then writing it is a pattern that happens more than once throughout the codebase, and isn't trivial to get right. Factor out the refresh_and_write_cache function from builtin/am.c to read-cache.c, so it can be re-used in other places in a subsequent commit. Note that we return different error codes for failing to refresh the cache, and failing to write the index. The current caller only cares about failing to write the index. However for other callers we're going to convert in subsequent patches we will need this distinction. Helped-by: Martin Ågren <martin.agren@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-20 09:58:21 -07:00
Garima Singh	7371612255	commit-graph: add --[no-]progress to write and verify Add --[no-]progress to git commit-graph write and verify. The progress feature was introduced in `7b0f229` ("commit-graph write: add progress output", 2018-09-17) but the ability to opt-out was overlooked. Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-18 14:23:09 -07:00
Junio C Hamano	627b826834	Merge branch 'md/list-objects-filter-combo' The list-objects-filter API (used to create a sparse/lazy clone) learned to take a combined filter specification. * md/list-objects-filter-combo: list-objects-filter-options: make parser void list-objects-filter-options: clean up use of ALLOC_GROW list-objects-filter-options: allow mult. --filter strbuf: give URL-encoding API a char predicate fn list-objects-filter-options: make filter_spec a string_list list-objects-filter-options: move error check up list-objects-filter: implement composite filters list-objects-filter-options: always supply *errbuf list-objects-filter: put omits set in filter struct list-objects-filter: encapsulate filter components	2019-09-18 11:50:09 -07:00
Junio C Hamano	b9ac6c59b8	Merge branch 'cc/multi-promisor' Teach the lazy clone machinery that there can be more than one promisor remote and consult them in order when downloading missing objects on demand. * cc/multi-promisor: Move core_partial_clone_filter_default to promisor-remote.c Move repository_format_partial_clone to promisor-remote.c Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} remote: add promisor and partial clone config to the doc partial-clone: add multiple remotes in the doc t0410: test fetching from many promisor remotes builtin/fetch: remove unique promisor remote limitation promisor-remote: parse remote.*.partialclonefilter Use promisor_remote_get_direct() and has_promisor_remote() promisor-remote: use repository_format_partial_clone promisor-remote: add promisor_remote_reinit() promisor-remote: implement promisor_remote_get_direct() Add initial support for many promisor remotes fetch-object: make functions return an error code t0410: remove pipes after git commands	2019-09-18 11:50:09 -07:00
Junio C Hamano	f76bd8c6b1	Merge branch 'js/pre-merge-commit-hook' A new "pre-merge-commit" hook has been introduced. * js/pre-merge-commit-hook: merge: --no-verify to bypass pre-merge-commit hook git-merge: honor pre-merge-commit hook merge: do no-verify like commit t7503: verify proper hook execution	2019-09-18 11:50:08 -07:00
Junio C Hamano	128666753b	Merge branch 'jk/drop-release-pack-memory' xmalloc() used to have a mechanism to ditch memory and address space resources as the last resort upon seeing an allocation failure from the underlying malloc(), which made the code complex and thread-unsafe with dubious benefit, as major memory resource users already do limit their uses with various other mechanisms. It has been simplified away. * jk/drop-release-pack-memory: packfile: drop release_pack_memory()	2019-09-18 11:50:08 -07:00
Junio C Hamano	917a319ea5	Merge branch 'js/rebase-r-strategy' "git rebase --rebase-merges" learned to drive different merge strategies and pass strategy specific options to them. * js/rebase-r-strategy: t3427: accelerate this test by using fast-export and fast-import rebase -r: do not (re-)generate root commits with `--root` and `--onto` t3418: test `rebase -r` with merge strategies t/lib-rebase: prepare for testing `git rebase --rebase-merges` rebase -r: support merge strategies other than `recursive` t3427: fix another incorrect assumption t3427: accommodate for the `rebase --merge` backend having been replaced t3427: fix erroneous assumption t3427: condense the unnecessarily repetitive test cases into three t3427: move the `filter-branch` invocation into the `setup` case t3427: simplify the `setup` test case significantly t3427: add a clarifying comment rebase: fold git-rebase--common into the -p backend sequencer: the `am` and `rebase--interactive` scripts are gone .gitignore: there is no longer a built-in `git-rebase--interactive` t3400: stop referring to the scripted rebase Drop unused git-rebase--am.sh	2019-09-18 11:50:07 -07:00
Elijah Newren	902b90cf42	clean: fix theoretical path corruption cmd_clean() had the following code structure: struct strbuf abs_path = STRBUF_INIT; for_each_string_list_item(item, &del_list) { strbuf_addstr(&abs_path, prefix); strbuf_addstr(&abs_path, item->string); PROCESS(&abs_path); strbuf_reset(&abs_path); } where I've elided a bunch of unnecessary details and PROCESS(&abs_path) represents a big chunk of code rather than an actual function call. One piece of PROCESS was: if (lstat(abs_path.buf, &st)) continue; which would cause the strbuf_reset() to be missed -- meaning that the next path to be handled would have two paths concatenated. This path used to use die_errno() instead of continue prior to commit `396049e5fb` ("git-clean: refactor git-clean into two phases", 2013-06-25), but my understanding of how correct_untracked_entries() works is that it will prevent both dir/ and dir/file from being in the list to clean so this should be dead code and the die_errno() should be safe. But I hesitate to remove it since I am not certain. However, we can fix both this bug and possible similar future bugs by simply moving the strbuf_reset(&abs_path) to the beginning of the loop. It'll result in N calls to strbuf_reset() instead of N-1, but that's a small price to pay to avoid sneaky bugs like this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-17 12:20:36 -07:00
Elijah Newren	ca8b5390db	clean: rewrap overly long line Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-17 12:20:36 -07:00
Elijah Newren	09487f2cba	clean: avoid removing untracked files in a nested git repository Users expect files in a nested git repository to be left alone unless sufficiently forced (with two -f's). Unfortunately, in certain circumstances, git would delete both tracked (and possibly dirty) files and untracked files within a nested repository. To explain how this happens, let's contrast a couple cases. First, take the following example setup (which assumes we are already within a git repo): git init nested cd nested >tracked git add tracked git commit -m init >untracked cd .. In this setup, everything works as expected; running 'git clean -fd' will result in fill_directory() returning the following paths: nested/ nested/tracked nested/untracked and then correct_untracked_entries() would notice this can be compressed to nested/ and then since "nested/" is a directory, we would call remove_dirs("nested/", ...), which would check is_nonbare_repository_dir() and then decide to skip it. However, if someone also creates an ignored file: >nested/ignored then running 'git clean -fd' would result in fill_directory() returning the same paths: nested/ nested/tracked nested/untracked but correct_untracked_entries() will notice that we had ignored entries under nested/ and thus simplify this list to nested/tracked nested/untracked Since these are not directories, we do not call remove_dirs() which was the only place that had the is_nonbare_repository_dir() safety check -- resulting in us deleting both the untracked file and the tracked (and possibly dirty) file. One possible fix for this issue would be walking the parent directories of each path and checking if they represent nonbare repositories, but that would be wasteful. Even if we added caching of some sort, it's still a waste because we should have been able to check that "nested/" represented a nonbare repository before even descending into it in the first place. Add a DIR_SKIP_NESTED_GIT flag to dir_struct.flags and use it to prevent fill_directory() and friends from descending into nested git repos. With this change, we also modify two regression tests added in commit `91479b9c72` ("t7300: add tests to document behavior of clean and nested git", 2015-06-15). That commit, nor its series, nor the six previous iterations of that series on the mailing list discussed why those tests coded the expectation they did. In fact, it appears their purpose was simply to test _existing_ behavior to make sure that the performance changes didn't change the behavior. However, these two tests directly contradicted the manpage's claims that two -f's were required to delete files/directories under a nested git repository. While one could argue that the user gave an explicit path which matched files/directories that were within a nested repository, there's a slippery slope that becomes very difficult for users to understand once you go down that route (e.g. what if they specified "git clean -f -d '*.c'"?) It would also be hard to explain what the exact behavior was; avoid such problems by making it really simple. Also, clean up some grammar errors describing this functionality in the git-clean manpage. Finally, there are still a couple bugs with -ffd not cleaning out enough (e.g. missing the nested .git) and with -ffdX possibly cleaning out the wrong files (paying attention to outer .gitignore instead of inner). This patch does not address these cases at all (and does not change the behavior relative to those flags), it only fixes the handling when given a single -f. See https://public-inbox.org/git/20190905212043.GC32087@szeder.dev/ for more discussion of the -ffd[X?] bugs. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-17 12:20:35 -07:00
Elijah Newren	e86bbcf987	clean: disambiguate the definition of -d The -d flag pre-dated git-clean's ability to have paths specified. As such, the default for git-clean was to only remove untracked files in the current directory, and -d existed to allow it to recurse into subdirectories. The interaction of paths and the -d option appears to not have been carefully considered, as evidenced by numerous bugs and a dearth of tests covering such pairings in the testsuite. The definition turns out to be important, so let's look at some of the various ways one could interpret the -d option: A) Without -d, only look in subdirectories which contain tracked files under them; with -d, also look in subdirectories which are untracked for files to clean. B) Without specified paths from the user for us to delete, we need to have some kind of default, so...without -d, only look in subdirectories which contain tracked files under them; with -d, also look in subdirectories which are untracked for files to clean. The important distinction here is that choice B says that the presence or absence of '-d' is irrelevant if paths are specified. The logic behind option B is that if a user explicitly asked us to clean a specified pathspec, then we should clean anything that matches that pathspec. Some examples may clarify. Should git clean -f untracked_dir/file remove untracked_dir/file or not? It seems crazy not to, but a strict reading of option A says it shouldn't be removed. How about git clean -f untracked_dir/file1 tracked_dir/file2 or git clean -f untracked_dir_1/file1 untracked_dir_2/file2 ? Should it remove either or both of these files? Should it require multiple runs to remove both the files listed? (If this sounds like a crazy question to even ask, see the commit message of "t7300: Add some testcases showing failure to clean specified pathspecs" added earlier in this patch series.) What if -ffd were used instead of -f -- should that allow these to be removed? Should it take multiple invocations with -ffd? What if a glob (such as 'tracked') were used instead of spelling out the directory names? What if the filenames involved globs, such as git clean -f '.o' or git clean -f '/*.o' ? The current documentation actually suggests a definition that is slightly different than choice A, and the implementation prior to this series provided something radically different than either choices A or B. (The implementation, though, was clearly just buggy). There may be other choices as well. However, for almost any given choice of definition for -d that I can think of, some of the examples above will appear buggy to the user. The only case that doesn't have negative surprises is choice B: treat a user-specified path as a request to clean all untracked files which match that path specification, including recursing into any untracked directories. Change the documentation and basic implementation to use this definition. There were two regression tests that indirectly depended on the current implementation, but neither was about subdirectory handling. These two tests were introduced in commit `5b7570cfb4` ("git-clean: add tests for relative path", 2008-03-07) which was solely created to add coverage for the changes in commit fb328947c8e ("git-clean: correct printing relative path", 2008-03-07). Both tests specified a directory that happened to have an untracked subdirectory, but both were only checking that the resulting printout of a file that was removed was shown with a relative path. Update these tests appropriately. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-17 12:20:35 -07:00
Masaya Suzuki	b7e2d8bca5	fetch: use oidset to keep the want OIDs for faster lookup During git-fetch, the client checks if the advertised tags' OIDs are already in the fetch request's want OID set. This check is done in a linear scan. For a repository that has a lot of refs, repeating this scan takes 15+ minutes. In order to speed this up, create a oid_set for other refs' OIDs. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-16 13:02:50 -07:00
Jeff King	4c96a77594	list-objects-filter: delay parsing of sparse oid The list-objects-filter code has two steps to its initialization: 1. parse_list_objects_filter() makes sure the spec is a filter we know about and is syntactically correct. This step is done by "rev-list" or "upload-pack" that is going to apply a filter, but also by "git clone" or "git fetch" before they send the spec across the wire. 2. list_objects_filter__init() runs the type-specific initialization (using function pointers established in step 1). This happens at the start of traverse_commit_list_filtered(), when we're about to actually use the filter. It's a good idea to parse as much as we can in step 1, in order to catch problems early (e.g., a blob size limit that isn't a number). But one thing we _shouldn't_ do is resolve any oids at that step (e.g., for sparse-file contents specified by oid). In the case of a fetch, the oid has to be resolved on the remote side. The current code does resolve the oid during the parse phase, but ignores any error (which we must do, because we might just be sending the spec across the wire). This leads to two bugs: - if we're not in a repository (e.g., because it's git-clone parsing the spec), then we trigger a BUG() trying to resolve the name - if we did hit the error case, we still have to notice that later and bail. The code path in rev-list handles this, but the one in upload-pack does not, leading to a segfault. We can fix both by moving the oid resolution into the sparse-oid init function. At that point we know we have a repository (because we're about to traverse), and handling the error there fixes the segfault. As a bonus, we can drop the NULL sparse_oid_value check in rev-list, since this is now handled in the sparse-oid-filter init function. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-16 12:47:37 -07:00
Jeff King	bab28d9f97	builtin/pack-objects: report reused packfile objects To see when packfile reuse kicks in or not, it is useful to show reused packfile objects statistics in the output of upload-pack. Helped-by: James Ramsay <james@jramsay.com.au> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-13 14:40:33 -07:00
Junio C Hamano	9437394661	Merge branch 'cb/fetch-set-upstream' "git fetch" learned "--set-upstream" option to help those who first clone from their private fork they intend to push to, add the true upstream via "git remote add" and then "git fetch" from it. * cb/fetch-set-upstream: pull, fetch: add --set-upstream option	2019-09-09 12:26:37 -07:00
Junio C Hamano	d1a251a1fa	Merge branch 'en/checkout-mismerge-fix' Fix a mismerge that happened in 2.22 timeframe. * en/checkout-mismerge-fix: checkout: remove duplicate code	2019-09-09 12:26:36 -07:00
Junio C Hamano	f4f8dfe127	Merge branch 'ds/feature-macros' A mechanism to affect the default setting for a (related) group of configuration variables is introduced. * ds/feature-macros: repo-settings: create feature.experimental setting repo-settings: create feature.manyFiles setting repo-settings: parse core.untrackedCache commit-graph: turn on commit-graph by default t6501: use 'git gc' in quiet mode repo-settings: consolidate some config settings	2019-09-09 12:26:36 -07:00
Jeff King	dd2e50a84e	commit-graph: turn off save_commit_buffer The commit-graph tool may read a lot of commits, but it only cares about parsing their metadata (parents, trees, etc) and doesn't ever show the messages to the user. And so it should not need save_commit_buffer, which is meant for holding onto the object data of parsed commits so that we can show them later. In fact, it's quite harmful to do so. According to massif, the max heap of "git commit-graph write --reachable" in linux.git before/after this patch (removing the commit graph file in between) goes from ~1.1GB to ~270MB. Which isn't surprising, since the difference is about the sum of the uncompressed sizes of all commits in the repository, and this was equivalent to leaking them. This obviously helps if you're under memory pressure, but even without it, things go faster. My before/after times for that command (without massif) went from 12.521s to 11.874s, a speedup of ~5%. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-09 10:56:50 -07:00
Ben Wijen	bf1e28e0ad	builtin/rebase.c: Remove pointless message When doing 'git rebase --autostash <upstream> <master>' with a dirty worktree a 'HEAD is now at ...' message is emitted, which is pointless as it refers to the old active branch which isn't actually moved. This commit removes the 'HEAD is now at...' message. Signed-off-by: Ben Wijen <ben@wijen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-07 10:17:05 -07:00
Ben Wijen	d2172ef02d	builtin/rebase.c: make sure the active branch isn't moved when autostashing Consider the following scenario: git checkout not-the-master work work work git rebase --autostash upstream master Here 'rebase --autostash <upstream> <branch>' incorrectly moves the active branch (not-the-master) to master (before the rebase). The expected behavior: (58794775:/git-rebase.sh:526) AUTOSTASH=$(git stash create autostash) git reset --hard git checkout master git rebase upstream git stash apply $AUTOSTASH The actual behavior: (6defce2b:/builtin/rebase.c:1062) AUTOSTASH=$(git stash create autostash) git reset --hard master git checkout master git rebase upstream git stash apply $AUTOSTASH This commit reinstates the 'legacy script' behavior as introduced with `58794775`: rebase: implement --[no-]autostash and rebase.autostash Signed-off-by: Ben Wijen <ben@wijen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-07 10:17:05 -07:00
Jeff King	3a37876b5d	pack-objects: drop packlist index_pos optimization Once upon a time, the code to add an object to our packing list in pack-objects all lived in a single function. It computed the position within the hash table once, then used it to check if the object was already present, and if not, to add it. Later, in `2834bc27c1` (pack-objects: refactor the packing list, 2013-10-24), this was split into two functions: packlist_find() and packlist_alloc(). We ended up with an "index_pos" variable that gets passed through several functions to make it from one to the other. The resulting code is rather confusing to follow. The "index_pos" variable is sometimes undefined, if we don't yet have a hash table. This works out in practice because in that case packlist_alloc() won't use it at all, since it will have to create/grow the hash table. But it's hard to verify that, and it does cause gcc 9.2.1's -Wmaybe-uninitialized to complain when compiled with "-flto -O3" (rightfully, since we do pass the uninitialized value as a function parameter, even if nobody ends up using it). All of this is to save computing the hash index again when we're inserting into the hash table, which I found doesn't make a measurable difference in the program runtime (which is not surprising, since we're doing all kinds of other heavyweight things for each object). Let's just drop this index_pos variable entirely, simplifying the code (and pleasing the compiler). We might be better still refactoring this custom hash table to use one of our existing implementations (an oidmap, or a kh_oid_map). I stopped short of that here, but this would be the likely first step towards that anyway. Reported-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-06 11:03:42 -07:00
Jeff King	f1cbd033e2	pack-objects: use object_id in packlist_alloc() The only caller of packlist_alloc() already has a "struct object_id", and we immediately copy the hash they pass us into our own object_id. Let's avoid the unnecessary round-trip to a raw sha1 pointer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-06 11:03:39 -07:00
Jeff King	0dfed92dfd	git-am: handle missing "author" when parsing commit We try to parse the "author" line out of a commit buffer. We handle the case that split_ident_line() doesn't work, but we don't do any error checking that we found an "author" line in the first place! This would cause us to segfault on such a corrupt object. Let's put in an explicit NULL check (we can just die(), which is what a bogus split would do, too). As a bonus, this silences a warning when compiling with gcc 9.2.1 using "-flto -O3", which claims that ident_len may be uninitialized (it would only be if we had a NULL here). Reported-by: Stephan Beyer <s-beyer@gmx.net> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-06 11:03:39 -07:00
René Scharfe	c77722b3ea	use get_tagged_oid() Avoid derefencing ->tagged without checking for NULL by using the convenience wrapper for getting the ID of the tagged object. It die()s when encountering a broken tag instead of segfaulting. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-05 14:11:34 -07:00
Derrick Stolee	65edd96aec	treewide: rename 'exclude' methods to 'pattern' The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames several methods defined in dir.h to make more sense with the renamed 'struct exclude_list' to 'struct pattern_list' and 'struct exclude' to 'struct path_pattern': * last_exclude_matching() -> last_matching_pattern() * parse_exclude() -> parse_path_pattern() In addition, the word 'exclude' was replaced with 'pattern' in the methods below: * add_exclude_list() * add_excludes_from_file_to_list() * add_excludes_from_file() * add_excludes_from_blob_to_list() * add_exclude() * clear_exclude_list() A few methods with the word "exclude" remain. These will be handled seperately. In particular, the method "is_excluded()" is concretely about the .gitignore file relative to a specific directory. This is the important boundary between library and consumer: is_excluded() cares about .gitignore, but is_excluded() calls last_matching_pattern() to make that decision. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-05 14:05:12 -07:00
Derrick Stolee	4ff89ee52c	treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_' The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit replaces 'EXCL_FLAG_' to 'PATTERN_FLAG_' in the names of the flags used on 'struct path_pattern'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-05 14:05:12 -07:00
Derrick Stolee	caa3d55444	treewide: rename 'struct exclude_list' to 'struct pattern_list' The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames 'struct exclude_list' to 'struct pattern_list' and renames several variables called 'el' to 'pl'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-05 14:05:11 -07:00
Derrick Stolee	ab8db61390	treewide: rename 'struct exclude' to 'struct path_pattern' The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a list of 'struct exclude' items makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames 'struct exclude' to 'struct path_pattern' and renames several variable names to match. 'struct pattern' was already taken by attr.c, and this more completely describes that the patterns are specific to file paths. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-05 14:05:11 -07:00
Derrick Stolee	50f26bd035	fetch: add fetch.writeCommitGraph config setting The commit-graph feature is now on by default, and is being written during 'git gc' by default. Typically, Git only writes a commit-graph when a 'git gc --auto' command passes the gc.auto setting to actualy do work. This means that a commit-graph will typically fall behind the commits that are being used every day. To stay updated with the latest commits, add a step to 'git fetch' to write a commit-graph after fetching new objects. The fetch.writeCommitGraph config setting enables writing a split commit-graph, so on average the cost of writing this file is very small. Occasionally, the commit-graph chain will collapse to a single level, and this could be slow for very large repos. For additional use, adjust the default to be true when feature.experimental is enabled. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-03 12:06:14 -07:00
Thomas Gummerer	8e4c8af058	push: disallow --all and refspecs when remote.<name>.mirror is set Pushes with --all, or refspecs are disallowed when --mirror is given to 'git push', or when 'remote.<name>.mirror' is set in the config of the repository, because they can have surprising effects. `800a4ab399` ("push: check for errors earlier", 2018-05-16) refactored this code to do that check earlier, so we can explicitly check for the presence of flags, instead of their sideeffects. However when 'remote.<name>.mirror' is set in the config, the TRANSPORT_PUSH_MIRROR flag would only be set after we calling 'do_push()', so the checks would miss it entirely. This leads to surprises for users [1]. Fix this by making sure we set the flag (if appropriate) before checking for compatibility of the various options. 1: https://twitter.com/FiloSottile/status/1163918701462249472 Reported-by: Filippo Valsorda <filippo@ml.filippo.io> Helped-by: Saleem Rashid Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-03 11:16:54 -07:00
Derrick Stolee	313677627a	checkout: add simple check for 'git checkout -b' The 'git switch' command was created to separate half of the behavior of 'git checkout'. It specifically has the mode to do nothing with the index and working directory if the user only specifies to create a new branch and change HEAD to that branch. This is also the behavior most users expect from 'git checkout -b', but for historical reasons it also performs an index update by scanning the working directory. This can be slow for even moderately-sized repos. A performance fix for 'git checkout -b' was introduced by `fa655d8411` (checkout: optimize "git checkout -b <new_branch>" 2018-08-16). That change includes details about the config setting checkout.optimizeNewBranch when the sparse-checkout feature is required. The way this change detected if this behavior change is safe was through the skip_merge_working_tree() method. This method was complex and needed to be updated as new options were introduced. This behavior was essentially reverted by `65f099b` ("switch: no worktree status unless real branch switch happens" 2019-03-29). Instead, two members of the checkout_opts struct were used to distinguish between 'git checkout' and 'git switch': * switch_branch_doing_nothing_is_ok * only_merge_on_switching_branches These settings have opposite values depending on if we start in cmd_checkout or cmd_switch. The message for 64f099b includes "Users of big repos are encouraged to move to switch." Making this change while 'git switch' is still experimental is too aggressive. Create a happy medium between these two options by making 'git checkout -b <branch>' behave just like 'git switch', but only if we read exactly those arguments. This must be done in cmd_checkout to avoid the arguments being consumed by the option parsing logic. This differs from the previous change by fa644d8 in that the config option checkout.optimizeNewBranch remains deleted. This means that 'git checkout -b' will ignore the index merge even if we have a sparse-checkout file. While this is a behavior change for 'git checkout -b', it matches the behavior of 'git switch -c'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Acked-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-30 10:20:05 -07:00
Denton Liu	414d924beb	rebase: teach rebase --keep-base A common scenario is if a user is working on a topic branch and they wish to make some changes to intermediate commits or autosquash, they would run something such as git rebase -i --onto master... master in order to preserve the merge base. This is useful when contributing a patch series to the Git mailing list, one often starts on top of the current 'master'. While developing the patches, 'master' is also developed further and it is sometimes not the best idea to keep rebasing on top of 'master', but to keep the base commit as-is. In addition to this, a user wishing to test individual commits in a topic branch without changing anything may run git rebase -x ./test.sh master... master Since rebasing onto the merge base of the branch and the upstream is such a common case, introduce the --keep-base option as a shortcut. This allows us to rewrite the above as git rebase -i --keep-base master and git rebase -x ./test.sh --keep-base master respectively. Add tests to ensure --keep-base works correctly in the normal case and fails when there are multiple merge bases, both in regular and interactive mode. Also, test to make sure conflicting options cause rebase to fail. While we're adding test cases, add a missing set_fake_editor call to 'rebase -i --onto master...side'. While we're documenting the --keep-base option, change an instance of "merge-base" to "merge base", which is the consistent spelling. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-27 15:33:40 -07:00
Denton Liu	4effc5bc96	rebase: fast-forward --fork-point in more cases Before, when we rebased with a --fork-point invocation where the fork-point wasn't empty, we would be setting options.restrict_revision. The fast-forward logic would automatically declare that the rebase was not fast-forwardable if it was set. However, this was painting with a very broad brush. Refine the logic so that we can fast-forward in the case where the restricted revision is equal to the merge base, since we stop rebasing at the merge base anyway. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-27 15:33:40 -07:00
Denton Liu	c0efb4c1dd	rebase: fast-forward --onto in more cases Before, when we had the following graph, A---B---C (master) \ D (side) running 'git rebase --onto master... master side' would result in D being always rebased, no matter what. However, the desired behavior is that rebase should notice that this is fast-forwardable and do that instead. Add detection to `can_fast_forward` so that this case can be detected and a fast-forward will be performed. First of all, rewrite the function to use gotos which simplifies the logic. Next, since the options.upstream && !oidcmp(&options.upstream->object.oid, &options.onto->object.oid) conditions were removed in `cmd_rebase`, we reintroduce a substitute in `can_fast_forward`. In particular, checking the merge bases of `upstream` and `head` fixes a failing case in t3416. The abbreviated graph for t3416 is as follows: F---G topic / A---B---C---D---E master and the failing command was git rebase --onto master...topic F topic Before, Git would see that there was one merge base (C), and the merge and onto were the same so it would incorrectly return 1, indicating that we could fast-forward. This would cause the rebased graph to be 'ABCFG' when we were expecting 'ABCG'. With the additional logic, we detect that upstream and head's merge base is F. Since onto isn't F, it means we're not rebasing the full set of commits from master..topic. Since we're excluding some commits, a fast-forward cannot be performed and so we correctly return 0. Add '-f' to test cases that failed as a result of this change because they were not expecting a fast-forward so that a rebase is forced. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-27 15:33:40 -07:00

... 26 27 28 29 30 ...

9307 Commits