In3afc679b"commit: use generations in paint_down_to_common()", the queue in paint_down_to_common() was changed to use a priority order based on generation number before commit date. This served two purposes: 1. When generation numbers are present, the walk guarantees correct topological relationships, regardless of clock skew in commit dates. 2. It enables short-circuiting the walk when the min_generation parameter is added ind7c1ec3e"commit: add short-circuit to paint_down_to_common()". This short-circuit helps commands like 'git branch --contains' from needing to walk to a merge base when we know the result is false. The commit message for3afc679bincludes the following sentence: This change does not affect the number of commits that are walked during the execution of paint_down_to_common(), only the order that those commits are inspected. This statement is incorrect. Because it changes the order in which the commits are inspected, it changes the order they are added to the queue, and hence can change the number of loops before the queue_has_nonstale() method returns true. This change makes a concrete difference depending on the topology of the commit graph. For instance, computing the merge-base between consecutive versions of the Linux kernel has no effect for versions after v4.9, but 'git merge-base v4.8 v4.9' presents a performance regression: v2.18.0: 0.122s v2.19.0-rc1: 0.547s HEAD: 0.127s To determine that this was simply an ordering issue, I inserted a counter within the while loop of paint_down_to_common() and found that the loop runs 167,468 times in v2.18.0 and 635,579 times in v2.19.0-rc1. The topology of this case can be described in a simplified way here: v4.9 | \ | \ v4.8 \ | \ \ | \ | ... A B | / / | / / |/__/ C Here, the "..." means "a very long line of commits". By generation number, A and B have generation one more than C. However, A and B have commit date higher than most of the commits reachable from v4.8. When the walk reaches v4.8, we realize that it has PARENT1 and PARENT2 flags, so everything it can reach is marked as STALE, including A. B has only the PARENT1 flag, so is not STALE. When paint_down_to_common() is run using compare_commits_by_commit_date, A and B are removed from the queue early and C is inserted into the queue. At this point, C and the rest of the queue entries are marked as STALE. The loop then terminates. When paint_down_to_common() is run using compare_commits_by_gen_then_commit_date, B is removed from the queue only after the many commits reachable from v4.8 are explored. This causes the loop to run longer. The reason for this regression is simple: the queue order is intended to not explore a commit until everything that _could_ reach that commit is explored. From the information gathered by the original ordering, we have no guarantee that there is not a commit D reachable from v4.8 that can also reach B. We gained absolute correctness in exchange for a performance regression. The performance regression is probably the worse option, since these incorrect results in paint_down_to_common() are rare. The topology required for the performance regression are less rare, but still require multiple merge commits where the parents differ greatly in generation number. In our example above, the commit A is as important as the commit B to demonstrate the problem, since otherwise the commit C will sit in the queue as non-stale just as long in both orders. The solution provided uses the min_generation parameter to decide if we should use generation numbers in our ordering. When min_generation is equal to zero, it means that the caller has no known cutoff for the walk, so we should rely on our commit-date heuristic as before; this is the case with merge_bases_many(). When min_generation is non-zero, then the caller knows a valuable cutoff for the short-circuit mechanism; this is the case with remove_redundant() and in_merge_bases_many(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git - fast, scalable, distributed revision control system
Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.
Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.
Please read the file INSTALL for installation instructions.
Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.
See Documentation/gittutorial.txt to get started, then see
Documentation/giteveryday.txt for a useful minimum set of commands, and
Documentation/git-.txt for documentation of each command.
If git has been correctly installed, then the tutorial can also be
read with man gittutorial or git help tutorial, and the
documentation of each command with man git-<commandname> or git help <commandname>.
CVS users may also want to read Documentation/gitcvs-migration.txt
(man gitcvs-migration or git help cvs-migration if git is
installed).
The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://public-inbox.org/git/, http://marc.info/?l=git and other archival sites.
The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.
The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):
- random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
- stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
- "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
- "goddamn idiotic truckload of sh*t": when it breaks