e53e6b4433f264250c2e586167caf61721b0185c
When traversing trees with an index, the current index pointer
(o->cache_bottom) occasionally has to be temporarily advanced forwards to
match the traversal order of the tree, which is not the same as the sort
order of the index. The existing algorithm that did this (introduced in
730f72840c) would get "stuck" when the
cache_bottom was popped and then repeatedly check the same index entries
over and over. This represents a serious performance regression for
large repositories compared to the old "broken" traversal order.
This commit makes a simple change to mitigate this. Whenever
find_cache_pos sees that the current pos is also the cache_bottom, and
it has already been unpacked, it advances the cache_bottom as well as
the current pos. This prevents the above "sticking" behavior without
dramatically changing the algorithm.
In addition, this commit moves the unpacked check above the
ce_in_traverse_path() check. The simple bitmask check is cheaper, and
in the case described above will be firing quite a bit to advance the
cache_bottom after a tree pop.
This yields considerable performance improvements for large trees.
The following are the number of function calls for "git diff HEAD" on
the Linux kernel tree, with 33,307 files:
Symbol Calls Before Calls After
------------------- ------------ -----------
unpack_callback 35,332 35,332
find_cache_pos 37,357 37,357
ce_in_traverse_path 4,979,473 37,357
do_compare_entry 6,828,181 251,925
df_name_compare 6,828,181 251,925
And on a repository of 187,456 files:
Symbol Calls Before Calls After
------------------- ------------ -----------
unpack_callback 197,958 197,958
find_cache_pos 208,460 208,460
ce_in_traverse_path 37,308,336 208,460
do_compare_entry 156,950,469 2,690,626
df_name_compare 156,950,469 2,690,626
On the latter repository, user time for "git diff HEAD" was reduced from
5.58 to 0.42 seconds. This is compared to 0.30 seconds before the
traversal order fix was implemented.
Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…
…
…
…
…
////////////////////////////////////////////////////////////////
GIT - the stupid content tracker
////////////////////////////////////////////////////////////////
"git" can mean anything, depending on your mood.
- random three-letter combination that is pronounceable, and not
actually used by any common UNIX command. The fact that it is a
mispronunciation of "get" may or may not be relevant.
- stupid. contemptible and despicable. simple. Take your pick from the
dictionary of slang.
- "global information tracker": you're in a good mood, and it actually
works for you. Angels sing, and a light suddenly fills the room.
- "goddamn idiotic truckload of sh*t": when it breaks
Git is a fast, scalable, distributed revision control system with an
unusually rich command set that provides both high-level operations
and full access to internals.
Git is an Open Source project covered by the GNU General Public License.
It was originally written by Linus Torvalds with help of a group of
hackers around the net. It is currently maintained by Junio C Hamano.
Please read the file INSTALL for installation instructions.
See Documentation/gittutorial.txt to get started, then see
Documentation/everyday.txt for a useful minimum set of commands, and
Documentation/git-commandname.txt for documentation of each command.
If git has been correctly installed, then the tutorial can also be
read with "man gittutorial" or "git help tutorial", and the
documentation of each command with "man git-commandname" or "git help
commandname".
CVS users may also want to read Documentation/gitcvs-migration.txt
("man gitcvs-migration" or "git help cvs-migration" if git is
installed).
Many Git online resources are accessible from http://git-scm.com/
including full documentation and Git related tools.
The user discussion and development of Git take place on the Git
mailing list -- everyone is welcome to post bug reports, feature
requests, comments and patches to git@vger.kernel.org. To subscribe
to the list, send an email with just "subscribe git" in the body to
majordomo@vger.kernel.org. The mailing list archives are available at
http://marc.theaimsgroup.com/?l=git and other archival sites.
The messages titled "A note from the maintainer", "What's in
git.git (stable)" and "What's cooking in git.git (topics)" and
the discussion following them on the mailing list give a good
reference for project status, development direction and
remaining tasks.
Description
Languages
C
50.5%
Shell
38.7%
Perl
4.5%
Tcl
3.2%
Python
0.8%
Other
2.1%