Thomas Rast 6942efcfa9 xdiff: load full words in the inner loop of xdl_hash_record
Redo the hashing loop in xdl_hash_record in a way that loads an entire
'long' at a time, using masking tricks to see when and where we found
the terminating '\n'.

I stole inspiration and code from the posts by Linus Torvalds around

  https://lkml.org/lkml/2012/3/2/452
  https://lkml.org/lkml/2012/3/5/6

His method reads the buffers in sizeof(long) increments, and may thus
overrun it by at most sizeof(long)-1 bytes before it sees the final
newline (or hits the buffer length check).  I considered padding out
all buffers by a suitable amount to "catch" the overrun, but

* this does not work for mmap()'d buffers: if you map 4096+8 bytes
  from a 4096 byte file, accessing the last 8 bytes results in a
  SIGBUS on my machine; and

* it would also be extremely ugly because it intrudes deep into the
  unpacking machinery.

So I adapted it to not read beyond the buffer at all.  Instead, it
reads the final partial word byte-by-byte and strings it together.
Then it can use the same logic as before to finish the hashing.

So far we enable this only on x86_64, where it provides nice speedup
for diff-related work:

  Test                                  origin/next      tr/xdiff-fast-hash
  -----------------------------------------------------------------------------
  4000.1: log -3000 (baseline)          0.07(0.05+0.02)  0.08(0.06+0.02) +14.3%
  4000.2: log --raw -3000 (tree-only)   0.37(0.33+0.04)  0.37(0.32+0.04) +0.0%
  4000.3: log -p -3000 (Myers)          1.75(1.65+0.09)  1.60(1.49+0.10) -8.6%
  4000.4: log -p -3000 --histogram      1.73(1.62+0.09)  1.58(1.49+0.08) -8.7%
  4000.5: log -p -3000 --patience       2.11(2.00+0.10)  1.94(1.80+0.11) -8.1%

Perhaps other platforms could also benefit.  However it does NOT work
on big-endian systems!

[jc: minimum style and compilation fixes]

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 17:03:25 -07:00
2012-04-02 13:11:49 -07:00
2012-03-28 08:47:23 -07:00
2012-04-06 10:47:58 -07:00
2012-04-02 15:06:25 -07:00
2012-03-28 08:47:23 -07:00
2012-01-31 22:24:23 -08:00
2011-03-17 15:30:49 -07:00
2011-12-13 22:53:08 -08:00
2011-10-21 16:04:32 -07:00
2012-01-06 12:44:07 -08:00
2012-03-07 12:12:59 -08:00
2012-01-08 15:08:03 -08:00
2012-03-07 12:12:59 -08:00
2011-05-26 16:47:15 -07:00
2011-12-16 22:33:40 -08:00
2011-08-11 12:21:07 -07:00
2011-09-19 20:46:48 -07:00
2012-04-06 10:47:58 -07:00
2012-01-08 15:07:20 -08:00
2012-04-06 10:15:11 -07:00
2012-03-07 12:12:59 -08:00
2011-12-19 16:06:41 -08:00
2011-12-19 16:06:41 -08:00
2012-03-28 08:47:23 -07:00
2012-03-28 08:47:23 -07:00
2012-02-05 23:53:21 -08:00
2011-08-20 22:33:57 -07:00
2011-05-19 18:23:17 -07:00
2010-08-26 09:20:03 -07:00
2012-01-06 12:44:07 -08:00
2012-01-06 12:44:07 -08:00
2011-08-22 10:07:07 -07:00
2011-11-06 20:31:28 -08:00
2011-12-16 22:33:40 -08:00
2012-01-06 12:44:07 -08:00
2011-12-12 16:09:38 -08:00
2011-11-07 22:12:19 -08:00
2012-02-12 19:50:39 -08:00
2012-02-12 19:50:39 -08:00
2012-04-02 13:07:58 -07:00
2012-02-22 18:17:39 -08:00
2012-02-22 18:17:39 -08:00
2011-05-30 00:09:55 -07:00
2011-11-12 22:27:38 -08:00
2011-08-01 15:00:29 -07:00
2011-05-26 16:47:15 -07:00
2011-12-11 23:16:24 -08:00
2011-03-22 11:43:27 -07:00
2011-03-22 10:16:54 -07:00
2011-03-22 10:16:54 -07:00
2012-03-28 08:47:23 -07:00
2012-02-22 18:17:39 -08:00
2011-12-11 23:16:25 -08:00
2011-10-17 21:37:15 -07:00
2011-05-26 13:54:18 -07:00

////////////////////////////////////////////////////////////////

	GIT - the stupid content tracker

////////////////////////////////////////////////////////////////

"git" can mean anything, depending on your mood.

 - random three-letter combination that is pronounceable, and not
   actually used by any common UNIX command.  The fact that it is a
   mispronunciation of "get" may or may not be relevant.
 - stupid. contemptible and despicable. simple. Take your pick from the
   dictionary of slang.
 - "global information tracker": you're in a good mood, and it actually
   works for you. Angels sing, and a light suddenly fills the room.
 - "goddamn idiotic truckload of sh*t": when it breaks

Git is a fast, scalable, distributed revision control system with an
unusually rich command set that provides both high-level operations
and full access to internals.

Git is an Open Source project covered by the GNU General Public License.
It was originally written by Linus Torvalds with help of a group of
hackers around the net. It is currently maintained by Junio C Hamano.

Please read the file INSTALL for installation instructions.

See Documentation/gittutorial.txt to get started, then see
Documentation/everyday.txt for a useful minimum set of commands, and
Documentation/git-commandname.txt for documentation of each command.
If git has been correctly installed, then the tutorial can also be
read with "man gittutorial" or "git help tutorial", and the
documentation of each command with "man git-commandname" or "git help
commandname".

CVS users may also want to read Documentation/gitcvs-migration.txt
("man gitcvs-migration" or "git help cvs-migration" if git is
installed).

Many Git online resources are accessible from http://git-scm.com/
including full documentation and Git related tools.

The user discussion and development of Git take place on the Git
mailing list -- everyone is welcome to post bug reports, feature
requests, comments and patches to git@vger.kernel.org (read
Documentation/SubmittingPatches for instructions on patch submission).
To subscribe to the list, send an email with just "subscribe git" in
the body to majordomo@vger.kernel.org. The mailing list archives are
available at http://marc.theaimsgroup.com/?l=git and other archival
sites.

The messages titled "A note from the maintainer", "What's in
git.git (stable)" and "What's cooking in git.git (topics)" and
the discussion following them on the mailing list give a good
reference for project status, development direction and
remaining tasks.
Description
No description provided
Readme 279 MiB
Languages
C 50.5%
Shell 38.7%
Perl 4.5%
Tcl 3.2%
Python 0.8%
Other 2.1%