git-diff --numstat -z: make it machine readable

The "-z" format is all about machine parsability, but showing renamed
paths as "common/{a => b}/suffix" makes it impossible.  The scripts would
never have successfully parsed "--numstat -z -M" in the old format.

This fixes the output format in a (hopefully minimally) backward
incompatible way.

 * The output without -z is not changed.  This has given a good way for
   humans to view added and deleted lines separately, and showing the
   path in combined, shorter way would preserve readability.

 * The output with -z is unchanged for paths that do not involve renames.
   Existing scripts that do not pass -M/-C are not affected at all.

 * The output with -z for a renamed path is shown in a format that can
   easily be distinguished from an unrenamed path.

This is based on Jakub Narebski's patch.  Bugs and documentation typos
are mine.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Junio C Hamano
2007-12-11 23:46:30 -08:00
parent 71a9883db2
commit f604652e05
2 changed files with 129 additions and 32 deletions

View File

@@ -84,3 +84,64 @@ all parents.
include::diff-generate-patch.txt[]
other diff formats
------------------
The `--summary` option describes newly added, deleted, renamed and
copied files. The `--stat` option adds diffstat(1) graph to the
output. These options can be combined with other options, such as
`-p`, and are meant for human consumption.
When showing a change that involves a rename or a copy, `--stat` output
formats the pathnames compactly by combining common prefix and suffix of
the pathnames. For example, a change that moves `arch/i386/Makefile` to
`arch/x86/Makefile` while modifying 4 lines will be shown like this:
------------------------------------
arch/{i386 => x86}/Makefile | 4 +--
------------------------------------
The `--numstat` option gives the diffstat(1) information but is designed
for easier machine consumption. An entry in `--numstat` output looks
like this:
----------------------------------------
1 2 README
3 1 arch/{i386 => x86}/Makefile
----------------------------------------
That is, from left to right:
. the number of added lines;
. a tab;
. the number of deleted lines;
. a tab;
. pathname (possibly with rename/copy information);
. a newline.
When `-z` output option is in effect, the output is formatted this way:
----------------------------------------
1 2 README NUL
3 1 NUL arch/i386/Makefile NUL arch/x86/Makefile NUL
----------------------------------------
That is:
. the number of added lines;
. a tab;
. the number of deleted lines;
. a tab;
. a NUL (only exists if renamed/copied);
. pathname in preimage;
. a NUL (only exists if renamed/copied);
. pathname in postimage (only exists if renamed/copied);
. a NUL.
The extra `NUL` before the preimage path in renamed case is to allow
scripts that read the output to tell if the current record being read is
a single-path record or a rename/copy record without reading ahead.
After reading added and deleted lines, reading up to `NUL` would yield
the pathname, but if that is `NUL`, the record will show two paths.