Commit Graph

8398 Commits

Author SHA1 Message Date
Vicent Marti
6b8fda2db1 pack-objects: use bitmaps when packing objects
In this patch, we use the bitmap API to perform the `Counting Objects`
phase in pack-objects, rather than a traditional walk through the object
graph. For a reasonably-packed large repo, the time to fetch and clone
is often dominated by the full-object revision walk during the Counting
Objects phase. Using bitmaps can reduce the CPU time required on the
server (and therefore start sending the actual pack data with less
delay).

For bitmaps to be used, the following must be true:

  1. We must be packing to stdout (as a normal `pack-objects` from
     `upload-pack` would do).

  2. There must be a .bitmap index containing at least one of the
     "have" objects that the client is asking for.

  3. Bitmaps must be enabled (they are enabled by default, but can be
     disabled by setting `pack.usebitmaps` to false, or by using
     `--no-use-bitmap-index` on the command-line).

If any of these is not true, we fall back to doing a normal walk of the
object graph.

Here are some sample timings from a full pack of `torvalds/linux` (i.e.
something very similar to what would be generated for a clone of the
repository) that show the speedup produced by various
methods:

    [existing graph traversal]
    $ time git pack-objects --all --stdout --no-use-bitmap-index \
			    </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m44.111s
    user    0m42.396s
    sys     0m3.544s

    [bitmaps only, without partial pack reuse; note that
     pack reuse is automatic, so timing this required a
     patch to disable it]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m5.413s
    user    0m5.604s
    sys     0m1.804s

    [bitmaps with pack reuse (what you get with this patch)]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Reusing existing pack: 3237103, done.
    Total 3237103 (delta 0), reused 0 (delta 0)

    real    0m1.636s
    user    0m1.460s
    sys     0m0.172s

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Vicent Marti
0d4455a3ab documentation: add documentation for the bitmap format
This is the technical documentation for the JGit-compatible Bitmap v1
on-disk format.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Junio C Hamano
53b2a5ff30 Merge branch 'jk/name-pack-after-byte-representation'
Two packfiles that contain the same set of objects have
traditionally been named identically, but that made repacking a
repository that is already fully packed without any cruft with a
different packing parameter cumbersome. Update the convention to
name the packfile after the bytestream representation of the data,
not after the set of objects in it.

* jk/name-pack-after-byte-representation:
  pack-objects doc: treat output filename as opaque
  pack-objects: name pack files after trailer hash
  sha1write: make buffer const-correct
2013-12-27 14:58:19 -08:00
Junio C Hamano
6904f9aa5b Merge branch 'zk/difftool-counts'
Show the total number of paths and the number of paths shown so far
when "git difftool" prompts to launch an external diff tool, which
would give users some sense of progress.

* zk/difftool-counts:
  diff.c: fix some recent whitespace style violations
  difftool: display the number of files in the diff queue in the prompt
2013-12-27 14:58:13 -08:00
Junio C Hamano
2b0a564e02 Merge branch 'jk/pull-rebase-using-fork-point'
* jk/pull-rebase-using-fork-point:
  rebase: use reflog to find common base with upstream
  pull: use merge-base --fork-point when appropriate
2013-12-27 14:58:08 -08:00
Junio C Hamano
7cdebd8a20 Merge branch 'jc/push-refmap'
Make "git push origin master" update the same ref that would be
updated by our 'master' when "git push origin" (no refspecs) is run
while the 'master' branch is checked out, which makes "git push"
more symmetric to "git fetch" and more usable for the triangular
workflow.

* jc/push-refmap:
  push: also use "upstream" mapping when pushing a single ref
  push: use remote.$name.push as a refmap
  builtin/push.c: use strbuf instead of manual allocation
2013-12-27 14:57:50 -08:00
Jeff King
65ea9c3c3d cat-file: provide %(deltabase) batch format
It can be useful for debugging or analysis to see which
objects are stored as delta bases on top of others. This
information is available by running `git verify-pack`, but
that is extremely expensive (and is harder than necessary to
parse).

Instead, let's make it available as a cat-file query format,
which makes it fast and simple to get the bases for a subset
of the objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26 11:54:26 -08:00
Samuel Bronson
6d8940b562 diff: add diff.orderfile configuration variable
diff.orderfile acts as a default for the -O command line option.

[sb: split up aw's original patch; rework tests and docs, treat option
as pathname]

Signed-off-by: Anders Waldenborg <anders@0x63.nu>
Signed-off-by: Samuel Bronson <naesten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-18 16:39:00 -08:00
Roberto Tyley
615b8f1a8d docs: add filter-branch notes on The BFG
The BFG is a tool specifically designed for the task of removing
unwanted data from Git repository history - a common use-case for which
git-filter-branch has been the traditional workhorse.

It's beneficial to let users know that filter-branch has an alternative
here:

* speed : The BFG is 10-50x faster
  http://rtyley.github.io/bfg-repo-cleaner/#speed
* complexity of configuration : filter-branch is a very flexible tool,
  but demands very careful usage in order to get the desired results
  http://rtyley.github.io/bfg-repo-cleaner/#examples

Obviously, filter-branch has it's advantages too - it permits very
complex rewrites, and doesn't require a JVM - but for the common
use-case of deleting unwanted data, it's helpful to users to be aware
that an alternative exists.

The BFG was released under the GPL in February 2013, and has since seen
widespread production use (The Guardian, RedHat, Google, UK Government
Digital Service), been tested against large repos (~300K commits, ~5GB
packfiles) and received significant positive feedback from users:

http://rtyley.github.io/bfg-repo-cleaner/#feedback

Signed-off-by: Roberto Tyley <roberto.tyley@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-18 10:41:41 -08:00
Junio C Hamano
7794a680e6 Sync with 1.8.5.2
* maint:
  Git 1.8.5.2
  cmd_repack(): remove redundant local variable "nr_packs"
2013-12-17 14:12:17 -08:00
Junio C Hamano
b10cd577d8 Update draft release notes to 1.9
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-17 14:05:50 -08:00
Junio C Hamano
173473c287 Merge branch 'kn/gitweb-extra-branch-refs'
Allow gitweb to be configured to show refs out of refs/heads/ as if
they were branches.

* kn/gitweb-extra-branch-refs:
  gitweb: Denote non-heads, non-remotes branches
  gitweb: Add a feature for adding more branch refs
  gitweb: Return 1 on validation success instead of passed input
  gitweb: Move check-ref-format code into separate function
2013-12-17 12:03:33 -08:00
Junio C Hamano
0067272999 Merge branch 'bc/doc-merge-no-op-revert'
* bc/doc-merge-no-op-revert:
  Documentation: document pitfalls with 3-way merge
2013-12-17 11:47:01 -08:00
Junio C Hamano
4d1826d1d9 Merge branch 'fc/trivial'
* fc/trivial:
  remote: fix status with branch...rebase=preserve
  fetch: add missing documentation
  t: trivial whitespace cleanups
  abspath: trivial style fix
2013-12-17 11:46:32 -08:00
Junio C Hamano
f9633716d0 Merge branch 'kb/doc-exclude-directory-semantics'
* kb/doc-exclude-directory-semantics:
  gitignore.txt: clarify recursive nature of excluded directories
2013-12-17 11:44:19 -08:00
Junio C Hamano
5512ac5840 Git 1.8.5.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-17 11:42:12 -08:00
Junio C Hamano
59f3e3f1e2 Merge branch 'rs/doc-submitting-patches' into maint
* rs/doc-submitting-patches:
  SubmittingPatches: document how to handle multiple patches
2013-12-17 11:38:23 -08:00
Junio C Hamano
5169f5a484 Merge branch 'tr/doc-git-cherry' into maint
* tr/doc-git-cherry:
  Documentation: revamp git-cherry(1)
2013-12-17 11:37:55 -08:00
Junio C Hamano
212607494d Merge branch 'nd/glossary-content-pathspec-markup' into maint
* nd/glossary-content-pathspec-markup:
  glossary-content.txt: fix documentation of "**" patterns
2013-12-17 11:36:54 -08:00
Junio C Hamano
c8394bb466 Merge branch 'jj/doc-markup-gitcli' into maint
* jj/doc-markup-gitcli:
  Documentation/gitcli.txt: fix double quotes
2013-12-17 11:36:38 -08:00
Junio C Hamano
5712dcb209 Merge branch 'jj/doc-markup-hints-in-coding-guidelines' into maint
* jj/doc-markup-hints-in-coding-guidelines:
  State correct usage of literal examples in man pages in the coding standards
2013-12-17 11:36:10 -08:00
Junio C Hamano
ace08c2239 Merge branch 'jj/log-doc' into maint
* jj/log-doc:
  Documentation/git-log.txt: mark-up fix and minor rephasing
  Documentation/git-log: update "--log-size" description
2013-12-17 11:35:41 -08:00
Junio C Hamano
7be001dfbf Merge branch 'jj/rev-list-options-doc' into maint
* jj/rev-list-options-doc:
  Documentation/rev-list-options.txt: fix some grammatical issues and typos
  Documentation/rev-list-options.txt: fix mark-up
2013-12-17 11:34:41 -08:00
Junio C Hamano
e8fcf70cd4 Merge branch 'tb/doc-fetch-pack-url' into maint
* tb/doc-fetch-pack-url:
  git-fetch-pack uses URLs like git-fetch
2013-12-17 11:34:24 -08:00
Junio C Hamano
a4a227a725 Merge branch 'mi/typofixes' into maint
* mi/typofixes:
  contrib: typofixes
  Documentation/technical/http-protocol.txt: typofixes
  typofixes: fix misspelt comments
2013-12-17 11:34:01 -08:00
Jeff King
40a4f5a7bf pack-objects doc: treat output filename as opaque
After 1190a1a (pack-objects: name pack files after trailer hash,
2013-12-05), the SHA-1 used to determine the filename is calculated
differently.  Update the documentation to not guarantee anything more
than that the SHA-1 depends on the pack content somehow.

Hopefully this will discourage readers from depending on the old or
the new calculation.

Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-16 11:36:09 -08:00
Junio C Hamano
d7aced95cd Update draft release notes to 1.9
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 14:24:39 -08:00
Junio C Hamano
72911f8c18 Merge branch 'cn/thin-push-capability'
Allow receive-pack to insist on receiving a fat pack from "git
push" clients.

* cn/thin-push-capability:
  send-pack: don't send a thin pack to a server which doesn't support it
2013-12-12 14:20:32 -08:00
Junio C Hamano
577aed296a Merge branch 'jk/remove-deprecated'
* jk/remove-deprecated:
  stop installing git-tar-tree link
  peek-remote: remove deprecated alias of ls-remote
  lost-found: remove deprecated command
  tar-tree: remove deprecated command
  repo-config: remove deprecated alias for "git config"
2013-12-12 14:18:34 -08:00
Junio C Hamano
fca26a3430 Merge branch 'rs/doc-submitting-patches'
* rs/doc-submitting-patches:
  SubmittingPatches: document how to handle multiple patches
2013-12-12 14:18:29 -08:00
Junio C Hamano
71fe59f880 Merge branch 'tr/doc-git-cherry'
* tr/doc-git-cherry:
  Documentation: revamp git-cherry(1)
2013-12-12 14:18:24 -08:00
Junio C Hamano
e66ef7ae6f Merge branch 'mh/fetch-tags-in-addition-to-normal-refs'
The "--tags" option to "git fetch" used to be literally a synonym to
a "refs/tags/*:refs/tags/*" refspec, which meant that (1) as an
explicit refspec given from the command line, it silenced the lazy
"git fetch" default that is configured, and (2) also as an explicit
refspec given from the command line, it interacted with "--prune"
to remove any tag that the remote we are fetching from does not
have.

This demotes it to an option; with it, we fetch all tags in
addition to what would be fetched without the option, and it does
not interact with the decision "--prune" makes to see what
remote-tracking refs the local has are missing the remote
counterpart.

* mh/fetch-tags-in-addition-to-normal-refs: (23 commits)
  fetch: improve the error messages emitted for conflicting refspecs
  handle_duplicate(): mark error message for translation
  ref_remote_duplicates(): extract a function handle_duplicate()
  ref_remove_duplicates(): simplify loop logic
  t5536: new test of refspec conflicts when fetching
  ref_remove_duplicates(): avoid redundant bisection
  git-fetch.txt: improve description of tag auto-following
  fetch-options.txt: simplify ifdef/ifndef/endif usage
  fetch, remote: properly convey --no-prune options to subprocesses
  builtin/remote.c:update(): use struct argv_array
  builtin/remote.c: reorder function definitions
  query_refspecs(): move some constants out of the loop
  fetch --prune: prune only based on explicit refspecs
  fetch --tags: fetch tags *in addition to* other stuff
  fetch: only opportunistically update references based on command line
  get_expanded_map(): avoid memory leak
  get_expanded_map(): add docstring
  builtin/fetch.c: reorder function definitions
  get_ref_map(): rename local variables
  api-remote.txt: correct section "struct refspec"
  ...
2013-12-12 14:14:10 -08:00
Krzesimir Nowak
8d646a9bac gitweb: Add a feature for adding more branch refs
Allow extra-branch-refs feature to tell gitweb to show refs from
additional hierarchies in addition to branches in the list-of-branches
view.

Signed-off-by: Krzesimir Nowak <krzesimir@endocode.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 12:37:37 -08:00
Christian Couder
34a332221c Documentation/git-replace: describe --format option
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 11:53:49 -08:00
Nguyễn Thái Ngọc Duy
82fba2b9d3 git-clone.txt: remove shallow clone limitations
Now that git supports data transfer from or to a shallow clone, these
limitations are not true anymore.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:19 -08:00
Nguyễn Thái Ngọc Duy
eab3296c7e prune: clean .git/shallow after pruning objects
This patch teaches "prune" to remove shallow roots that are no longer
reachable from any refs (e.g. when the relevant refs are removed).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:19 -08:00
Nguyễn Thái Ngọc Duy
16094885ca smart-http: support shallow fetch/clone
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy
0a1bc12b6e receive-pack: allow pushes that update .git/shallow
The basic 8 steps to update .git/shallow does not fully apply here
because the user may choose to accept just a few refs (while fetch
always accepts all refs). The steps are modified a bit.

1-6. same as before. After calling assign_shallow_commits_to_refs at
   step 6, each shallow commit has a bitmap that marks all refs that
   require it.

7. mark all "ours" shallow commits that are reachable from any
   refs. We will need to do the original step 7 on them later.

8. go over all shallow commit bitmaps, mark refs that require new
   shallow commits.

9. setup a strict temporary shallow file to plug all the holes, even
   if it may cut some of our history short. This file is used by all
   hooks. The hooks could use --shallow-file=$GIT_DIR/shallow to
   overcome this and reach everything in current repo.

10. go over the new refs one by one. For each ref, do the reachability
   test if it needs a shallow commit on the list from step 7. Remove
   it if it's reachable from our refs. Gather all required shallow
   commits, run check_everything_connected() with the new ref, then
   install them to .git/shallow.

This mode is disabled by default and can be turned on with
receive.shallowupdate

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy
5dbd767601 receive/send-pack: support pushing from a shallow clone
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy
48d25cae22 fetch: add --update-shallow to accept refs that update .git/shallow
The same steps are done as in when --update-shallow is not given. The
only difference is we now add all shallow commits in "ours" and
"theirs" to .git/shallow (aka "step 8").

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy
79d3a236c5 upload-pack: make sure deepening preserves shallow roots
When "fetch --depth=N" where N exceeds the longest chain of history in
the source repo, usually we just send an "unshallow" line to the
client so full history is obtained.

When the source repo is shallow we need to make sure to "unshallow"
the current shallow point _and_ "shallow" again when the commit
reaches its shallow bottom in the source repo.

This should fix both cases: large <N> and --unshallow.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy
ad491366de make the sender advertise shallow commits to the receiver
If either receive-pack or upload-pack is called on a shallow
repository, shallow commits (*) will be sent after the ref
advertisement (but before the packet flush), so that the receiver has
the full "shape" of the sender's commit graph. This will be needed for
the receiver to update its .git/shallow if necessary.

This breaks the protocol for all clients trying to push to a shallow
repo, or fetch from one. Which is basically the same end result as
today's "is_repository_shallow() && die()" in receive-pack and
upload-pack. New clients will be made aware of shallow upstream and
can make use of this information.

The sender must send all shallow commits that are sent in the
following pack. It may send more shallow commits than necessary.

upload-pack for example may choose to advertise no shallow commits if
it knows in advance that the pack it's going to send contains no
shallow commits. But upload-pack is the server, so we choose the
cheaper way, send full .git/shallow and let the client deal with it.

Smart HTTP is not affected by this patch. Shallow support on
smart-http comes later separately.

(*) A shallow commit is a commit that terminates the revision
    walker. It is usually put in .git/shallow in order to keep the
    revision walker from going out of bound because there is no
    guarantee that objects behind this commit is available.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:16 -08:00
John Keeping
ad8261d212 rebase: use reflog to find common base with upstream
Commit 15a147e (rebase: use @{upstream} if no upstream specified,
2011-02-09) says:

	Make it default to 'git rebase @{upstream}'. That is also what
	'git pull [--rebase]' defaults to, so it only makes sense that
	'git rebase' defaults to the same thing.

but that isn't actually the case.  Since commit d44e712 (pull: support
rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
chosen the most recent reflog entry which is an ancestor of the current
branch if it can find one.

Add a '--fork-point' argument to git-rebase that can be used to trigger
this behaviour.  This option is turned on by default if no non-option
arguments are specified on the command line, otherwise we treat an
upstream specified on the command-line literally.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 10:56:30 -08:00
brian m. carlson
c566500032 Documentation: document pitfalls with 3-way merge
Oftentimes people will make the same change in two branches, revert the change
in one branch, and then be surprised when a merge reinstitutes that change when
the branches are merged.  Add an explanatory paragraph that explains that this
occurs and the reason why, so people are not surprised.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09 13:42:40 -08:00
Felipe Contreras
379484b551 fetch: add missing documentation
There's no mention of the 'origin' default, or the fact that the
upstream tracking branch remote is used.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09 13:24:33 -08:00
Karsten Blees
59856de171 gitignore.txt: clarify recursive nature of excluded directories
Additionally, precedence of negated patterns is exactly as outlined in
the DESCRIPTION section, we don't need to repeat this.

Signed-off-by: Karsten Blees <blees@dcon.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09 10:55:48 -08:00
Zoltan Klinger
ee7fb0b1d4 difftool: display the number of files in the diff queue in the prompt
When --prompt option is set, git-difftool displays a prompt for each
modified file to be viewed in an external diff program.  At that
point, it could be useful to display a counter and the total number
of files in the diff queue.

Below is the current difftool prompt for the first of 5 modified files:

    Viewing: 'diff.c'
    Launch 'vimdiff' [Y/n]:

Consider the modified prompt:

    Viewing (1/5): 'diff.c'
    Launch 'vimdiff' [Y/n]:

The current GIT_EXTERNAL_DIFF mechanism does not tell the number of
paths in the diff queue nor the current counter.  To make this
"counter/total" info available for GIT_EXTERNAL_DIFF programs
without breaking existing ones by doing the following:

 - Keep track of the number of paths shown so far in diff_options;

 - Export two new environment variables from run_external_diff() to
   show the total number of paths (from diff_queue_struct) and the
   current value of the counter (from diff_options); and

 - Update git-difftool--helper to use these two environment variables.

Signed-off-by: Zoltan Klinger <zoltan.klinger@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 14:00:27 -08:00
Nguyễn Thái Ngọc Duy
ef79b1f870 Support pathspec magic :(exclude) and its short form :!
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 13:00:39 -08:00
Nguyễn Thái Ngọc Duy
8b7cb51a9d glossary-content.txt: rephrase magic signature part
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 13:00:38 -08:00
Junio C Hamano
077f43447c Start 1.9 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 11:20:04 -08:00