This patch introduces a new multi-valued configuration option,
`gc.recentObjectsHook` as a means to mark certain objects as recent (and
thus exempt from garbage collection), regardless of their age.
When performing a garbage collection operation on a repository with
unreachable objects, Git makes its decision on what to do with those
object(s) based on how recent the objects are or not. Generally speaking,
unreachable-but-recent objects stay in the repository, and older objects
are discarded.
However, we have no convenient way to keep certain precious, unreachable
objects around in the repository, even if they have aged out and would
be pruned. Our options today consist of:
- Point references at the reachability tips of any objects you
consider precious, which may be undesirable or infeasible if there
are many such objects.
- Track them via the reflog, which may be undesirable since the
reflog's lifetime is limited to that of the reference it's tracking
(and callers may want to keep those unreachable objects around for
longer).
- Extend the grace period, which may keep around other objects that
the caller *does* want to discard.
- Manually modify the mtimes of objects you want to keep. If those
objects are already loose, this is easy enough to do (you can just
enumerate and `touch -m` each one).
But if they are packed, you will either end up modifying the mtimes
of *all* objects in that pack, or be forced to write out a loose
copy of that object, both of which may be undesirable. Even worse,
if they are in a cruft pack, that requires modifying its `*.mtimes`
file by hand, since there is no exposed plumbing for this.
- Force the caller to construct the pack of objects they want
to keep themselves, and then mark the pack as kept by adding a
".keep" file. This works, but is burdensome for the caller, and
having extra packs is awkward as you roll forward your cruft pack.
This patch introduces a new option to the above list via the
`gc.recentObjectsHook` configuration, which allows the caller to
specify a program (or set of programs) whose output is treated as a set
of objects to treat as recent, regardless of their true age.
The implementation is straightforward. Git enumerates recent objects via
`add_unseen_recent_objects_to_traversal()`, which enumerates loose and
packed objects, and eventually calls add_recent_object() on any objects
for which `want_recent_object()`'s conditions are met.
This patch modifies the recency condition from simply "is the mtime of
this object more recent than the cutoff?" to "[...] or, is this object
mentioned by at least one `gc.recentObjectsHook`?".
Depending on whether or not we are generating a cruft pack, this allows
the caller to do one of two things:
- If generating a cruft pack, the caller is able to retain additional
objects via the cruft pack, even if they would have otherwise been
pruned due to their age.
- If not generating a cruft pack, the caller is likewise able to
retain additional objects as loose.
A potential alternative here is to introduce a new mode to alter the
contents of the reachable pack instead of the cruft one. One could
imagine a new option to `pack-objects`, say `--extra-reachable-tips`
that does the same thing as above, adding the visited set of objects
along the traversal to the pack.
But this has the unfortunate side-effect of altering the reachability
closure of that pack. If parts of the unreachable object graph mentioned
by one or more of the "extra reachable tips" programs is not closed,
then the resulting pack won't be either. This makes it impossible in the
general case to write out reachability bitmaps for that pack, since
closure is a requirement there.
Instead, keep these unreachable objects in the cruft pack (or set of
unreachable, loose objects) instead, to ensure that we can continue to
have a pack containing just reachable objects, which is always safe to
write a bitmap over.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
159 lines
6.6 KiB
Plaintext
159 lines
6.6 KiB
Plaintext
gc.aggressiveDepth::
|
|
The depth parameter used in the delta compression
|
|
algorithm used by 'git gc --aggressive'. This defaults
|
|
to 50, which is the default for the `--depth` option when
|
|
`--aggressive` isn't in use.
|
|
+
|
|
See the documentation for the `--depth` option in
|
|
linkgit:git-repack[1] for more details.
|
|
|
|
gc.aggressiveWindow::
|
|
The window size parameter used in the delta compression
|
|
algorithm used by 'git gc --aggressive'. This defaults
|
|
to 250, which is a much more aggressive window size than
|
|
the default `--window` of 10.
|
|
+
|
|
See the documentation for the `--window` option in
|
|
linkgit:git-repack[1] for more details.
|
|
|
|
gc.auto::
|
|
When there are approximately more than this many loose
|
|
objects in the repository, `git gc --auto` will pack them.
|
|
Some Porcelain commands use this command to perform a
|
|
light-weight garbage collection from time to time. The
|
|
default value is 6700.
|
|
+
|
|
Setting this to 0 disables not only automatic packing based on the
|
|
number of loose objects, but any other heuristic `git gc --auto` will
|
|
otherwise use to determine if there's work to do, such as
|
|
`gc.autoPackLimit`.
|
|
|
|
gc.autoPackLimit::
|
|
When there are more than this many packs that are not
|
|
marked with `*.keep` file in the repository, `git gc
|
|
--auto` consolidates them into one larger pack. The
|
|
default value is 50. Setting this to 0 disables it.
|
|
Setting `gc.auto` to 0 will also disable this.
|
|
+
|
|
See the `gc.bigPackThreshold` configuration variable below. When in
|
|
use, it'll affect how the auto pack limit works.
|
|
|
|
gc.autoDetach::
|
|
Make `git gc --auto` return immediately and run in background
|
|
if the system supports it. Default is true.
|
|
|
|
gc.bigPackThreshold::
|
|
If non-zero, all non-cruft packs larger than this limit are kept
|
|
when `git gc` is run. This is very similar to
|
|
`--keep-largest-pack` except that all non-cruft packs that meet
|
|
the threshold are kept, not just the largest pack. Defaults to
|
|
zero. Common unit suffixes of 'k', 'm', or 'g' are supported.
|
|
+
|
|
Note that if the number of kept packs is more than gc.autoPackLimit,
|
|
this configuration variable is ignored, all packs except the base pack
|
|
will be repacked. After this the number of packs should go below
|
|
gc.autoPackLimit and gc.bigPackThreshold should be respected again.
|
|
+
|
|
If the amount of memory estimated for `git repack` to run smoothly is
|
|
not available and `gc.bigPackThreshold` is not set, the largest pack
|
|
will also be excluded (this is the equivalent of running `git gc` with
|
|
`--keep-largest-pack`).
|
|
|
|
gc.writeCommitGraph::
|
|
If true, then gc will rewrite the commit-graph file when
|
|
linkgit:git-gc[1] is run. When using `git gc --auto`
|
|
the commit-graph will be updated if housekeeping is
|
|
required. Default is true. See linkgit:git-commit-graph[1]
|
|
for details.
|
|
|
|
gc.logExpiry::
|
|
If the file gc.log exists, then `git gc --auto` will print
|
|
its content and exit with status zero instead of running
|
|
unless that file is more than 'gc.logExpiry' old. Default is
|
|
"1.day". See `gc.pruneExpire` for more ways to specify its
|
|
value.
|
|
|
|
gc.packRefs::
|
|
Running `git pack-refs` in a repository renders it
|
|
unclonable by Git versions prior to 1.5.1.2 over dumb
|
|
transports such as HTTP. This variable determines whether
|
|
'git gc' runs `git pack-refs`. This can be set to `notbare`
|
|
to enable it within all non-bare repos or it can be set to a
|
|
boolean value. The default is `true`.
|
|
|
|
gc.cruftPacks::
|
|
Store unreachable objects in a cruft pack (see
|
|
linkgit:git-repack[1]) instead of as loose objects. The default
|
|
is `true`.
|
|
|
|
gc.pruneExpire::
|
|
When 'git gc' is run, it will call 'prune --expire 2.weeks.ago'
|
|
(and 'repack --cruft --cruft-expiration 2.weeks.ago' if using
|
|
cruft packs via `gc.cruftPacks` or `--cruft`). Override the
|
|
grace period with this config variable. The value "now" may be
|
|
used to disable this grace period and always prune unreachable
|
|
objects immediately, or "never" may be used to suppress pruning.
|
|
This feature helps prevent corruption when 'git gc' runs
|
|
concurrently with another process writing to the repository; see
|
|
the "NOTES" section of linkgit:git-gc[1].
|
|
|
|
gc.worktreePruneExpire::
|
|
When 'git gc' is run, it calls
|
|
'git worktree prune --expire 3.months.ago'.
|
|
This config variable can be used to set a different grace
|
|
period. The value "now" may be used to disable the grace
|
|
period and prune `$GIT_DIR/worktrees` immediately, or "never"
|
|
may be used to suppress pruning.
|
|
|
|
gc.reflogExpire::
|
|
gc.<pattern>.reflogExpire::
|
|
'git reflog expire' removes reflog entries older than
|
|
this time; defaults to 90 days. The value "now" expires all
|
|
entries immediately, and "never" suppresses expiration
|
|
altogether. With "<pattern>" (e.g.
|
|
"refs/stash") in the middle the setting applies only to
|
|
the refs that match the <pattern>.
|
|
|
|
gc.reflogExpireUnreachable::
|
|
gc.<pattern>.reflogExpireUnreachable::
|
|
'git reflog expire' removes reflog entries older than
|
|
this time and are not reachable from the current tip;
|
|
defaults to 30 days. The value "now" expires all entries
|
|
immediately, and "never" suppresses expiration altogether.
|
|
With "<pattern>" (e.g. "refs/stash")
|
|
in the middle, the setting applies only to the refs that
|
|
match the <pattern>.
|
|
+
|
|
These types of entries are generally created as a result of using `git
|
|
commit --amend` or `git rebase` and are the commits prior to the amend
|
|
or rebase occurring. Since these changes are not part of the current
|
|
project most users will want to expire them sooner, which is why the
|
|
default is more aggressive than `gc.reflogExpire`.
|
|
|
|
gc.recentObjectsHook::
|
|
When considering whether or not to remove an object (either when
|
|
generating a cruft pack or storing unreachable objects as
|
|
loose), use the shell to execute the specified command(s).
|
|
Interpret their output as object IDs which Git will consider as
|
|
"recent", regardless of their age. By treating their mtimes as
|
|
"now", any objects (and their descendants) mentioned in the
|
|
output will be kept regardless of their true age.
|
|
+
|
|
Output must contain exactly one hex object ID per line, and nothing
|
|
else. Objects which cannot be found in the repository are ignored.
|
|
Multiple hooks are supported, but all must exit successfully, else the
|
|
operation (either generating a cruft pack or unpacking unreachable
|
|
objects) will be halted.
|
|
|
|
gc.rerereResolved::
|
|
Records of conflicted merge you resolved earlier are
|
|
kept for this many days when 'git rerere gc' is run.
|
|
You can also use more human-readable "1.month.ago", etc.
|
|
The default is 60 days. See linkgit:git-rerere[1].
|
|
|
|
gc.rerereUnresolved::
|
|
Records of conflicted merge you have not resolved are
|
|
kept for this many days when 'git rerere gc' is run.
|
|
You can also use more human-readable "1.month.ago", etc.
|
|
The default is 15 days. See linkgit:git-rerere[1].
|