Both git-gc(1) and git-maintenance(1) have logic to daemonize so that
the maintenance tasks are performed in the background. git-gc(1) has
some special logic though to not perform _all_ housekeeping tasks in the
background: both references and reflogs are still handled synchronously
in the foreground.
This split exists because otherwise it may easily happen that git-gc(1)
keeps the "packed-refs" file locked for an extended amount of time,
where the next Git command that wants to modify any reference could now
fail. This was especially important in the past, where git-gc(1) was
still executed directly as part of our automatic maintenance: git-gc(1)
was invoked via `git gc --auto --detach`, so we knew to handle most of
the maintenance tasks in the background while doing those parts that may
cause locking issues in the foreground.
We have since moved to git-maintenance(1), which is a more flexible
replacement for git-gc(1). By default this command runs git-gc(1), only,
but it can be configured to run different tasks, as well. This command
does not know about the split between maintenance tasks that should run
before and after detach though, and this has led to several bug reports
about spurious locking errors for the "packed-refs" file.
Prepare for a fix by introducing this split for maintenance tasks. Note
that this commit does not yet change any of the tasks, so there should
not (yet) be a change in behaviour.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The typedefs for `maintenance_task_fn` and `maintenance_auto_fn` are
somewhat confusingly not true function pointers. As such, any user of
those typedefs needs to manually add the pointer to make use of them.
Fix this by making these true function pointers.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract the function to run maintenance tasks. This function will be
reused in a subsequent commit where we introduce a split between
maintenance tasks that run before and after daemonizing the process.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When configuring maintenance tasks run by git-maintenance(1) we do so by
modifying the global array of tasks directly. This is already quite bad
on its own, as global state makes for logic that is hard to follow.
Even more importantly though we use multiple different fields to track
whether or not a task should be run:
- "enabled" tracks the "maintenance.*.enabled" config key. This field
disables execution of a task, unless the user has explicitly asked
for the task.
- "selected_order" tracks the order in which jobs have been asked for
by the user via the "--task=" command line option. It overrides
everything else, but only has an effect if at least one job has been
selected.
- "schedule" tracks the schedule priority for a job, that is how often
it should run. This field only plays a role when the user has passed
the "--schedule=" command line option.
All of this makes it non-trivial to figure out which job really should
be running right now. The logic to configure these fields and the logic
that interprets them is distributed across multiple functions, making it
even harder to follow it.
Refactor the logic so that we stop modifying global state. Instead, we
now compute which jobs should be run in `initialize_task_config()`,
represented as an array of jobs to run that is stored in the options
structure. Like this, all logic becomes self-contained and any users of
this array only need to iterate through the tasks and execute them one
by one.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--task=" option explicitly allows the user to say which maintenance
tasks should be run, whereas "--schedule=" only respects the maintenance
strategy configured for a specific repository. As such, it is not
sensible to accept both options at the same time.
Mark them as incompatible with one another. While at it, also convert
the existing logic that marks "--auto" and "--schedule=" as incompatible
to use `die_for_incompatible_opt2()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users of git-maintenance(1) can explicitly ask it to run specific tasks
by passing the `--task=` command line option. This option can be passed
multiple times, which causes us to execute tasks in the same order as
the tasks have been provided by the user.
The order in which tasks are run is computed in `task_option_parse()`:
every time we parse such a command line argument, we modify the global
array of tasks by seting the selected index for that specific task.
This has two downsides:
- We modify global state, which makes it hard to follow the logic.
- The configuration of tasks is split across multiple different
functions, so it is not easy to figure out the different factors
that play a role in selecting tasks.
Refactor the logic so that `task_option_parse()` does not modify global
state anymore. Instead, this function now only collects the list of
configured tasks. The logic to configure ordering of the respective
tasks is then deferred to `initialize_task_config()`.
This refactoring solves the second problem, that the configuration of
tasks is spread across multiple different locations. The first problem,
that we modify global state, will be fixed in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two different variables that track the quietness for git-gc(1):
- The local variable `quiet`, which we wire up.
- The `quiet` field of `struct maintenance_run_opts`.
This leads to confusion which of these variables should be used and what
the respective effect is.
Simplify this logic by dropping the local variable in favor of the
options field.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the array of maintenance tasks to use designated field
initializers. This makes it easier to add more fields to the struct
without having to modify all tasks.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dependency on the_repository variable has been reduced from the
code paths in "git replay".
* en/replay-wo-the-repository:
replay: replace the_repository with repo parameter passed to cmd_replay ()
Teach "git send-email" to also consult `hostname -f` for mail
domain to compute the identity given to SMTP servers.
* ag/send-email-hostname-f:
send-email: try to get fqdn by running hostname -f on Linux and macOS
CI settings at GitLab has been updated to run MSVC based Meson job
automatically (as opposed to be done only upon manual request).
* ps/ci-gitlab-enable-msvc-meson-job:
gitlab-ci: always run MSVC-based Meson job
Two "scalar" subcommands that adds a repository that hasn't been
under "scalar"'s control are taught an option not to enable the
scheduled maintenance on it.
* ds/scalar-no-maintenance:
scalar reconfigure: improve --maintenance docs
scalar reconfigure: add --maintenance=<mode> option
scalar clone: add --no-maintenance option
scalar register: add --no-maintenance option
scalar: customize register_dir()'s behavior
win+Meson CI pipeline, unlike other pipelines for Windows,
used to build artifacts in develper mode, which has been changed to
build them in release mode for consistency.
* js/ci-build-win-in-release-mode:
ci(win+Meson): build in Release mode
Performance regression in not-yet-released code has been corrected.
* ps/reftable-read-block-perffix:
reftable: fix perf regression when reading blocks of unwanted type
Code cleanup.
* jk/oidmap-cleanup:
raw_object_store: drop extra pointer to replace_map
oidmap: add size function
oidmap: rename oidmap_free() to oidmap_clear()
The `send-email` documentation has been updated with OAuth2.0
related examples.
* ag/doc-send-email:
docs: add credential helper for outlook and gmail in OAuth list of helpers
docs: improve send-email documentation
send-mail: improve checks for valid_fqdn
Bundle-URI feature did not use refs recorded in the bundle other
than normal branches as anchoring points to optimize the follow-up
fetch during "git clone"; now it is told to utilize all.
* sc/bundle-uri-use-all-refs-in-bundle:
bundle-uri: add test for bundle-uri clones with tags
bundle-uri: copy all bundle references ino the refs/bundle space
Provide an overview of the set of functions used for manipulating
`json_writer`s, by describing what functions should be used for
each JSON-related task.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Helped-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make repository clean-up tasks "gc" can do available to "git
maintenance" front-end.
* ps/maintenance-missing-tasks:
builtin/maintenance: introduce "rerere-gc" task
builtin/gc: move rerere garbage collection into separate function
builtin/maintenance: introduce "worktree-prune" task
builtin/gc: move pruning of worktrees into a separate function
builtin/gc: remove global variables where it is trivial to do
builtin/gc: fix indentation of `cmd_gc()` parameters
The fallback implementation of open_nofollow() depended on
open("symlink", O_NOFOLLOW) to set errno to ELOOP, but a few BSD
derived systems use different errno, which has been worked around.
* cf/wrapper-bsd-eloop:
wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
Replace the_repository everywhere with repo, feed repo from cmd_replay()
to all the other functions in the file that need it, and remove the
UNUSED annotation on repo.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --maintenance option for 'scalar reconfigure' has three possible
values. Improve the documentation by specifying the option in the -h
help menu and usage information.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`hostname` is a popular command available on both Linux and macOS. As
per the man-page[1], `hostname -f` command returns the fully qualified
domain name (FQDN) of the system. The current Net::Domain perl module
being used in the script for the same has been quite unrealiable in many
cases. Thankfully, we now have a better check for valid_fqdn, which does
reject the invalid FQDNs given by this module properly, but at the same
time, it will result in a fallback to 'localhost.localdomain' being
used. `hostname -f` has been quite reliable (probably even more reliable
than the Net::Domain module) and before falling back to
'localhost.localdomain', we should try to use it. Interestingly, the
`hostname` command is actually used by perl modules like Net::Domain[2]
and Sys::Hostname[3] to get the hostname. So, lets give `hostname -f` a
chance as well!
[1]: https://man7.org/linux/man-pages/man1/hostname.1.html
[2]: https://github.com/Perl/perl5/blob/blead/cpan/libnet/lib/Net/Domain.pm#L88
[3]: https://github.com/Perl/perl5/blob/blead/ext/Sys-Hostname/Hostname.pm#L93
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add 'f?o'" did not add 'foo' if 'f?o', an unusual pathname,
also existed on the working tree, which has been corrected.
* kj/glob-path-with-special-char:
dir.c: literal match with wildcard in pathspec should still glob
Code clean-up around stale CI elements and building with Visual Studio.
* js/ci-buildsystems-cleanup:
config.mak.uname: drop the `vcxproj` target
contrib/buildsystems: drop support for building . vcproj/.vcxproj files
ci: stop linking the `prove` cache
With 7304bd2bc3 (ci: wire up Visual Studio build with Meson,
2025-01-22) we have introduced a CI job that builds and tests Git with
Microsoft Visual Studio via Meson. This job is only being executed by
default on GitHub Workflows though -- on GitLab CI it is marked as a
"manual" job, so the developer has to actively trigger these jobs.
The consequence of this split is that any breakage specific to this job
is only noticed by developers who mainly work with GitHub. Let's improve
this situation by also running the job by default on GitLab CI.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --minimal" used to give non-minimal output when its
optimization kicked in, which has been disabled.
* ng/xdiff-truly-minimal:
xdiff: disable cleanup_records heuristic with --minimal
"git index-pack --fix-thin" used to abort to prevent a cycle in
delta chains from forming in a corner case even when there is no
such cycle.
* ds/fix-thin-fix:
index-pack: allow revisiting REF_DELTA chains
t5309: create failing test for 'git index-pack'
test-tool: add pack-deltas helper
Further refinement on CI messages when an optional external
software is unavailable (e.g. due to third-party service outage).
* jc/ci-skip-unavailable-external-software:
ci: download JGit from maven, not eclipse.org
ci: update the message for unavailble third-party software
Further code clean-up in the object-store layer.
* ps/object-store-cleanup:
object-store: drop `repo_has_object_file()`
treewide: convert users of `repo_has_object_file()` to `has_object()`
object-store: allow fetching objects via `has_object()`
object-store: move function declarations to their respective subsystems
object-store: move and rename `odb_pack_keep()`
object-store: drop `loose_object_path()`
object-store: move `struct packed_git` into "packfile.h"