Prepare to implement incremental multi-pack indexes (MIDXs) over the
next several commits by first describing the relevant prerequisites
(like a new chunk in the MIDX format, the directory structure for
incremental MIDXs, etc.)
The format is described in detail in the patch contents below, but the
high-level description is as follows.
Incremental MIDXs live in $GIT_DIR/objects/pack/multi-pack-index.d, and
each `*.midx` within that directory has a single "parent" MIDX, which is
the MIDX layer immediately before it in the MIDX chain. The chain order
resides in a file 'multi-pack-index-chain' in the same directory.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A handful of entries are added to the GitFAQ document.
* bc/gitfaq-more:
doc: mention that proxies must be completely transparent
gitfaq: add entry about syncing working trees
gitfaq: give advice on using eol attribute in gitattributes
gitfaq: add documentation on proxies
The http transport can now be told to send request with
authentication material without first getting a 401 response.
* bc/http-proactive-auth:
http: allow authenticating proactively
A new warning message is issued when a command has to expand a
sparse index to handle working tree cruft that are outside of the
sparse checkout.
* ds/advice-sparse-index-expansion:
advice: warn when sparse index expands
Address-looking strings found on the trailer are now placed on the
Cc: list after running through sanitize_address by "git send-email".
* cb/send-email-sanitize-trailer-addresses:
git-send-email: use sanitized address when reading mbox body
The "ort" merge backend saw one bugfix for a crash that happens
when inner merge gets killed, and assorted code clean-ups.
* en/ort-inner-merge-error-fix:
merge-ort: fix missing early return
merge-ort: convert more error() cases to path_msg()
merge-ort: upon merge abort, only show messages causing the abort
merge-ort: loosen commented requirements
merge-ort: clearer propagation of failure-to-function from merge_submodule
merge-ort: fix type of local 'clean' var in handle_content_merge ()
merge-ort: maintain expected invariant for priv member
merge-ort: extract handling of priv member into reusable function
A test in reftable library has been rewritten using the unit test
framework.
* cp/unit-test-reftable-record:
t-reftable-record: add tests for reftable_log_record_compare_key()
t-reftable-record: add tests for reftable_ref_record_compare_name()
t-reftable-record: add index tests for reftable_record_is_deletion()
t-reftable-record: add obj tests for reftable_record_is_deletion()
t-reftable-record: add log tests for reftable_record_is_deletion()
t-reftable-record: add ref tests for reftable_record_is_deletion()
t-reftable-record: add comparison tests for obj records
t-reftable-record: add comparison tests for index records
t-reftable-record: add comparison tests for ref records
t-reftable-record: add reftable_record_cmp() tests for log records
t: move reftable/record_test.c to the unit testing framework
"git push" that pushes only deletion gave an unnecessary and
harmless error message when push negotiation is configured, which
has been corrected.
* jc/disable-push-nego-for-deletion:
push: avoid showing false negotiation errors
Custom control structures we invented more recently have been
taught to the clang-format file.
* rs/clang-format-updates:
clang-format: include kh_foreach* macros in ForEachMacros
GitWeb update to use committer date consistently in rss/atom feeds.
* am/gitweb-feed-use-committer-date:
gitweb: rss/atom change published/updated date to committer date
Test suite has been taught not to unnecessarily rely on DNS failing
a bogus external name.
* jk/tests-without-dns:
t/lib-bundle-uri: use local fake bundle URLs
t5551: do not confirm that bogus url cannot be used
t5553: use local url for invalid fetch
An existing test of oidmap API has been rewritten with the
unit-test framework.
* gt/unit-test-oidmap:
t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
"git describe --dirty --broken" forgot to refresh the index before
seeing if there is any chang, ("git describe --dirty" correctly did
so), which has been corrected.
* as/describe-broken-refresh-index-fix:
describe: refresh the index when 'broken' flag is used
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already document in the FAQ that proxies must be completely
transparent and not modify the request or response in any way, but add
similar documentation to the http.proxy entry. We know that while the
FAQ is very useful, users sometimes are less likely to read in favor of
the documentation specific to an option or command, so adding it in both
places will help users be adequately informed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users very commonly want to sync their working tree with uncommitted
changes across machines, often to carry across in-progress work or
stashes. Despite this not being a recommended approach, users want to
do it and are not dissuaded by suggestions not to, so let's recommend a
sensible technique.
The technique that many users are using is their preferred cloud syncing
service, which is a bad idea. Users have reported problems where they
end up with duplicate files that won't go away (with names like "file.c
2"), broken references, oddly named references that have date stamps
appended to them, missing objects, and general corruption and data loss.
That's because almost all of these tools sync file by file, which is a
great technique if your project is a single word processing document or
spreadsheet, but is utterly abysmal for Git repositories because they
don't necessarily snapshot the entire repository correctly. They also
tend to sync the files immediately instead of when the repository is
quiescent, so writing multiple files, as occurs during a commit or a gc,
can confuse the tools and lead to corruption.
We know that the old standby, rsync, is up to the task, provided that
the repository is quiescent, so let's suggest that and dissuade people
from using cloud syncing tools. Let's tell people about common things
they should be aware of before doing this and that this is still
potentially risky. Additionally, let's tell people that Git's security
model does not permit sharing working trees across users in case they
planned to do that. While we'd still prefer users didn't try to do
this, hopefully this will lead them in a safer direction.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the FAQ, we tell people how to use the text attribute, but we fail to
explain what to do with the eol attribute. As we ourselves have
noticed, most shell implementations do not care for carriage returns,
and as such, people will practically always want them to use LF endings.
Similar things can be said for batch files on Windows, except with CRLF
endings.
Since these are common things to have in a repository, let's help users
make a good decision by recommending that they use the gitattributes
file to correctly check out the endings.
In addition, let's correct the cross-reference to this question, which
originally referred to "the following entry", even though a new entry
has been inserted in between. The cross-reference notation should
prevent this from occurring and provide a link in formats, such as HTML,
which support that.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many corporate environments and local systems have proxies in use. Note
the situations in which proxies can be used and how to configure them.
At the same time, note what standards a proxy must follow to work with
Git. Explicitly call out certain classes that are known to routinely
have problems reported various places online, including in the Git for
Windows issue tracker and on Stack Overflow, and recommend against the
use of such software, noting that they are associated with myriad
security problems (including, for example, breaking sandboxing and image
integrity[0], and, for TLS middleboxes, the use of insecure protocols
and ciphers and lack of certificate verification[1]). Don't mention the
specific nature of these security problems in the FAQ entry because they
are extremely numerous and varied and we wish to keep the FAQ entry
relatively brief.
[0] https://issues.chromium.org/issues/40285192
[1] https://faculty.cc.gatech.edu/~mbailey/publications/ndss17_interception.pdf
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under ci/ hierarchy, we run scripts under either "sh" (any Bourne
compatible POSIX shell would work) or specifically "bash" (as they
require features from bash, e.g., ${parameter/pattern/string}
expansion). As we have the CI environment under our control, we can
expect that /bin/sh will always be fine to run the scripts that only
require a Bourne shell, but we may not know where "bash" is
installed depending on the distro used.
So let's make sure we start these scripts with either one of these:
#!/bin/sh
#!/usr/bin/env bash
Yes, the latter has to assume that everybody installs "env" at that
path and not as /bin/env or /usr/local/bin/env, but this currently
is the best we could do.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to deal with modified paths that are out-of-cone in a
sparsely checked out working tree has been optimized.
* ds/sparse-lstat-caching:
sparse-index: improve lstat caching of sparse paths
sparse-index: count lstat() calls
sparse-index: use strbuf in path_found()
sparse-index: refactor path_found()
sparse-checkout: refactor skip worktree retry logic
When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.
* xx/bundie-uri-fixes:
unbundle: extend object verification for fetches
fetch-pack: expose fsckObjects configuration logic
bundle-uri: verify oid before writing refs
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.
* tb/path-filter-fix:
bloom: introduce `deinit_bloom_filters()`
commit-graph: reuse existing Bloom filters where possible
object.h: fix mis-aligned flag bits table
commit-graph: new Bloom filter version that fixes murmur3
commit-graph: unconditionally load Bloom filters
bloom: prepare to discard incompatible Bloom filters
bloom: annotate filters with hash version
repo-settings: introduce commitgraph.changedPathsVersion
t4216: test changed path filters with high bit paths
t/helper/test-read-graph: implement `bloom-filters` mode
bloom.h: make `load_bloom_filter_from_graph()` public
t/helper/test-read-graph.c: extract `dump_graph_info()`
gitformat-commit-graph: describe version 2 of BDAT
commit-graph: ensure Bloom filters are read with consistent settings
revision.c: consult Bloom filters for root commits
t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
date parser updates to be more careful about underflowing epoch
based timestamp.
* db/date-underflow-fix:
date: detect underflow/overflow when parsing dates with timezone offset
t0006: simplify prerequisites
When GIT_PAGER failed to spawn, depending on the code path taken,
we failed immediately (correct) or just spew the payload to the
standard output (incorrect). The code now always fail immediately
when GIT_PAGER fails.
* rj/pager-die-upon-exec-failure:
pager: die when paging to non-existing command
"git archive --add-virtual-file=<path>:<contents>" never paid
attention to the --prefix=<prefix> option but the documentation
said it would. The documentation has been corrected.
* jc/archive-prefix-with-add-virtual-file:
archive: document that --add-virtual-file takes full path
Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.
When users are in this state, there is usually a modified or untracked
file outside of the sparse-checkout mentioned by the output of 'git
status'. There are a number of reasons why this is insufficient:
1. Users may not have a full understanding of which files are inside or
outside of their sparse-checkout. This is more common in monorepos
that manage the sparse-checkout using custom tools that map build
dependencies into sparse-checkout definitions.
2. In some cases, an empty directory could exist outside the
sparse-checkout and these empty directories are not reported by 'git
status' and friends.
3. If the user has '.gitignore' or 'exclude' files, then 'git status'
will squelch the warnings and not demonstrate any problems.
In order to help users who are in this state, add a new advice message
to indicate that a sparse index is expanded to a full index. This
message should be written at most once per process, so add a static
global 'give_advice_on_expansion' to sparse-index.c. Further, there is a
case in 'git sparse-checkout set' that uses the sparse index as an
in-memory data structure (even when writing a full index) so we need to
disable the message in that kind of case.
The t1092-sparse-checkout-compatibility.sh test script compares the
behavior of several Git commands across full and sparse repositories,
including sparse repositories with and without a sparse index. We need
to disable the advice in the sparse-index repo to avoid differences in
stderr. By leaving the advice on in the sparse-checkout repo (without
the sparse index), we can test the behavior of disabling the advice in
convert_to_sparse(). (Indeed, these tests are how that necessity was
discovered.) Add a test that reenables the advice and demonstrates that
the message is output.
The advice message is defined outside of expand_index() to avoid super-
wide lines. It is also defined as a macro to avoid compile issues with
-Werror=format-security.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The number to be displayed is calculated by the following defined in
object.h:
#define REV_SHIFT 2
#define MAX_REVS (FLAG_BITS - REV_SHIFT)
FLAG_BITS is currently 28, so 26 is the correct number.
Signed-off-by: Rikita Ishikawa <lagrange.resolvent@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The author date is used for published/updated date in the rss/atom
feed stream. Change it to the committer date that reflects the
"published/updated" definition better and makes rss/atom feeds more
linear. Gitlab/Github rss/atom feeds use the committer date.
Additionally, to be consistent, also use the committer date to
determine the date of the last commit to send in the feed
instead of the author date.
Signed-off-by: Jesús Ariel Cabello Mateos <080ariel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* https://github.com/j6t/git-gui:
git-gui: fix inability to quit after closing another instance
git-gui: sv.po: Update Swedish translation (576t0f0u)
git-gui: note the new maintainer
Makefile(s): do not enforce "all indents must be done with tab"
Makefile(s): avoid recipe prefix in conditional statements
doc: switch links to https
doc: update links to current pages
git-gui: po: fix typo in French "aperçu"
The problem can be reproduced on Linux with this sequence:
1. Run git gui from a terminal.
2. Edit the commit message and wait for at least 2 seconds.
3. Terminate the instance from the terminal, for example with Ctrl-C,
to simulate crash. This leaves the file .git/GITGUI_BCK behind.
4. Start two instances of git gui &.
At this point the first instance can be closed (it renames
.git/GITGUI_BCK to .git/GITGUI_MSG), but the seconds brings an error
message about the absent file and cannot be closed thereafter and must
be killed from the command line.
The renaming that happens by the first instance is the correct action
and need not be repeated by the second instance. It is the correct
action to ignore the failed renaming.
On the other hand, the second instance could just edit the commit
message again, wait 2 seconds to write GITGUI_BCK, and then can be
closed without failing. At this point, since the user has edited the
message, it is again correct to preserve the edited version in
GITGUI_MSG.
* os/catch-rename:
git-gui: fix inability to quit after closing another instance
The command for generating the list of ForEachMacros searches for
macros whose name contains the string "for_each". Include those whose
name contains "foreach" as well. That brings in kh_foreach and
kh_foreach_value from khash.h.
Regenerating the list also brings in hashmap-based macros added by
87571c3f71 (hashmap: use *_entry APIs for iteration, 2019-10-06),
f0e63c4113 (hashmap: use *_entry APIs to wrap container_of, 2019-10-06),
4fa1d501f7 (strmap: add functions facilitating use as a string->int map,
2020-11-05), b70c82e6ed (strmap: add more utility functions,
2020-11-05), and 1201eb628a (strmap: add a strset sub-type, 2020-11-06).
for_each_abbrev is no longer found because its definition was removed by
d850b7a545 (cocci: apply the "cache.h" part of "the_repository.pending",
2023-03-28). Note that it had been a false positive, though, as it had
been a function wrapper, not a for-like macro.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In ebd2e4a13a (Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format
better, 2021-09-28), we tightened our Makefile's behavior to only enable
-Wpedantic when compiling with either gcc5/clang4 or greater as older
compiler versions did not have support for -Wpedantic.
Commit ebd2e4a13a was looking for either "gcc5" or "clang4" to appear in
the COMPILER_FEATURES variable, combining the two "$(filter ...)"
searches with an "$(or ...)".
But ebd2e4a13a has a typo where instead of writing:
ifneq ($(or ($filter ...),$(filter ...)),)
we wrote:
ifneq (($or ($filter ...),$(filter ...)),)
Causing our Makefile (when invoked with DEVELOPER=1, and a sufficiently
recent compiler version) to barf:
$ make DEVELOPER=1
config.mak.dev:13: extraneous text after 'ifneq' directive
[...]
Correctly combine the results of the two "$(filter ...)" operations by
using "$(or ...)", not "$or".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the conversions in f19b9165 (merge-ort: convert more error()
cases to path_msg(), 2024-06-19) accidentally lost the early return.
Restore it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.
Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git push" is configured to use the push negotiation, a push of
deletion of a branch (without pushing anything else) may end up not
having anything to negotiate for the common ancestor discovery.
In such a case, we end up making an internal invocation of "git
fetch --negotiate-only" without any "--negotiate-tip" parameters
that stops the negotiate-only fetch from being run, which by itself
is not a bad thing (one fewer round-trip), but the end-user sees a
"fatal: --negotiate-only needs one or more --negotiation-tip=*"
message that the user cannot act upon.
Teach "git push" to notice the situation and omit performing the
negotiate-only fetch to begin with. One fewer process spawned, one
fewer "alarming" message given the user.
Signed-off-by: Junio C Hamano <gitster@pobox.com>