Commit Graph

21131 Commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason
62f3a45bb4 t/lib-patch-mode.sh: fix ignored exit codes
Fix code added in b319ef70a9 (Add a small patch-mode testing library,
2009-08-13) to use &&-chaining.

This avoids losing both the exit code of a "git" and the "cat"
processes.

This fixes cases where we'd have e.g. missed memory leaks under
SANITIZE=leak, this code doesn't leak now as far as I can tell, but I
discovered it while looking at leaks in related code.

For "verify_saved_head()" we could make use of "test_cmp_rev" with
some changes, but it uses "git rev-parse --verify", and this existing
test does not. I think it could safely use it, but let's avoid the
while-at-it change, and narrowly fix the exit code problem.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:41 -08:00
Ævar Arnfjörð Bjarmason
fb18dd2831 auto-crlf tests: don't lose exit code in loops and outside tests
Change the functions which are called from within
"test_expect_success" to add the "|| return 1" idiom to their
for-loops, so we won't lose the exit code of "cp", "git" etc.

Then for those setup functions that aren't called from a
"test_expect_success" we need to put the setup code in a
"test_expect_success" as well. It would not be enough to properly
&&-chain these, as the calling code is the top-level script itself. As
we don't run the tests with "set -e" we won't report failing commands
at the top-level.

The "checkout" part of this would miss memory leaks under
SANITIZE=leak, this code doesn't leak (the relevant "git checkout"
leak has been fixed), but in a past version of git we'd continue past
this failure under SANITIZE=leak when these invocations had errored
out, even under "--immediate".

For checkout_files() we could run one test_expect_success() instead of
the 5 we run now in a loop.

But as this function added in [1] is already taking pains to split up
its setup into phases (there are 5 more "test_expect_success()" at the
end of it already, see [1]), let's follow that existing convention.

1. 343151dcbd (t0027: combinations of core.autocrlf, core.eol and text, 2014-06-29)

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:30:41 -08:00
Ævar Arnfjörð Bjarmason
20b813d7d3 add: remove "add.interactive.useBuiltin" & Perl "git add--interactive"
Since [1] first released with Git v2.37.0 the built-in version of "add
-i" has been the default. That built-in implementation was added in
[2], first released with Git v2.25.0.

At this point enough time has passed to allow for finding any
remaining bugs in this new implementation, so let's remove the
fallback code.

As with similar migrations for "stash"[3] and "rebase"[4] we're
keeping a mention of "add.interactive.useBuiltin" in the
documentation, but adding a warning() to notify any outstanding users
that the built-in is now the default. As with [5] and [6] we should
follow-up in the future and eventually remove that warning.

1. 0527ccb1b5 (add -i: default to the built-in implementation,
   2021-11-30)
2. f83dff60a7 (Start to implement a built-in version of `git add
   --interactive`, 2019-11-13)
3. 8a2cd3f512 (stash: remove the stash.useBuiltin setting,
   2020-03-03)
4. d03ebd411c (rebase: remove the rebase.useBuiltin setting,
   2019-03-18)
5. deeaf5ee07 (stash: remove documentation for `stash.useBuiltin`,
   2022-01-27)
6. 9bcde4d531 (rebase: remove transitory rebase.useBuiltin setting &
   env, 2021-03-23)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 15:03:34 -08:00
Kostya Farber
d912a603ed t5000: modernise archive and :(glob) test
To match present day coding guiding codelines let's:

- use <<-EOF, so we can indent all lines to the
  the same level for this test

- use <<\EOF to notify the reader that no interpolation
  is expected in the body

Signed-off-by: Kostya Farber <kostya.farber@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-06 14:14:20 -08:00
Johannes Schindelin
3aef76ffd4 Sync with 2.38.4
* maint-2.38:
  Git 2.38.4
  Git 2.37.6
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:43:39 +01:00
Johannes Schindelin
6487e9c459 Sync with 2.37.6
* maint-2.37:
  Git 2.37.6
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:43:28 +01:00
Johannes Schindelin
16004682f9 Sync with 2.36.5
* maint-2.36:
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:38:31 +01:00
Johannes Schindelin
40843216c5 Sync with 2.35.7
* maint-2.35:
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:37:52 +01:00
Johannes Schindelin
6a53a59bf9 Sync with 2.34.7
* maint-2.34:
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:29:44 +01:00
Johannes Schindelin
a7237f5ae9 Sync with 2.33.7
* maint-2.33:
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:29:16 +01:00
Johannes Schindelin
87248c5933 Sync with 2.32.6
* maint-2.32:
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:25:56 +01:00
Johannes Schindelin
aeb93d7da2 Sync with 2.31.7
* maint-2.31:
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:25:08 +01:00
Johannes Schindelin
e14d6b8408 Sync with 2.30.8
* maint-2.30:
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:24:06 +01:00
Junio C Hamano
a3033a68ac Merge branch 'ps/apply-beyond-symlink' into maint-2.30
Fix a vulnerability (CVE-2023-23946) that allows crafted input to trick
`git apply` into writing files outside of the working tree.

* ps/apply-beyond-symlink:
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:12:16 +01:00
Junio C Hamano
2c6e5b32aa Merge branch 'en/rebase-incompatible-opts'
"git rebase" often ignored incompatible options instead of
complaining, which has been corrected.

* en/rebase-incompatible-opts:
  rebase: provide better error message for apply options vs. merge config
  rebase: put rebase_options initialization in single place
  rebase: fix formatting of rebase --reapply-cherry-picks option in docs
  rebase: clarify the OPT_CMDMODE incompatibilities
  rebase: add coverage of other incompatible options
  rebase: fix incompatiblity checks for --[no-]reapply-cherry-picks
  rebase: fix docs about incompatibilities with --root
  rebase: remove --allow-empty-message from incompatible opts
  rebase: flag --apply and --merge as incompatible
  rebase: mark --update-refs as requiring the merge backend
2023-02-03 16:08:21 -08:00
Patrick Steinhardt
fade728df1 apply: fix writing behind newly created symbolic links
When writing files git-apply(1) initially makes sure that none of the
files it is about to create are behind a symlink:

```
 $ git init repo
 Initialized empty Git repository in /tmp/repo/.git/
 $ cd repo/
 $ ln -s dir symlink
 $ git apply - <<EOF
 diff --git a/symlink/file b/symlink/file
 new file mode 100644
 index 0000000..e69de29
 EOF
 error: affected file 'symlink/file' is beyond a symbolic link
```

This safety mechanism is crucial to ensure that we don't write outside
of the repository's working directory. It can be fooled though when the
patch that is being applied creates the symbolic link in the first
place, which can lead to writing files in arbitrary locations.

Fix this by checking whether the path we're about to create is
beyond a symlink or not. Tightening these checks like this should be
fine as we already have these precautions in Git as explained
above. Ideally, we should update the check we do up-front before
starting to reflect the computed changes to the working tree so that
we catch this case as well, but as part of embargoed security work,
adding an equivalent check just before we try to write out a file
should serve us well as a reasonable first step.

Digging back into history shows that this vulnerability has existed
since at least Git v2.9.0. As Git v2.8.0 and older don't build on my
system anymore I cannot tell whether older versions are affected, as
well.

Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-03 14:41:31 -08:00
Jeff King
b08edf709d t/lib-httpd: increase ssl key size to 2048 bits
Recent versions of openssl will refuse to work with 1024-bit RSA keys,
as they are considered insecure. I didn't track down the exact version
in which the defaults were tightened, but the Debian-package openssl 3.0
on my system yields:

  $ LIB_HTTPD_SSL=1 ./t5551-http-fetch-smart.sh -v -i
  [...]
  SSL Library Error: error:0A00018F:SSL routines::ee key too small
  1..0 # SKIP web server setup failed

This could probably be overcome with configuration, but that's likely
to be a headache (especially if it requires touching /etc/openssl).
Let's just pick a key size that's less outrageously out of date.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
Jeff King
d113449e26 t/lib-httpd: drop SSLMutex config
The SSL config enabled by setting LIB_HTTPD_SSL does not work with
Apache versions greater than 2.2, as more recent versions complain about
the SSLMutex directive. According to
https://httpd.apache.org/docs/current/upgrading.html:

  Directives AcceptMutex, LockFile, RewriteLock, SSLMutex,
  SSLStaplingMutex, and WatchdogMutexPath have been replaced with a
  single Mutex directive. You will need to evaluate any use of these
  removed directives in your 2.2 configuration to determine if they can
  just be deleted or will need to be replaced using Mutex.

Deleting this line will just use the system default, which seems
sensible. The original came as part of faa4bc35a0 (http-push: add
regression tests, 2008-02-27), but no specific reason is given there (or
on the mailing list) for its presence.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
Jeff King
edd060dc84 t/lib-httpd: bump required apache version to 2.4
Apache 2.4 has been out since early 2012, almost 11 years. And its
predecessor, 2.2, has been out of support since its last release in
2017, over 5 years ago. The last mention on the mailing list was from
around the same time, in this thread:

  https://lore.kernel.org/git/20171231023234.21215-1-tmz@pobox.com/

We can probably assume that 2.4 is available everywhere. And the stakes
are fairly low, as the worst case is that such a platform would skip the
http tests.

This lets us clean up a few minor version checks in the config file, but
also revert f1f2b45be0 (tests: adjust the configuration for Apache 2.2,
2016-05-09). Its technique isn't _too_ bad, but certainly required a bit
more explanation than the 2.4 version it replaced. I manually confirmed
that the test in t5551 still behaves as expected (if you replace
"cadabra" with "foo", the server correctly rejects the request).

It will also help future patches which will no longer have to deal with
conditional config for this old version.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
Jeff King
d762617079 t/lib-httpd: bump required apache version to 2.2
Apache 2.2 was released in 2005, almost 18 years ago. We can probably
assume that people are running a version at least that old (and the
stakes for removing it are fairly low, as the worst case is that they
would not run the http tests against their ancient version).

Dropping support for the older versions cleans up the config file a
little, and will also enable us to bump the required version further
(with more cleanups) in a future patch.

Note that the file actually checks for version 2.1. In apache's
versioning scheme, odd numbered versions are for development and even
numbers are for stable releases. So 2.1 and 2.2 are effectively the same
from our perspective.

Older versions would just fail to start, which would generally cause us
to skip the tests. However, we do have version detection code in
lib-httpd.sh which produces a nicer error message, so let's update that,
too. I didn't bother handling the case of "3.0", etc. Apache has been on
2.x for 21 years, with no signs of bumping the major version.  And if
they eventually do, I suspect there will be enough breaking changes that
we'd need to update more than just the numeric version check. We can
worry about that hypothetical when it happens.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-01 10:10:34 -08:00
Derrick Stolee
026df9e047 bundle-uri: test missing bundles with heuristic
The creationToken heuristic uses a different mechanism for downloading
bundles from the "standard" approach. Specifically: it uses a concrete
order based on the creationToken values and attempts to download as few
bundles as possible. It also modifies local config to store a value for
future fetches to avoid downloading bundles, if possible.

However, if any of the individual bundles has a failed download, then
the logic for the ordering comes into question. It is important to avoid
infinite loops, assigning invalid creation token values in config, but
also to be opportunistic as possible when downloading as many bundles as
seem appropriate.

These tests were used to inform the implementation of
fetch_bundles_by_token() in bundle-uri.c, but are being added
independently here to allow focusing on faulty downloads. There may be
more cases that could be added that result in modifications to
fetch_bundles_by_token() as interesting data shapes reveal themselves in
real scenarios.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
Derrick Stolee
c429bed102 bundle-uri: store fetch.bundleCreationToken
When a bundle list specifies the "creationToken" heuristic, the Git
client downloads the list and then starts downloading bundles in
descending creationToken order. This process stops as soon as all
downloaded bundles can be applied to the repository (because all
required commits are present in the repository or in the downloaded
bundles).

When checking the same bundle list twice, this strategy requires
downloading the bundle with the maximum creationToken again, which is
wasteful. The creationToken heuristic promises that the client will not
have a use for that bundle if its creationToken value is at most the
previous creationToken value.

To prevent these wasteful downloads, create a fetch.bundleCreationToken
config setting that the Git client sets after downloading bundles. This
value allows skipping that maximum bundle download when this config
value is the same value (or larger).

To test that this works correctly, we can insert some "duplicate"
fetches into existing tests and demonstrate that only the bundle list is
downloaded.

The previous logic for downloading bundles by creationToken worked even
if the bundle list was empty, but now we have logic that depends on the
first entry of the list. Terminate early in the (non-sensical) case of
an empty bundle list.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
Derrick Stolee
7f0cc04f2c fetch: fetch from an external bundle URI
When a user specifies a URI via 'git clone --bundle-uri', that URI may
be a bundle list that advertises a 'bundle.heuristic' value. In that
case, the Git client stores a 'fetch.bundleURI' config value storing
that URI.

Teach 'git fetch' to check for this config value and download bundles
from that URI before fetching from the Git remote(s). Likely, the bundle
provider has configured a heuristic (such as "creationToken") that will
allow the Git client to download only a portion of the bundles before
continuing the fetch.

Since this URI is completely independent of the remote server, we want
to be sure that we connect to the bundle URI before creating a
connection to the Git remote. We do not want to hold a stateful
connection for too long if we can avoid it.

To test that this works correctly, extend the previous tests that set
'fetch.bundleURI' to do follow-up fetches. The bundle list is updated
incrementally at each phase to demonstrate that the heuristic avoids
downloading older bundles. This includes the middle fetch downloading
the objects in bundle-3.bundle from the Git remote, and therefore not
needing that bundle in the third fetch.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
Derrick Stolee
4074d3c7e1 clone: set fetch.bundleURI if appropriate
Bundle providers may organize their bundle lists in a way that is
intended to improve incremental fetches, not just initial clones.
However, they do need to state that they have organized with that in
mind, or else the client will not expect to save time by downloading
bundles after the initial clone. This is done by specifying a
bundle.heuristic value.

There are two types of bundle lists: those at a static URI and those
that are advertised from a Git remote over protocol v2.

The new fetch.bundleURI config value applies for static bundle URIs that
are not advertised over protocol v2. If the user specifies a static URI
via 'git clone --bundle-uri', then Git can set this config as a reminder
for future 'git fetch' operations to check the bundle list before
connecting to the remote(s).

For lists provided over protocol v2, we will want to take a different
approach and create a property of the remote itself by creating a
remote.<id>.* type config key. That is not implemented in this change.

Later changes will update 'git fetch' to consume this option.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
Derrick Stolee
7903efb717 bundle-uri: download in creationToken order
The creationToken heuristic provides an ordering on the bundles
advertised by a bundle list. Teach the Git client to download bundles
differently when this heuristic is advertised.

The bundles in the list are sorted by their advertised creationToken
values, then downloaded in decreasing order. This avoids the previous
strategy of downloading bundles in an arbitrary order and attempting
to apply them (likely failing in the case of required commits) until
discovering the order through attempted unbundling.

During a fresh 'git clone', it may make sense to download the bundles in
increasing order, since that would prevent the need to attempt
unbundling a bundle with required commits that do not exist in our empty
object store. The cost of testing an unbundle is quite low, and instead
the chosen order is optimizing for a future bundle download during a
'git fetch' operation with a non-empty object store.

Since the Git client continues fetching from the Git remote after
downloading and unbundling bundles, the client's object store can be
ahead of the bundle provider's object store. The next time it attempts
to download from the bundle list, it makes most sense to download only
the most-recent bundles until all tips successfully unbundle. The
strategy implemented here provides that short-circuit where the client
downloads a minimal set of bundles.

However, we are not satisfied by the naive approach of downloading
bundles until one successfully unbundles, expecting the earlier bundles
to successfully unbundle now. The example repository in t5558
demonstrates this well:

 ---------------- bundle-4

       4
      / \
 ----|---|------- bundle-3
     |   |
     |   3
     |   |
 ----|---|------- bundle-2
     |   |
     2   |
     |   |
 ----|---|------- bundle-1
      \ /
       1
       |
 (previous commits)

In this repository, if we already have the objects for bundle-1 and then
try to fetch from this list, the naive approach will fail. bundle-4
requires both bundle-3 and bundle-2, though bundle-3 will successfully
unbundle without bundle-2. Thus, the algorithm needs to keep this in
mind.

A later implementation detail will store the maximum creationToken seen
during such a bundle download, and the client will avoid downloading a
bundle unless its creationToken is strictly greater than that stored
value. For now, if the client seeks to download from an identical
bundle list since its previous download, it will download the
most-recent bundle then stop since its required commits are already in
the object store.

Add tests that exercise this behavior, but we will expand upon these
tests when incremental downloads during 'git fetch' make use of
creationToken values.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
Derrick Stolee
512fccf8a5 bundle-uri: parse bundle.<id>.creationToken values
The previous change taught Git to parse the bundle.heuristic value,
especially when its value is "creationToken". Now, teach Git to parse
the bundle.<id>.creationToken values on each bundle in a bundle list.

Before implementing any logic based on creationToken values for the
creationToken heuristic, parse and print these values for testing
purposes.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
Derrick Stolee
c93c3d2fa4 bundle-uri: parse bundle.heuristic=creationToken
The bundle.heuristic value communicates that the bundle list is
organized to make use of the bundle.<id>.creationToken values that may
be provided in the bundle list. Those values will create a total order
on the bundles, allowing the Git client to download them in a specific
order and even remember previously-downloaded bundles by storing the
maximum creation token value.

Before implementing any logic that parses or uses the
bundle.<id>.creationToken values, teach Git to parse the
bundle.heuristic value from a bundle list. We can use 'test-tool
bundle-uri' to print the heuristic value and verify that the parsing
works correctly.

As an extra precaution, create the internal 'heuristics' array to be a
list of (enum, string) pairs so we can iterate through the array entries
carefully, regardless of the enum values.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:48 -08:00
Derrick Stolee
7bc73e7b61 t5558: add tests for creationToken heuristic
As documented in the bundle URI design doc in 2da14fad8f (docs:
document bundle URI standard, 2022-08-09), the 'creationToken' member of
a bundle URI allows a bundle provider to specify a total order on the
bundles.

Future changes will allow the Git client to understand these members and
modify its behavior around downloading the bundles in that order. In the
meantime, create tests that add creation tokens to the bundle list. For
now, the Git client correctly ignores these unknown keys.

Create a new test helper function, test_remote_https_urls, which filters
GIT_TRACE2_EVENT output to extract a list of URLs passed to
git-remote-https child processes. This can be used to verify the order
of these requests as we implement the creationToken heuristic. For now,
we need to sort the actual output since the current client does not have
a well-defined order that it applies to the bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:47 -08:00
Derrick Stolee
d9fd674c8b bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.

Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().

This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.

To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.

The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:

  error: Could not read <missing-commit>
  fatal: Failed to traverse parents of commit <extant-commit>

However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:

  error: some prerequisite commits exist in the object store,
         but are not connected to the repository's history

(Line break added here for the commit message formatting, only.)

While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:47 -08:00
Derrick Stolee
e72171f085 bundle: test unbundling with incomplete history
When verifying a bundle, Git checks first that all prerequisite commits
exist in the object store, then adds an additional check: those
prerequisite commits must be reachable from references in the
repository.

This check is stronger than what is checked for refs being added during
'git fetch', which simply guarantees that the new refs have a complete
history up to the point where it intersects with the current reachable
history.

However, we also do not have any tests that check the behavior under
this condition. Create a test that demonstrates its behavior.

In order to construct a broken history, perform a shallow clone of a
repository with a linear history, but whose default branch ('base') has
a single commit, so dropping the shallow markers leaves a complete
history from that reference. However, the 'tip' reference adds a
shallow commit whose parent is missing in the cloned repository. Trying
to unbundle a bundle with the 'tip' as a prerequisite will succeed past
the object store check and move into the reachability check.

The two errors that are reported are of this form:

  error: Could not read <missing-commit>
  fatal: Failed to traverse parents of commit <present-commit>

These messages are not particularly helpful for the person running the
unbundle command, but they do prevent the command from succeeding.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 08:57:47 -08:00
Junio C Hamano
777afaaa5c Merge branch 'tb/t0003-invoke-dd-more-portably'
Test portability fix.

* tb/t0003-invoke-dd-more-portably:
  t0003: call dd with portable blocksize
2023-01-30 14:24:23 -08:00
Junio C Hamano
abf2bb895b Merge branch 'jk/hash-object-fsck'
"git hash-object" now checks that the resulting object is well
formed with the same code as "git fsck".

* jk/hash-object-fsck:
  fsck: do not assume NUL-termination of buffers
  hash-object: use fsck for object checks
  fsck: provide a function to fsck buffer without object struct
  t: use hash-object --literally when created malformed objects
  t7030: stop using invalid tag name
  t1006: stop using 0-padded timestamps
  t1007: modernize malformed object tests
2023-01-30 14:24:22 -08:00
Junio C Hamano
4ac326f64f Merge branch 'po/pretty-format-columns-doc'
Clarify column-padding operators in the pretty format string.

* po/pretty-format-columns-doc:
  doc: pretty-formats note wide char limitations, and add tests
  doc: pretty-formats describe use of ellipsis in truncation
  doc: pretty-formats document negative column alignments
  doc: pretty-formats: delineate `%<|(` parameter values
  doc: pretty-formats: separate parameters from placeholders
2023-01-30 14:24:22 -08:00
Derrick Stolee
dea6308892 scalar: only warn when background maintenance fails
A user reported issues with 'scalar clone' and 'scalar register' when
working in an environment that had locked down the ability to run
'crontab' or 'systemctl' in that those commands registered as _failures_
instead of opportunistically reporting a success with just a warning
about background maintenance.

As a workaround, they can use GIT_TEST_MAINT_SCHEDULER to fake a
successful background maintenance, but this is not a viable strategy for
long-term.

Update 'scalar register' and 'scalar clone' to no longer fail by
modifying register_dir() to only warn when toggle_maintenance(1) fails.

Since background maintenance is a "nice to have" and not a requirement
for a working repository, it is best to move this from hard error to
gentle warning.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-27 12:38:26 -08:00
Derrick Stolee
eeea9ae165 t921*: test scalar behavior starting maintenance
A user recently reported issues with 'scalar register' and 'scalar
clone' in that they failed when the system had permissions locked down
so both 'crontab' and 'systemctl' commands failed when trying to enable
background maintenance.

This hard error is undesirable, but let's create tests that demonstrate
this behavior before modiying the behavior. We can use
GIT_TEST_MAINT_SCHEDULER to guarantee failure and check the exit code
and error message.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-27 12:38:26 -08:00
Derrick Stolee
008217cb4a t: allow 'scalar' in test_must_fail
This will enable scalar tests to use the test_must_fail helper, when
necessary.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-27 12:38:26 -08:00
Junio C Hamano
d26e26a3f5 Merge branch 'cw/fetch-remote-group-with-duplication'
"git fetch <group>", when "<group>" of remotes lists the same
remote twice, unnecessarily failed when parallel fetching was
enabled, which has been corrected.

* cw/fetch-remote-group-with-duplication:
  fetch: fix duplicate remote parallel fetch bug
2023-01-27 08:51:41 -08:00
Junio C Hamano
531d13d4d2 Merge branch 'km/send-email-with-v-reroll-count'
"git send-email -v 3" used to be expanded to "git send-email
--validate 3" when the user meant to pass them down to
"format-patch", which has been corrected.

* km/send-email-with-v-reroll-count:
  send-email: relay '-v N' to format-patch
2023-01-27 08:51:40 -08:00
Junio C Hamano
557d93a146 Merge branch 'cb/grep-pcre-ucp'
"grep -P" learned to use Unicode Character Property to grok
character classes when processing \b and \w etc.

* cb/grep-pcre-ucp:
  grep: correctly identify utf-8 characters with \{b,w} in -P
2023-01-27 08:51:40 -08:00
Elijah Newren
eddfcd8ece rebase: provide better error message for apply options vs. merge config
When config which selects the merge backend (currently,
rebase.autosquash=true or rebase.updateRefs=true) conflicts with other
options on the command line (such as --whitespace=fix), make the error
message specifically call out the config option and specify how to
override that config option on the command line.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
Elijah Newren
796abac7e1 rebase: add coverage of other incompatible options
The git-rebase manual noted several sets of incompatible options, but
we were missing tests for a few of these.  Further, we were missing
code checks for one of these, which could result in command line
options being silently ignored.

Also, note that adding a check for autosquash means that using
--whitespace=fix together with the config setting rebase.autosquash=true
will trigger an error.  A subsequent commit will improve the error
message.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
Elijah Newren
ffeaca177a rebase: fix incompatiblity checks for --[no-]reapply-cherry-picks
--[no-]reapply-cherry-picks was traditionally only supported by the
sequencer.  Support was added for the apply backend, when --keep-base is
also specified, in commit ce5238a690 ("rebase --keep-base: imply
--reapply-cherry-picks", 2022-10-17).  Make the code error out when
--[no-]reapply-cherry-picks is specified AND the apply backend is used
AND --keep-base is not specified.  Also, clarify a number of comments
surrounding the interaction of these flags.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
Elijah Newren
b8ad365640 rebase: fix docs about incompatibilities with --root
In commit 5dacd4abdd ("git-rebase.txt: document incompatible options",
2018-06-25), I added notes about incompatibilities between options for
the apply and merge backends.  Unfortunately, I inverted the condition
when --root was incompatible with the apply backend.  Fix the
documentation, and add a testcase that verifies the documentation
matches the code.

While at it, the documentation for --root also tried to cover some of
the backend differences between the apply and merge backends in relation
to reapplying cherry picks.  The information:
  * assumed that the apply backend was the default (it isn't anymore)
  * was written before --reapply-cherry-picks became an option
  * was written before the detailed information on backend differences
All of these factors make the sentence under --root about reapplying
cherry picks contradict information that is now available elsewhere in
the manual, and the other references are correct.  So just strike this
sentence.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:53 -08:00
Elijah Newren
7d718c552b rebase: flag --apply and --merge as incompatible
Previously, we flagged options which implied --apply as being
incompatible with options which implied --merge.  But if both options
were given explicitly, then we didn't flag the incompatibility.  The
same is true with --apply and --interactive.  Add the check, and add
some testcases to verify these are also caught.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:52 -08:00
Elijah Newren
1207599e83 rebase: mark --update-refs as requiring the merge backend
--update-refs is built in terms of the sequencer, which requires the
merge backend.  It was already marked as incompatible with the apply
backend in the git-rebase manual, but the code didn't check for this
incompatibility and warn the user.  Check and error now.

While at it, fix a typo in t3422...and fix some misleading wording
(most options which used to be am-specific have since been implemented
in the merge backend as well).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-25 09:20:52 -08:00
Taylor Blau
bffc762f87 dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
When using the dir_iterator API, we first stat(2) the base path, and
then use that as a starting point to enumerate the directory's contents.

If the directory contains symbolic links, we will immediately die() upon
encountering them without the `FOLLOW_SYMLINKS` flag. The same is not
true when resolving the top-level directory, though.

As explained in a previous commit, this oversight in 6f054f9fb3
(builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28)
can be used as an attack vector to include arbitrary files on a victim's
filesystem from outside of the repository.

Prevent resolving top-level symlinks unless the FOLLOW_SYMLINKS flag is
given, which will cause clones of a repository with a symlink'd
"$GIT_DIR/objects" directory to fail.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
Taylor Blau
cf8f6ce02a clone: delay picking a transport until after get_repo_path()
In the previous commit, t5619 demonstrates an issue where two calls to
`get_repo_path()` could trick Git into using its local clone mechanism
in conjunction with a non-local transport.

That sequence is:

 - the starting state is that the local path https:/example.com/foo is a
   symlink that points to ../../../.git/modules/foo. So it's dangling.

 - get_repo_path() sees that no such path exists (because it's
   dangling), and thus we do not canonicalize it into an absolute path

 - because we're using --separate-git-dir, we create .git/modules/foo.
   Now our symlink is no longer dangling!

 - we pass the url to transport_get(), which sees it as an https URL.

 - we call get_repo_path() again, on the url. This second call was
   introduced by f38aa83f9a (use local cloning if insteadOf makes a
   local URL, 2014-07-17). The idea is that we want to pull the url
   fresh from the remote.c API, because it will apply any aliases.

And of course now it sees that there is a local file, which is a
mismatch with the transport we already selected.

The issue in the above sequence is calling `transport_get()` before
deciding whether or not the repository is indeed local, and not passing
in an absolute path if it is local.

This is reminiscent of a similar bug report in [1], where it was
suggested to perform the `insteadOf` lookup earlier. Taking that
approach may not be as straightforward, since the intent is to store the
original URL in the config, but to actually fetch from the insteadOf
one, so conflating the two early on is a non-starter.

Note: we pass the path returned by `get_repo_path(remote->url[0])`,
which should be the same as `repo_name` (aside from any `insteadOf`
rewrites).

We *could* pass `absolute_pathdup()` of the same argument, which
86521acaca (Bring local clone's origin URL in line with that of a remote
clone, 2008-09-01) indicates may differ depending on the presence of
".git/" for a non-bare repo. That matters for forming relative submodule
paths, but doesn't matter for the second call, since we're just feeding
it to the transport code, which is fine either way.

[1]: https://lore.kernel.org/git/CAMoD=Bi41mB3QRn3JdZL-FGHs4w3C2jGpnJB-CqSndO7FMtfzA@mail.gmail.com/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
Taylor Blau
58325b93c5 t5619: demonstrate clone_local() with ambiguous transport
When cloning a repository, Git must determine (a) what transport
mechanism to use, and (b) whether or not the clone is local.

Since f38aa83f9a (use local cloning if insteadOf makes a local URL,
2014-07-17), the latter check happens after the remote has been
initialized, and references the remote's URL instead of the local path.
This is done to make it possible for a `url.<base>.insteadOf` rule to
convert a remote URL into a local one, in which case the `clone_local()`
mechanism should be used.

However, with a specially crafted repository, Git can be tricked into
using a non-local transport while still setting `is_local` to "1" and
using the `clone_local()` optimization. The below test case
demonstrates such an instance, and shows that it can be used to include
arbitrary (known) paths in the working copy of a cloned repository on a
victim's machine[^1], even if local file clones are forbidden by
`protocol.file.allow`.

This happens in a few parts:

 1. We first call `get_repo_path()` to see if the remote is a local
    path. If it is, we replace the repo name with its absolute path.

 2. We then call `transport_get()` on the repo name and decide how to
    access it. If it was turned into an absolute path in the previous
    step, then we should always treat it like a file.

 3. We use `get_repo_path()` again, and set `is_local` as appropriate.
    But it's already too late to rewrite the repo name as an absolute
    path, since we've already fed it to the transport code.

The attack works by including a submodule whose URL corresponds to a
path on disk. In the below example, the repository "sub" is reachable
via the dumb HTTP protocol at (something like):

    http://127.0.0.1:NNNN/dumb/sub.git

However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level
directory called "http:", then nested directories "127.0.0.1:NNNN", and
"dumb") exists within the repository, too.

To determine this, it first picks the appropriate transport, which is
dumb HTTP. It then uses the remote's URL in order to determine whether
the repository exists locally on disk. However, the malicious repository
also contains an embedded stub repository which is the target of a
symbolic link at the local path corresponding to the "sub" repository on
disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git",
pointing to the stub repository via ".git/modules/sub/../../../repo").

This stub repository fools Git into thinking that a local repository
exists at that URL and thus can be cloned locally. The affected call is
in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which
locates a valid repository at that target.

This then causes Git to set the `is_local` variable to "1", and in turn
instructs Git to clone the repository using its local clone optimization
via the `clone_local()` function.

The exploit comes into play because the stub repository's top-level
"$GIT_DIR/objects" directory is a symbolic link which can point to an
arbitrary path on the victim's machine. `clone_local()` resolves the
top-level "objects" directory through a `stat(2)` call, meaning that we
read through the symbolic link and copy or hardlink the directory
contents at the destination of the link.

In other words, we can get steps (1) and (3) to disagree by leveraging
the dangling symlink to pick a non-local transport in the first step,
and then set is_local to "1" in the third step when cloning with
`--separate-git-dir`, which makes the symlink non-dangling.

This can result in data-exfiltration on the victim's machine when
sensitive data is at a known path (e.g., "/home/$USER/.ssh").

The appropriate fix is two-fold:

 - Resolve the transport later on (to avoid using the local
   clone optimization with a non-local transport).

 - Avoid reading through the top-level "objects" directory when
   (correctly) using the clone_local() optimization.

This patch merely demonstrates the issue. The following two patches will
implement each part of the above fix, respectively.

[^1]: Provided that any target directory does not contain symbolic
  links, in which case the changes from 6f054f9fb3 (builtin/clone.c:
  disallow `--local` clones with symlinks, 2022-07-28) will abort the
  clone.

Reported-by: yvvdwf <yvvdwf@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
Junio C Hamano
ebed06a3e9 Merge branch 'zh/scalar-progress'
"scalar" learned to give progress bar.

* zh/scalar-progress:
  scalar: show progress if stderr refers to a terminal
2023-01-23 13:39:52 -08:00
Junio C Hamano
5287319bf8 Merge branch 'ds/omit-trailing-hash-in-index'
Quickfix for a topic already in 'master'.

* ds/omit-trailing-hash-in-index:
  t1600: fix racy index.skipHash test
2023-01-23 13:39:52 -08:00