Commit Graph

12451 Commits

Author SHA1 Message Date
Jeff King
521e04e6e8 sparse-checkout: free pattern list in sparse_checkout_list()
In sparse_checkout_list(), we create a pattern_list that needs to
eventually be cleared. We remember to do so in the regular code path,
but the cone-mode path does an early return, and forgets to clean up.

We could fix the leak by adding a new call to clear_pattern_list(). But
we can simplify even further by just skipping the early return, pushing
the other code path (which consists now of only one line!) into an else
block. That also matches the same cone/non-cone if/else used in some
other functions.

This fixes 15 leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King
008f59d2d6 sparse-checkout: free sparse_filename after use
We allocate a heap buffer via get_sparse_checkout_filename(). Most calls
remember to free it, but sparse_checkout_init() forgets to, causing a
leak. Ironically, it remembers to do so in the error return paths, but
not in the path that makes it all the way to the function end!

Fixing this clears up 6 leaks from t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King
a14d49ca84 sparse-checkout: refactor temporary sparse_checkout_patterns
In update_working_directory(), we take in a pattern_list, attach it to
the repository index by assigning it to index->sparse_checkout_patterns,
and then call unpack_trees. Afterwards, we remove it by setting
index->sparse_checkout_patterns back to NULL.

But there are two possible leaks here:

  1. If the index already had a populated sparse_checkout_patterns,
     we've obliterated it. We can fix this by saving and restoring it,
     rather than always setting it back to NULL.

  2. We may call the function with a NULL pattern_list, expecting it to
     use the on-disk sparse file. In that case, the index routines will
     lazy-load the sparse patterns automatically. But now at the end of
     the function when we restore the patterns, we'll leak those
     lazy-loaded ones!

     We can fix this by freeing the pattern list before overwriting its
     pointer whenever it does not match what was passed in (in practice
     this should only happen when the passed-in list is NULL, but this
     is erring on the defensive side).

Together these remove 48 indirect leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King
d765fa0331 sparse-checkout: always free "line" strbuf after reading input
In add_patterns_from_input(), we may read lines from a file with a loop
like this:

  while (!strbuf_getline(&line, file)) {
	...
	strbuf_to_cone_pattern(&line, pl);
  }
  /* we don't strbuf_release(&line) here! */

This generally is OK because strbuf_to_cone_pattern() consumes the
buffer via strbuf_detach(). But we can leak in a few cases:

  1. We don't always consume the buffer! If the line ends up empty after
     trimming, we leave strbuf_to_cone_pattern() without detaching. In
     most cases this is OK, because a subsequent getline() call will use
     the same buffer. But if you had an empty line at the end of file,
     for example, it would leak.

  2. Even if strbuf_to_cone_pattern() always consumed the buffer,
     there's a subtle issue with strbuf_getline(). As we saw in
     94e2aa555e (strbuf: fix leak when `appendwholeline()` fails with
     EOF, 2024-05-27), it's possible for it to return EOF with an
     allocated buffer (e.g., if the underlying getdelim() call saw an
     error). So we should always strbuf_release() after finishing a read
     loop like this.

Note that even the code to read patterns from argv has the same problem.
Because that also uses strbuf_to_cone_pattern(), we stuff each argv
entry into a strbuf. It uses the same "line" strbuf as the getline code,
but we should position the strbuf_release() to cover both code paths.

This fixes at least 9 leaks found in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:43 -07:00
Jeff King
c3324649ed sparse-checkout: reuse --stdin buffer when reading patterns
When we read patterns from --stdin, we loop on strbuf_getline(), and
detach each line we read to pass into add_pattern(). This used to be
necessary because add_pattern() required that the pattern strings remain
valid while the pattern_list was in use. But it also created a leak,
since we didn't record the detached buffers anywhere else.

Now that add_pattern() has been modified to make its own copy of the
strings, we can stop detaching and fix the leak. This fixes 4 leaks
detected in t1091.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05 09:51:42 -07:00
Junio C Hamano
607c3d372e show-ref: introduce --branches and deprecate --heads
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.

Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.

We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
Junio C Hamano
b773fb8822 ls-remote: introduce --branches and deprecate --heads
We call the tips of branches "heads", but this command calls the
option to show only branches "--heads", which confuses the branches
themselves and the tips of branches.

Straighten the terminology by introducing "--branches" option that
limits the output to branches, and deprecate "--heads" option used
that way.

We do not plan to remove "--heads" or "-h" yet; we may want to do so
at Git 3.0, in which case, we may need to start advertising upcoming
removal with an extra warning when they are used.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
Junio C Hamano
a096e70c78 refs: call branches branches
These things in refs/heads/ hierarchy are called "branches" in human
parlance.  Replace REF_HEADS with REF_BRANCHES to make it clearer.

No end-user visible change intended at this step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 15:07:08 -07:00
Jeff King
db83b64cda sparse-checkout: clear patterns when init() sees existing sparse file
In sparse_checkout_init(), we first try to load patterns from an
existing file. If we found any, we return immediately, but end up
leaking the patterns we parsed. Fixing this reduces the number of leaks
in t7002 from 9 down to 5.

Note that there are two other exits from the function, but they don't
need the same treatment:

  - if we can't resolve HEAD, we write out a hard-coded sparse file and
    return. But we know the pattern list is empty there, since we didn't
    find any in the on-disk file and we haven't yet added any of our
    own.

  - otherwise, we do populate the list and then tail-call into
    write_patterns_and_update(). But that function frees the
    pattern_list itself, so we don't need to.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
Jeff King
4d7f95ed1f sparse-checkout: pass string literals directly to add_pattern()
The add_pattern() function takes a pattern string, but neither makes a
copy of it nor takes ownership of the memory. So it is the caller's
responsibility to make sure the string hangs around as long as the
pattern_list which references it.

There are a few cases in sparse-checkout where we use string literal
patterns by stuffing them into a strbuf, detaching the buffer, and then
passing the result into add_pattern(). This creates a leak when the
pattern_list is eventually cleared, since we don't retain a copy of the
detached buffer to free.

But we can observe that the whole strbuf dance is unnecessary. The point
was presumably[1] to satisfy the lifetime requirement of the string. But
string literals have static duration; we can count on them lasting for
the whole program.

So we can fix the leak by just passing them directly. And as a bonus,
that simplifies the code. The leaks can be seen in t7002, which drops
from 25 leaks to 22 with this patch. It also makes t3602 and t1090
leak-free.

In the long run, we will also want to clean up this (undocumented!)
memory lifetime requirement of add_pattern(). But that can come in a
later patch; passing the string literals directly will be the right
thing either way.

[1] The code in question comes from 416adc8711 (sparse-checkout: update
    working directory in-process for 'init', 2019-11-21) and 99dfa6f970
    (sparse-checkout: use in-process update for disable subcommand,
    2019-11-21), but I didn't see anything in their commit messages or
    on the list explaining the strbufs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:23 -07:00
Jeff King
2181fe6e46 sparse-checkout: free string list in write_cone_to_file()
We use a string list to hold sorted and de-duped patterns, but don't
free it before leaving the function, causing a leak.

This drops the number of leaks found in t7002 from 27 to 25.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-04 10:38:22 -07:00
Junio C Hamano
03b0e7d3a7 Merge branch 'ps/leakfixes' into ps/leakfixes-more
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-06-03 13:08:33 -07:00
Junio C Hamano
f8da12adcf Merge branch 'jc/fix-2.45.1-and-friends-for-maint'
Adjust jc/fix-2.45.1-and-friends-for-2.39 for more recent
maintenance track.

* jc/fix-2.45.1-and-friends-for-maint:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-30 14:15:17 -07:00
Junio C Hamano
6c5be97e4e Merge branch 'jc/undecided-is-not-necessarily-sha1-fix'
The base topic started to make it an error for a command to leave
the hash algorithm unspecified, which revealed a few commands that
were not ready for the change.  Give users a knob to revert back to
the "default is sha-1" behaviour as an escape hatch, and start
fixing these breakages.

* jc/undecided-is-not-necessarily-sha1-fix:
  apply: fix uninitialized hash function
  builtin/hash-object: fix uninitialized hash function
  builtin/patch-id: fix uninitialized hash function
  t1517: test commands that are designed to be run outside repository
  setup: add an escape hatch for "no more default hash algorithm" change
2024-05-30 14:15:14 -07:00
Junio C Hamano
988499e295 Merge branch 'ps/refs-without-the-repository-updates'
Further clean-up the refs subsystem to stop relying on
the_repository, and instead use the repository associated to the
ref_store object.

* ps/refs-without-the-repository-updates:
  refs/packed: remove references to `the_hash_algo`
  refs/files: remove references to `the_hash_algo`
  refs/files: use correct repository
  refs: remove `dwim_log()`
  refs: drop `git_default_branch_name()`
  refs: pass repo when peeling objects
  refs: move object peeling into "object.c"
  refs: pass ref store when detecting dangling symrefs
  refs: convert iteration over replace refs to accept ref store
  refs: retrieve worktree ref stores via associated repository
  refs: refactor `resolve_gitlink_ref()` to accept a repository
  refs: pass repo when retrieving submodule ref store
  refs: track ref stores via strmap
  refs: implement releasing ref storages
  refs: rename `init_db` callback to avoid confusion
  refs: adjust names for `init` and `init_db` callbacks
2024-05-30 14:15:13 -07:00
Junio C Hamano
a60c21b720 Merge branch 'ps/undecided-is-not-necessarily-sha1'
Before discovering the repository details, We used to assume SHA-1
as the "default" hash function, which has been corrected. Hopefully
this will smoke out codepaths that rely on such an unwarranted
assumptions.

* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes
2024-05-30 14:15:11 -07:00
Phillip Wood
0c26738aa4 rebase -i: pass struct replay_opts to parse_insn_line()
This new parameter will be used in the next commit. As adding the
parameter requires quite a few changes to plumb it through the call
chain these are separated into their own commit to avoid cluttering up
the next commit with incidental changes.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 10:02:56 -07:00
Jeff King
64f8502b40 mv: replace src_dir with a strvec
We manually manage the src_dir array with ALLOC_GROW. Using a strvec is
a little more ergonomic, and makes the memory ownership more clear. It
does mean that we copy the strings (which were otherwise just pointers
into the "sources" strvec), but using the same rationale as 9fcd9e4e72
(builtin/mv duplicate string list memory, 2024-05-27), it's just not
enough to be worth worrying about here.

As a bonus, this gets rid of some "int"s used for allocation management
(though in practice these were limited to command-line sizes and thus
not overflowable).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
Jeff King
d58a687705 mv: factor out empty src_dir removal
This pulls the loop added by b6f51e3db9 (mv: cleanup empty
WORKING_DIRECTORY, 2022-08-09) into a sub-function. That reduces clutter
in cmd_mv() and makes it easier to see that the lifetime of the
a_src_dir strbuf is limited to this code (and thus its cleanup doesn't
need to go after the "out" label).

Another option would be to just declare the strbuf inside the loop,
since it is only used there. But this refactor retains the existing
property that we can reuse the allocated buffer for each iteration of
the loop. That optimization is probably overkill, but I think the
sub-function is more readable anyway, and then keeping the optimization
is basically free.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
Jeff King
cc65e085e4 mv: move src_dir cleanup to end of cmd_mv()
Commit b6f51e3db9 (mv: cleanup empty WORKING_DIRECTORY, 2022-08-09)
added an auxiliary array where we store directory arguments that we see
while processing the incoming arguments. After actually moving things,
we then use that array to remove now-empty directories, and then
immediately free the array.

But if the actual move queues any errors in only_match_skip_worktree,
that can cause us to jump straight to the "out" label to clean up,
skipping the free() and leaking the array.

Let's push the free() down past the "out" label so that we always clean
up (the array is initialized to NULL, so this is always safe). We'll
hold on to the memory a little longer than necessary, but clarity is
more important than micro-optimizing here.

Note that the adjacent "a_src_dir" strbuf does not suffer the same
problem; it is only allocated during the removal step.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30 08:55:29 -07:00
Junio C Hamano
a3f0e2a064 Merge branch 'ps/leakfixes' into jk/leakfixes
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-05-30 08:54:58 -07:00
Junio C Hamano
5529cba09f Merge branch 'ps/leakfixes' into ps/no-writable-strings
* ps/leakfixes:
  builtin/mv: fix leaks for submodule gitfile paths
  builtin/mv: refactor to use `struct strvec`
  builtin/mv duplicate string list memory
  builtin/mv: refactor `add_slash()` to always return allocated strings
  strvec: add functions to replace and remove strings
  submodule: fix leaking memory for submodule entries
  commit-reach: fix memory leak in `ahead_behind()`
  builtin/credential: clear credential before exit
  config: plug various memory leaks
  config: clarify memory ownership in `git_config_string()`
  builtin/log: stop using globals for format config
  builtin/log: stop using globals for log config
  convert: refactor code to clarify ownership of check_roundtrip_encoding
  diff: refactor code to clarify memory ownership of prefixes
  config: clarify memory ownership in `git_config_pathname()`
  http: refactor code to clarify memory ownership
  checkout: clarify memory ownership in `unique_tracking_name()`
  strbuf: fix leak when `appendwholeline()` fails with EOF
  transport-helper: fix leaking helper name
2024-05-29 09:32:24 -07:00
Ghanshyam Thakkar
a70f8f19ad strbuf: introduce strbuf_addstrings() to repeatedly add a string
In a following commit we are going to port code from
"t/helper/test-sha256.c", t/helper/test-hash.c and "t/t0015-hash.sh" to
a new "t/unit-tests/t-hash.c" file using the recently added unit test
framework.

To port code like: perl -e "$| = 1; print q{aaaaaaaaaa} for 1..100000;"
we are going to need a new strbuf_addstrings() function that repeatedly
adds the same string a number of times to a buffer.

Such a strbuf_addstrings() function would already be useful in
"json-writer.c" and "builtin/submodule-helper.c" as both of these files
already have code that repeatedly adds the same string. So let's
introduce such a strbuf_addstrings() function in "strbuf.{c,h}" and use
it in both "json-writer.c" and "builtin/submodule-helper.c".

We use the "strbuf_addstrings" name as this way strbuf_addstr() and
strbuf_addstrings() would be similar for strings as strbuf_addch() and
strbuf_addchars() for characters.

Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Co-authored-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-29 09:09:39 -07:00
Junio C Hamano
b32f298264 Merge branch 'jc/format-patch-more-aggressive-range-diff'
The default "creation-factor" used by "git format-patch" has been
raised to make it more aggressively find matching commits.

* jc/format-patch-more-aggressive-range-diff:
  format-patch: run range-diff with larger creation-factor
2024-05-28 11:17:10 -07:00
Junio C Hamano
ee8537ebc9 Merge branch 'tb/pack-bitmap-write-cleanups'
The pack bitmap code saw some clean-up to prepare for a follow-up topic.

* tb/pack-bitmap-write-cleanups:
  pack-bitmap: introduce `bitmap_writer_free()`
  pack-bitmap-write.c: avoid uninitialized 'write_as' field
  pack-bitmap: drop unused `max_bitmaps` parameter
  pack-bitmap: avoid use of static `bitmap_writer`
  pack-bitmap-write.c: move commit_positions into commit_pos fields
  object.h: add flags allocated by pack-bitmap.h
2024-05-28 11:17:07 -07:00
Junio C Hamano
00ffa1cb1c Merge branch 'ps/builtin-config-cleanup'
Code clean-up to reduce inter-function communication inside
builtin/config.c done via the use of global variables.

* ps/builtin-config-cleanup: (21 commits)
  builtin/config: pass data between callbacks via local variables
  builtin/config: convert flags to a local variable
  builtin/config: track "fixed value" option via flags only
  builtin/config: convert `key` to a local variable
  builtin/config: convert `key_regexp` to a local variable
  builtin/config: convert `regexp` to a local variable
  builtin/config: convert `value_pattern` to a local variable
  builtin/config: convert `do_not_match` to a local variable
  builtin/config: move `respect_includes_opt` into location options
  builtin/config: move default value into display options
  builtin/config: move type options into display options
  builtin/config: move display options into local variables
  builtin/config: move location options into local variables
  builtin/config: refactor functions to have common exit paths
  config: make the config source const
  builtin/config: check for writeability after source is set up
  builtin/config: move actions into `cmd_config_actions()`
  builtin/config: move legacy options into `cmd_config()`
  builtin/config: move subcommand options into `cmd_config()`
  builtin/config: move legacy mode into its own function
  ...
2024-05-28 11:17:07 -07:00
Junio C Hamano
16a592f132 Merge branch 'ps/pseudo-ref-terminology'
Terminology to call various ref-like things are getting
straightened out.

* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-28 11:17:06 -07:00
Patrick Steinhardt
ebdbefa4fe builtin/mv: fix leaks for submodule gitfile paths
Similar to the preceding commit, we have effectively given tracking
memory ownership of submodule gitfile paths. Refactor the code to start
tracking allocated strings in a separate `struct strvec` such that we
can easily plug those leaks. Mark now-passing tests as leak free.

Note that ideally, we wouldn't require two separate data structures to
track those paths. But we do need to store `NULL` pointers for the
gitfile paths such that we can indicate that its corresponding entries
in the other arrays do not have such a path at all. And given that
`struct strvec`s cannot store `NULL` pointers we cannot use them to
store this information.

There is another small gotcha that is easy to miss: you may be wondering
why we don't want to store `SUBMODULE_WITH_GITDIR` in the strvec. This
is because this is a mere sentinel value and not actually a string at
all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:03 -07:00
Patrick Steinhardt
52a7dab439 builtin/mv: refactor to use struct strvec
Memory allocation patterns in git-mv(1) are extremely hard to follow:
We copy around string pointers into manually-managed arrays, some of
which alias each other, but only sometimes, while we also drop some of
those strings at other times without ever daring to free them.

While this may be my own subjective feeling, it seems like others have
given up as the code has multiple calls to `UNLEAK()`. These are not
sufficient though, and git-mv(1) is still leaking all over the place
even with them.

Refactor the code to instead track strings in `struct strvec`. While
this has the effect of effectively duplicating some of the strings
without an actual need, it is way easier to reason about and fixes all
of the aliasing of memory that has been going on. It allows us to get
rid of the `UNLEAK()` calls and also fixes leaks that those calls did
not paper over.

Mark tests which are now leak-free accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
Patrick Steinhardt
9fcd9e4e72 builtin/mv duplicate string list memory
makes the next patch easier, where we will migrate to the paths being
owned by a strvec. given that we are talking about command line
parameters here it's also not like we have tons of allocations that this
would save

while at it, fix a memory leak

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
Patrick Steinhardt
3d231f7b82 builtin/mv: refactor add_slash() to always return allocated strings
The `add_slash()` function will only conditionally return an allocated
string when the passed-in string did not yet have a trailing slash. This
makes the memory ownership harder to track than really necessary.

It's dubious whether this optimization really buys us all that much. The
number of times we execute this function is bounded by the number of
arguments to git-mv(1), so in the typical case we may end up saving an
allocation or two.

Simplify the code to unconditionally return allocated strings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:02 -07:00
Patrick Steinhardt
96c1655095 builtin/credential: clear credential before exit
We never release memory associated with `struct credential`. Fix this
and mark the corresponding test as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:01 -07:00
Patrick Steinhardt
1b261c20ed config: clarify memory ownership in git_config_string()
The out parameter of `git_config_string()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
Patrick Steinhardt
83024d98f7 builtin/log: stop using globals for format config
This commit does the exact same as the preceding commit, only for the
format configuration instead of the log configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:20:00 -07:00
Patrick Steinhardt
106a54aecb builtin/log: stop using globals for log config
We're using global variables to store the log configuration. Many of
these can be set both via the command line and via the config, and
depending on how they are being set, they may contain allocated strings.
This leads to hard-to-track memory ownership and memory leaks.

Refactor the code to instead use a `struct log_config` that is being
allocated on the stack. This allows us to more clearly scope the
variables, track memory ownership and ultimately release the memory.

This also prepares us for a change to `git_config_string()`, which will
be adapted to have a `char **` out parameter instead of `const char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
Patrick Steinhardt
6073b3b5c3 config: clarify memory ownership in git_config_pathname()
The out parameter of `git_config_pathname()` is a `const char **` even
though we transfer ownership of memory to the caller. This is quite
misleading and has led to many memory leaks all over the place. Adapt
the parameter to instead be `char **`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:59 -07:00
Patrick Steinhardt
cc395d6b47 checkout: clarify memory ownership in unique_tracking_name()
The function `unique_tracking_name()` returns an allocated string, but
does not clearly indicate this because its return type is `const char *`
instead of `char *`. This has led to various callsites where we never
free its returned memory at all, which causes memory leaks.

Plug those leaks and mark now-passing tests as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:58 -07:00
René Scharfe
36d900d2b0 difftool: add env vars directly in run_file_diff()
Add the environment variables of the child process directly using
strvec_push() instead of building an array out of them and then adding
that using strvec_pushv().  The new code is shorter and avoids magic
array index values and fragile array padding.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 08:55:59 -07:00
Junio C Hamano
d36cc0d5a4 Merge branch 'fixes/2.45.1/2.44' into jc/fix-2.45.1-and-friends-for-maint
* fixes/2.45.1/2.44:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:59:12 -07:00
Junio C Hamano
863c0ed71e Merge branch 'fixes/2.45.1/2.43' into fixes/2.45.1/2.44
* fixes/2.45.1/2.43:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:35 -07:00
Junio C Hamano
3c562ef2e6 Merge branch 'fixes/2.45.1/2.42' into fixes/2.45.1/2.43
* fixes/2.45.1/2.42:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:58:11 -07:00
Junio C Hamano
73339e4dc2 Merge branch 'fixes/2.45.1/2.41' into fixes/2.45.1/2.42
* fixes/2.45.1/2.41:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:43 -07:00
Junio C Hamano
4f215d214f Merge branch 'fixes/2.45.1/2.40' into fixes/2.45.1/2.41
* fixes/2.45.1/2.40:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 16:57:02 -07:00
Junio C Hamano
48440f60a7 Merge branch 'jc/fix-2.45.1-and-friends-for-2.39' into fixes/2.45.1/2.40
Revert overly aggressive "layered defence" that went into 2.45.1
and friends, which broke "git-lfs", "git-annex", and other use
cases, so that we can rebuild necessary counterparts in the open.

* jc/fix-2.45.1-and-friends-for-2.39:
  Revert "fsck: warn about symlink pointing inside a gitdir"
  Revert "Add a helper function to compare file contents"
  clone: drop the protections where hooks aren't run
  tests: verify that `clone -c core.hooksPath=/dev/null` works again
  Revert "core.hooksPath: add some protection while cloning"
  init: use the correct path of the templates directory again
  hook: plug a new memory leak
  ci: stop installing "gcc-13" for osx-gcc
  ci: avoid bare "gcc" for osx-gcc job
  ci: drop mention of BREW_INSTALL_PACKAGES variable
  send-email: avoid creating more than one Term::ReadLine object
  send-email: drop FakeTerm hack
2024-05-24 12:29:36 -07:00
Taylor Blau
4722e06edc pack-bitmap: move some initialization to bitmap_writer_init()
The pack-bitmap-writer machinery uses a oidmap (backed by khash.h) to
map from commits selected for bitmaps (by OID) to a bitmapped_commit
structure (containing the bitmap itself, among other things like its XOR
offset, etc.)

This map was initialized at the end of `bitmap_writer_build()`. New
entries are added in `pack-bitmap-write.c::store_selected()`, which is
called by the bitmap_builder machinery (which is responsible for
traversing history and generating the actual bitmaps).

Reorganize when this field is initialized and when entries are added to
it so that we can quickly determine whether a commit is a candidate for
pseudo-merge selection, or not (since it was already selected to receive
a bitmap, and thus storing it in a pseudo-merge would be redundant).

The changes are as follows:

  - Introduce a new `bitmap_writer_init()` function which initializes
    the `writer.bitmaps` field (instead of waiting until the end of
    `bitmap_writer_build()`).

  - Add map entries in `push_bitmapped_commit()` (which is called via
    `bitmap_writer_select_commits()`) with OID keys and NULL values to
    track whether or not we *expect* to write a bitmap for some given
    commit.

  - Validate that a NULL entry is found matching the given key when we
    store a selected bitmap.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:41 -07:00
Junio C Hamano
7593d66928 Merge branch 'la/hide-trailer-info'
The trailer API has been reshuffled a bit.

* la/hide-trailer-info:
  trailer unit tests: inspect iterator contents
  trailer: document parse_trailers() usage
  trailer: retire trailer_info_get() from API
  trailer: make trailer_info struct private
  trailer: make parse_trailers() return trailer_info pointer
  interpret-trailers: access trailer_info with new helpers
  sequencer: use the trailer iterator
  trailer: teach iterator about non-trailer lines
  trailer: add unit tests for trailer iterator
  Makefile: sort UNIT_TEST_PROGRAMS
2024-05-23 11:04:27 -07:00
Junio C Hamano
939d49e9bd Merge branch 'kn/ref-transaction-symref' into kn/update-ref-symref
* kn/ref-transaction-symref:
  refs: remove `create_symref` and associated dead code
  refs: rename `refs_create_symref()` to `refs_update_symref()`
  refs: use transaction in `refs_create_symref()`
  refs: add support for transactional symref updates
  refs: move `original_update_refname` to 'refs.c'
  refs: support symrefs in 'reference-transaction' hook
  files-backend: extract out `create_symref_lock()`
  refs: accept symref values in `ref_transaction_update()`
2024-05-23 09:38:59 -07:00
Junio C Hamano
0ff6d23a0f Merge branch 'ps/pseudo-ref-terminology' into ps/ref-storage-migration
* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-23 09:14:32 -07:00
Junio C Hamano
e55f364398 Merge branch 'ps/refs-without-the-repository-updates' into ps/ref-storage-migration
* ps/refs-without-the-repository-updates:
  refs/packed: remove references to `the_hash_algo`
  refs/files: remove references to `the_hash_algo`
  refs/files: use correct repository
  refs: remove `dwim_log()`
  refs: drop `git_default_branch_name()`
  refs: pass repo when peeling objects
  refs: move object peeling into "object.c"
  refs: pass ref store when detecting dangling symrefs
  refs: convert iteration over replace refs to accept ref store
  refs: retrieve worktree ref stores via associated repository
  refs: refactor `resolve_gitlink_ref()` to accept a repository
  refs: pass repo when retrieving submodule ref store
  refs: track ref stores via strmap
  refs: implement releasing ref storages
  refs: rename `init_db` callback to avoid confusion
  refs: adjust names for `init` and `init_db` callbacks
2024-05-23 09:14:08 -07:00
Johannes Schindelin
873a466ea3 clone: drop the protections where hooks aren't run
As part of the security bug-fix releases v2.39.4, ..., v2.45.1, I
introduced logic to safeguard `git clone` from running hooks that were
installed _during_ the clone operation.

The rationale was that Git's CVE-2024-32002, CVE-2021-21300,
CVE-2019-1354, CVE-2019-1353, CVE-2019-1352, and CVE-2019-1349 should
have been low-severity vulnerabilities but were elevated to
critical/high severity by the attack vector that allows a weakness where
files inside `.git/` can be inadvertently written during a `git clone`
to escalate to a Remote Code Execution attack by virtue of installing a
malicious `post-checkout` hook that Git will then run at the end of the
operation without giving the user a chance to see what code is executed.

Unfortunately, Git LFS uses a similar strategy to install its own
`post-checkout` hook during a `git clone`; In fact, Git LFS is
installing four separate hooks while running the `smudge` filter.

While this pattern is probably in want of being improved by introducing
better support in Git for Git LFS and other tools wishing to register
hooks to be run at various stages of Git's commands, let's undo the
clone protections to unbreak Git LFS-enabled clones.

This reverts commit 8db1e8743c (clone: prevent hooks from running
during a clone, 2024-03-28).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-21 12:33:08 -07:00