Commit Graph

303 Commits

Author SHA1 Message Date
Junio C Hamano
5575c713c2 Merge branch 'ps/reftable-alloc-failures'
The reftable library is now prepared to expect that the memory
allocation function given to it may fail to allocate and to deal
with such an error.

* ps/reftable-alloc-failures: (26 commits)
  reftable/basics: fix segfault when growing `names` array fails
  reftable/basics: ban standard allocator functions
  reftable: introduce `REFTABLE_FREE_AND_NULL()`
  reftable: fix calls to free(3P)
  reftable: handle trivial allocation failures
  reftable/tree: handle allocation failures
  reftable/pq: handle allocation failures when adding entries
  reftable/block: handle allocation failures
  reftable/blocksource: handle allocation failures
  reftable/iter: handle allocation failures when creating indexed table iter
  reftable/stack: handle allocation failures in auto compaction
  reftable/stack: handle allocation failures in `stack_compact_range()`
  reftable/stack: handle allocation failures in `reftable_new_stack()`
  reftable/stack: handle allocation failures on reload
  reftable/reader: handle allocation failures in `reader_init_iter()`
  reftable/reader: handle allocation failures for unindexed reader
  reftable/merged: handle allocation failures in `merged_table_init_iter()`
  reftable/writer: handle allocation failures in `reftable_new_writer()`
  reftable/writer: handle allocation failures in `writer_index_hash()`
  reftable/record: handle allocation failures when decoding records
  ...
2024-10-10 14:22:25 -07:00
Junio C Hamano
4861bbf85a Merge branch 'ak/typofix-2.46-maint'
Typofixes.

* ak/typofix-2.46-maint:
  perl: fix a typo
  mergetool: fix a typo
  reftable: fix a typo
  trace2: fix typos
2024-10-04 14:21:40 -07:00
Patrick Steinhardt
2179b5c831 reftable/basics: fix segfault when growing names array fails
When growing the `names` array fails we would end up with a `NULL`
pointer. This causes two problems:

  - We would run into a segfault because we try to free names that we
    have assigned to the array already.

  - We lose track of the old array and cannot free its contents.

Fix this issue by using a temporary variable. Like this we do not
clobber the old array that we tried to reallocate, which will remain
valid when a call to realloc(3P) fails.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-04 07:59:31 -07:00
Andrew Kreimer
a54601c38b reftable: fix a typo
Fix a typo in comments.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03 12:06:51 -07:00
Patrick Steinhardt
35730302e9 reftable/basics: ban standard allocator functions
The reftable library uses pluggable allocators, which means that we
shouldn't ever use the standard allocator functions. But it is an easy
mistake to make to accidentally use e.g. free(3P) instead of the
reftable-specific `reftable_free()` function, and we do not have any
mechanism to detect this misuse right now.

Introduce a couple of macros that ban the standard allocators, similar
to how we do it in "banned.h".

Note that we do not ban the following two classes of functions:

  - Macros like `FREE_AND_NULL()` or `REALLOC_ARRAY()`. As those expand
    to code that contains already-banned functions we'd get a compiler
    error even without banning those macros explicitly.

  - Git-specific allocators like `xmalloc()` and friends. The primary
    reason is that there are simply too many of them, so we're rather
    aiming for best effort here. Furthermore, the eventual goal is to
    make them unavailable in the reftable library place by not pulling
    them in via "git-compat-utils.h" anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:56 -07:00
Patrick Steinhardt
24e0ade65b reftable: introduce REFTABLE_FREE_AND_NULL()
We have several calls to `FREE_AND_NULL()` in the reftable library,
which of course uses free(3P). As the reftable allocators are pluggable
we should rather call the reftable specific function, which is
`reftable_free()`.

Introduce a new macro `REFTABLE_FREE_AND_NULL()` and adapt the callsites
accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:56 -07:00
Patrick Steinhardt
daa59e9c43 reftable: fix calls to free(3P)
There are a small set of calls to free(3P) in the reftable library. As
the reftable allocators are pluggable we should rather call the reftable
specific function, which is `reftable_free()`.

Convert the code accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:56 -07:00
Patrick Steinhardt
12b9078066 reftable: handle trivial allocation failures
Handle trivial allocation failures in the reftable library and its unit
tests.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
Patrick Steinhardt
51afc709dc reftable/tree: handle allocation failures
The tree interfaces of the reftable library handle both insertion and
searching of tree nodes with a single function, where the behaviour is
altered between the two via an `insert` bit. This makes it quit awkward
to handle allocation failures because on inserting we'd have to check
for `NULL` pointers and return an error, whereas on searching entries we
don't have to handle it as an allocation error.

Split up concerns of this function into two separate functions, one for
inserting entries and one for searching entries. This makes it easy for
us to check for allocation errors as `tree_insert()` should never return
a `NULL` pointer now. Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
Patrick Steinhardt
d0501c8c9d reftable/pq: handle allocation failures when adding entries
Handle allocation failures when adding entries to the pqueue. Adapt its
only caller accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
Patrick Steinhardt
2d5dbb37b2 reftable/block: handle allocation failures
Handle allocation failures in `block_writer_init()` and
`block_reader_init()`. This requires us to bubble up error codes into
`writer_reinit_block_writer()`. Adapt call sites accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:55 -07:00
Patrick Steinhardt
cd6a47167e reftable/blocksource: handle allocation failures
Handle allocation failures in the blocksource code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
Patrick Steinhardt
cc6a9af5d7 reftable/iter: handle allocation failures when creating indexed table iter
Handle allocation failures in `new_indexed_table_ref_iter()`. While at
it, rename the function to match our coding style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
Patrick Steinhardt
5b67cc6477 reftable/stack: handle allocation failures in auto compaction
Handle allocation failures in `reftable_stack_auto_compact()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
Patrick Steinhardt
694af039f5 reftable/stack: handle allocation failures in stack_compact_range()
Handle allocation failures in `stack_compact_range()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
Patrick Steinhardt
5dbe266212 reftable/stack: handle allocation failures in reftable_new_stack()
Handle allocation failures in `reftable_new_stack()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:54 -07:00
Patrick Steinhardt
dce75e15ff reftable/stack: handle allocation failures on reload
Handle allocation failures in `reftable_stack_reload_once()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
Patrick Steinhardt
0a8372f509 reftable/reader: handle allocation failures in reader_init_iter()
Handle allocation failures in `reader_init_iter()`. This requires us to
also adapt `reftable_reader_init_*_iterator()` to bubble up the new
error codes. Adapt callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
Patrick Steinhardt
18da600293 reftable/reader: handle allocation failures for unindexed reader
Handle allocation failures when creating unindexed readers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
Patrick Steinhardt
802c0646ac reftable/merged: handle allocation failures in merged_table_init_iter()
Handle allocation failures in `merged_table_init_iter()`. While at it,
merge `merged_iter_init()` into the function. It only has a single
caller and merging them makes it easier to handle allocation failures
consistently.

This change also requires us to adapt `reftable_stack_init_*_iterator()`
to bubble up the new error codes of `merged_table_iter_init()`. Adapt
callsites accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:53 -07:00
Patrick Steinhardt
74d1c18757 reftable/writer: handle allocation failures in reftable_new_writer()
Handle allocation failures in `reftable_new_writer()`. Adapt the
function to return an error code to return such failures. While at it,
rename it to match our code style as we have to touch up every callsite
anyway.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
Patrick Steinhardt
b680af2dba reftable/writer: handle allocation failures in writer_index_hash()
Handle allocation errors in `writer_index_hash()`. Adjust its only
caller in `reftable_writer_add_ref()` accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
Patrick Steinhardt
31f5b972e0 reftable/record: handle allocation failures when decoding records
Handle allocation failures when decoding records. While at it, fix some
error codes to be `REFTABLE_FORMAT_ERROR`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
Patrick Steinhardt
ea194f9c46 reftable/record: handle allocation failures on copy
Handle allocation failures when copying records. While at it, convert
from `xstrdup()` to `reftable_strdup()`. Adapt callsites to check for
error codes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:52 -07:00
Patrick Steinhardt
eef7bcdafe reftable/basics: handle allocation failures in parse_names()
Handle allocation failures in `parse_names()` by returning `NULL` in
case any allocation fails. While at it, refactor the function to return
the array directly instead of assigning it to an out-pointer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
Patrick Steinhardt
6593e147d3 reftable/basics: handle allocation failures in reftable_calloc()
Handle allocation failures in `reftable_calloc()`.

While at it, remove our use of `st_mult()` that would cause us to die on
an overflow. From the caller's point of view there is not much of a
difference between arguments that are too large to be multiplied and a
request that is too big to handle by the allocator: in both cases the
allocation cannot be fulfilled. And in neither of these cases do we want
the reftable library to die.

While we could use `unsigned_mult_overflows()` to handle the overflow
gracefully, we instead open-code it to further our goal of converting
the reftable codebase to become a standalone library that can be reused
by external projects.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
Patrick Steinhardt
7f0969febf reftable: introduce reftable_strdup()
The reftable library provides the ability to swap out allocators. There
is a gap here though, because we continue to use `xstrdup()` even in the
case where all the other allocators have been swapped out.

Introduce `reftable_strdup()` that uses `reftable_malloc()` to do the
allocation.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
Patrick Steinhardt
a5a15a4514 reftable/basics: merge "publicbasics" into "basics"
The split between "basics" and "publicbasics" is somewhat arbitrary and
not in line with how we typically structure code in the reftable
library. While we do indeed split up headers into a public and internal
part, we don't do that for the compilation unit itself. Furthermore, the
declarations for "publicbasics.c" are in "reftable-malloc.h", which
isn't in line with our naming schema, either.

Fix these inconsistencies by:

  - Merging "publicbasics.c" into "basics.c".

  - Renaming "reftable-malloc.h" to "reftable-basics.h" as the public
    header.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:51 -07:00
Patrick Steinhardt
bcd5a4059a reftable/error: introduce out-of-memory error code
The reftable library does not use the same memory allocation functions
as the rest of the Git codebase. Instead, as the reftable library is
supposed to be usable as a standalone library without Git, it provides a
set of pluggable memory allocators.

Compared to `xmalloc()` and friends these allocators are _not_ expected
to die when an allocation fails. This design choice is concious, as a
library should leave it to its caller to handle any kind of error. While
it is very likely that the caller cannot really do much in the case of
an out-of-memory situation anyway, we are not the ones to make that
decision.

Curiously though, we never handle allocation errors even though memory
allocation functions are allowed to fail. And as we do not plug in Git's
memory allocator via `reftable_set_alloc()` either the consequence is
that we'd instead segfault as soon as we run out of memory.

While the easy fix would be to wire up `xmalloc()` and friends, it
would only fix the usage of the reftable library in Git itself. Other
users like libgit2, which is about to revive its efforts to land a
backend for reftables, wouldn't be able to benefit from this solution.

Instead, we are about to do it the hard way: adapt all allocation sites
to perform error checking. Introduce a new error code for out-of-memory
errors that we will wire up in subsequent steps.

This commit also serves as the motivator for all the remaining steps in
this series such that we do not have to repeat the same arguments in
every single subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-02 07:53:50 -07:00
Junio C Hamano
ab68c70a8b Merge branch 'ps/reftable-concurrent-writes'
Give timeout to the locking code to write to reftable.

* ps/reftable-concurrent-writes:
  refs/reftable: reload locked stack when preparing transaction
  reftable/stack: allow locking of outdated stacks
  refs/reftable: introduce "reftable.lockTimeout"
2024-09-30 16:16:14 -07:00
Patrick Steinhardt
80e7342ea8 reftable/stack: allow locking of outdated stacks
In `reftable_stack_new_addition()` we first lock the stack and then
check whether it is still up-to-date. If it is not we return an error to
the caller indicating that the stack is outdated.

This is overly restrictive in our ref transaction interface though: we
lock the stack right before we start to verify the transaction, so we do
not really care whether it is outdated or not. What we really want is
that the stack is up-to-date after it has been locked so that we can
verify queued updates against its current state while we know that it is
locked for concurrent modification.

Introduce a new flag `REFTABLE_STACK_NEW_ADDITION_RELOAD` that alters
the behaviour of `reftable_stack_init_addition()` in this case: when we
notice that it is out-of-date we reload it instead of returning an error
to the caller.

This logic will be wired up in the reftable backend in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:45:25 -07:00
Patrick Steinhardt
bc39b6a796 refs/reftable: introduce "reftable.lockTimeout"
When multiple concurrent processes try to update references in a
repository they may try to lock the same lockfiles. This can happen even
when the updates are non-conflicting and can both be applied, so it
doesn't always make sense to abort the transaction immediately. Both the
"loose" and "packed" backends thus have a grace period that they wait
for the lock to be released that can be controlled via the config values
"core.filesRefLockTimeout" and "core.packedRefsTimeout", respectively.

The reftable backend doesn't have such a setting yet and instead fails
immediately when it sees such a lock. But the exact same concepts apply
here as they do apply to the other backends.

Introduce a new "reftable.lockTimeout" config that controls how long we
may wait for a "tables.list" lock to be released. The default value of
this config is 100ms, which is the same default as we have it for the
"loose" backend.

Note that even though we also lock individual tables, this config really
only applies to the "tables.list" file. This is because individual
tables are only ever locked when we already hold the "tables.list" lock
during compaction. When we observe such a lock we in fact do not want to
compact the table at all because it is already in the process of being
compacted by a concurrent process. So applying the same timeout here
would not make any sense and only delay progress.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24 09:45:25 -07:00
Patrick Steinhardt
0a148a8eda reftable/reader: make table iterator reseekable
In 67ce50ba26 (Merge branch 'ps/reftable-reusable-iterator', 2024-05-30)
we have refactored the interface of reftable iterators such that they
can be reused in theory. This patch series only landed the required
changes on the interface level, but didn't yet implement the actual
logic to make iterators reusable.

As it turns out almost all of the infrastructure already does support
re-seeking. The only exception is the table iterator, which does not
reset its `is_finished` bit. Do so and add a couple of tests that verify
that we can re-seek iterators.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16 13:57:19 -07:00
Chandra Pratap
15e29ea1c6 t: move reftable/stack_test.c to the unit testing framework
reftable/stack_test.c exercises the functions defined in
reftable/stack.{c, h}. Migrate reftable/stack_test.c to the
unit testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests to be in-line with unit-tests'
standards.

Since some of the tests use set_test_hash() defined by
reftable/test_framework.{c, h} but these files are not
'#included' in the test file, copy this function in the
ported test file.

With the migration of stack test to the unit-tests framework,
"test-tool reftable" becomes a no-op. Hence, get rid of everything
that uses "test-tool reftable" alongside everything that is used
to implement it.

While at it, alphabetically sort the cmds[] list in
helper/test-tool.c by moving the entry for "dump-reftable".

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08 13:24:03 -07:00
Junio C Hamano
17636cdf3b Merge branch 'ps/reftable-concurrent-compaction'
The code path for compacting reftable files saw some bugfixes
against concurrent operation.

* ps/reftable-concurrent-compaction:
  reftable/stack: fix segfault when reload with reused readers fails
  reftable/stack: reorder swapping in the reloaded stack contents
  reftable/reader: keep readers alive during iteration
  reftable/reader: introduce refcounting
  reftable/stack: fix broken refnames in `write_n_ref_tables()`
  reftable/reader: inline `reader_close()`
  reftable/reader: inline `init_reader()`
  reftable/reader: rename `reftable_new_reader()`
  reftable/stack: inline `stack_compact_range_stats()`
  reftable/blocksource: drop malloc block source
2024-09-03 09:15:03 -07:00
Junio C Hamano
839b808325 Merge branch 'cp/unit-test-reftable-block'
Another test for reftable library ported to the unit test framework.

* cp/unit-test-reftable-block:
  t-reftable-block: mark unused argv/argc
  t-reftable-block: add tests for index blocks
  t-reftable-block: add tests for obj blocks
  t-reftable-block: add tests for log blocks
  t-reftable-block: remove unnecessary variable 'j'
  t-reftable-block: use xstrfmt() instead of xstrdup()
  t-reftable-block: use block_iter_reset() instead of block_iter_close()
  t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
  t-reftable-block: use reftable_record_equal() instead of check_str()
  t-reftable-block: release used block reader
  t: harmonize t-reftable-block.c with coding guidelines
  t: move reftable/block_test.c to the unit testing framework
2024-08-29 11:08:16 -07:00
Junio C Hamano
d4d677704d Merge branch 'ps/reftable-drop-generic'
The code in the reftable library has been cleaned up by discarding
unused "generic" interface.

* ps/reftable-drop-generic:
  reftable: mark unused parameters in empty iterator functions
  reftable/generic: drop interface
  t/helper: refactor to not use `struct reftable_table`
  t/helper: use `hash_to_hex_algop()` to print hashes
  t/helper: inline printing of reftable records
  t/helper: inline `reftable_table_print()`
  t/helper: inline `reftable_stack_print_directory()`
  t/helper: inline `reftable_reader_print_file()`
  t/helper: inline `reftable_dump_main()`
  reftable/dump: drop unused `compact_stack()`
  reftable/generic: move generic iterator code into iterator interface
  reftable/iter: drop double-checking logic
  reftable/stack: open-code reading refs
  reftable/merged: stop using generic tables in the merged table
  reftable/merged: rename `reftable_new_merged_table()`
  reftable/merged: expose functions to initialize iterators
2024-08-29 11:08:16 -07:00
Jeff King
e49d2472d2 reftable: mark unused parameters in empty iterator functions
These unused parameters were marked in a68ec8683a (reftable: mark unused
parameters in virtual functions, 2024-08-17), but the functions were
moved to a new file in a parallel branch via f2406c81b9
(reftable/generic: move generic iterator code into iterator interface,
2024-08-22).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 10:09:56 -07:00
Junio C Hamano
2b30d66c43 Merge branch 'jk/mark-unused-parameters'
Mark unused parameters as UNUSED to squelch -Wunused warnings.

* jk/mark-unused-parameters:
  t-hashmap: stop calling setup() for t_intern() test
  scalar: mark unused parameters in dummy function
  daemon: mark unused parameters in non-posix fallbacks
  setup: mark unused parameter in config callback
  test-mergesort: mark unused parameters in trivial callback
  t-hashmap: mark unused parameters in callback function
  reftable: mark unused parameters in virtual functions
  reftable: drop obsolete test function declarations
  reftable: ignore unused argc/argv in test functions
  unit-tests: ignore unused argc/argv
  t/helper: mark more unused argv/argc arguments
  oss-fuzz: mark unused argv/argc argument
  refs: mark unused parameters in do_for_each_reflog_helper()
  refs: mark unused parameters in ref_store fsck callbacks
  update-ref: mark more unused parameters in parser callbacks
  imap-send: mark unused parameter in ssl_socket_connect() fallback
2024-08-26 11:32:23 -07:00
Junio C Hamano
668843e6d8 Merge branch 'cp/unit-test-reftable-readwrite'
* cp/unit-test-reftable-readwrite:
  t-reftable-readwrite: add test for known error
  t-reftable-readwrite: use 'for' in place of infinite 'while' loops
  t-reftable-readwrite: use free_names() instead of a for loop
  t: move reftable/readwrite_test.c to the unit testing framework
2024-08-23 09:02:35 -07:00
Patrick Steinhardt
85da2a2ab6 reftable/stack: fix segfault when reload with reused readers fails
It is expected that reloading the stack fails with concurrent writers,
e.g. because a table that we just wanted to read just got compacted.
In case we decided to reuse readers this will cause a segfault though
because we unconditionally release all new readers, including the reused
ones. As those are still referenced by the current stack, the result is
that we will eventually try to dereference those already-freed readers.

Fix this bug by incrementing the refcount of reused readers temporarily.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:48 -07:00
Patrick Steinhardt
1302ed68d4 reftable/stack: reorder swapping in the reloaded stack contents
The code flow of how we swap in the reloaded stack contents is somewhat
convoluted because we switch back and forth between swapping in
different parts of the stack.

Reorder the code to simplify it. We now first close and unlink the old
tables which do not get reused before we update the stack to point to
the new stack.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
Patrick Steinhardt
89eada4ea1 reftable/reader: keep readers alive during iteration
The lifetime of a table iterator may survive the lifetime of a reader
when the stack gets reloaded. Keep the reader from being released by
increasing its refcount while the iterator is still being used.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
Patrick Steinhardt
d857469d85 reftable/reader: introduce refcounting
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.

Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.

One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      #18 0x55f23dba7764 in run_builtin git.c:484
      #19 0x55f23dba7764 in handle_builtin git.c:741
      #20 0x55f23dbab61e in run_argv git.c:805
      #21 0x55f23dbab61e in cmd_main git.c:1000
      #22 0x55f23dba4781 in main common-main.c:64
      #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.

The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.

Prepare for a fix by converting the reftable readers to be refcounted.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
Patrick Steinhardt
4ac2fd9b4a reftable/stack: fix broken refnames in write_n_ref_tables()
The `write_n_ref_tables()` helper function writes N references in
separate tables. We never reset the computed name of those references
though, leading us to end up with unexpected names.

Fix this by resetting the buffer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
Patrick Steinhardt
00e130a6bb reftable/reader: inline reader_close()
Same as with the preceding commit, we also provide a `reader_close()`
function that allows the caller to close a reader without freeing it.
This is unnecessary now that all users will have an allocated version of
the reader.

Inline it into `reftable_reader_free()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
Patrick Steinhardt
2de3c0d345 reftable/reader: inline init_reader()
Most users use an allocated version of the `reftable_reader`, except for
some tests. We are about to convert the reader to become refcounted
though, and providing the ability to keep a reader on the stack makes
this conversion harder than necessary.

Update the tests to use `reftable_reader_new()` instead to prepare for
this change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
Patrick Steinhardt
a0218203cd reftable/reader: rename reftable_new_reader()
Rename the `reftable_new_reader()` function to `reftable_reader_new()`
to match our coding guidelines.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
Patrick Steinhardt
a52bac9ac0 reftable/stack: inline stack_compact_range_stats()
The only difference between `stack_compact_range_stats()` and
`stack_compact_range()` is that the former updates stats on failure,
whereas the latter doesn't. There are no callers anymore that do not
want their stats updated though, making the indirection unnecessary.

Inline the stat updates into `stack_compact_range()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
Patrick Steinhardt
afdafade1a reftable/blocksource: drop malloc block source
The reftable blocksource provides a generic interface to read blocks via
different sources, e.g. from disk or from memory. One of the block
sources is the malloc block source, which can in theory read data from
memory. We nowadays also have a strbuf block source though, which
provides essentially the same functionality with better ergonomics.

Adapt the only remaining user of the malloc block source in our tests
to use the strbuf block source, instead, and remove the now-unused
malloc block source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00