Commit Graph

260 Commits

Author SHA1 Message Date
Seyi Kuforiji
ee0a88dadb t/unit-tests: convert reftable readwrite test to use clar
Adapt reftable readwrite test file to use clar by using clar assertions
where necessary.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24 11:46:03 -07:00
Seyi Kuforiji
18a992b7b7 t/unit-tests: convert reftable table test to use clar
Adapt reftable table test file to use clar by using clar assertions
where necessary.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24 11:46:03 -07:00
Seyi Kuforiji
a0aaa85c0c t/unit-tests: convert reftable pq test to use clar
Adapt reftable priority queue test file to use clar by using clar
assertions where necessary.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24 11:46:03 -07:00
Seyi Kuforiji
c7784ba600 t/unit-tests: convert reftable merged test to use clar
Adapt reftable merged test file to use clar testing framework by using
clar assertions where necessary.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24 11:46:02 -07:00
Seyi Kuforiji
a83bf04d8b t/unit-tests: convert reftable block test to use clar
Adapt reftable block test file to use clar testing framework by using
clar assertions where necessary.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24 11:46:02 -07:00
Seyi Kuforiji
ed5dcbf2f0 t/unit-tests: convert reftable basics test to use clar test framework
Adapt reftable basics test file to clar by using clar assertions
where necessary.Break up test edge case to improve modularity and
clarity.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24 11:46:02 -07:00
Seyi Kuforiji
5dd5c4e345 t/unit-tests: implement clar specific reftable test helper functions
Helper functions defined in `t/unit-tests/lib-reftable.{c,h}` are
required for the reftable-related test files to run. In the current
implementation these functions are designed to conform with our
homegrown unit-testing structure. So in other to convert the reftable
test files, there is need for a clar specific implementation of these
helper functions.

Implement equivalent helper functions in `lib-reftable-clar.{c,h}` to
use clar. These functions conform with the clar testing framework and
become available for all reftable-related test files implemented using
the clar testing framework, which requires them. This will be used by
subsequent commits.

Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-24 11:46:01 -07:00
Junio C Hamano
90eedabbf7 Merge branch 'ps/reftable-read-block-perffix'
Performance regression in not-yet-released code has been corrected.

* ps/reftable-read-block-perffix:
  reftable: fix perf regression when reading blocks of unwanted type
2025-05-19 16:02:48 -07:00
Jeff King
2744646834 oidmap: rename oidmap_free() to oidmap_clear()
This function does not free the oidmap struct itself; it just drops all
items from the map (using hashmap_clear_() internally). It should be
called oidmap_clear(), per CodingGuidelines.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12 13:06:26 -07:00
Patrick Steinhardt
1970333644 reftable: fix perf regression when reading blocks of unwanted type
In fd888311fb (reftable/table: move reading block into block reader,
2025-04-07), we have refactored how reftable blocks are read so that
most of the logic is contained in the "block.c" subsystem itself. Most
importantly, the whole logic to read the data itself is now contained in
that subsystem.

This change caused a significant performance regression though when
reading blocks that aren't of the specific type one is searching for:

    Benchmark 1: update-ref: create 100k refs (revision = fd888311fbc~)
      Time (mean ± σ):      2.171 s ±  0.028 s    [User: 1.189 s, System: 0.977 s]
      Range (min … max):    2.117 s …  2.206 s    10 runs

    Benchmark 2: update-ref: create 100k refs (revision = fd888311fb)
      Time (mean ± σ):      3.418 s ±  0.030 s    [User: 2.371 s, System: 1.037 s]
      Range (min … max):    3.377 s …  3.473 s    10 runs

    Summary
      update-ref: create 100k refs (revision = fd888311fbc~) ran
        1.57 ± 0.02 times faster than update-ref: create 100k refs (revision = fd888311fb)

The root caute of the performance regression is that we changed when
exactly blocks of an uninteresting type are being discarded. Previous to
the refactoring in the mentioned commit we'd load the block data, read
its type, notice that it's not the wanted type and discard the block.
After the commit though we don't discard the block immediately, but we
fully decode it only to realize that it's not the desired type. We then
discard the block again, but have already performed a bunch of pointless
work.

Fix the regression by making `reftable_block_init()` return early in
case the block is not of the desired type. This fixes the performance
hit:

    Benchmark 1: update-ref: create 100k refs (revision = HEAD~)
      Time (mean ± σ):      2.712 s ±  0.018 s    [User: 1.990 s, System: 0.716 s]
      Range (min … max):    2.682 s …  2.741 s    10 runs

    Benchmark 2: update-ref: create 100k refs (revision = HEAD)
      Time (mean ± σ):      1.670 s ±  0.012 s    [User: 0.991 s, System: 0.676 s]
      Range (min … max):    1.652 s …  1.693 s    10 runs

    Summary
      update-ref: create 100k refs (revision = HEAD) ran
        1.62 ± 0.02 times faster than update-ref: create 100k refs (revision = HEAD~)

Note that the baseline performance is lower than in the original due to
a couple of unrelated performance improvements that have landed since
the original commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12 10:55:24 -07:00
Junio C Hamano
a819a3da85 Merge branch 'ps/reftable-api-revamp'
Overhaul of the reftable API.

* ps/reftable-api-revamp:
  reftable/table: move printing logic into test helper
  reftable/constants: make block types part of the public interface
  reftable/table: introduce iterator for table blocks
  reftable/table: add `reftable_table` to the public interface
  reftable/block: expose a generic iterator over reftable records
  reftable/block: make block iterators reseekable
  reftable/block: store block pointer in the block iterator
  reftable/block: create public interface for reading blocks
  git-zlib: use `struct z_stream_s` instead of typedef
  reftable/block: rename `block_reader` to `reftable_block`
  reftable/block: rename `block` to `block_data`
  reftable/table: move reading block into block reader
  reftable/block: simplify how we track restart points
  reftable/blocksource: consolidate code into a single file
  reftable/reader: rename data structure to "table"
  reftable: fix formatting of the license header
2025-04-29 14:21:30 -07:00
Junio C Hamano
779534d5a7 Merge branch 'sk/clar-trailer-urlmatch-norm-test'
A few traditional unit tests have been rewritten to use the clar
framework.

* sk/clar-trailer-urlmatch-norm-test:
  t/unit-tests: convert urlmatch-normalization test to clar
  t/unit-tests: convert trailer test to use clar
2025-04-16 13:54:18 -07:00
Junio C Hamano
7b03646f85 Merge branch 'js/comma-semicolon-confusion'
Code clean-up.

* js/comma-semicolon-confusion:
  detect-compiler: detect clang even if it found CUDA
  clang: warn when the comma operator is used
  compat/regex: explicitly mark intentional use of the comma operator
  wildmatch: avoid using of the comma operator
  diff-delta: avoid using the comma operator
  xdiff: avoid using the comma operator unnecessarily
  clar: avoid using the comma operator unnecessarily
  kwset: avoid using the comma operator unnecessarily
  rebase: avoid using the comma operator unnecessarily
  remote-curl: avoid using the comma operator unnecessarily
2025-04-15 13:50:17 -07:00
Junio C Hamano
6e2a3b8ae0 Merge branch 'ps/reftable-sans-compat-util'
Make the code in reftable library less reliant on the service
routines it used to borrow from Git proper, to make it easier to
use by external users of the library.

* ps/reftable-sans-compat-util:
  Makefile: skip reftable library for Coccinelle
  reftable: decouple from Git codebase by pulling in "compat/posix.h"
  git-compat-util.h: split out POSIX-emulating bits
  compat/mingw: split out POSIX-related bits
  reftable/basics: introduce `REFTABLE_UNUSED` annotation
  reftable/basics: stop using `SWAP()` macro
  reftable/stack: stop using `sleep_millisec()`
  reftable/system: introduce `reftable_rand()`
  reftable/reader: stop using `ARRAY_SIZE()` macro
  reftable/basics: provide wrappers for big endian conversion
  reftable/basics: stop using `st_mult()` in array allocators
  reftable: stop using `BUG()` in trivial cases
  reftable/record: don't `BUG()` in `reftable_record_cmp()`
  reftable/record: stop using `BUG()` in `reftable_record_init()`
  reftable/record: stop using `COPY_ARRAY()`
  reftable/blocksource: stop using `xmmap()`
  reftable/stack: stop using `write_in_full()`
  reftable/stack: stop using `read_in_full()`
2025-04-08 11:43:14 -07:00
Patrick Steinhardt
0f8ee94b63 reftable/constants: make block types part of the public interface
Now that reftable blocks can be read individually via the public
interface it becomes necessary for callers to be able to distinguish the
different types of blocks. Expose the relevant constants.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:12 -07:00
Patrick Steinhardt
da89659365 reftable/table: introduce iterator for table blocks
Introduce a new iterator that allows the caller to iterate through all
blocks contained in a table. This gives users more fine-grained control
over how exactly those blocks are being read and exposes information to
callers that was previously inaccessible.

This iterator will be required by a future patch series that adds
consistency checks for the reftable backend. In addition to that though
we will also reimplement `reftable_table_print_blocks()` on top of this
new iterator in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:12 -07:00
Patrick Steinhardt
50d8459477 reftable/block: expose a generic iterator over reftable records
Expose a generic iterator over reftable records and expose it via the
public interface. Together with an upcoming iterator for reftable blocks
contained in a table this will allow users to trivially iterate through
blocks and their respective records individually.

This functionality will be used to implement consistency checks for the
reftable backend, which requires more fine-grained control over how we
read data.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:12 -07:00
Patrick Steinhardt
6da48a5e00 reftable/block: make block iterators reseekable
Refactor the block iterators so that initialization and seeking are
different from one another. This makes the iterator trivially reseekable
by storing the pointer to the block at initialization time, which we can
then reuse on every seek.

This refactoring prepares the code for exposing a `reftable_iterator`
interface for blocks in a subsequent commit. Callsites are adjusted
accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:11 -07:00
Patrick Steinhardt
12a9aa8cb7 reftable/block: rename block_reader to reftable_block
The `block_reader` structure is used to access parsed data of a reftable
block. The structure is currently treated as an internal implementation
detail and not exposed via our public interfaces. The functionality
provided by the structure is useful to external users of the reftable
library though, for example when implementing consistency checks that
need to scan through the blocks manually.

Rename the structure to `reftable_block` now that the name has been made
available in the preceding commit. This name is in line with the naming
schema used for other data structures like `reftable_table` in that it
describes the underlying entity that it provides access to.

The new data structure isn't yet exposed via the public interface, which
is left for a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:10 -07:00
Patrick Steinhardt
2b3362c10d reftable/block: rename block to block_data
The `reftable_block` structure associates a byte slice with a block
source. As such it only holds the data of a reftable block without
actually encoding any of the details for how to access that data.

Rename the structure to instead be called `reftable_block_data`. Besides
clarifying that this really only holds data, it also allows us to rename
the `reftable_block_reader` to `reftable_block` in the next commit, as
this is the structure that actually encapsulates access to the reftable
blocks.

Rename the `struct reftable_block_reader::block` member accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:10 -07:00
Patrick Steinhardt
fd888311fb reftable/table: move reading block into block reader
The logic to read blocks from a reftable is scattered across both the
table and the block subsystems. Besides causing somewhat fuzzy
responsibilities, it also means that we have to awkwardly pass around
the ownership of blocks between the subsystems.

Refactor the code so that we stop passing the block when initializing a
reader, but instead by passing in the block source plus the offset at
which we're supposed to read a block. Like this, the ownership of the
block itself doesn't need to get handed over as the block reader is the
one owning the block right from the start.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:10 -07:00
Patrick Steinhardt
1ac4e5e83d reftable/blocksource: consolidate code into a single file
The code that implements block sources is distributed across a couple of
files. Consolidate all of it into "reftable/blocksource.c" and its
accompanying header so that it is easier to locate and more self
contained.

While at it, rename some of the functions to have properly scoped names.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:09 -07:00
Patrick Steinhardt
b648bd6549 reftable/reader: rename data structure to "table"
The `struct reftable_reader` subsystem encapsulates a table that has
been read from the disk. As such, the current name of that structure is
somewhat hard to understand as it only talks about the fact that we read
something from disk, without really giving an indicator _what_ that is.

Furthermore, this naming schema doesn't really fit well into how the
other structures are named: `reftable_merged_table`, `reftable_stack`,
`reftable_block` and `reftable_record` are all named after what they
encapsulate.

Rename the subsystem to `reftable_table`, which directly gives a hint
that the data structure is about handling the individual tables part of
the stack.

While this change results in a lot of churn, it prepares for us exposing
the APIs to third-party callers now that the reftable library is a
standalone library that can be linked against by other projects.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07 14:53:09 -07:00
Junio C Hamano
c7c4e5e419 Merge branch 'ps/reftable-sans-compat-util' into ps/reftable-api-revamp
* ps/reftable-sans-compat-util:
  Makefile: skip reftable library for Coccinelle
  reftable: decouple from Git codebase by pulling in "compat/posix.h"
  git-compat-util.h: split out POSIX-emulating bits
  compat/mingw: split out POSIX-related bits
  reftable/basics: introduce `REFTABLE_UNUSED` annotation
  reftable/basics: stop using `SWAP()` macro
  reftable/stack: stop using `sleep_millisec()`
  reftable/system: introduce `reftable_rand()`
  reftable/reader: stop using `ARRAY_SIZE()` macro
  reftable/basics: provide wrappers for big endian conversion
  reftable/basics: stop using `st_mult()` in array allocators
  reftable: stop using `BUG()` in trivial cases
  reftable/record: don't `BUG()` in `reftable_record_cmp()`
  reftable/record: stop using `BUG()` in `reftable_record_init()`
  reftable/record: stop using `COPY_ARRAY()`
  reftable/blocksource: stop using `xmmap()`
  reftable/stack: stop using `write_in_full()`
  reftable/stack: stop using `read_in_full()`
2025-04-01 19:05:13 +09:00
Johannes Schindelin
0fbbb2c9f5 clar: avoid using the comma operator unnecessarily
The comma operator is a somewhat obscure C feature that is often used by
mistake and can even cause unintentional code flow. In this instance, it
makes the code harder to read than necessary, too. Better use a
semicolon instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-28 17:38:09 -07:00
Seyi Kuforiji
7d4212b8f3 t/unit-tests: convert urlmatch-normalization test to clar
Adapt urlmatch-normalization test file to use clar testing framework by
using clar assertions where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-04 10:01:43 -08:00
Seyi Kuforiji
bc9342771b t/unit-tests: convert trailer test to use clar
Adapt trailer test file to use clar testing framework by using clar
assertions where necessary. Split test into individual test functions
for clarity and maintainability. Each test case now has its own
function, making it easier to isolate failures and improve test
readability.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-04 09:59:17 -08:00
Seyi Kuforiji
149585079f t/unit-tests: convert oidtree test to use clar test framework
Adapt oidtree test script to clar framework by using clar assertions
where necessary. `cl_parse_any_oid()` ensures the hash algorithm is set
before parsing. This prevents issues from an uninitialized or invalid
hash algorithm.

Introduce 'test_oidtree__initialize` handles the to set up of the global
oidtree variable and `test_oidtree__cleanup` frees the oidtree when all
tests are completed.

With this change, `check_each` stops at the first error encountered,
making it easier to address it.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:23 -08:00
Seyi Kuforiji
69bc044def t/unit-tests: convert oidmap test to use clar test framework
Adapt oidmap test script to clar framework by using clar assertions
where necessary. `cl_parse_any_oid()` ensures the hash algorithm is set
before parsing. This prevents issues from an uninitialized or invalid
hash algorithm.

Introduce 'test_oidmap__initialize` handles the to set up of the global
oidmap map with predefined key-value pairs, and `test_oidmap__cleanup`
frees the oidmap and its entries when all tests are completed.

The test loops through all entries to detect multiple errors. With this
change, it stops at the first error encountered, making it easier to
address it.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
Seyi Kuforiji
869a1edf44 t/unit-tests: convert oid-array test to use clar test framework
Adapt oid-array test script to clar framework by using clar assertions
where necessary. Remove descriptions from macros to reduce
redundancy, and move test input arrays to global scope for reuse across
multiple test functions. Introduce `test_oid_array__initialize()` to
explicitly initialize the hash algorithm.

These changes streamline the test suite, making individual tests
self-contained and reducing redundant code.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
Seyi Kuforiji
a16a2ee312 t/unit-tests: implement clar specific oid helper functions
`get_oid_arbitrary_hex()` and `init_hash_algo()` are both required for
oid-related tests to run without errors. In the current implementation,
both functions are defined and declared in the
`t/unit-tests/lib-oid.{c,h}` which is utilized by oid-related tests in
the homegrown unit tests structure.

Adapt functions in lib-oid.{c,h} to use clar. Both these functions
become available for oid-related test files implemented using the clar
testing framework, which requires them. This will be used by subsequent
commits.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
Patrick Steinhardt
e676694298 reftable/basics: provide wrappers for big endian conversion
We're using a mixture of big endian conversion functions provided by
both the reftable library, but also by the Git codebase. Refactor the
code so that we exclusively use reftable-provided wrappers in order to
untangle us from the Git codebase.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 10:55:37 -08:00
Patrick Steinhardt
6f6127decd reftable/record: don't BUG() in reftable_record_cmp()
The reftable library aborts with a bug in case `reftable_record_cmp()`
is invoked with two records of differing types. This would cause the
program to die without the caller being able to handle the error, which
is not something we want in the context of library code. And it ties us
to the Git codebase.

Refactor the code such that `reftable_record_cmp()` returns an error
code separate from the actual comparison result. This requires us to
also adapt some callers up the callchain in a similar fashion.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 10:55:36 -08:00
Patrick Steinhardt
9d9fac0f34 reftable/record: stop using BUG() in reftable_record_init()
We're aborting the program via `BUG()` in case `reftable_record_init()`
was invoked with an unknown record type. This is bad because we may now
die in library code, and because it makes us depend on the Git codebase.

Refactor the code such that `reftable_record_init()` can return an error
code to the caller. Adapt any callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 10:55:36 -08:00
Junio C Hamano
82522a9e2c Merge branch 'kn/reflog-migration-fix-followup'
Code clean-up.

* kn/reflog-migration-fix-followup:
  reftable: prevent 'update_index' changes after adding records
  refs: use 'uint64_t' for 'ref_update.index'
  refs: mark `ref_transaction_update_reflog()` as static
2025-02-14 17:53:48 -08:00
Junio C Hamano
6f0b72205d Merge branch 'sk/unit-tests-0130'
Convert a handful of unit tests to work with the clar framework.

* sk/unit-tests-0130:
  t/unit-tests: convert strcmp-offset test to use clar test framework
  t/unit-tests: convert strbuf test to use clar test framework
  t/unit-tests: adapt example decorate test to use clar test framework
  t/unit-tests: convert hashmap test to use clar test framework
2025-02-10 10:18:31 -08:00
Seyi Kuforiji
af8bf677c1 t/unit-tests: convert strcmp-offset test to use clar test framework
Adapt strcmp-offset test script to clar framework by using clar
assertions where necessary. Introduce `test_strcmp_offset__empty()` to
verify `check_strcmp_offset()` behavior when both input strings are
empty. This ensures the function correctly handles edge cases and
returns expected values.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:45 -08:00
Seyi Kuforiji
4b995465b2 t/unit-tests: convert strbuf test to use clar test framework
Adapt strbuf test script to clar framework by using clar assertions
where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:45 -08:00
Seyi Kuforiji
e0f807bdad t/unit-tests: adapt example decorate test to use clar test framework
Introduce `test_example_decorate__initialize()` to explicitly set up
object IDs and retrieve corresponding objects before tests run. This
ensures a consistent and predictable test state without relying on data
from previous tests.

Add `test_example_decorate__cleanup()` to clear decorations after each
test, preventing interference between tests and ensuring each runs in
isolation.

Adapt example decorate test script to clar framework by using clar
assertions where necessary. Previously, tests relied on data written by
earlier tests, leading to unintended dependencies between them. This
explicitly initializes the necessary state within
`test_example_decorate__readd`, ensuring it does not depend on prior
test executions.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:44 -08:00
Seyi Kuforiji
38b066ee76 t/unit-tests: convert hashmap test to use clar test framework
Adapts hashmap test script to clar framework by using clar assertions
where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:44 -08:00
Patrick Steinhardt
0578f1e66a global: adapt callers to use generic hash context helpers
Adapt callers to use generic hash context helpers instead of using the
hash algorithm to update them. This makes the callsites easier to reason
about and removes the possibility that the wrong hash algorithm is used
to update the hash context's state. And as a nice side effect this also
gets rid of a bunch of users of `the_hash_algo`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 10:06:11 -08:00
Patrick Steinhardt
7346e340f1 hash: stop typedeffing the hash context
We generally avoid using `typedef` in the Git codebase. One exception
though is the `git_hash_ctx`, likely because it used to be a union
rather than a struct until the preceding commit refactored it. But now
that it is a normal `struct` there isn't really a need for a typedef
anymore.

Drop the typedef and adapt all callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 10:06:10 -08:00
Junio C Hamano
a17fd7dd3a Merge branch 'ps/reftable-sign-compare'
The reftable/ library code has been made -Wsign-compare clean.

* ps/reftable-sign-compare:
  reftable: address trivial -Wsign-compare warnings
  reftable/blocksource: adjust `read_block()` to return `ssize_t`
  reftable/blocksource: adjust type of the block length
  reftable/block: adjust type of the restart length
  reftable/block: adapt header and footer size to return a `size_t`
  reftable/basics: adjust `hash_size()` to return `uint32_t`
  reftable/basics: adjust `common_prefix_size()` to return `size_t`
  reftable/record: handle overflows when decoding varints
  reftable/record: drop unused `print` function pointer
  meson: stop disabling -Wsign-compare
2025-01-28 13:02:24 -08:00
Junio C Hamano
8d335468ec Merge branch 'sk/unit-tests'
Move a few more unit tests to the clar test framework.

* sk/unit-tests:
  t/unit-tests: convert reftable tree test to use clar test framework
  t/unit-tests: adapt priority queue test to use clar test framework
  t/unit-tests: convert mem-pool test to use clar test framework
  t/unit-tests: handle dashes in test suite filenames
2025-01-28 13:02:22 -08:00
Karthik Nayak
017bd89239 reftable: prevent 'update_index' changes after adding records
The function `reftable_writer_set_limits()` allows updating the
'min_update_index' and 'max_update_index' of a reftable writer. These
values are written to both the writer's header and footer.

Since the header is written during the first block write, any subsequent
changes to the update index would create a mismatch between the header
and footer values. The footer would contain the newer values while the
header retained the original ones.

To protect against this bug, prevent callers from updating these values
after any record is written. To do this, modify the function to return
an error whenever the limits are modified after any record adds. Check
for record adds within `reftable_writer_set_limits()` by checking the
`last_key` and `next` variable. The former is updated after each record
added, but is reset at certain points. The latter is set after writing
the first block.

Modify all callers of the function to anticipate a return type and
handle it accordingly. Add a unit test to also ensure the function
returns the error as expected.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 09:51:36 -08:00
Patrick Steinhardt
ffe6643668 reftable/block: adapt header and footer size to return a size_t
The functions `header_size()` and `footer_size()` return a positive
integer representing the size of the header and footer, respectively,
dependent on the version of the reftable format. Similar to the
preceding commit, these functions return a signed integer though, which
is nonsensical given that there is no way for these functions to return
negative.

Adapt the functions to return a `size_t` instead to fix a couple of sign
comparison warnings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:29 -08:00
Patrick Steinhardt
57adf71b93 reftable/basics: adjust hash_size() to return uint32_t
The `hash_size()` function returns the number of bytes used by the hash
function. Weirdly enough though, it returns a signed integer for its
size even though the size obviously cannot ever be negative. The only
case where it could be negative is if the function returned an error
when asked for an unknown hash, but we assert(3p) instead.

Adjust the type of `hash_size()` to be `uint32_t` and adapt all places
that use signed integers for the hash size to follow suit. This also
allows us to get rid of a couple asserts that we had which verified that
the size was indeed positive, which further stresses the point that this
refactoring makes sense.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:29 -08:00
Patrick Steinhardt
5ac65f0d6b reftable/basics: adjust common_prefix_size() to return size_t
The `common_prefix_size()` function computes the length of the common
prefix between two buffers. As such its return value will always be an
unsigned integer, as the length cannot be negative. Regardless of that,
the function returns a signed integer, which is nonsensical and causes a
couple of -Wsign-compare warnings all over the place.

Adjust the function to return a `size_t` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:29 -08:00
Patrick Steinhardt
072e3aa3a5 reftable/record: handle overflows when decoding varints
The logic to decode varints isn't able to detect integer overflows: as
long as the buffer still has more data available, and as long as the
current byte has its 0x80 bit set, we'll continue to add up these values
to the result. This will eventually cause the `uint64_t` to overflow, at
which point we'll return an invalid result.

Refactor the function so that it is able to detect such overflows. The
implementation is basically copied from Git's own `decode_varint()`,
which already knows to handle overflows. The only adjustment is that we
also take into account the string view's length in order to not overrun
it. The reftable documentation explicitly notes that those two encoding
schemas are supposed to be the same:

    Varint encoding
    ^^^^^^^^^^^^^^^

    Varint encoding is identical to the ofs-delta encoding method used
    within pack files.

    Decoder works as follows:

    ....
    val = buf[ptr] & 0x7f
    while (buf[ptr] & 0x80) {
      ptr++
      val = ((val + 1) << 7) | (buf[ptr] & 0x7f)
    }
    ....

While at it, refactor `put_var_int()` in the same way by copying over
the implementation of `encode_varint()`. While `put_var_int()` doesn't
have an issue with overflows, it generates warnings with -Wsign-compare.
The implementation of `encode_varint()` doesn't, is battle-tested and at
the same time way simpler than what we currently have.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 14:20:28 -08:00
Junio C Hamano
85cf8801c8 Merge branch 'sk/unit-test-hash'
Test update.

* sk/unit-test-hash:
  t/unit-tests: convert hash to use clar test framework
2025-01-21 08:44:55 -08:00