Commit Graph

72906 Commits

Author SHA1 Message Date
Patrick Steinhardt
ce1f213cc9 reftable/block: reuse zstream state on inflation
When calling `inflateInit()` and `inflate()`, the zlib library will
allocate several data structures for the underlying `zstream` to keep
track of various information. Thus, when inflating repeatedly, it is
possible to optimize memory allocation patterns by reusing the `zstream`
and then calling `inflateReset()` on it to prepare it for the next chunk
of data to inflate.

This is exactly what the reftable code is doing: when iterating through
reflogs we need to potentially inflate many log blocks, but we discard
the `zstream` every single time. Instead, as we reuse the `block_reader`
for each of the blocks anyway, we can initialize the `zstream` once and
then reuse it for subsequent inflations.

Refactor the code to do so, which leads to a significant reduction in
the number of allocations. The following measurements were done when
iterating through 1 million reflog entries. Before:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 23,028 allocs, 22,906 frees, 162,813,552 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 302 allocs, 180 frees, 88,352 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
15a60b747e reftable/block: open-code call to uncompress2()
The reftable format stores log blocks in a compressed format. Thus,
whenever we want to read such a block we first need to decompress it.
This is done by calling the convenience function `uncompress2()` of the
zlib library, which is a simple wrapper that manages the lifecycle of
the `zstream` structure for us.

While nice for one-off inflation of data, when iterating through reflogs
we will likely end up inflating many such log blocks. This requires us
to reallocate the state of the `zstream` every single time, which adds
up over time. It would thus be great to reuse the `zstream` instead of
discarding it after every inflation.

Open-code the call to `uncompress2()` such that we can start reusing the
`zstream` in the subsequent commit. Note that our open-coded variant is
different from `uncompress2()` in two ways:

  - We do not loop around `inflate()` until we have processed all input.
    As our input is limited by the maximum block size, which is 16MB, we
    should not hit limits of `inflate()`.

  - We use `Z_FINISH` instead of `Z_NO_FLUSH`. Quoting the `inflate()`
    documentation: "inflate() should normally be called until it returns
    Z_STREAM_END or an error. However if all decompression is to be
    performed in a single step (a single call of inflate), the parameter
    flush should be set to Z_FINISH."

    Furthermore, "Z_FINISH also informs inflate to not maintain a
    sliding window if the stream completes, which reduces inflate's
    memory footprint."

Other than that this commit is expected to be functionally equivalent
and does not yet reuse the `zstream`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
dd347bbce6 reftable/block: reuse uncompressed blocks
The reftable backend stores reflog entries in a compressed format and
thus needs to uncompress blocks before one can read records from it.
For each reflog block we thus have to allocate an array that we can
decompress the block contents into. This block is being discarded
whenever the table iterator moves to the next block. Consequently, we
reallocate a new array on every block, which is quite wasteful.

Refactor the code to reuse the uncompressed block data when moving the
block reader to a new block. This significantly reduces the number of
allocations when iterating through many compressed blocks. The following
measurements are done with `git reflog list` when listing 100k reflogs.
Before:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 45,755 allocs, 45,633 frees, 254,779,456 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,473 bytes in 122 blocks
    total heap usage: 23,028 allocs, 22,906 frees, 162,813,547 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
b00bcb7c49 reftable/reader: iterate to next block in place
The table iterator has to iterate towards the next block once it has
yielded all records of the current block. This is done by creating a new
table iterator, initializing it to the next block, releasing the old
iterator and then copying over the data.

Refactor the code to instead advance the table iterator in place. This
is simpler and unlocks some optimizations in subsequent patches. Also,
it allows us to avoid some allocations.

The following measurements show a single matching ref out of 1 million
refs. Before this change:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 315 allocs, 190 frees, 107,027 bytes allocated

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
bcdc586db0 reftable/block: move ownership of block reader into struct table_iter
The table iterator allows the caller to iterate through all records in a
reftable table. To do so it iterates through all blocks of the desired
type one by one, where for each block it creates a new block iterator
and yields all its entries.

One of the things that is somewhat confusing in this context is who owns
the block reader that is being used to read the blocks and pass them to
the block iterator. Intuitively, as the table iterator is responsible
for iterating through the blocks, one would assume that this iterator is
also responsible for managing the lifecycle of the reader. And while it
somewhat is, the block reader is ultimately stored inside of the block
iterator.

Refactor the code such that the block reader is instead fully managed by
the table iterator. Instead of passing the reader to the block iterator,
we now only end up passing the block data to it. Despite clearing up the
lifecycle of the reader, it will also allow for better reuse of the
reader in subsequent patches.

The following benchmark prints a single matching ref out of 1 million
refs. Before:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 6,607 allocs, 6,482 frees, 509,635 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 7,235 allocs, 7,110 frees, 301,481 bytes allocated

Note that while there are more allocation and free calls now, the
overall number of bytes allocated is significantly lower. The number of
allocations will be reduced significantly by the next patch though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
b371221a60 reftable/block: introduce block_reader_release()
Introduce a new function `block_reader_release()` that releases
resources acquired by the block reader. This function will be extended
in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
aac8c03cc4 reftable/block: better grouping of functions
Function definitions and declaration of `struct block_reader` and
`struct block_iter` are somewhat mixed up, making it hard to see which
functions belong together. Rearrange them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
42c7bdc36d reftable/block: merge block_iter_seek() and block_reader_seek()
The function `block_iter_seek()` is merely a simple wrapper around
`block_reader_seek()`. Merge those two functions into a new function
`block_iter_seek_key()` that more clearly says what it is actually
doing.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Patrick Steinhardt
3122d44025 reftable/block: rename block_reader_start()
The function `block_reader_start()` does not really apply to the block
reader, but to the block iterator. It's name is thus somewhat confusing.
Rename it to `block_iter_seek_start()` to clarify.

We will rename `block_reader_seek()` in similar spirit in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:36:09 -07:00
Junio C Hamano
8f7582d995 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 11:31:39 -07:00
Junio C Hamano
d8360a86ed Merge branch 'tb/midx-write'
Code clean-up by splitting code responsible for writing midx files
into its own file.

* tb/midx-write:
  midx-write.c: use `--stdin-packs` when repacking
  midx-write.c: check count of packs to repack after grouping
  midx-write.c: factor out common want_included_pack() routine
  midx-write: move writing-related functions from midx.c
2024-04-12 11:31:39 -07:00
Junio C Hamano
28dc93bab0 Merge branch 'rs/t-prio-queue-cleanup'
t-prio-queue test has been cleaned up by using C99 compound
literals; this is meant to also serve as a weather-balloon to smoke
out folks with compilers who have trouble compiling code that uses
the feature.

* rs/t-prio-queue-cleanup:
  t-prio-queue: simplify using compound literals
2024-04-12 11:31:39 -07:00
Junio C Hamano
7fbe3ead19 Merge branch 'ps/reftable-binsearch-updates'
Reftable code clean-up and some bugfixes.

* ps/reftable-binsearch-updates:
  reftable/block: avoid decoding keys when searching restart points
  reftable/record: extract function to decode key lengths
  reftable/block: fix error handling when searching restart points
  reftable/block: refactor binary search over restart points
  reftable/refname: refactor binary search over refnames
  reftable/basics: improve `binsearch()` test
  reftable/basics: fix return type of `binsearch()` to be `size_t`
2024-04-12 11:31:39 -07:00
Junio C Hamano
847af43a3a Merge branch 'jc/checkout-detach-wo-tracking-report'
"git checkout/switch --detach foo", after switching to the detached
HEAD state, gave the tracking information for the 'foo' branch,
which was pointless.

Tested-by: M Hickford <mirth.hickford@gmail.com>
cf. <CAGJzqsmE9FDEBn=u3ge4LA3ha4fDbm4OWiuUbMaztwjELBd7ug@mail.gmail.com>

* jc/checkout-detach-wo-tracking-report:
  checkout: omit "tracking" information on a detached HEAD
2024-04-12 11:31:39 -07:00
Junio C Hamano
d8800f630a Merge branch 'rs/imap-send-use-xsnprintf'
Code clean-up and duplicate reduction.

* rs/imap-send-use-xsnprintf:
  imap-send: use xsnprintf to format command
2024-04-12 11:31:38 -07:00
Junio C Hamano
d842e22ebb Merge branch 'js/merge-tree-3-trees'
Match the option argument type in the help text to the correct type
updated by a recent series.

* js/merge-tree-3-trees:
  merge-tree: fix argument type of the `--merge-base` option
2024-04-12 11:31:38 -07:00
Johannes Schindelin
0c6ee971fb merge-tree: fix argument type of the --merge-base option
In 5f43cf5b2e (merge-tree: accept 3 trees as arguments, 2024-01-28), I
taught `git merge-tree` to perform three-way merges on trees. This
commit even changed the manual page to state that the `--merge-base`
option takes a tree-ish rather than requiring a commit.

But I forgot to adjust the in-program help text. This patch fixes that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-12 09:10:43 -07:00
Junio C Hamano
436d4e5b14 The seventeenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-10 10:00:09 -07:00
Junio C Hamano
f43863e686 Merge branch 'jc/t2104-style-update'
Coding style fixes.

* jc/t2104-style-update:
  t2104: style fixes
2024-04-10 10:00:09 -07:00
Junio C Hamano
280b74ce18 Merge branch 'kn/clarify-update-ref-doc'
Doc update, as a preparation to enhance "git update-ref --stdin".

* kn/clarify-update-ref-doc:
  githooks: use {old,new}-oid instead of {old,new}-value
  update-ref: use {old,new}-oid instead of {old,new}value
2024-04-10 10:00:08 -07:00
Junio C Hamano
a4a1453ad1 Merge branch 'vs/complete-with-set-u-fix'
Another "set -u" fix for the bash prompt (in contrib/) script.

* vs/complete-with-set-u-fix:
  completion: protect prompt against unset SHOWUPSTREAM in nounset mode
  completion: fix prompt with unset SHOWCONFLICTSTATE in nounset mode
2024-04-10 10:00:08 -07:00
Junio C Hamano
aaf524cfb0 Merge branch 'rs/mem-pool-size-t-safety'
size_t arithmetic safety.

* rs/mem-pool-size-t-safety:
  mem-pool: use st_add() in mem_pool_strvfmt()
2024-04-10 10:00:08 -07:00
Junio C Hamano
dc89c59951 Merge branch 'ds/typofix-core-config-doc'
Typofix.

* ds/typofix-core-config-doc:
  config: fix some small capitalization issues, as spotted
2024-04-10 10:00:08 -07:00
Junio C Hamano
91ec36f2cc The sixteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-09 14:31:45 -07:00
Junio C Hamano
8f31543f3d Merge branch 'rj/use-adv-if-enabled'
Use advice_if_enabled() API to rewrite a simple pattern to
call advise() after checking advice_enabled().

* rj/use-adv-if-enabled:
  add: use advise_if_enabled for ADVICE_ADD_EMBEDDED_REPO
  add: use advise_if_enabled for ADVICE_ADD_EMPTY_PATHSPEC
  add: use advise_if_enabled for ADVICE_ADD_IGNORED_FILE
2024-04-09 14:31:45 -07:00
Junio C Hamano
eacfd581d2 Merge branch 'ps/pack-refs-auto'
"git pack-refs" learned the "--auto" option, which is a useful
addition to be triggered from "git gc --auto".

Acked-by: Karthik Nayak <karthik.188@gmail.com>
cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com>

* ps/pack-refs-auto:
  builtin/gc: pack refs when using `git maintenance run --auto`
  builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs
  t6500: extract objects with "17" prefix
  builtin/gc: move `struct maintenance_run_opts`
  builtin/pack-refs: introduce new "--auto" flag
  builtin/pack-refs: release allocated memory
  refs/reftable: expose auto compaction via new flag
  refs: remove `PACK_REFS_ALL` flag
  refs: move `struct pack_refs_opts` to where it's used
  t/helper: drop pack-refs wrapper
  refs/reftable: print errors on compaction failure
  reftable/stack: gracefully handle failed auto-compaction due to locks
  reftable/stack: use error codes when locking fails during compaction
  reftable/error: discern locked/outdated errors
  reftable/stack: fix error handling in `reftable_stack_init_addition()`
2024-04-09 14:31:45 -07:00
Junio C Hamano
a6abddab1e Merge branch 'es/test-cron-safety'
The test script had an incomplete and ineffective attempt to avoid
clobbering the testing user's real crontab (and its equivalents),
which has been completed.

* es/test-cron-safety:
  test-lib: fix non-functioning GIT_TEST_MAINT_SCHEDULER fallback
2024-04-09 14:31:45 -07:00
Junio C Hamano
989bf45394 Merge branch 'rj/add-p-explicit-reshow'
"git add -p" and other "interactive hunk selection" UI has learned to
skip showing the hunk immediately after it has already been shown, and
an additional action to explicitly ask to reshow the current hunk.

* rj/add-p-explicit-reshow:
  add-patch: do not print hunks repeatedly
  add-patch: introduce 'p' in interactive-patch
2024-04-09 14:31:44 -07:00
Junio C Hamano
4b4081034b Merge branch 'mg/editorconfig-makefile'
The .editorconfig file has been taught that a Makefile uses HT
indentation.

* mg/editorconfig-makefile:
  editorconfig: add Makefiles to "text files"
2024-04-09 14:31:44 -07:00
Junio C Hamano
58dd7e4b11 Merge branch 'ja/doc-markup-updates'
Documentation rules has been explicitly described how to mark-up
literal parts and a few manual pages have been updated as examples.

* ja/doc-markup-updates:
  doc: git-clone: do not autoreference the manpage in itself
  doc: git-clone: apply new documentation formatting guidelines
  doc: git-init: apply new documentation formatting guidelines
  doc: allow literal and emphasis format in doc vs help tests
  doc: rework CodingGuidelines with new formatting rules
2024-04-09 14:31:44 -07:00
Junio C Hamano
4697c8a445 Merge branch 'dg/myfirstobjectwalk-updates'
Update a more recent tutorial doc.

* dg/myfirstobjectwalk-updates:
  MyFirstObjectWalk: add stderr to pipe processing
  MyFirstObjectWalk: fix description for counting omitted objects
  MyFirstObjectWalk: fix filtered object walk
  MyFirstObjectWalk: fix misspelled "builtins/"
  MyFirstObjectWalk: use additional arg in config_fn_t
2024-04-09 14:31:44 -07:00
Junio C Hamano
39b2c6f77e Merge branch 'jc/advice-sans-trailing-whitespace'
The "hint:" messages given by the advice mechanism, when given a
message with a blank line, left a line with trailing whitespace,
which has been cleansed.

* jc/advice-sans-trailing-whitespace:
  advice: omit trailing whitespace
2024-04-09 14:31:43 -07:00
Junio C Hamano
8289a36f87 Merge branch 'jc/apply-parse-diff-git-header-names-fix'
"git apply" failed to extract the filename the patch applied to,
when the change was about an empty file created in or deleted from
a directory whose name ends with a SP, which has been corrected.

* jc/apply-parse-diff-git-header-names-fix:
  t4126: fix "funny directory name" test on Windows (again)
  t4126: make sure a directory with SP at the end is usable
  apply: parse names out of "diff --git" more carefully
2024-04-09 14:31:43 -07:00
Junio C Hamano
19981daefd The fifteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 10:49:49 -07:00
Junio C Hamano
dce1e0b6da Merge branch 'jk/core-comment-string'
core.commentChar used to be limited to a single byte, but has been
updated to allow an arbitrary multi-byte sequence.

* jk/core-comment-string:
  config: add core.commentString
  config: allow multi-byte core.commentChar
  environment: drop comment_line_char compatibility macro
  wt-status: drop custom comment-char stringification
  sequencer: handle multi-byte comment characters when writing todo list
  find multi-byte comment chars in unterminated buffers
  find multi-byte comment chars in NUL-terminated strings
  prefer comment_line_str to comment_line_char for printing
  strbuf: accept a comment string for strbuf_add_commented_lines()
  strbuf: accept a comment string for strbuf_commented_addf()
  strbuf: accept a comment string for strbuf_stripspace()
  environment: store comment_line_char as a string
  strbuf: avoid shadowing global comment_line_char name
  commit: refactor base-case of adjust_comment_line_char()
  strbuf: avoid static variables in strbuf_add_commented_lines()
  strbuf: simplify comment-handling in add_lines() helper
  config: forbid newline as core.commentChar
2024-04-05 10:49:49 -07:00
Junio C Hamano
3256584c36 Merge branch 'rs/config-comment'
"git config" learned "--comment=<message>" option to leave a
comment immediately after the "variable = value" on the same line
in the configuration file.

* rs/config-comment:
  config: allow tweaking whitespace between value and comment
  config: fix --comment formatting
  config: add --comment option to add a comment
2024-04-05 10:49:49 -07:00
Junio C Hamano
7774cfed62 The fourteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 10:56:20 -07:00
Junio C Hamano
17381ab62a Merge branch 'bl/cherry-pick-empty'
Allow git-cherry-pick(1) to automatically drop redundant commits via
a new `--empty` option, similar to the `--empty` options for
git-rebase(1) and git-am(1). Includes a soft deprecation of
`--keep-redundant-commits` as well as some related docs changes and
sequencer code cleanup.

* bl/cherry-pick-empty:
  cherry-pick: add `--empty` for more robust redundant commit handling
  cherry-pick: enforce `--keep-redundant-commits` incompatibility
  sequencer: do not require `allow_empty` for redundant commit options
  sequencer: handle unborn branch with `--allow-empty`
  rebase: update `--empty=ask` to `--empty=stop`
  docs: clean up `--empty` formatting in git-rebase(1) and git-am(1)
  docs: address inaccurate `--empty` default with `--exec`
2024-04-03 10:56:20 -07:00
Junio C Hamano
d988e80bd3 Merge branch 'bl/pretty-shorthand-config-fix'
The "--pretty=<shortHand>" option of the commands in the "git log"
family, defined as "[pretty] shortHand = <expansion>" should have
been looked up case insensitively, but was not, which has been
corrected.

* bl/pretty-shorthand-config-fix:
  pretty: find pretty formats case-insensitively
  pretty: update tests to use `test_config`
2024-04-03 10:56:20 -07:00
Junio C Hamano
4cc302e886 Merge branch 'rs/strbuf-expand-bad-format'
Code clean-up.

* rs/strbuf-expand-bad-format:
  cat-file: use strbuf_expand_bad_format()
  factor out strbuf_expand_bad_format()
2024-04-03 10:56:20 -07:00
Junio C Hamano
f046355ec3 Merge branch 'rs/midx-use-strvec-pushf'
Code clean-up.

* rs/midx-use-strvec-pushf:
  midx: use strvec_pushf() for pack-objects base name
2024-04-03 10:56:20 -07:00
Junio C Hamano
188e94250a Merge branch 'pb/test-scripts-are-build-targets'
The t/README file now gives a hint on running individual tests in
the "t/" directory with "make t<num>-*.sh t<num>-*.sh".

* pb/test-scripts-are-build-targets:
  t/README: mention test files are make targets
2024-04-03 10:56:19 -07:00
Junio C Hamano
e4193dcf12 Merge branch 'ds/grep-doc-updates'
Documentation updates.

* ds/grep-doc-updates:
  grep docs: describe --no-index further and improve formatting a bit
  grep docs: describe --recurse-submodules further and improve formatting a bit
2024-04-03 10:56:19 -07:00
Junio C Hamano
e76218cad3 Merge branch 'az/grep-group-error-message-update'
Error message clarification.

* az/grep-group-error-message-update:
  grep: improve errors for unmatched ( and )
2024-04-03 10:56:19 -07:00
Junio C Hamano
eda72ddc18 Merge branch 'jc/release-notes-entry-experiment'
Introduce an experimental protocol for contributors to propose the
topic description to be used in the "What's cooking" report, the
merge commit message for the topic, and in the release notes and
document it in the SubmittingPatches document.

* jc/release-notes-entry-experiment:
  SubmittingPatches: release-notes entry experiment
2024-04-03 10:56:19 -07:00
Junio C Hamano
e139bb1006 Merge branch 'jk/remote-helper-object-format-option-fix'
The implementation and documentation of "object-format" option
exchange between the Git itself and its remote helpers did not
quite match, which has been corrected.

* jk/remote-helper-object-format-option-fix:
  transport-helper: send "true" value for object-format option
  transport-helper: drop "object-format <algo>" option
  transport-helper: use write helpers more consistently
2024-04-03 10:56:18 -07:00
Patrick Steinhardt
d51d8cc368 reftable/block: avoid decoding keys when searching restart points
When searching over restart points in a block we decode the key of each
of the records, which results in a memory allocation. This is quite
pointless though given that records it restart points will never use
prefix compression and thus store their keys verbatim in the block.

Refactor the code so that we can avoid decoding the keys, which saves us
some allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
Patrick Steinhardt
cd75790707 reftable/record: extract function to decode key lengths
We're about to refactor the binary search over restart points so that it
does not need to fully decode the record keys anymore. To do so we will
need to decode the record key lengths, which is non-trivial logic.

Extract the logic to decode these lengths from `refatble_decode_key()`
so that we can reuse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
Patrick Steinhardt
f9e88544f5 reftable/block: fix error handling when searching restart points
When doing the binary search over restart points in a block we need to
decode the record keys. This decoding step can result in an error when
the block is corrupted, which we indicate to the caller of the binary
search by setting `args.error = 1`. But the only caller that exists
mishandles this because it in fact performs the error check before
calling `binsearch()`.

Fix this bug by checking for errors at the right point in time.
Furthermore, refactor `binsearch()` so that it aborts the search in case
the callback function returns a negative value so that we don't
needlessly continue to search the block.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00
Patrick Steinhardt
77307a61d6 reftable/block: refactor binary search over restart points
When seeking a record in our block reader we perform a binary search
over the block's restart points so that we don't have to do a linear
scan over the whole block. The logic to do so is quite intricate though,
which makes it hard to understand.

Improve documentation and rename some of the functions and variables so
that the code becomes easier to understand overall. This refactoring
should not result in any change in behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 09:16:50 -07:00