There exist a bunch of call sites in the reftable backend that want to
create iterators for a reftable stack. This is rather convoluted right
now, where you always have to go via the merged table. And it is about
to become even more convoluted when we split up iterator initialization
and seeking in the next commit.
Introduce convenience functions that allow the caller to create an
iterator from a reftable stack directly without going through the merged
table. Adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the interfaces exposed by `struct reftable_reader` and `struct
table_iterator` such that they support iterator reuse. This is done by
separating initialization of the iterator and seeking on it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the interfaces exposed by `struct reftable_table` and `struct
reftable_iterator` such that they support iterator reuse. This is done
by separating initialization of the iterator and seeking on it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reftable iterators are created by seeking on the parent structure of a
corresponding record. For example, to create an iterator for the merged
table you would call `reftable_merged_table_seek_ref()`. Most notably,
it is not posible to create an iterator and then seek it afterwards.
While this may be a bit easier to reason about, it comes with two
significant downsides. The first downside is that the logic to find
records is split up between the parent data structure and the iterator
itself. Conceptually, it is more straight forward if all that logic was
contained in a single place, which should be the iterator.
The second and more significant downside is that it is impossible to
reuse iterators for multiple seeks. Whenever you want to look up a
record, you need to re-create the whole infrastructure again, which is
quite a waste of time. Furthermore, it is impossible to optimize seeks,
such as when seeking the same record multiple times.
To address this, we essentially split up the concerns properly such that
the parent data structure is responsible for setting up the iterator via
a new `init_iter()` callback, whereas the iterator handles seeks via a
new `seek()` callback. This will eventually allow us to call `seek()` on
the iterator multiple times, where every iterator can potentially
optimize for certain cases.
Note that at this point in time we are not yet ready to reuse the
iterators. This will be left for a future patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When seeking on a merged table, we perform the seek for each of the
subiterators. If the subiterator has the desired record we add it to the
priority queue, otherwise we skip it and don't add it to the stack of
subiterators hosted by the merged table.
The consequence of this is that the index of the subiterator in the
merged table does not necessarily correspond to the index of it in the
merged iterator. Next to being potentially confusing, it also means that
we won't easily be able to re-seek the merged iterator because we have
no clear connection between both of the data structures.
Refactor the code so that the index stays the same in both structures.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To initialize a `struct merged_iter`, we need to seek all subiterators
to the wanted record and then add their results to the priority queue
used to sort the records. This logic is split up across two functions,
`merged_table_seek_record()` and `merged_iter_init()`. The scope of
these functions is somewhat weird though, where `merged_iter_init()` is
only responsible for adding the records of the subiterators to the
priority queue.
Clarify the scope of those functions such that `merged_iter_init()` is
only responsible for initializing the iterator's structure. Performing
the subiterator seeks are now part of `merged_table_seek_record()`.
This step is required to move seeking of records into the generic
`struct reftable_iterator` infrastructure.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All the seeking functions accept a `struct reftable_reader` as input
such that they can use the reader to look up the respective blocks.
Refactor the code to instead set up the reader as a member of `struct
table_iter` during initialization such that we don't have to pass the
reader on every single call.
This step is required to move seeking of records into the generic
`struct reftable_iterator` infrastructure.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have both `reader_seek()` and `reader_seek_internal()`, where the
former function only exists so that we can exit early in case the given
table has no records of the sought-after type.
Merge these two functions into one.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "reftable/reader.c" we implement two different interfaces:
- The reftable reader contains the logic to read reftables.
- The table iterator is used to iterate through a single reftable read
by the reader.
The way those two types are used in the code is somewhat confusing
though because seeking inside a table is implemented as if it was part
of the reftable reader, even though it is ultimately more of a detail
implemented by the table iterator.
Make the boundary between those two types clearer by renaming functions
that seek records in a table such that they clearly belong to the table
iterator's logic.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `reader_seek_internal()` we either end up doing an indexed seek when
there is one or a linear seek otherwise. These two code paths are
disjunct without a good reason, where the indexed seek will cause us to
exit early.
Refactor the two code paths such that it becomes possible to share a bit
more code between them.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing an indexed seek we need to walk down the multi-level index
until we finally hit a record of the desired indexed type. This loop
performs a copy of the index iterator on every iteration, which is both
hard to understand and completely unnecessary.
Refactor the code so that we use a single iterator to walk down the
indices, only.
Note that while this should improve performance, the improvement is
negligible in all but the most unreasonable repositories. This is
because the effect is only really noticeable when we have to walk down
many levels of indices, which is not something that a repository would
typically have. So the motivation for this change is really only about
readability.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `block_reader_restart_offset()` gets the offset of the
`i`th restart point. `i` is a signed integer though, which is certainly
not the correct type to track indices like this. Furthermore, both
callers end up passing a `size_t`.
Refactor the code to use a `size_t` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The color parsing code learned to handle 12-bit RGB colors, spelled
as "#RGB" (in addition to "#RRGGBB" that is already supported).
* bb/rgb-12-bit-colors:
color: add support for 12-bit RGB colors
t/t4026-color: add test coverage for invalid RGB colors
t/t4026-color: remove an extra double quote character
Command line completion support for zsh (in contrib/) has been
updated to stop exposing internal state to end-user shell
interaction.
* dk/zsh-git-repo-path-fix:
completion: zsh: stop leaking local cache variable
zsh can pretend to be a normal shell pretty well except for some
glitches that we tickle in some of our scripts. Work them around
so that "vimdiff" and our test suite works well enough with it.
* bc/zsh-compatibility:
vimdiff: make script and tests work with zsh
t4046: avoid continue in &&-chain for zsh
When the user responds to a prompt given by "git add -p" with an
unsupported command, list of available commands were given, which
was too much if the user knew what they wanted to type but merely
made a typo. Now the user gets a much shorter error message.
* rj/add-p-typo-reaction:
add-patch: response to unknown command
add-patch: do not show UI messages on stderr
Command line completion script (in contrib/) learned to complete
"git symbolic-ref" a bit better (you need to enable plumbing
commands to be completed with GIT_COMPLETION_SHOW_ALL_COMMANDS).
* rh/complete-symbolic-ref:
completion: add docs on how to add subcommand completions
completion: improve docs for using __git_complete
completion: add 'symbolic-ref'
The singleton index_state instance "the_index" has been eliminated
by always instantiating "the_repository" and replacing references
to "the_index" with references to its .index member.
* ps/the-index-is-no-more:
repository: drop `initialize_the_repository()`
repository: drop `the_index` variable
builtin/clone: stop using `the_index`
repository: initialize index in `repo_init()`
builtin: stop using `the_index`
t/helper: stop using `the_index`
The credential helper protocol, together with the HTTP layer, have
been enhanced to support authentication schemes different from
username & password pair, like Bearer and NTLM.
* bc/credential-scheme-enhancement:
credential: add method for querying capabilities
credential-cache: implement authtype capability
t: add credential tests for authtype
credential: add support for multistage credential rounds
t5563: refactor for multi-stage authentication
docs: set a limit on credential line length
credential: enable state capability
credential: add an argument to keep state
http: add support for authtype and credential
docs: indicate new credential protocol fields
credential: add a field called "ephemeral"
credential: gate new fields on capability
credential: add a field for pre-encoded credentials
http: use new headers for each object request
remote-curl: reset headers on new request
credential: add an authtype field
Tests to ensure interoperability between reftable written by jgit
and our code have been added and enabled in CI.
* ps/ci-test-with-jgit:
t0612: add tests to exercise Git/JGit reftable compatibility
t0610: fix non-portable variable assignment
t06xx: always execute backend-specific tests
ci: install JGit dependency
ci: make Perforce binaries executable for all users
ci: merge scripts which install dependencies
ci: fix setup of custom path for GitLab CI
ci: merge custom PATH directories
ci: convert "install-dependencies.sh" to use "/bin/sh"
ci: drop duplicate package installation for "linux-gcc-default"
ci: skip sudo when we are already root
ci: expose distro name in dockerized GitHub jobs
ci: rename "runs_on_pool" to "distro"
Code to write out reftable has seen some optimization and
simplification.
* ps/reftable-write-optim:
reftable/block: reuse compressed array
reftable/block: reuse zstream when writing log blocks
reftable/writer: reset `last_key` instead of releasing it
reftable/writer: unify releasing memory
reftable/writer: refactorings for `writer_flush_nonempty_block()`
reftable/writer: refactorings for `writer_add_record()`
refs/reftable: don't recompute committer ident
reftable: remove name checks
refs/reftable: skip duplicate name checks
refs/reftable: perform explicit D/F check when writing symrefs
refs/reftable: fix D/F conflict error message on ref copy
RGB color parsing currently supports 24-bit values in the form #RRGGBB.
As in Cascading Style Sheets (CSS [1]), also allow to specify an RGB color
using only three digits with #RGB.
In this shortened form, each of the digits is – again, as in CSS –
duplicated to convert the color to 24 bits, e.g. #f1b specifies the same
color as #ff11bb.
In color.h, remove the '0x' prefix in the example to match the actual
syntax.
[1] https://developer.mozilla.org/en-US/docs/Web/CSS/hex-color
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sure that the RGB color parser rejects invalid characters and
invalid lengths.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is most probably just an editing left-over from cb357221a4 (t4026:
test "normal" color, 2014-11-20) which added this test.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d44e5267ea (diff-lib: plug minor memory leaks in do_diff_cache(),
2020-11-14) added the call to diff_setup_done() to release the memory
of the parseopt member of struct diff_options that repo_init_revisions()
had allocated via repo_diff_setup() and prep_parse_options().
189e97bc4b (diff: remove parseopts member from struct diff_options,
2022-12-01) did away with that allocation; diff_setup_done() doesn't
release any memory anymore. So stop calling this function on the blank
diffopt member before it is overwritten, as this is no longer necessary.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Completing commands like "git rebase" in one repository will leak the
local __git_repo_path into the shell's environment so that completing
commands after changing to a different repository will give the old
repository's references (or none at all).
The bug report on the mailing list [1] suggests one simple way to observe
this yourself:
Enter the following commands from some directory:
mkdir a b b/c
for d (a b); git -C $d init && git -C $d commit --allow-empty -m init
cd a
git branch foo
pushd ../b/c
git branch bar
Now type these:
git rebase <TAB>… # completion for bar available; C-c to abort
declare -p __git_repo_path # outputs /path/to/b/.git
popd
git branch # outputs foo, main
git rebase <TAB>… # completion candidates are bar, main!
Ideally, the last typed <TAB> should be yielding foo, main.
Commit beb6ee7163 (completion: extract repository discovery from
__gitdir(), 2017-02-03) anticipated this problem by marking
__git_repo_path as local in __git_main and __gitk_main for Bash
completion but did not give the same mark to _git for Zsh completion.
Thus make __git_repo_path local for Zsh completion, too.
[1]: https://lore.kernel.org/git/CALnO6CBv3+e2WL6n6Mh7ZZHCX2Ni8GpvM4a-bQYxNqjmgZdwdg@mail.gmail.com/
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out. Now it keeps going.
* js/for-each-repo-keep-going:
maintenance: running maintenance should not stop on errors
for-each-repo: optionally keep going on an error
In addition to building the objects needed, try to link the objects
that are used in fuzzer tests, to make sure at least they build
without bitrot, in Linux CI runs.
* js/build-fuzz-more-often:
fuzz: link fuzz programs with `make all` on Linux
Advertise "git contacts", a tool for newcomers to find people to
ask review for their patches, a bit more in our developer
documentation.
* la/doc-use-of-contacts-when-contributing:
SubmittingPatches: demonstrate using git-contacts with git-send-email
SubmittingPatches: add heading for format-patch and send-email
SubmittingPatches: dedupe discussion of security patches
SubmittingPatches: discuss reviewers first
SubmittingPatches: quote commands
SubmittingPatches: mention GitGitGadget
SubmittingPatches: clarify 'git-contacts' location
MyFirstContribution: mention contrib/contacts/git-contacts
The "--rfc" option of "git format-patch" learned to take an
optional string value to be used in place of "RFC" to tweak the
"[PATCH]" on the subject header.
* jc/format-patch-rfc-more:
format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
format-patch: allow --rfc to optionally take a value, like --rfc=WIP
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".
* ds/format-patch-rfc-and-k:
format-patch: ensure that --rfc and -k are mutually exclusive
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.
* xx/disable-replace-when-building-midx:
midx: disable replace objects
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.
* pw/rebase-m-signoff-fix:
rebase -m: fix --signoff with conflicts
sequencer: store commit message in private context
sequencer: move current fixups to private context
sequencer: start removing private fields from public API
sequencer: always free "struct replay_opts"
When the user gives an unknown command to the "add -p" prompt, the list
of accepted commands with their explanation is given. This is the same
output they get when they say '?'.
However, the unknown command may be due to a user input error rather
than the user not knowing the valid command.
To reduce the likelihood of user confusion and error repetition, instead
of displaying the list of accepted commands, display a short error
message with the unknown command received, as feedback to the user.
Include a reminder about the current command '?' in the new message, to
guide the user if they want help.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no need to show some UI messages on stderr, and yet doing so
may produce some undesirable results, such as messages appearing in an
unexpected order.
Let's use stdout for all UI messages, and adjusts the tests accordingly.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>