The code to read status from subprocess reads one packet line and
tries to find "status=<foo>". It is way overkill to split the line
into an array of two strbufs to extract <foo>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
environment.c:get_git_namespace() learns the raw namespace from an
environment variable, splits it at "/", and appends them after
"refs/namespaces/"; the reason why it splits first is so that an
empty string resulting from double slashes can be omitted.
The split pieces do not need to be edited in any way, so an array of
strbufs is a wrong data structure to use. Instead split into a
string list and use the pieces from there.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing an old-style GIT_CONFIG_PARAMETERS environment
variable, the code parses key=value pairs by splitting them at '='
into an array of strbuf's. As strbuf_split() leaves the delimiter
at the end of the split piece, the code has to manually trim it.
If we split with string_list_split(), that becomes unnecessary.
Retire the use of strbuf_split() from this code path.
Note that the max parameter of string_list_split() is of
an ergonomically iffy design---it specifies the maximum number of
times the function is allowed to split, which means that in order to
split a text into up to 2 pieces, you have to pass 1, not 2.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading copy instructions from the standard input, the program
reads a line, splits it into tokens at whitespace, and trims each of
the tokens before using. We no longer need to use strbuf just to be
able to trim, as string_list_split*() family now can trim while
splitting a string.
Retire the use of strbuf_split() from this code path.
Note that this loop is a bit sloppy in that it ensures at least
there are two tokens on each line, but ignores if there are extra
tokens on the line. Tightening it is outside the scope of this
series.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading merge instructions from the standard input, the program
reads from the standard input, splits the line into tokens at
whitespace, and trims each of them before using. We no longer need
to use strbuf just for trimming, as string_list_split*() family can
trim while splitting a string.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/clean.c:filter_by_patterns_cmd() interactively reads a line
that has exclude patterns from the user and splits the line into a
list of patterns. It uses the strbuf_split() so that each split
piece can then trimmed.
There is no need to use strbuf anymore, thanks to the recent
enhancement to string_list_split*() family that allows us to trim
the pieces split into a string_list.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callee parse_choice() only needs to access a NUL-terminated
string; instead of insisting to take a pointer to a strbuf, just
take a pointer to a character array.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/clean.c:parse_choice() is fed a single line of input, which
is space or comma separated list of tokens, and a list of menu
items. It parses the tokens into number ranges (e.g. 1-3 that means
the first three items) or string prefix (e.g. 's' to choose the menu
item "(s)elect") that specify the elements in the menu item list,
and tells the caller which ones are chosen.
For parsing the input string, it uses strbuf_split() to split it
into bunch of strbufs. Instead use string_list_split_in_place(),
for a few reasons.
* strbuf_split() is a bad API function to use, that yields an array
of strbuf that is a bad data structure to use in general.
* string_list_split_in_place() allows you to split with "comma or
space"; the current code has to preprocess the input string to
replace comma with space because strbuf_split() does not allow
this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you pass a structure by value, the callee can modify the
contents of the structure that was passed in without having to worry
about changing the structure the caller has. Passing structure by
value sometimes (but not very often) can be a valid way to give
callee a temporary variable it can freely modify.
But not a structure with members that are pointers, like a strbuf.
builtin/clean.c:list_and_choose() reads a line interactively from
the user, and passes the line (in a strbuf) to parse_choice() by
value, which then munges by replacing ',' with ' ' (to accept both
comma and space separated list of choices). But because the strbuf
passed by value still shares the underlying character array buf[],
this ends up munging the caller's strbuf contents.
This is a catastrophe waiting to happen. If the callee causes the
strbuf to be reallocated, the buf[] the caller has will become
dangling, and when the caller does strbuf_release(), it would result
in double-free.
Stop calling the function with misleading call-by-value with strbuf.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf is a very good data structure to work with string data
without having to worry about running past the end of the string,
but strbuf_split() is a wrong API and an array of strbuf that the
function produces is a wrong thing to use in general. You do not
edit these N strings split out of a single strbuf simultaneously.
Often it is much better off to split a string into string_list and
work with the resulting strings.
wt-status.c:abbrev_oid_in_line() takes one line of rebase todo list
(like "pick e813a0200a title"), and
for instructions that has an object name as the second token on the
line, replace the object name with its unique abbreviation. After
splitting these tokens out of a single line, no simultaneous edit on
any of these pieces of string that takes advantage of strbuf API
takes place. The final string is composed with strbuf API, but
these split pieces are merely used as pieces of strings and there is
no need for them to be stored in individual strbuf.
Instead, split the line into a string_list, and compose the final
string using these pieces.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/string-list-split:
string-list: split-then-remove-empty can be done while splitting
string-list: optionally omit empty string pieces in string_list_split*()
diff: simplify parsing of diff.colormovedws
string-list: optionally trim string pieces split by string_list_split*()
string-list: unify string_list_split* functions
string-list: align string_list_split() with its _in_place() counterpart
string-list: report programming error with BUG
Thanks to the new STRING_LIST_SPLIT_NONEMPTY flag, a common pattern
to split a string into a string list and then remove empty items in
the resulting list is no longer needed. Instead, just tell the
string_list_split*() to omit empty ones while splitting.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the unified split_string() machinery a new flag bit,
STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces to be
omitted from the resulting string list.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to parse this configuration variable, whose value is a
comma-separated list of known tokens like "ignore-space-change" and
"ignore-all-space", uses string_list_split() to split the value into
pieces, and then places each piece of string in a strbuf to trim,
before comparing the result with the list of known tokens.
Thanks to the previous steps, now string_list_split() can trim the
resulting pieces before it places them in the string list. Use it
to simplify the code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the unified split_string() to take an optional "flags" word,
and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
pieces to be trimmed before they are placed in the string list.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thanks to the previous step, the only difference between these two
related functions is that string_list_split() works on a string
without modifying its contents (i.e. taking "const char *") and the
resulting pieces of strings are their own copies in a string list,
while string_list_split_in_place() works on a mutable string and the
resulting pieces of strings come from the original string.
Consolidate their implementations into a single helper function, and
make them a thin wrapper around it. We can later add an extra flags
parameter to extend both of these functions by updating only the
internal helper function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The string_list_split_in_place() function was updated by 52acddf3
(string-list: multi-delimiter `string_list_split_in_place()`,
2023-04-24) to take more than one delimiter characters, hoping that
we can later use it to replace our uses of strtok(). We however did
not make a matching change to the string_list_split() function,
which is very similar.
Before giving both functions more features in future commits, allow
string_list_split() to also take more than one delimiter characters
to make them closer to each other.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Passing a string list that has .strdup_strings bit unset to
string_list_split(), or one that has .strdup_strings bit set to
string_list_split_in_place(), is a programmer error. Do not use
die() to abort the execution. Use BUG() instead.
As a developer-facing message, the message string itself should
be a lot more concise, but let's keep the original one for now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git commit" that concludes a conflicted merge failed to notice and remove
existing comment added automatically (like "# Conflicts:") when the
core.commentstring is set to 'auto'.
* ac/auto-comment-char-fix:
config: set comment_line_str to "#" when core.commentChar=auto
commit: avoid scanning trailing comments when 'core.commentChar' is "auto"
The pop_most_recent_commit() function can have quite expensive
worst case performance characteristics, which has been optimized by
using prio-queue data structure.
* rs/pop-recent-commit-with-prio-queue:
commit: use prio_queue_replace() in pop_most_recent_commit()
prio-queue: add prio_queue_replace()
commit: convert pop_most_recent_commit() to prio_queue
Document that we do not require "real" name when signing your
patches off.
* bc/contribution-under-non-real-names:
SubmittingPatches: allow non-real name contributions
Meson-based build did not handle libexecdir setting correctly,
which has been corrected.
* rj/meson-libexecdir-fix:
po/meson.build: add missing 'ga' language code
meson: fix installation when -Dlibexexdir is set
Clean-up compat/bswap.h mess.
* ss/compat-bswap-revamp:
bswap.h: provide a built-in based version of bswap32/64 if possible
bswap.h: remove optimized x86 version of bswap32/64
bswap.h: always overwrite ntohl/ ntohll macros
bswap.h: define GIT_LITTLE_ENDIAN on msvc as little endian
bswap.h: add support for __BYTE_ORDER__
GIT_TEST_INSTALLED was not honored in the recent topic related to
SHA256 hashes, which has been corrected.
* kl/test-installed-fix:
test-lib: respect GIT_TEST_INSTALLED when querying default hash
Declare weather-balloon we raised for "bool" type 18 months ago a
success and officially allow using the type in our codebase.
* pw/adopt-c99-bool-officially:
strbuf: convert predicates to return bool
git-compat-util: convert string predicates to return bool
CodingGuidelines: allow the use of bool
Clean up the way how signature on commit objects are exported to
and imported from fast-import stream.
* cc/fast-import-export-signature-names:
fast-(import|export): improve on commit signature output format
Our <sane-ctype.h> header file relied on that the system-supplied
<ctype.h> header is not later included, which would override our
macro definitions, but "amazon linux" broke this assumption. Fix
this by preemptively including <ctype.h> near the beginning of
<sane-ctype.h> ourselves.
* ps/sane-ctype-workaround:
sane-ctype: fix compiler error on Amazon Linux 2
Lift the limitation to use changed-path filter in "git log" so that
it can be used for a pathspec with multiple literal paths.
* ly/changed-paths-traversal:
bloom: optimize multiple pathspec items in revision
revision: make helper for pathspec to bloom keyvec
bloom: replace struct bloom_key * with struct bloom_keyvec
bloom: rename function operates on bloom_key
bloom: add test helper to return murmur3 hash
* 'master' of https://github.com/j6t/gitk: (21 commits)
gitk: remove header of now empty section "General options"
gitk: separate upstream refs when using the sort-by-type option
gitk: make 'sort-refs-by-type' optional and persistent
gitk: sort by ref type on the 'tags and heads' view
gitk: choosefont - remove a stray debugging line
gitk: allow horizontal commit-graph scrolling
gitk: update aqua scrolling for TclTk 8.6 / TIP171
gitk: update x11 scrolling for TclTk 8.6 / TIP 171
gitk: update win32 scrolling for Tk 8.6 / TIP 171
gitk: mousewheel scrolling functions for Tk 8.6
gitk: wheel scrolling multiplier preference
gitk: separate x11 / win32 / aqua Mouse bindings
gitk: remove non-ttk support code
gitk: replace ${NS} with ttk
gitk: always use themed Tk (ttk)
gitk: use $config_variables as list for save/restore
gitk: remove implementations for Tcl/Tk < 8.6
gitk: Make TclTk 8.6 the minimum, allow 8.7
gitk: remove code targeting git <= 1.7.2
gitk: require git >= 2.20
...
An earlier commit remove the only option that was available under
"General options". We don't need the header for the empty section.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
* mr/sort-refs-by-type:
gitk: separate upstream refs when using the sort-by-type option
gitk: make 'sort-refs-by-type' optional and persistent
gitk: sort by ref type on the 'tags and heads' view
Optimize pop_most_recent_commit() by adding the first parent using the
more efficient prio_queue_peek() and prio_queue_replace() instead of
prio_queue_get() and prio_queue_put().
On my machine this neutralizes the performance hit it took in Git's own
repository when we converted it to prio_queue two patches ago (git_pq):
$ hyperfine -w3 -L git ./git_2.50.1,./git_pq,./git '{git} rev-parse :/^Initial.revision'
Benchmark 1: ./git_2.50.1 rev-parse :/^Initial.revision
Time (mean ± σ): 1.073 s ± 0.003 s [User: 1.053 s, System: 0.019 s]
Range (min … max): 1.069 s … 1.078 s 10 runs
Benchmark 2: ./git_pq rev-parse :/^Initial.revision
Time (mean ± σ): 1.077 s ± 0.002 s [User: 1.057 s, System: 0.018 s]
Range (min … max): 1.072 s … 1.079 s 10 runs
Benchmark 3: ./git rev-parse :/^Initial.revision
Time (mean ± σ): 1.069 s ± 0.003 s [User: 1.049 s, System: 0.018 s]
Range (min … max): 1.065 s … 1.074 s 10 runs
Summary
./git rev-parse :/^Initial.revision ran
1.00 ± 0.00 times faster than ./git_2.50.1 rev-parse :/^Initial.revision
1.01 ± 0.00 times faster than ./git_pq rev-parse :/^Initial.revision
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function to replace the top element of the queue that basically
does the same as prio_queue_get() followed by prio_queue_put(), but
without the work by prio_queue_get() to rebalance the heap. It can be
used to optimize loops that get one element and then immediately add
another one. That's common e.g., with commit history traversal, where
we get out a commit and then put in its parents.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pop_most_recent_commit() calls commit_list_insert_by_date() for parent
commits, which is itself called in a loop. This can lead to quadratic
complexity if there are many merges. Replace the commit_list with a
prio_queue to ensure logarithmic worst case complexity and convert all
three users.
Add a performance test that exercises one of them using a pathological
history that consists of 50% merges and 50% root commits to demonstrate
the speedup:
Test v2.50.1 HEAD
----------------------------------------------------------------------
1501.2: rev-parse ':/65535' 2.48(2.47+0.00) 0.20(0.19+0.00) -91.9%
Alas, sane histories don't benefit from the conversion much, and
traversing Git's own history takes a 1% performance hit on my machine:
$ hyperfine -w3 -L git ./git_2.50.1,./git '{git} rev-parse :/^Initial.revision'
Benchmark 1: ./git_2.50.1 rev-parse :/^Initial.revision
Time (mean ± σ): 1.071 s ± 0.004 s [User: 1.052 s, System: 0.017 s]
Range (min … max): 1.067 s … 1.078 s 10 runs
Benchmark 2: ./git rev-parse :/^Initial.revision
Time (mean ± σ): 1.079 s ± 0.003 s [User: 1.060 s, System: 0.017 s]
Range (min … max): 1.074 s … 1.083 s 10 runs
Summary
./git_2.50.1 rev-parse :/^Initial.revision ran
1.01 ± 0.00 times faster than ./git rev-parse :/^Initial.revision
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui has _search_exe as needed to give the executable suffix
(.exe) on Windows. But, the prior commit eliminated the only user of
this variable. Delete it.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
gitexec looks up and caches the method to execute git subcommands using
the long deprecated dashed form if found in $(git--exec-path). But,
git_read and git_write now use the dashless form, by-passing gitexec.
This leaves two remaining uses of gitexec: one during startup to define
use of an ssh_key helper, and one in the about dialog box. These are
neither performance critical nor likely to be called more than once, so
do not justify an otherwise unused cacheing system.
Let's change those two uses, making gitexec unused. This allows removing
gitexec and _git_cmd.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
git-gui implements its own approach to locating and running various git
subcommands, bypassing git's capabilities for running git-*. This was
written in 2007: at that time, many git commands were shell-scripts
stored in $(git --exec-path), git's run-command api was not well adapted
to Windows and had serious performance issues when it worked at all, and
running subcommand 'git foo' as 'git-foo' was common and fully supported.
On Windows, git-gui searches $(git --exec-path) for builtin commands,
then attempts to find an interpreter on PATH to run those, invoking
these differently than on other platforms. For instance, the explicit
shebang #!/usr/bin/perl found in a script will be run by the first Perl
interpreter found on $PATH, which might not be at that specific location
so could be different than what git would run.
The various issues leading to the current implemention no longer exist.
Most git commands are now builtins, links to run those are not installed
in $(git --exec-path) by default (the "dashless" form is recommended
instead), and git's run-command api works well everywhere.
So, let's use git to launch its subcommands on all platforms. Do so by
modifying procs git_read and git_write to use the "dashless" form for
invoking git commands, avoiding the search for git-<foo>. This leaves
_git_cmd unused with cleanup in a later patch.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
git-gui's default clone method is git-clone's default, and this uses
hardlinks rather than copying the objects directory for local
repositories. However, this method explicitly fails if a symlink (or
.gitfile) exists in the path to the objects directory. Thus, the default
clone option fails for worktrees created by git-new-workdir or
git-worktree. git-gui's original do_clone trapped this error for a
symlinked git-new-workdir tree, directly falling back to a full clone,
while the updated git-gui using git-clone does not. (The old do_clone
could not handle gitfile linked worktrees, however).
Let's apply the more friendly fallback to a full clone in both these
cases where git-clone behavior throws an error on the default method.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
git-gui clones a repository by invoking git-plumbing commands, in proc
do_clone, rather than using git-clone. The justification was that the
low-level commands are guaranteed to provide a stable interface, while
the higher level commands such as git-clone may not be stable. This
approach requires git-gui to continually evolve by mirroring new
features in git itself, which has not happened, while the user interface
in git-clone has proven very stable. Also, git-gui does directly call
many other non-plumbing commands in git's repertoire.
do_clone's last significant functionality change was in 2015, and
updates are required for shallow clones, the reftable backend, cloning
from linked worktrees, and perhaps other features and bugs. For
instance, I had reports of git-gui failing to correctly clone
repositories prior to 2015, resulting in essentially the patch given
here. The only significant work was supporting .gitfile linked worktrees
unknown to do_clone, but supported by git-clone, and none regarding the
interface to git-clone itself. That interface is clearly stable enough
to not be a problem.
Supporting new use-cases with this requires exposing new options in the
clone dialog, then passing flags to git-clone. This avoids updating
do_clone to understand those options, reducing the maintenance burdens.
So, teach git-gui to use git-clone. This change is in one patch as
there is no obvious incremental path to migration. The existing dialog /
options / status screen are unchanged, the known user-visible changes
are that cloning from a working directory linked by a gitfile now works,
there is no auto-fallback to a full copy when cloning linked workdirs
and worktrees (meaning git-clone fails unless a full or shared copy is
selected), and messages displayed are from git-clone.
Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
The gpg.program configuration variable, which names a pathname to
the (custom) GPG compatible program, can now be spelled with ~tilde
expansion.
* jb/gpg-program-variable-is-a-pathname:
gpg-interface: expand gpg.program as a path