The problem, is I based the code on the implementation in ripgrep. But
while ripgrep is writing directly to the stream, I am using a Formatter,
which means I have to write characters, not raw bytes.
Thus we need to percent encode all non-ascii bytes (or we could switch
to writing bytes directly, but that would be more complicated, and I
think percent encoding is safer anyway).
The previous behaviour was designed to mimic the output buffering of
typical UNIX tools: line-buffered if stdout is a TTY, and fully-buffered
otherwise. More precicely, when printing to a terminal, fd would flush
explicitly after printing any buffered results, then flush after every
single result once streaming mode started. When not printing to a
terminal, fd never explicitly flushed, so writes would only happen as
the BufWriter filled up.
The new behaviour actually unifies the TTY and non-TTY cases: we flush
after printing the buffered results, then once we start streaming, we
flush after each batch, but *only when the channel is empty*. This
provides a good balance: if the channel is empty, the receiver thread
might as well flush before it goes to sleep waiting for more results.
If the channel is non-empty, we might as well process those results
before deciding to flush.
For TTYs, this should improve performance by consolidating write() calls
without sacrificing interactivity. For non-TTYs, we'll be flushing more
often, but only when the receiver would otherwise have nothing to do,
thus improving interactivity without sacrificing performance. This is
particularly handy when fd is piped into another command (such as head
or grep): with the old behaviour, fd could wait for the whole traversal
to finish before printing anything. With the new behaviour, fd will
print those results soon after they are received.
Fixes#1313.
Set a limit of how many threads fd will use by default. On hosts that
have a large number of cores, using additional threads has diminishing
returns, and having large numbers of threads increases the setup cost.
Thus we don't necessarily want to use the same number of threads as we
have cores.
Fixes: #1203
I think this might help startup time a little bit, since it looks like
on linux at least the std implementation uses syscalls to determine the
parallelism, whereas num_cpus was processing the /proc/cpuinfo file.
It also removes another dependency.
The users crate is no longer maintained. There is a maintained fork
called uzers, but we already have a dependency on nix, and that has the
functionality we need.