On Tue, Jun 24, 2025 at 02:22:21AM -0400, Eric Sunshine wrote: > > diff --git a/t/t0612-reftable-jgit-compatibility.sh b/t/t0612-reftable-jgit-compatibility.sh > > @@ -112,14 +112,11 @@ test_expect_success 'JGit can read multi-level index' ' > > - awk " > > - BEGIN { > > - print \"start\"; > > - for (i = 0; i < 10000; i++) > > - printf \"create refs/heads/branch-%d HEAD\n\", i; > > - print \"commit\"; > > - } > > - " >input && > > + { > > + echo start && > > + test_seq -f "create refs/heads/branch-%d HEAD" 10000 && > > + echo commit > > + } >input && > > I had suggested[1] an effectively equivalent change to Patrick for a > couple tests in the nearby t0610, but he rejected[2] the idea due to > the pure-shell version being significantly slower than the `awk` > version. > > Pondering his response today, I wondered if it would make sense to > replace our pure-shell `test_seq` with an implementation via `awk`, > however, if most of our sequence vend only a small set of numbers, > then the startup cost of `awk` would probably swamp any savings, > especially on Windows where process startup is extremely slow. Taking > that into account, I further wondered if we could see an overall win > by taking a hybrid approach in which we employ the pure-shell version > if vending a small set of numbers, but fall over to an `awk` version > if vending a lot of numbers, especially as in the test above or the > tests in t0610. Anyhow, food for thought, or not, if you're not hungry > for thought food. Ah, interesting. I didn't time it at all, as my general intuition for shell performance is that counting process spawns overrides everything else (though admittedly it is usually O(n) processes vs O(1), and here we are going from one extra process to zero). I did a few timings, and it looks like the shell wins at 10,000 on my system, but awk wins at 50,000 (though there is a lot of run-to-run noise; I think awk might even win at 10,000 on a loaded system, as this is such a light load that CPU frequency throttling comes into play). I assumed that the culprit was a lack of buffering, but I don't think so. awk seems to issue 10,000 write() calls. I guess it is just internal shell overhead in issuing commands. Where is a JIT byte-code shell interpreter when we need one? ;) My inclination is not to worry about it too much. At 10,000 I think we are talking about a few milliseconds. There's so much more low-hanging fruit if somebody wants to optimize the test suite. IMHO readability is more important here (and if we really want to optimize, doing it inside test_seq would be better). -Peff