I can only guess it was a matter of alignment, which could depend on what stuff precedes and succeeds each of the instantiations. Also, one of them was at the end of a section, so maybe it matters somehow. I posted unoptimized code because otherwise everything got inlined and I didn't feel like writing a longer template that wouldn't.
Anyway, it nicely illustrates the point that you can't be 100% sure what you get because it's an optimization. A more reliable (and efficient) way is to have a language mechanism which avoids generating duplicate code in the first place.
ML had parametric polymorphism in the '70s, Java since the '90s (generic containers and
sorting included), Rust designers still haven't heard of it in the 2010s
BTW, speaking of Rust, is it even possible to write a similar wrapper for stdlib's qsort by any means whatsoever in that language?
In C++ I had to resort to function templates and default arguments, not sure if Rust has those...