I've played with QOI some more.
While it performs rather well, and not too far from PNG for a whole range of images (including the author's test set

), it performs pretty poorly for some of them. In particular, I have images generated with complex textures, and the compression ratio falls to just about 1/2, while the PNG equivalent is almost 1/10 of that!
But! That wasn't the end of it. I looked for a simple and fast compression that I could apply to QOI (a bit in the same vein as PNG does some pre-filtering, and then uses - I think - 'deflate' to compress it). I found XLZ, which is a simple and fast LZ77 compression:
https://github.com/banebyte5115/xlz . It does happen to complement QOI pretty well, and while both QOI+XLZ is still MUCH faster for encoding (and even decoding) than PNG, it does perform very well on a wider range of images than just QOI, and the cumulated code is just a few KBytes. The above images I mentioned, which were "tough" for QOI alone, get compressed BETTER than the highest-level PNG.
While QOI can easily be turned into a streaming compression and decompression, XLZ is another beast, though. As it's LZ77, it does require a significant amount of memory - basically, it requires a "sliding window". But you can always work around that for memory-limited applications by compressing data in smaller chunks. Might not be quite as efficient compression-wise, but it's workable.
So, I'm definitely considering chaining the two for some applications.
I suggest considering the source code in both projects as "reference implementations" (as the author of QOI states), rather than production-ready code, and thus suggest rewriting those with your own constraints, code style and coding rules, if they apply, for any serious project.