Resistors are never required per se. It's just that LEDs need current limitation circuit, especially because of negative Vf temperature coeff; the hotter they get, the lower forward voltage they have.
Due to that funkiness, voltage supply won't work at all, even if carefully tuned. So now you can add a series resistor, and the LED current will be less dependent on the Vf variations, but it still is. And here's the optimization; the more voltage you drop over the resistor, the better the current regulation gets, but efficiency is wasted. Or you can drop only a tiny part of total voltage over the resistor but then it's close to not having a resistor at all, i.e., LED variations start significantly affecting current again.
If you just connect LEDs parallel, one with the lowest Vf hogs the most current, gets hotter, starts having even lower Vf and hogs even more current. Direct parallel connections are undesirable but acceptable when LEDs are matched (same batch, similar Vf), thermally coupled together and bond wires add a bit of resistance.
But as you clearly are after a really good and efficient circuit, don't parallel any LEDs, resistors or not. If the string cannot be 16 LEDs, then just have multiple strings but each with their own driver, i.e., parallel the drivers not LEDs.
Now if you want good efficiency and sane integration level / size, the driver should not be an ancient switcher IC like LM2596. Sure it was great in 1990's. Look in the LED driver IC category at distributors, there are plenty. If you take the efficiency seriously and the string length is less than some 8 LEDs, synchronous rectification may be a requirement, getting rid of some 0.5V of extra drop.
In any case, if efficiency is truly important, the choice of LEDs dominates here. It doesn't make a lot of sense to optimize the driver from 80% to 98% efficiency if the LEDs themselves are some cheap 80 lm/W crap while over 200 lm/W is easily available.