You could make a circuit that dumps the excess charge into neighboring cell(s), and so on and so forth until the last cell is fully charged. I'm not real sure what happens if you try overcharging such a circuit, though: each clamp dumps charge into its neighbor, and you're applying power but it has nowhere to go. You'd have to have a battery manager on the outside that notes when all cells are fully charged, and either disconnects (terminal voltage goes up) or dummy loads (terminal current stays high) the charge.
Maybe a voltage balancing chain would be better than a clamping chain...
Anyway, for the lossy version, it's only supposed to be a small loss in charge efficiency, overall the same loss as the maximum difference in matching. If one cell is 90% and the rest are 100%, then that cell has to burn the remaining 10% on every discharge-charge cycle, and the full pack is limited to that 90% charge figure. Well, 90% discharge means the 100%-ers aren't fully discharged, so they only need to go 90%, so it's more like a 10% of 10% difference, and it's not so bad at all, a few percent of charge efficiency, not the same as the full difference. In the average case that is. The worst case (all start at "0%" charge) would still be the 10% loss case. But again only 10% for that cell, so for a 4S pack, say, it's still 2.5% max overall.
So yeah, a few percent is down in the noise of normal variation and charge efficiency to begin with, so it's really not a bad method in general.
Tim