If the difference was only in different capacitance, voltage derating would indeed solve the balancing problem. The smaller capacitor would fill up faster, so if you rated your max voltage so that when Vsmaller = 5V, then Vbigger = say 4V, and Vtotal would be at 9V.
But, balancing is also needed to account for different leakage currents. Think about it, if one of the capacitor discharges on its own faster than its buddy, the one who discharges less sees larger part of the total voltage as you keep charging the series.
Classic options are,
Parallel resistors: the idea is to simply increase the leakage of both capacitors significantly enough so that the original (smaller) leakage current becomes only a tiny part of the new, higher leakage current; explicit resistors dominate and form a voltage divider. Obvious drawback is, these resistors consume current all the time, discharging the capacitors. Good for very short storage periods only.
Zener diodes / series diodes / etc. These devices have, unlike resistors, exponential current rise as a function of applied voltage. They are pretty good at preventing exceeding the maximums, but they are still not on-off devices (just exponential), meaning they discharge the capacitors significantly near their clamping voltage. Also, the voltage curve of zeners / diodes is highly temperature dependent.
Active circuit. Now, with a voltage reference + comparator, you can build a circuit which behaver pretty much like "ideal zener": below threshold, leakage/quiescent current can be tiny enough not to matter, and above threshold, they can shunt significant current. Classic and simple implementation, see TL431.