For some reason, couldn't leave behind that linearisation technique, had to come up with an intuitive explanation:

TL;DR, adding 3 shifted sigmoid-like shapes, it reduces to adding only 2, because the 3rd is almost flat when far from origin, so it doesn't change much what the middle diff pair will do anyway. Geometrically, this is like sliding 2 parallel bars, and adding the 3 points marked 1, 2, 3. The distance between the sliding imaginary bars had to be chosen such that the the moving bars can not fit both on the same side, or else it will worsen the linearity.

In terms of adding slopes, the same, visually is intuitive how adding the intersection between the first derivative, the plot like a hump, and the 2 bars tend to average at a more constant value than the hump variation. Constant value here corresponds with more constant slope, which corresponds to more linear transfer function, which corresponds to less distortions.

Now, if adding shifted sigmoids reduces distortions, this means the same effect should be achieved no matter how we shift the TF. Instead of shifting the TF with transistors of a different area, we can use normal differential pairs (transistors of the same area), and shift the transfer function with other methods. For example with a DC bias.

0mV - 0.292460%

5mV - 0.287210%

10mV - 0.271911%

20mV - 0.217110%

50mV - 0.032649% <--- min distort 0.03% (+/-50mV value comes from a brief browse, not optimized)

100mV - 0.140207%

The advantage will be that instead of (1+13) + (1+1) + (13+1) = 30 area, same effect can be achieved with 3 equal pairs, all BJT with the same 1x area, so total 6 x area instead of 30, and a 1x area BJT might be faster than a 13x area one, so higher working frequency.

If it were to have more area available, more symmetrical pairs can be paralleled to reduce the noise factor.

Another advantage of symmetrical pairs (DC shifted) is that the overall distortions are lower (at least in simulation) when compared with the same schematic implemented with asymmetrical pairs (see attached FFT detail vs previous FFT detail min THD 0.03% vs. 0.05% w asymmetrical areas).

In case the end application is at low frequency, instead of 3 differential pairs with 3 distinct DC offsets, we can use only one differential pair, and switch the other input with +/-50mV really fast.

If the switching is at a much higher frequency than the signal, then the output will average the +/-50mV of "shaking", and it will be just like if it were 3 differential pairs instead of just one pair of BJT. The shaking signal must be averaged out from the output (output must be low pass filtered in case this does not happen naturally).

Even more convenient than that, the above paragraph is nothing but addition (not modulation). Addition can be made by simply putting the voltage sources in series, or maybe with 2 adding resistors, like in the last draft schematic from the lower right corner.

Note that a shaking made with a sinusoidal HF shaking, instead of square shaking, will still reduce distortions when compared with a normal and not shaken input signal. (sinusoidal shaking should be easier to implement than square, analysis required for which one linearizes the amplifier better)

Both the adding of the shaking jumps in the input signal, and the low pass filtering of the output if necessary, can be applied outside of an existing amplifier. The existing amplifier doesn't have to be modified, and doesn't have to be specially designed to include parallel differential pairs, as long as the existing amplifier has enough bandwidth for the fast "shaking" of the useful signal.

In conclusion, the same core principle of adding shifted transfer functions, can be implemented either:

- by additional asymmetrical differential pairs

- by additional symmetrical differential pairs with different DC offsets

- without additional differential pairs, and instead by time-multiplexing a DC-offset overlapped to the input signal and averaging the amplifier output

First 2 methods will require specialized amplifiers, the 3rd one may be applicable to already existing amplifiers.