So all three pins of Q2 and Q4 are tied together? No need for any current-balancing measures? Would it work with only 3 transistors?
You want the "center transistor" to conduct 2X the bias of the input differential transistors, thus the choice of 2 paralleled transistors. A single device would cause a offset current in the output. On an IC transistors are almost identical, especially with a tight layout, for discrete devices you'll see some variation.
WRT the input resistors having a bias drop, it's low @ (Ic/Beta)*R which is just 1mv with Ic of 1ma, Beta of 100 and 100Ω, nothing to worry about for discrete use, as the individual transistors will likely have more of a Vbe difference than 1mv!
In our chip designs these resistors were 50Ω but we didn't have any 50Ω resistors in hand so just used 100Ω instead for the Protoboard use.
Honestly we didn't try to match any of the 2N3904s, just grabbed 4 from the parts tub, a handful of resistors and plugged things into a Protoboard.
Here's what the kludged up Protoboard looks like. The white dots on the transistors are for device identification, White for 2N3904, Red for 2N3906, Blue for 2N7000. We don't see well and this helps locate devices in the parts tub, resistors are another story

.
Best