After you figure out the control part, you need to address the actual audio switching. You will not have success with cleanly switching audio signals with BJT transistors. A BJT acts like a current-controlled current source (with some current gain, the Hfe of the transistor). They won't work in this application.
You might have success with FETs. A FET acts as a voltage-controlled variable resistance. In this way, it can act as a switch for analog signals. Even easier, get some variation of an "analog switch" or "analog mux" IC, such as 4053, 4052, 4066, etc (different prefixes based on manufacturer: CD4066, 74HC4066 etc). These package up FETs along with the appropriate circuitry to drive them, and accept digital logic inputs to control everything. In either case, you need to realize that the FET isn't perfect, and its resistance will vary with the signal level. This will cause distortion, unless you take care to design out the effect of this variable resistance. In general, all that is needed is a low source impedance to drive the input of the FETs, and a high load impedance on the other side. Then the variation of resistance in the FET has virtually no effect on the signal level. This means, buffer each input with an op-amp as a unity gain follower, and send the outputs of those to the inputs of the analog MUX IC (or discrete FETs if you choose). The output of the MUX/FETs should then connect to another opamp as a unity gain follower. You'll want to AC couple (DC block) at some point too, as some offset is likely to be introduced.
The above is OK if you want a completely solid-state solution. I'd personally be tempted to use relays since they will negate the requirement for the additional opamps, and will have much less crosstalk. If looking at relays, be absolutely certain to use signal relays, (a.k.a. telecom relays), not power relays. They really are different, and a power relay will end up making bad connections after a while, if used for small signals.