Generally, B-E junction has higher capacitance. Both C-B & B-E works as protection/ overvoltage, difference in frequency on the line. DC and low end ether option is o'k, but for high speed only C-B.
I think you got that reversed; the base-collection junction is slow and the base-emitter junction is fast. Sometimes the base-collector junction is used because the slow recovery time clamps the RF signal for a short time after the overload is removed like a PIN diode would do. Usually though the base-collector junction is used for higher breakdown voltage.
I've seen that done a few times, my assumption was that it was to get diodes with thermal characteristics that match the transistors in the circuit. I'm sure one of our resident analog experts can correct me or elaborate though.
That is only very rarely the reason. Usually it is to take advantage of lower leakage than common diodes or the extremely high speed of the base-emitter junction. Even in the past, 12 volt <500ps switching diodes were specialty items because of low demand; most circuit would use a small signal schottky diode instead.
TO-225 or TO-220 transistors are sometimes used as thermal sensors instead of axial diodes because the tab makes for convenient mounting.