The Cortex-M series use a 16bit instruction set (the thumb instructions set in ARM speak). Older microcontrollers often used the ARM7TDMI core which supports both the full blow 32bit instruction set and the 16 bit thumb. Using the thumb instruction set usually resulted in a huge space saving at the cost of a minor performance reduction.
Cortex-M use the 32/16 bit Thumb2 instruction set, except the M0/M0+ which use almost pure Thumb1 16 bit instructions with just a few system management 32 bit instructions. Choose Thumb2 or ARMv7-M for both. They don't support original ARM instructions at all.
ARM7TDMI uses ARMv4T which, as you say, has both original ARM 32 bit instructions and Thumb 16 bit instructions. Any given function normally uses purely one or the other although it's possible on some CPUs to play tricks such as ADDI PC,PC,#1 to switch to Thumb mode before executing the next instruction.
In the days when ARM7TDMI ruled, there was often only a 16 bit bus between CPU and memory, in which case Thumb was faster as well in almost all cases and you usually only switched to ARM mode for specialised instructions that the stripped-down Thumb mode didn't provide.