BD139 / BD140 is indeed not TO220 but smaller and fits a Breadboard better.
They are almost as cheap as the TO92, and TO-220 is bound to be a lot more expensive still, so in the end you won't save much, and the thicker pins are also harder on the breadboards. Higher power BJT's tend to have a lower Hfe and this is also clearly starting to show with the BD139 (Which is also a 40+ year old design).
But instead of trying to save some transistors, I advise to go the other way around.
You can get assortment boxes of different transistors for cheap:
https://www.aliexpress.com/wholesale?SearchText=transistor+assortmentAnd in bulk probably still cheaper.
Devise some tests (to high voltage, current, or power) that will let some smoke out of some of these transistors, but not from others.
There is no better way to learn than to see the differences of these transistors, and a bit of smoke makes it fun too.
Tests also do not have to be destructive. What happens if you put 2 different transistors in a long tailed pair, and why does that happen?
There are also grey area's where e (saturated) BJT refuse to conduct more current without it recassarily getting damaged.
Examining such areas are very educational. In education, the why is always more important than the how.
For keeping the different transistors apart you can dip them in some paint, instead of reading the small numbers, or just write them off as single use.
But it aso depends on the age group and what you want to learn them about transistors.
--------- 8< ----------------- 8< ----------------- 8< ----------------- 8< --------
[edit] Removed text below the cut line [/edit]