I have worked with my own soft core CPUs and looked at others. I prefer the MISC style processor (Minimal Instruction Set Computer) which is pretty much a simple stack machine (also called NOSC for No Operand Set Computer). I also program in assembly/Forth since they are very similar on a stack machine and you don't need fancy debugging tools.
Here is a generic reference to a wide variety of soft core CPUs from the most simple to the most complex. If you spend some time looking at this data (it's a LOT btw) you will find some things that blow your mind.
https://opencores.org/projects/up_core_list/summaryhttps://opencores.org/projects/up_core_list/downloadshttps://opencores.org/usercontent/doc/1523749899There is a microBlaze that only uses 260 6LUTs in a kintex-7-3. Yeah, he doesn't try to level the reporting, he just tells you what type of LUTs were used, 4 input, 6 input, A which I assume means Altera?
Even more interesting is the 320 6LUTs risc-v virtex-u-2 by Jan Gray that gets 1171.9 KIPS per LUT! It is 1 clock per instruction and runs at up to 375 MHz. Ohhh... it's proprietary, only in the list for reference...
If you are interested in the stack processors (labeled "Forth" in the tables) check out the J1 and J1a from James Bowman. It is a simple design, 1 clock per instruction up to 400 MHz. It is in Verilog, but I believe it has been translated to VHDL which is not hard even if you do it yourself. Rather than worry with language issues, just look at what registers and logic the Verilog is describing and write that in VHDL. It is most likely pretty straight forward code.
Anyway, lots and lots of data to analyze there. The guy's name is Jim Brakefield. He must really love this stuff to do all this work. He's been keeping this list for at least 10 years, probably longer.