But I would first consider if designing your own ISA is really worth it, or if just implementing your soft core using a well-known ISA that has already support in major compilers wouldn't be MUCH easier. Of course RISC-V comes to mind, but you could consider other ISAs (such as MIPS, PowerPC or whatever else) - just be aware of possible licensing issues.
Why design your own ISA?
1) fun, education, challenge.
OK, that's unarguable.
2) so no one else can reverse-engineer or rip off your valuable code, compiled to your undocumented and possibly wacky ISA.
Ooookay.
3) because there aren't any existing good ISAs that you're allowed to use without getting sued and that fit your requirements.
That used to be true, even not so long ago, but it's almost certainly not true now.
There are obviously some fairly diverse ISAs such as the ZPU already mentioned, or the J1. Most of those seem to be both an ISA and an implementation, and it's probably not all that useful to make a different implementation. The same goes for ancient ISAs that you could probably get away with using such as 8080/z80, 6502, 8051. Also, OpenRISC.
If things don't have a good C compiler -- preferably gcc or LLVM -- then that rules them out for many uses. Most of the 8 bit CPUs make a terrible target for high level languages -- it's possible, but both the code density and speed sucks relative to hand-written assembly language. Real large programs for them consist of hand-written assembly language for the speed-critical parts, plus an interpreter for a better designed ISA.
Most old ISAs are also *full*. There's no room in the encoding left for custom instructions, so if you want to add something extra there's nowhere convenient to put it.
It's not even as if the different ISAs are all that different to each other. The biggest difference is perhaps between stack-based ones and register-based ones. But even then they all have:
- a way to load a constant literal value
- a way to load a value from memory at a calculated address, or to store a calculated value there
- arithmetic operations: add, subtract, and, or, xor, left shift, right shift (usually both logical and "arithmetic" provided)
- a way to test a value against zero, or two values against each other, and choose different execution paths
- a way to temporarily transfer execution to other place, remembering where it came from so it can return later, and to do this in a nested way
After this it's just pretty much meaningless variations in the number of registers, in the details of the binary encoding, and of possibly combining several of the above things into a single instruction. No new ISA since 1985 has made combined operations to any extent greater than adding together a register, a shifted register, and a constant to get an address for load/store (or to write back to a result register) -- and many don't do more than register plus constant.
I think by now everyone here knows I'm a fan of RISC-V :-) :-)
Why?
- it's free to use
- it's well supported by gcc and LLVM and by a rapidly growing number of libraries and OSes
- it's got a lot of convenient encoding space reserved for custom instructions
- it's very modular. Registers can be 32 or 64 bits. There can be 16 or 32 registers. gcc and llvm can target any combination, including down to compiling any C/C++ program to just 37 fixed-size 32 bit opcodes -- which is more like eight instruction types with simple variations: register to register arithmetic (10), register and 12 bit immediate arithmetic (9), conditional branch with 12 bit offset from PC (6), load (5) or store (3) with 12 bit offset from register, jump and link to address in register plus 12 bit offset (1), jump and link to PC + 20 bit offset shifted left by 1 (1), load 20 bit constant shifted left by 12 into a register and optionally add the PC (2). There are only four basic instruction formats.
- There are a lot of different open source (and commercial) implementations with different size / speed / features trade-offs, with more every month. The smallest is around 200 LUTs plus 250 flip-flops. The biggest right now are Out-of-Order cores with per-cycle throughput comparable to Arm A72 or Intel Nehalem or maybe Sandy Bridge.