It's also much easier to test each component individually in the HDL of your choice that trying to write target programs to do the same thing. When you're satisfied the separate modules of your CPU work then you might want to run some compiled code. Just a suggestion.
yes, good advice, the Truth is that … the guy who has built this project hasn't tested it hardly, he said "yes" but it is a lie, i mean he has been working on such a SoC for 3 years so he has though that so long time is enough to be sure of his job. The Truth is … he has never tested so deeply, and he has never put a real complex application on, he has tested just a few parts with extremely simplified context, and this is the reason why he has has asked me to try his SoC with something from the real world, so i have putted a bit of real world routines on, and bang now we all know that the pipeline is buggy
As i wrote, i am not good at HDL, i am just trying to help, and in order to do that, i am planning to split the testing activity into two+1 sub activities
1) asking an HDL friend of mine to hardly put everything on modelsim writing a lot of test-benches (i have to learn about it, i am not good at, i can't directly help)
2) asking an other friend to use his chip scope (Xilinx technology) in order to understand exactly what is wrong using Spartan3E and Spartan6 boards
3) supporting them with testing application, such as assembly code with positive/negative test cases, etc
i hope that combining them will demystify things
as "firmware guy" as far as i have observed, i can say:
1) adding NOP after branches avoids unexpected behavior
2) adding NOP after load/store avoids unexpected behavior
the -O0 is the winning one, it has always had success with every kind of complex application i have putted on such a SoC: never seen any unexpected behavior with -00, i thing because it is working exactly this way: stuffing a lot of NOP in order to avoid hazards
but it also may be the real cause is related to collision to registers from/to the mux, or other things that i think may be clarified by the Chip-Scope.
I don't know very well this technology, but if i have understood right the "Chip-scope" is a Logic Analyzer from Xilinx, unfortunately it is not included in the ISE web pack that i have (ISE v10.1/linux, and ISE v14/windows), my friend has a full license so he could use it.