I think it's not easy to find this, because memory buses are usually up to the implementer of a device (e.g. in ARM ecosystem) And they rarely share this information.
In addition, consider the complexity of a peripheral bus. It can have a dozen nodes connected to it, so no way you would like to do a direct fan-out of that bus in a chip. Some vendors may actually connect it like a network-on-chip structure. In low power designs, you may find that the "clock enable" bit is actually a power control bit, meaning that peripheral bus may need cross, bidirectionally, a dozen clock and power domains.
Even for SRAM blocks, they may need to be shutdown in sleep mode, and also requiring bridges. These may (or may not) add a few cycles delay.
Caches are a real PITA, because they are tuned to improve average performance. It's hard to be deterministic about caches, because their state is heavily temporally correlated by the program code. I suppose you could extend an existing (open source) sim to have handshaking with the memory subsystem, so you could introduce memory arbiters, flash wait states, etc.
But honestly at this point I think you're almost better off running it on real silicon.