If you want to port it to a non arm platform, some considerations should be given to the following:
- clz: the code from arm uses clz extensively and you will have to find an implementation of it on your target platform. Luckily some compilers have built in implementation for clz. A generic clz will vary greatly in inefficiency.
- addressing or paging: the code has tons of tables and compiles to over 100kb on most platforms. On platforms that uses paging, you have to manually address that. This means extensive but simple changes to the code.
- numeric constants: arms has 32.- bit int vs 16 on most 8-bitters. So all the constants will need to be written as ul. Simple but extensive, as there are thousands of those items.
- flash space: all in, you are looking at over 100kb of flash space for the dsp lib. If your compiler does optimize away unused code, you have to use a large mcu.
I think in the end, you will find that porting it to a 32 bit or over chip, arm or otherwise, is a lot easier and practical.