Check with a scope for runts on the SPI clock. I've seen cascaded 8-bit transfer subroutines used for 16-bit transfers, where the last 8-bits (subroutine) leaves the clock low and then the clock is up right away for the next 8-bit transfer, so you get an extra clock pulse as a glitch.