A SPI transfer that fails or becomes unreliable by adding short wires is typical for a wrong configuration of the clock/data timing relationship. With a wrong configuration setup/hold times are marginal and the circuit shows exactly that strange behavior the OP reported. It will be gone with a valid digital configuration. Trying to make the wrong digital design work better by some analog mods is a waste of effort.
Regards, Dieter
Yeah. A bit more about this:
Synchronous digital logic (e.g. as used in digital ICs like microprocessors) is based on data and clock changing "simultaneously". But this involves a carefully designed clock tree. It is a clock edge which triggers change of data. Then the data propagates through gates which have some guaranteed minimum delay. By modelling it is then guaranteed that the changed data value gets to its destination significantly after the clock transition - this minimum acceptable time is called hold time.
This arrangement, "output data as soon as acceptable but not any sooner", leaves maximum possible time for data delays before the
next clock transition - the margin that needs to be left at the end of the slot is called
setup time.
But SPI does not do this kind of delay management and matching. If you try to run SPI like "normal" synchronous logic, hold time violations are very likely. Or even beyond that; sampling the data even
before it is outputted, so violating setup time plus being off-by-one bit.
SPI instead is specified to clock the data in on the opposite clock edge it is outputted: expressed in time, exactly half-way the bit time slot. This completely prevents the issue of data being sampled "too early", but of course reduces the time margin to the next transition to half, but that is generally OK for the clock rates used in SPI.
Now if the clock configuration is incorrect, SPI tries to operate like normal synchronous logic inputting the data at the exact same time as its being output. If the system was designed for this, like normal synchronous logic, it would correctly read the
previous bit, but being not delay-matched for this purpose it has pretty much equal chances of sampling the
previous bit, the
current bit, or the invalid transition area between them which could sample in either value or result in metastability (which is an interesting failure condition in itself).