I think PIC16 does not have hardware multiply?
I guess not. I couldn't find any information about hardware multiplication/presence of math accel peripherial in the PIC16F877A datasheet.
A sequence of left shifts and additions is equivalent to multiplication. For example,
a*10 is exactly equivalent to
(a<<1)+(a<<3), when
a is any unsigned integer type with overflow ignored. The result is slightly different, if instead of an addition, exclusive or (
^) is used.
(When using both addition and subtraction and a constant
N-bit multiplier, at most
N/2 operations are needed. For example,
a*15 is exactly equivalent to
(a<<4)-a when
a is any integer type with modular/wraparound overflow. Again, results are slightly different when exclusive or is used instead of subtraction.)
The behaviour when treated as little-endian unsigned 16-bit words, each bit only changing that bit position and more significant positions, indicates that any combination of shifts stay on the more significant side, i.e. there are at least as many left shifts as there are right shifts.
The behaviour when only the highest bits of the word are set (resulting in all zeros in specific result words) indicates that there likely is a right shift by one or two bits (depending on which of the four result words is involved) at the combine stage.
In any case, the problem at hand is not to replicate the sequence of instructions of how the PIC calculates the cryptographic hash, but to discover the mathematical representation of that calculation. Because of the many equivalences (like multiplication being decomposable to shifts and additions, or shifts and additions and subtractions, and vice versa), and because this algorithm does not look like a proper cryptographic hash (since its output evinces clear patterns; in a proper cryptographic hash, almost half the bits should change whenever a single bit in the input changes; so called 'avalanche effect') but written by someone not versed in cryptography at all, it is basically guaranteed the mathematical representation is much simpler than the code itself.
As a comparison, consider my favourite random number generator, Xorshift 64*, by George Marsaglia. It uses a single 64-bit nonzero number
x as its state. The state is updated and a new pseudorandom 64-bit number
y generated using simply
x = x ^ (x >> 12); x = x ^ (x << 25); x = x ^ (x >> 27); y = x * 2685821657736338717;Even though it is this simple (and horribly fast, compared to other generators), the sequence of
y is fairly random, surpassing many currently used pseudorandom number generators touted as "best". Indeed, if we use only the 32 most significant bits
z of those
y,
z = y >> 32;the sequence of outputs passes the most stringent statistical randomness tests currently widely used: the BigCrush tests of the TestU01 framework.
In fact, up to 40 most significant bits can be used. The full 64-bit output only fails the MatrixRank test of BigCrush.
In comparison, Mersenne Twister (MT19937) fails
two tests in the BigCrush test set. A modified TinyMT does pass BigCrush test set also, but has seen much less investigation than the Xorshift family of functions.
However, most of the choices of the shift sizes and the final multiplier in such linear-feedback shift register pseudorandom number generators, do not yield good results. In practice, they need to be tested for randomness and patterns. This is why pseudorandom number generator investigators and specialists say one should not just try to create their own variant: most choices are bad, and you're unlikely to hit a good combination without proper verification.
The step from randomness to cryptographic hashes is huge, and involves irreversibility: making it impossible to derive
any source value (
plaintext) from the result value (
ciphertext). When two source values yield the same result, it is called a
collision. Various attack models, like appending a short stub to a source value to force a collision, exist. This is why cryptographic hashes are often designed from ground up using different models, and exhibit features like a change in any single bit in the source value changes half the bits on average in the result value.
Fortunately, none of that is evident here: this is not a proper cryptograhic hash, just a hash that shows clear patterns.
If it was a proper cryptographic hash,
I wouldn't even bother trying to reverse it. Even when the algorithm itself is known, but not the initialization vector or secret that is mixed with the plaintext (depending on the algorithm type), reverse-engineering can be impossible within a human lifetime.
(I am not a cryptographer myself, and only know enough to write secure software using proper cryptographic principles and techniques without failing horribly.)