Products > Programming

Suggestions wanted for hashing function

(1/2) > >>

HwAoRrDk:
My scenario is as follows:

I have an arbitrary length array of bytes. There is a limited set of certain byte values within this array I'm interested in. For each instance of a value in that set, I know it's offset in the array. So, therefore, each value of interest can be uniquely identified by the pair of <offset, value>.

I want to hash these multi-byte 'IDs' down to a single-byte index value in a limited range - for the sake of example, say, 0-32. Is there some not-too-complicated hashing algorithm that can be applied here? I want something fast and suitable for running on an 8-bit microcontroller.

I've looked at Pearson hashing, as that gives a single-byte result (i.e. 0-255), but I'm not sure it can be modified to produce values in an even smaller range. I've read the original paper, and it does mention smaller output ranges, but that's only in the context of string inputs that have a limited character set (e.g. alphanumeric characters only). I did think perhaps I could just use the standard algorithm and then divide the result by 8 (or >> 3), but I don't know whether that will just render it pointless by introducing collisions.

Any suggestions?

Mechatrommer:
each byte 0-32? there are 2 bytes of that type <offset, value>? thats 1024 combination. hashing it to 1 byte value is collision guaranteed.

HwAoRrDk:
Sorry, I don't think I explained very well. Let me give an example.

Given a set of IDs (each comprising pair of 8 bit value and 16 bit offset, concatenated):
0x840007
0x9b0068
0xa10074
0x9b0135
...etc.

I want to hash each into a value between 0 and 31. E.g. the first might be equal to 12, the second 29, third 3, etc.

I want to do this because the hashed value will form an index into an array of memory that cannot be made big enough to encompass using the raw ID as an index (which would need 16 million entries!).

bd139:
Just use modulo division by the number of slots if it requires no cryptographic strength and colissions are not a problem. Don't make it more complicated than you need it to be until you need it to be.

Mechatrommer:

--- Quote from: HwAoRrDk on April 03, 2020, 03:39:45 pm ---Sorry, I don't think I explained very well. Let me give an example.

Given a set of IDs (each comprising pair of 8 bit value and 16 bit offset, concatenated):
0x840007
0x9b0068
0xa10074
0x9b0135
...etc.

I want to hash each into a value between 0 and 31. E.g. the first might be equal to 12, the second 29, third 3, etc.

I want to do this because the hashed value will form an index into an array of memory that cannot be made big enough to encompass using the raw ID as an index (which would need 16 million entries!).

--- End quote ---
that is even worse than i thought. 16 millions entries into 32 entries. you dont mind collision do you? if you dont as bd139 said, modulo is one way, another way is get X bits value from ID 8 bits value that you think unique and Y bits from 16 bits offset. combine them X + Y = 5 bits (32 entries). or XOR them all the 3 bytes and discard either 3 bits of LSB or MSB, or the mainstream basic (as in literature similar to checksum)... shift bits and XOR 3 times and take 5 bits. ymmv.

Navigation

[0] Message Index

[#] Next page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod