Author Topic: Comparing HEX Files  (Read 1053 times)

0 Members and 2 Guests are viewing this topic.

Offline ko4nrbsTopic starter

  • Regular Contributor
  • *
  • Posts: 69
  • Country: us
Comparing HEX Files
« on: April 27, 2025, 03:52:16 pm »
Is there any way to use MPLAB IDE to compare two HEX files?

I have some older HEX files that I believe are the same as some others I have but wanted to double check them by comparing for differences if any.

Bill
 

Online RoGeorge

  • Super Contributor
  • ***
  • Posts: 7326
  • Country: ro
Re: Comparing HEX Files
« Reply #1 on: April 27, 2025, 04:00:06 pm »
Use 'fc -b <file1> <file2>' from a command prompt in Windows, or 'cmp <file1> <file2>' from a terminal in Linux.

Offline tunk

  • Super Contributor
  • ***
  • Posts: 1221
  • Country: no
Re: Comparing HEX Files
« Reply #2 on: April 27, 2025, 04:06:19 pm »
Or you could calculate hash values, e.g.:
https://en.wikipedia.org/wiki/SHA-2
 

Offline Doctorandus_P

  • Super Contributor
  • ***
  • Posts: 4102
  • Country: nl
Re: Comparing HEX Files
« Reply #3 on: April 27, 2025, 07:04:01 pm »
What do you want to compare precisely?

If it's output from the same toolchain then even just the filezise gives some half reasonable result.
If you want to compare the content, even if one file is in Intel Hex format, and the other in Motorola S-record, then it's a whole other story.

Files in either the Intel or Motorola format can also be different while the binary content it encodes is the same.
 

Offline bson

  • Supporter
  • ****
  • Posts: 2575
  • Country: us
Re: Comparing HEX Files
« Reply #4 on: April 27, 2025, 08:32:55 pm »
You can also convert them to normalized hex values (like all lowercase, without any '0x' prefix or trailing 'h' suffix, or such), one per line, and diff them.  This will tell you if they're mostly the same with just some bytes inserted or deleted, or slightly rearranged.

Some parts of hex files isn't data, but TekHex for example can also include segments and symbol names, and many formats have a start address.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 892
Re: Comparing HEX Files
« Reply #5 on: April 27, 2025, 08:53:58 pm »
WinMerge from winmerge.org will do visual comparisons, assuming you are using Windows of course.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 5141
  • Country: nz
Re: Comparing HEX Files
« Reply #6 on: April 27, 2025, 11:41:02 pm »
Files in either the Intel or Motorola format can also be different while the binary content it encodes is the same.

Well, yes, trivially true in many ways, but the output from a given tool should be consistent in practice.

In Intel hex format:

- each a-fA-F can be arbitrarily upper- or lower-case

- each record within a 64k segment contains a 16 bit start address and 8 bit number of bytes in the record. One file might have 16 bytes per record, another might have 1, another might change every time.

- start addresses don't have to be contiguous or even sorted

- addresses less than 1 MB have the same problem as 8086: there are many ways to do 16*segment + offset that give the same memory address. Addresses from 1 MB to 4 GB are unique.

I have a 50 line Intel hex decoder in

https://github.com/brucehoult/trv/blob/main/trv.c

It will accept any spec conforming ihex file, but is also more permissive in that it ignores all whitespace. Records don't have to end with newline, but conversely it will accept any number of spaces, tabs, LF, CR between any two printing characters (hex digits and :) It does check checksums and makes sure colons are in the right places and only the right places.

I've never had a need to read Motorola hex
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 892
Re: Comparing HEX Files
« Reply #7 on: April 28, 2025, 07:19:59 am »
A rant on Intel Hex...
Quote
It will accept any spec conforming ihex file, but is also more permissive
For any Intel hex format parser it is quite simple- find a RECORD MARK (':') then process a record as defined by the specification. Find the next RECORD MARK, repeat. Any bytes in the file outside of a record can be ignored, as they do not matter in any way, and anything after an end of file record can also be ignored.

The Intel 'Hexadecimal Object File Format Specification' never mentions anything other than records, which have a RECORD MARK to mark the start of a record, and will have a record length so the record end can be known. There was never a need to mention line endings or anything else outside a record, as the record specification is all that is needed. Quite simple and quite complete, without specifying anything beyond what was needed. If it had been created in current times, I imagine it would end up with a 100 page specification document instead of the ~11 Intel used.

Today, if you look up the Intel hex format many web sites will mention line endings which have nothing to do with the specification. You will also find many parsers that will cough up on something like not having a record mark as the first char in the file, or not having a record mark as the first char after a line ending, and so on. In most cases the hex generator and the hex consumer come from the same toolchain so as long as they are happy with each other then all is fine.

Why the specification morphed over time (for some) I would guess is because of the generator side- they see the generator produced hex as the format specification and not as something conforming to a specification. Its good for a generator to add line endings to a hex file, but its something allowed by the specification and not a requirement. This also proves that on the generator side the code writer has taken the liberty to add line endings which are not in the specification (no problem), but at the same time the code writer for the parser typically gives no liberty in processing hex files and adds requirements which are not a part of the specification. The code writer for the parser in many cases makes their own job more difficult than it needs to be by not following the specification.

I doubt Intel had the intention of allowing comments to be inside a hex file, but by not creating more specification than was needed comments are possible as long as a record mark is not used (and the parser sticks to the specification).
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 5141
  • Country: nz
Re: Comparing HEX Files
« Reply #8 on: April 28, 2025, 09:20:29 am »
A rant on Intel Hex...
Quote
It will accept any spec conforming ihex file, but is also more permissive
For any Intel hex format parser it is quite simple- find a RECORD MARK (':') then process a record as defined by the specification. Find the next RECORD MARK, repeat. Any bytes in the file outside of a record can be ignored, as they do not matter in any way, and anything after an end of file record can also be ignored.

I've never heard of or seen any Intel Hex file with anything other than the records (and line endings) in it, but if you want to allow that then no problems. Just change line 60 from:

Code: [Select]
if (nextHexFileChar() != ':') die("Expected : in hex file");

to

Code: [Select]
while (nextHexFileChar() != ':'){};
 

Offline __greg__

  • Newbie
  • Posts: 6
  • Country: pl
Re: Comparing HEX Files
« Reply #9 on: April 28, 2025, 09:38:35 am »
You can use srec_cmp tool from Srecord package.

It does not care about text file format.
Data lines may be out of order, too.
And have different lengths too,

It compares the whole memory content described (in any way possible) in hex file.
 
The following users thanked this post: ledtester

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 892
Re: Comparing HEX Files
« Reply #10 on: April 28, 2025, 02:27:11 pm »
Quote
but if you want to allow that then no problems
The rant wasn't about your code, just about how some specification that was so simple eventually took on added requirements which do not exist. The 'but is also more permissive' phrase must have triggered me, as the specification already allows for any char(s) outside a record with a record mark char being the exception. I think in the distant past I wanted to add comments to some hex files for some reason, but of course the programmer did not like that. Certainly not a big problem and not very important but I do notice the deviation every time I run into a web site explaining the Intel hex format or see source code that parses hex files.

Quote
I've never heard of or seen any Intel Hex file with anything other than the records (and line endings) in it
You must have felt some need to deal with hex files that do not conform to the normal by consuming white space as you do, even though that will never be seen as you say. Whether Intel intended (likely) or not, by using a record mark and record length they managed to eliminate dealing with various combinations of line endings or anything else outside of a record.

Not that it matters, but your proposed code change only allows white space outside a record (anything other than a record mark should also be allowable), and also allows white space inside a record which I would argue is not allowed. Making minimal changes and leaving function names as-is, a couple lines would probably take care of it and conform to the spec-
37    while ( ch = fgetc(hex_file), (ch != ':') && (ch != EOF) ){}
44    int ch = fgetc(hex_file);


edit-
binutils in its ihex_scan function also will not like seeing anything outside a record other than \r or \n, so its a common thing to misinterpret/reinterpret the Intel hex format.

Its not a common need, but comments in a hex file could be useful in some cases and should have been an option/allowed from toolchains early on. A little late now as too many ships have already sailed with their new interpretations. A simple example would be for an avr- they have a mechanism to deal with fuses/eeprom and other non-flash memory spaces in a hex file and would have been nice if comments could be generated to mark these records. Once in a while you need to view the hex file to see what is actually generated for fuses, or to see that some address specified section got placed where you intended, and if comments between records were allowed (which the Intel hex format certainly allows) then it could have been helpful in some cases and would have no effect on the records.

These parsers DO typically allow comments after the end of file record where they usually will quit scanning the file, but  some comments would only be useful if they preceded a record.
« Last Edit: April 28, 2025, 11:41:26 pm by cv007 »
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4410
  • Country: us
Re: Comparing HEX Files
« Reply #11 on: April 28, 2025, 11:59:24 pm »
the xxx-objcopy utility will convert hex files to bin files, which are more likely to be comparable, at least if you're looking at strict "equal/not-equal" comparison.Convert the bin files back to .hex files (with the same formatting details if you wanted to search for particular differences.(hmm.  I suspect objcopy will read in a conforming hex file and write out its "standard format" hex file, all in one command...)
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 5141
  • Country: nz
Re: Comparing HEX Files
« Reply #12 on: April 29, 2025, 12:39:51 am »
Quote
I've never heard of or seen any Intel Hex file with anything other than the records (and line endings) in it
You must have felt some need to deal with hex files that do not conform to the normal by consuming white space as you do, even though that will never be seen as you say.

I was concerned primarily with three things:

1) dealing with any combination of CRs and LFs, including transmission methods that caused double-spacing of lines.

2) the format allows 255 data bytes in a record, plus the address, length, and checksum. Once hex encoded that's over 500 characters. Some transmission methods may add newlines to wrap such long longs. (or if the original had no line breaks at all).  This can insert CR and/or LF between almost any pair of characters, possibly even splitting up the two characters making up an 8 or 16 bit hex value. Once you add support for this, also allowing spaces and tabs and other whitespace between any two graphical characters is easy and doesn't seem harmful.

3) some (usually smaller) files can come from copy&paste from web pages or other documents that use indentation with spaces or tabs to set off "code".

Quote
Not that it matters, but your proposed code change only allows white space outside a record (anything other than a record mark should also be allowable), and also allows white space inside a record which I would argue is not allowed.

I guess you meant "not only".

See line-wrapping in reason #2 above.

The whole point of a hex file format is that it is a TEXT format, not a binary one, so should be resilient to the kinds of things that happen if text is distributed as text e.g. kin the body of an email, not as an attachment.

Ignoring anything outside a record starting with a ":" is perhaps a useful addition to what I already did, though it won't help with the common case of SMTP or HTTP headers. Fortunately that's much easier to manually edit out than repairing unwanted line wrapping which can only really be reliably done by parsing records to find their length. i.e. by something that understands ihex format.
 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 892
Re: Comparing HEX Files
« Reply #13 on: April 29, 2025, 01:32:57 am »
Quote
Ignoring anything outside a record starting with a ":" is perhaps a useful addition
Leaving the whitespace check intact, or changing it to only accept chars which are ascii hex (ignoring anything else, not only whitespace, with the exception of the record mark), would get you something where the records could be split in any way other than with ascii hex chars (or a record mark). The format only allows for ascii hex chars, plus the record mark, so ignoring everything else, anywhere in the file, is also an option that does not really break the specification. That a record is contiguous is only implied in the specification, and a record field is specified to be only an ascii hex char, so eliminating any non ascii hex chars between fields you still end up with an intact record. No one is going to intentionally produce these kind of hex files, but if they show up this way for some reason you can still parse them successfully.

As a parser of hex files, you have more freedom being on the receiving end, and can be more selective in what gets processed. The generator of hex files can have an unknown parser on the other end, which typically has their own ideas about what is valid (hence my original rant).

I have written hex parsers for my own use a number of times (using different languages, not because I can't get it right the first time), and have used the 'ignore anything but records' method without a problem. My next one, if ever needed, should also extend this to ignore any char that is not a record mark or an ascii hex char.

edit- on second thought, I wouldn't want to use any hex file that could not put records together contiguously, whatever may have caused the breakup. Put comments or whatever else outside a record- fine, already ignoring that, make me reassemble a broken record- not my problem, that is a generator/sender/transport problem better solved on that end.
« Last Edit: April 29, 2025, 09:57:39 pm by cv007 »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 5141
  • Country: nz
Re: Comparing HEX Files
« Reply #14 on: April 29, 2025, 03:58:09 am »
ignore any char that is not a record mark or an ascii hex char.

That will trip up on random colons in the file, for example the SMTP and HTTP headers I mentioned, but not only.

Maybe ignore anything that can't manage to keep the record size, start address, and record type contiguous i.e. 8 hex digits after the colon. Or, at most, allow whitespace in that part but not other non hex characters.

 

Offline cv007

  • Frequent Contributor
  • **
  • Posts: 892
Re: Comparing HEX Files
« Reply #15 on: April 29, 2025, 06:53:07 am »
I guess I'm a little unclear how these hex files would end up in some lower layer of the internet.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf