Author Topic: [solved] how do you compare two folders? to extract source diff  (Read 1256 times)

0 Members and 1 Guest are viewing this topic.

Offline DiTBho

  • Frequent Contributor
  • **
  • Posts: 962
  • Country: gb
Usually when I work with Linux kernels, I download the source, and immediately create a git branch labelled "original code", so I can commit my changes on it, and this way it's easy to extract patches.

Now I am working on a two and large kernels, my colleagues didn't use Git, and differences between kernel-A and kernel-B are unclear.

Kernel-A and kernel-B are derived from the same source, but they are two different branches, both bugged, and with different working and not working stuff.

Within each main folder, the sources are distributed in several sub-folders, with a deep hierarchy. I see that some sub-folders have been renamed, deleted or added, as well as some of the contained files.

This is bad without Git, because it's not immediately clear how to know how a file was renamed or where it was moved.

How would you compare two folders, to extract diff, in this case?  :D
« Last Edit: July 13, 2021, 07:02:27 pm by DiTBho »
 

Offline esepecesito

  • Regular Contributor
  • *
  • Posts: 62
  • Country: de
Re: how do you compare two folders? to extract source diff
« Reply #1 on: July 11, 2021, 04:09:28 pm »
If they are not completely different "ls -lR > /tmp/old" the same for the other version, in another file, then diff. Crude, but may work.
Another possibility with for:
for file in d1/*.cpp; do
    diff "$file" "d2/${file##*/}"
done

You can also do it with find, connot do it out of my head now... but something like:
find /path/to/src -exec diff {} d2/${file##*/} \;
I'm sure it's wrong... but something along the lines should do.
 
The following users thanked this post: DiTBho

Offline evb149

  • Super Contributor
  • ***
  • Posts: 1923
  • Country: us
Re: how do you compare two folders? to extract source diff
« Reply #2 on: July 11, 2021, 05:07:56 pm »
Well under linux of course there are several 'diff' related tools.
And a UI tool called 'meld'.

But since you know these kernels came from git maybe you can take the 'A' project, find out what git branch it was based on 'AO',
check out that branch, and populate the 'A' files on top of 'AO' then you have some git-able differences fro 'A' to 'AO'.

Do the same for distribution 'B' and 'BO'.

Then you can use a git related history mechanism to relate the changes that led from 'AO' and 'BO' (assuming you have the public git history relating the parental branches from the kernel.org) and also deltas from 'A' -> 'AO' and 'B' -> 'BO' so you can relate the 'local feature branches' as well.

Then the problem can become the same as trying to relate 2.6.33 to 2.6.28 or whatever history relationship exists for the ancestry of the kernels in question with
your usual IDEs and git aware tools.

 
The following users thanked this post: DiTBho

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 7301
  • Country: fr
Re: how do you compare two folders? to extract source diff
« Reply #3 on: July 11, 2021, 05:32:00 pm »
Any decent diff tool will work on directories and recursively.

But, well, I don't know how different your two kernels are. But good luck analyzing every single difference in all source files. That could be a very time-consuming work.
 

Online magic

  • Super Contributor
  • ***
  • Posts: 3682
  • Country: pl
Re: how do you compare two folders? to extract source diff
« Reply #4 on: July 11, 2021, 06:06:50 pm »
Kids these days ::)

Quote
If you must generate your patches by hand, use ``diff -up`` or ``diff -uprN``
to create patches.  Git generates patches in this form by default; if
you're using ``git``, you can skip this section entirely.

edit
I see that some sub-folders have been renamed, deleted or added, as well as some of the contained files.
Nevermind, that's gonna suck big time. Not sure what could find renamed files and directiories, particularly if they have been modified too.
« Last Edit: July 11, 2021, 06:11:01 pm by magic »
 
The following users thanked this post: DiTBho

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 7301
  • Country: fr
Re: how do you compare two folders? to extract source diff
« Reply #5 on: July 11, 2021, 06:25:35 pm »
Ah yes, if there are renamed files and directories, all the worse.
A decent VCS can handle renamed stuff and keep track of what has been renamed to what.

Now working on such a large code base as the Linux kernel without any VCS is complete madness.
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Frequent Contributor
  • **
  • Posts: 962
  • Country: gb
Re: how do you compare two folders? to extract source diff
« Reply #6 on: July 11, 2021, 06:30:36 pm »
But since you know these kernels came from git

No, they do come neither from git nor from svn, that's exactly the problem.
There is no history, just files and folders, and I have to somehow reconstruct the history.


There is also a big problem: I should need something like "bisect" (which is a common tool with Git, at least for Linux), because there are things that work with kernel-A but not with kernel-B, and vice-versa.


Frankly, it's a full mess ... it seems like a punishment, I must have done something bad in my life to deserve it  :-//
 

Offline evb149

  • Super Contributor
  • ***
  • Posts: 1923
  • Country: us
Re: how do you compare two folders? to extract source diff
« Reply #7 on: July 11, 2021, 06:41:32 pm »
Well I meant "came from git" in a loose sense.

If you know you have a kernel source tree corresponding originally / in general version ancestry to some locally customized variant of 3.14.2 then at least you can grab the kernel source tree from linux git and check out 3.14.2 kernel source revision and then at least it should have a good general agreement of file / directory structure vs. whatever modifications were done locally to in not with GIT feature development tracking committed "yet".

If you have a 3.14.2 based kernel but there are a LOT of directory moves / renames / merges and such then yes that is a total mess and you'd almost have to write a program to try to estimate where a given file may have "originally come from" if it is say 72% similar to some other file in the stock linux 3.14.2 git tree and maybe do something based on that.

I'm not sure how you could end up with a local kernel copy and not even know what version kernel it is based on even if someone didn't maintain the .git metadata / repo structures corresponding to it.  So maybe "reconnecting" the changes to the closest available git ancestor and then treating that as two diverged feature branches is the best bet.


But since you know these kernels came from git

No, they do come neither from git nor from svn, that's exactly the problem.
There is no history, just files and folders, and I have to somehow reconstruct the history.


There is also a big problem: I should need something like "bisect" (which is a common tool with Git, at least for Linux), because there are things that work with kernel-A but not with kernel-B, and vice-versa.


Frankly, it's a full mess ... it seems like a punishment, I must have done something bad in my life to deserve it  :-//
 
The following users thanked this post: DiTBho

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 7301
  • Country: fr
Re: how do you compare two folders? to extract source diff
« Reply #8 on: July 11, 2021, 06:43:03 pm »
I would first use a tool or write a script to compare directories and file names, and give you what differs. At this point, you can just compare names, and list files and directories that exist on one side and not the other. Some diff tools have this feature. That will give you new and renamed files/directories. If there aren't too many of them, reconstructing what is just a rename and what is new should not take too much time.

Next step would be to do a full diff on all files. You can rename the renamed files/directories back to their original name before that, or you can just leave them as is and diff them separately.
 
The following users thanked this post: DiTBho

Online Doctorandus_P

  • Super Contributor
  • ***
  • Posts: 1726
  • Country: nl
Re: how do you compare two folders? to extract source diff
« Reply #9 on: July 11, 2021, 09:13:57 pm »
I like:
http://meldmerge.org/

It's not the greatest source code comparison program, and it's written in python, so it is quite slow, but it is open source and does work.

It can also compare sub directories recursively, and it's easy to click around with a mouse.
It does not work well with renamed directories though. It will find the different directory names, but then does not compare the files in those directories.
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Frequent Contributor
  • **
  • Posts: 962
  • Country: gb
Re: how do you compare two folders? to extract source diff
« Reply #10 on: July 11, 2021, 09:39:28 pm »
Thanks guys, I feel really lost.

I just bought a copy with license of "Beyond Compare Pro" for Windows. Its activation key will be dispatched by email probably tomorrow because it requires human interaction. I will use it over Samba on a laptop, with Meld and MeldMege locally on the Linux server.

I read good reviews about, it's a bit pricey but it seems to have a lot of serious and advance features.

We will see.
 

Offline AntiProtonBoy

  • Frequent Contributor
  • **
  • Posts: 938
  • Country: au
  • I think I passed the Voight-Kampff test.
Re: how do you compare two folders? to extract source diff
« Reply #11 on: July 12, 2021, 01:20:50 am »
I think Double Commander should be able to visually compare files for you.

Example:
https://doublecmd.sourceforge.io/static_gallery_mirror/page278_1.html
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Frequent Contributor
  • **
  • Posts: 962
  • Country: gb
Re: how do you compare two folders? to extract source diff
« Reply #12 on: July 12, 2021, 09:02:16 am »
Double Commander

oh, nice! and it's written in fpc (freepascal)  :D
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 7301
  • Country: fr
Re: how do you compare two folders? to extract source diff
« Reply #13 on: July 12, 2021, 06:06:03 pm »
Now that you invested in Beyond Compare... not much point in using other, less powerful compare tools. I for one have been using WinMerge (on Windows) for years. Works fairly well, and still maintained. Not as many features as Beyond Compare though. https://github.com/WinMerge/winmerge/releases/

As I said, I would personally have still probably written tools/scripts to "preprocess" the huge amount of directories and files, and after that, I would probably have used a visual compare tool such as the above for actually merging code in source files that are different. I think that would have saved time in the end. But I sort of like writing tools, so...
 

Online 2N3055

  • Super Contributor
  • ***
  • Posts: 3927
  • Country: hr
Re: how do you compare two folders? to extract source diff
« Reply #14 on: July 12, 2021, 06:34:23 pm »
Now that you invested in Beyond Compare... not much point in using other, less powerful compare tools. I for one have been using WinMerge (on Windows) for years. Works fairly well, and still maintained. Not as many features as Beyond Compare though. https://github.com/WinMerge/winmerge/releases/

As I said, I would personally have still probably written tools/scripts to "preprocess" the huge amount of directories and files, and after that, I would probably have used a visual compare tool such as the above for actually merging code in source files that are different. I think that would have saved time in the end. But I sort of like writing tools, so...

1+ for WinMerge.
 

Offline DiTBho

  • Frequent Contributor
  • **
  • Posts: 962
  • Country: gb
Re: how do you compare two folders? to extract source diff
« Reply #15 on: July 12, 2021, 07:07:23 pm »
I would personally have still probably written tools/scripts to "preprocess" the huge amount of directories and files, and after that, I would probably have used ...

With which algorithms? and to do what?

Yesterday, while I was waiting for my license to be activated, I wrote a C program that can find "added" and "removed" files and folders, but the real problem is understanding "what" a file or folder was renamed to.

My program can index each file, and I can re-use some old A.I. algorithms to extract a "match percentage", say document X in kernel-A is probably the ancestor of document Y in Kernel-B, documents have different file-names, the match percentage is ...

"Beyond Compare" does this analysis. I don't know how it does it  :-//
 

Offline AntiProtonBoy

  • Frequent Contributor
  • **
  • Posts: 938
  • Country: au
  • I think I passed the Voight-Kampff test.
Re: how do you compare two folders? to extract source diff
« Reply #16 on: July 13, 2021, 02:35:11 am »
Total commander is also a good one for windows users.
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 7301
  • Country: fr
Re: how do you compare two folders? to extract source diff
« Reply #17 on: July 13, 2021, 03:43:28 pm »
I would personally have still probably written tools/scripts to "preprocess" the huge amount of directories and files, and after that, I would probably have used ...

With which algorithms? and to do what?

I think I said it already? First comparing files and folders, and just output the ones that are different on each side. Without trying to automatically figure out which are potential renames, and which are just completely different. Of course there are other tools that can do this. Winmerge can also do that for you. But writing your own just gives you the opportunity to list differences in the exact way that you want, and that will make your life easier. That's the whole point. Most visual diff tools will just give you a long list of differences, which, when there are many files and folders, is not necessarily the best way to deal with that.

"Beyond Compare" does this analysis. I don't know how it does it  :-//

Now if you meant - which is I guess why you asked which algorithms I would use - not just listing which files or directories are different, but also determining which are potential renames, then that's more complex of course. If Beyond Compare does this, then it can be handy. But, I was just assuming that (unless the teams working on code were completely nuts) the number of file/directory renames was likely small compared to the total number of files and directories, and thus, that presenting differences in a way most convenient for you was more important than trying to automatically determine which were renames and which were just new stuff, something that I suggested doing manually in the end.

Now if Beyond Compare does everything you want and presents things in a convenient way for you, that's all good.
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Frequent Contributor
  • **
  • Posts: 962
  • Country: gb
Re: how do you compare two folders? to extract source diff
« Reply #18 on: July 13, 2021, 07:02:00 pm »
I'm pretty tired, I've been working on this thing non-stop for two days, let me say ... Beyond Compare is really worth all the money it costs, it's really super helpful!

It allowed me to somehow bisected and isolated 84 files which are related and differ in changes for the most; then cross-comparing things (between kernel A and Kernel B), I found 5 of them manifest several problems, which are really silly
  • a PHY not correctly initialized, the eth0 was up and seemed to work, but unable to communicate with the physical layer
  • a "UNIX-V" flag not correctly checked, and misspelled, this somehow passed the compile-check but then the kernel crashed as soon as it tried to mount something
  • a script trying to redefine CC but in the wrong way, this dropped all the passed cflags and made the compiler to compile a couple of modules in the wrong way
  • a couple of very bad bad mistakes in the DTB file, two fields appeared truncated probably due to a bad copy&paste

Really silly mistakes, I think was lucky to catch them so fast. Now I need to sweep out the garbage and all the temporary debug flags from the final trunk; I am in the process of creating a git-repo for everything, forcing my colleagues to use it.


edit:
thanks again guys  ;D
« Last Edit: July 13, 2021, 07:06:24 pm by DiTBho »
 

Offline bd139

  • Super Contributor
  • ***
  • Posts: 20506
  • Country: gb
Re: [solved] how do you compare two folders? to extract source diff
« Reply #19 on: July 13, 2021, 09:19:37 pm »
Been there. We had a file share and the only person allowed to edit stuff on it had to obtain the wooden spoon. This was until I bought a second wooden spoon and cocked their entire process up. This resulted in us moving to SVN and proper traceable version control  :phew:

Beyond compare is excellent for mirroring directories to removable media aka backups too. Excellent tool and well worth paying for. Have been a paying customer for about 15 years now :)
 
The following users thanked this post: DiTBho


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf