EEVblog Electronics Community Forum
Products => Computers => Programming => Topic started by: krish2487 on October 02, 2020, 12:01:43 pm
-
Hello folks,
I have been doing some reading about adopting good practices while versioning the code I write.
I was hoping to get some ideas and suggestions on how folks here version their code.
I was hoping to get some insight into strategies, tools, utilities, practices.. anything and everything that might be of
some help to others who are (like me) not too well informed nor formally trained.
While the topic mentions git, any suggestions pertaining to good practices, independant of the VCS are welcomed. :-)
Likewise with the term "programming", I am more interested in such a system for firmware, but every other types / kinds
of progamming are fine I suppose.
I ll go first,
I have been looking into and trying to learn about git flow + commitizen as a couple of tools that go hand in glove to implement a
structured approach to firmware releases and to enforce (for myself for now) a consistent conventional commit style and semver style.
My belief is that firmware moves less rapidly than a web technology or desktop software, hence why a more formalized
strategy like git flow makes sense. Also, most of us underestimate the longevity of code. especially so on microcontrollers.
I was wondering if other experience folks have already such a system / workflow in place that are tried and tested.
I realise one workflow / suggestion /style is not a silver bullet to all problems.. but I was hoping to get a bit more insight into other practices without reinventing the wheel myself, hence the rather broad
-
Random thoughts:
* Make sure you adopt Semantic Versioning (https://semver.org/) for your project. Such versioning has the format of "MAJOR.MINOR.PATCH+BUILD". For example "1.4.8+a0ef1".
* Choose git for version control system.
* Every time you create a release for your project, modify the current version number and then create a git tag "MAJOR.MINOR.PATCH" for the current commit. The BUILD is implicitly the first 5 hex digit of the current commit hash.
* You can query the current version from your repo with the command "git describe --tags --long --abbrev=5", then use a regex pattern "(\d+)\.(\d+)\.(\d+)-\d+-\w(\w+)" to search on the output string to extract and compose the correct "MAJOR.MINOR.PATCH+BUILD" semantic version. This can be integrated into your program's version strings, etc.
* When working with git, make sure you have at least two branches: "master" and "develop". The "master" branch tracks stable and production ready code. No build errors allowed. The "develop" branch is where you do all the business work, where things are in flux, or broken. Merge "develop" into "master" when you are ready to make a new release.
* If you are using Github, the above tagging and version strategy plays well with Github Releases.
* I don't really have good recommendation for third party tools to automate this, but I do recommend to writing scripts to perform all these tasks.
-
I recommend the distributed Fossil SCM after using it since 2017 as a one man band, developing mostly Forth related embedded projects.
I arrived at Fossil after using CVS from 1996, then Mercurial and Bazaar about three years later .
Fossil was designed for Sqlite by the Sqlite developer and a single file contains the project repo, a wiki, bug tracker, web-server etc.
This brief page describes the Fossil advantages much better than I can :
https://sqlite.org/whynotgit.html (https://sqlite.org/whynotgit.html)
-
Random thoughts:
* Make sure you adopt Semantic Versioning (https://semver.org/) for your project. Such versioning has the format of "MAJOR.MINOR.PATCH+BUILD". For example "1.4.8+a0ef1".
* Choose git for version control system.
* Every time you create a release for your project, modify the current version number and then create a git tag "MAJOR.MINOR.PATCH" for the current commit. The BUILD is implicitly the first 5 hex digit of the current commit hash.
* You can query the current version from your repo with the command "git describe --tags --long --abbrev=5", then use a regex pattern "(\d+)\.(\d+)\.(\d+)-\d+-\w(\w+)" to search on the output string to extract and compose the correct "MAJOR.MINOR.PATCH+BUILD" semantic version. This can be integrated into your program's version strings, etc.
More or less my thoughts..
I would like to suggest one more addition - a build id for uniquely identifying firmware builds.
https://interrupt.memfault.com/blog/gnu-build-id-for-firmware (https://interrupt.memfault.com/blog/gnu-build-id-for-firmware)
Build ID coupled with the commit hash and semver, it would be trivial to identify a rogue binary and perform a troubleshooting process.
* When working with git, make sure you have at least two branches: "master" and "develop". The "master" branch tracks stable and production ready code. No build errors allowed. The "develop" branch is where you do all the business work, where things are in flux, or broken. Merge "develop" into "master" when you are ready to make a new release.
* If you are using Github, the above tagging and version strategy plays well with Github Releases.
* I don't really have good recommendation for third party tools to automate this, but I do recommend to writing scripts to perform all these tasks.
Exactly the idea behind git flow
https://nvie.com/posts/a-successful-git-branching-model/ (https://nvie.com/posts/a-successful-git-branching-model/)
And you do have bash and other shell helper utilities for helping automate this flow.
https://github.com/nvie/gitflow (https://github.com/nvie/gitflow)
the last piece of the puzzle is to standardize the format of commit messages and changelog management, which I feel is fairly well done by this tool
https://commitizen-tools.github.io/commitizen/ (https://commitizen-tools.github.io/commitizen/)
This utility will help standardize the commit messages in a conventional commit style.
https://www.conventionalcommits.org/en/v1.0.0/ (https://www.conventionalcommits.org/en/v1.0.0/)
I was wondering if others had / have a similar workflow or suggestions to improve it, for a more structured approach to code management.
I recommend the distributed Fossil SCM after using it since 2017 as a one man band, developing mostly Forth related embedded projects.
I arrived at Fossil after using CVS from 1996, then Mercurial and Bazaar about three years later .
Fossil was designed for Sqlite by the Sqlite developer and a single file contains the project repo, a wiki, bug tracker, web-server etc.
This brief page describes the Fossil advantages much better than I can :
https://sqlite.org/whynotgit.html (https://sqlite.org/whynotgit.html)
I did have look at fossilSCM a while ago. It is easy to use.
However, one big caveat is that the rest of world (or atleast a large part of it) has standardized on git as a version control tool
Any collaboration or sharing of code repos or work will become much harder with fossil, and the time to familiarise developers with it.
Not because the tool is hard, but because of the time and resources spent for people to adopt it.
-
Hi,
Gitflow workflow is designed for libraries and tools that have immutable releases - release once released must never be changed again, bugs are fixed on new one and users are forced to to upgrade to newest one.
Now decide if you evre need to fix and re-reelase old releases. If not, you can go for Gitflow.
-
Hi,
Gitflow workflow is designed for libraries and tools that have immutable releases - release once released must never be changed again, bugs are fixed on new one and users are forced to to upgrade to newest one.
Now decide if you evre need to fix and re-reelase old releases. If not, you can go for Gitflow.
Isnt that a good thing for firmware?
it provides traceability to an extremely granular level. And even bug fixes on the firmware will warrant a new release anyway. The overarching process of upgrading firmware to a bugfree / stable one does not change irrespective of the workflow. Git flow just allows us to identify a bug from an immutable release as you said. Isnt firmware development and VC, immutable since it is running on hardware. Unless the devices have an OTA ability, any release of a firmware is basically its only firmware unless manually upgraded.
-
I did have look at fossilSCM a while ago. It is easy to use.
However, one big caveat is that the rest of world (or atleast a large part of it) has standardized on git as a version control tool
Any collaboration or sharing of code repos or work will become much harder with fossil, and the time to familiarise developers with it.
Not because the tool is hard, but because of the time and resources spent for people to adopt it.
All true but how many here will actually collaborate with others on their embedded projects ?
GIT was developed by and is used by the Linux Kernel developers, who are the perfect example of online collaboration between many people worldwide who all program in C and who all use GCC/Clang for a common OS. You won't find Linux developers versioning schematics, PCB layouts, pictures of housings and enclosures like you will in embedded.
In the embedded word (unlike Linux), everything is very different, the compilers, the MCU's, the CAD systems for PCB design, schematic capture and so on. There is very little common ground in the software unless you use a monolithic system like Arduino.
So does it make sense for someone to struggle with GIT every day for their blinky project in the (unlikely) anticipation of others joining in, or is it perhaps more sensible to use a SCM such as Fossil that will make their one man band development easy and fast ?
However, Fossil also does everything GIT does, and a few things GIT doesn't, so on the unlikely chance a project does take off with multiple developers, Fossil can easily handle it because it's a distributed SCM just like GIT.
I'm not disagreeing with you because you're right, every man and his dog is using GIT, it's as common as Microsoft Windows and C but based on my observations of many embedded hobbyists I have met, most of them only use GIT like I use wget, as a easy way to download something from the internet.
-
I'd suggest you try to keep it simple. Avoid adding complexity to the process unless the advantages outweigh the disadvantages.
The philosophy I try to use, , is called branch late. Basically you avoid branching and merging unless needed to bugfix a release after head has moved on. I think it is the simplest approach and works well for a solo worker or a small team and on any vcs.
Just check code in to master, and then at some point make a release. Continue checking in to master. If you need to go back and make a minor fix on the release then go and make a branch at the release point. Check in the fix and also merge it forward into head.
It is easy to become a slave to the vcs. Tools like git are amazing compared to what we used to have, but also complex and you don't have to use all of it. Consider the team size and skillset. Best practise wisdom on the net may relate to big projects with many people working on them, where some of the team are gurus in the vcs.
-
I would like to associate myself with hendorog's comment ;)
-
Unless your project is never talking to anything else, you will have over time have security issues. Keep in mind that while you fix them with a new version, or patch, what base version you patch them in becomes very important.
Failing to keep this in mind has been the down fall of many widely used "Open" packages. Some so bad, KXCD got in the act describing why the security bug was so bad.
For example, I'm on version 1.2.3.4 of your project, it does everything I want, sure might be some bugs in it, some features not working exactly right, but my tests of the overall application stack are good. Now a CERT issue hits and your answer is the fix is in 1.3.2.2 or greater. Fine, but 1.3.2.2, we have to rush and get that version, test it, test with our app and then get it into production without taking down our customers for too long. Then we find out a few months later 1.3.0.0 has a very nasty security issue in it, because of a new feature, that we are not using, and no one really cares about yet. But you have only been putting security fixes in the very latest version of the project. If only had released 1.2.3.5 that has the security fix in it, then we wouldn't be in this mess. Now we are with a deadline of under 24 hours to fix major apps.
This is why you see people like RedHat taking an "Open" project, and back porting the security fix and putting out a 1.2.3.4-002 version, because the testing of 1.3 with the new and changed features will take too long.
Now, how does one do this with git? I don't care, in fact, a poor man's source control, where only one guy can put in changes to and controls them works just as well. It's all a mater of setting out your rules and following them and thinking about having more than one active supported branch.
-
Gitflow workflow is designed for libraries and tools that have immutable releases - release once released must never be changed again, bugs are fixed on new one and users are forced to to upgrade to newest one.
Now decide if you evre need to fix and re-reelase old releases. If not, you can go for Gitflow.
In those situations, I would just fork a new maintenance repo just for that version. That way you don't have to disrupt the main fork.
-
All true but how many here will actually collaborate with others on their embedded projects ?
GIT was developed by and is used by the Linux Kernel developers, who are the perfect example of online collaboration between many people worldwide who all program in C and who all use GCC/Clang for a common OS. You won't find Linux developers versioning schematics, PCB layouts, pictures of housings and enclosures like you will in embedded.
In the embedded word (unlike Linux), everything is very different, the compilers, the MCU's, the CAD systems for PCB design, schematic capture and so on. There is very little common ground in the software unless you use a monolithic system like Arduino.
So does it make sense for someone to struggle with GIT every day for their blinky project in the (unlikely) anticipation of others joining in, or is it perhaps more sensible to use a SCM such as Fossil that will make their one man band development easy and fast ?
However, Fossil also does everything GIT does, and a few things GIT doesn't, so on the unlikely chance a project does take off with multiple developers, Fossil can easily handle it because it's a distributed SCM just like GIT.
I'm not disagreeing with you because you're right, every man and his dog is using GIT, it's as common as Microsoft Windows and C but based on my observations of many embedded hobbyists I have met, most of them only use GIT like I use wget, as a easy way to download something from the internet.
I agree with you. Git, or rather github provides a single place where people can pull, push, share, fork and do what not. This, coupled with the fact that its a more or less ubiquitous gives it a rather large market cap, which unfortunately is hard to beat. I am not being argumentative here, just stating the situation as it is. Even for embedded projects, people either turn turn to github or now, gitlab.
I'd suggest you try to keep it simple. Avoid adding complexity to the process unless the advantages outweigh the disadvantages.
The philosophy I try to use, , is called branch late. Basically you avoid branching and merging unless needed to bugfix a release after head has moved on. I think it is the simplest approach and works well for a solo worker or a small team and on any vcs.
Just check code in to master, and then at some point make a release. Continue checking in to master. If you need to go back and make a minor fix on the release then go and make a branch at the release point. Check in the fix and also merge it forward into head.
It is easy to become a slave to the vcs. Tools like git are amazing compared to what we used to have, but also complex and you don't have to use all of it. Consider the team size and skillset. Best practise wisdom on the net may relate to big projects with many people working on them, where some of the team are gurus in the vcs.
Believe it or not, that is how my work presently is. However, the reason for this post is also due to it. There are some issues I have noticed in the workflow which led me down this rabbit hole.
1. Isolation between features - what I mean by that is that since there are no explicit feature branches, developing multiple feature simultaneously does not allow compartmentalized development. This leads to a lot of frustration in debugging - which commit led to what and why.
2. Bug fixes are inevitably a mess.
3. Releases are not atomic to a feature - again, since there are not feature branches, any release needs that all the features being implemented need to be complete before making a release.
4. Lack of coherent co-relation between features, bugfixes, releases and tags - this should be self explanatory. Since each is coupled in a linear timeline, any commit message clarity is drowned in noise.
5. Troubleshooting - I feel branching helps split from a point and troubleshoot from a given commit. this strategy will fail if all the development happens on the same branch. Since we cannot predict the effect of other intermediate commits on a given bug / feature.
6. Root Cause Analysis - again self explanatory. Trying to perform any root cause analysis is next to impossible with a linear git timeline.
To be clear, I am not arguing with you, I am just stating some of my experiences. The above points I mentioned, can be worked around by doing some fancy git stuff like rebase, biset and squash and what not - but if you are going to that effort then might as well branch and do it right the first time, no..
In those situations, I would just fork a new maintenance repo just for that version. That way you don't have to disrupt the main fork.
Yes, but what about cases where the same changes in the maintenance repo need to be integrated into the main fork? Again, isnt the immutability of firmware the very reason why git flow would be appropriate for this sort of scenarios? You would almost certainly have a very very coherent branch and flow timeline.
Isolating a problem and developing a hotfix would be easier wouldnt it?
Unless your project is never talking to anything else, you will have over time have security issues. Keep in mind that while you fix them with a new version, or patch, what base version you patch them in becomes very important.
Failing to keep this in mind has been the down fall of many widely used "Open" packages. Some so bad, KXCD got in the act describing why the security bug was so bad.
For example, I'm on version 1.2.3.4 of your project, it does everything I want, sure might be some bugs in it, some features not working exactly right, but my tests of the overall application stack are good. Now a CERT issue hits and your answer is the fix is in 1.3.2.2 or greater. Fine, but 1.3.2.2, we have to rush and get that version, test it, test with our app and then get it into production without taking down our customers for too long. Then we find out a few months later 1.3.0.0 has a very nasty security issue in it, because of a new feature, that we are not using, and no one really cares about yet. But you have only been putting security fixes in the very latest version of the project. If only had released 1.2.3.5 that has the security fix in it, then we wouldn't be in this mess. Now we are with a deadline of under 24 hours to fix major apps.
This is why you see people like RedHat taking an "Open" project, and back porting the security fix and putting out a 1.2.3.4-002 version, because the testing of 1.3 with the new and changed features will take too long.
Now, how does one do this with git? I don't care, in fact, a poor man's source control, where only one guy can put in changes to and controls them works just as well. It's all a mater of setting out your rules and following them and thinking about having more than one active supported branch.
I sort of lost you there. To be clear, are you advocating for a pro git-flow sort of approach or against it?? From what I can understand, it looks like you are advocating for the pro git branching strategy.
-
I realise one workflow / suggestion /style is not a silver bullet to all problems.. but I was hoping to get a bit more insight into other practices without reinventing the wheel myself, hence the rather broad
Just use gitflow. It comes built-in to many tools to assist you. It's even in git itself.
You can do other workflows, but that will just be confusing for others if you don't write down the rules properly.
Gitflow is what most people will have seen or used before. At lease some form of it.
And besides, even gitflow isn't concrete.
It's perfectly fine to take a release from a few years ago and backport a bugfix on it. You'd have an additional "master" branch dangling there, but so what.
You'd basically see that as an in-repo fork.
-
Yes, but what about cases where the same changes in the maintenance repo need to be integrated into the main fork?
I'd assume hotfixes should be quite small, and not too much of a burden to create pull request to merge back into main fork, or manually port the changes over.
I reckon this methodology would be adequate if the old firmware version reaches end-of life in support and you want to keep it in maintenance mode only. That way you can retire the fork at some point and forget about it.
Again, isnt the immutability of firmware the very reason why git flow would be appropriate for this sort of scenarios? You would almost certainly have a very very coherent branch and flow timeline.
Isolating a problem and developing a hotfix would be easier wouldnt it?
Sure, if that works, then use it. :)