All non-trivial FSMs grow like Topsy until they become intractable - i.e. you can't accurately predict how they will operate. I've even seen an FSM similar to yours that had grown over the years until the turnaround time (customer request to deployed feature) was typically six months. Completely ridiculous of course, and it sank the product.
Yeah, I don't doubt it. The benefit of using something like a state machine should be that you fit your system to a construct you can implement quickly/reliably. And there's an added benefit of engineers being able to understand logic flow in a FSM better than other programming construct. But of course most of us aren't programmers and we can kludge even the best of code into a mess.
Nominally they were programmers. Personally I wouldn't have let them anywhere near a keyboard.
Are there any articles or examples of low-level FSM implementation you would recommend? I didn't find too much in the past specific to microcontrollers, made me want to share.
The trouble is I learned/reinvented various techniques before URLs even existed.
Have a look for Harel StateCharts. IMNSHO, while complete, they are a little overcomplicated.
An old low-level C technique is to have a 2D array of function pointers. Dimension 1 is the current state S (encoded as an integer), dimension 2 is the incoming event E (also encoded as an integer). Each pointer refers to the function to be executed when event E occurs in state S; the function takes whatever actions are necessary and specifies the next state (variant: the next state is encoded in a second parallel 2D array of integers).
Logging consists of recording <time,E,S> tuples. The set of states and events is fully explicitly enumerated, and it is unambiguous what happens when event E happens in state S1 or S2 or...
If you need a new action, then you don't have to dismember entangled switch statements, you just add a new function and point to it.
If an event "should not occur" then that can either be encoded as an empty function or, more usefully as a WTF-function.
Note that the execution time is constant and trivial - an array lookup and indirect jump to a function. Unless the state-event space is very sparse, which is unlikely for real-time systems, the 2D array is also space efficient.
If hierarchical FSMs better represent your problem, then encode encode each event as a class E, each state as a class C (where the class hierarchy matches the state hierarchy), and have a virtual function in classes C for each event E. Then provide methods in the relevant C for each relevant E. "Should not happen" E will trickle up to the base/root class.
Have a look at how various "state machine compilers" implement the FSMs in different languages. The better ones will target different languages and have alternative target implementations depending on the constraints.
One (old) compiler was Bob Martin's "State Map Compiler", see
http://smc.sourceforge.net/SmcFaq.htm#RipOff but there are many others.