Refactoring is what you do later on when you discover an ACTUAL requirement that can not be easily implemented in the current design.
Or when the current design maintenance cost is higher than the cost of replacing it with a simpler, easier to maintain design.
You refactor the code -- that is, change its structure without changing the functionality AT ALL -- until it is in a form where the now known to be needed functionality is easy to implement.
Without changing the
used functionality; unused features and parts of the design are (or should be) removed.
Refactoring things to minimize code complexity and cost of code maintenance is a very important part of my work flow.
Story time, feel free to skip/ignore:
My own design approach differs from the most common ones, in that I like to test important features in modular test programs/firmwares before using them in a design.
For example, let's say you have an embedded design where for some reason, you need many lightweight timeouts that often/typically get cancelled before they elapse. (This is very common in systems programming, especially in any kind of internet or web service, and one I have lots of experience with; that's why I'm using it as an example here.) My preferred solution is a combination of a
binary min-heap in a fixed-size array, with each entry containing the timeout time (when it elapses/triggers) and the timeout index/id. In parallel, there is an array with the same number of events, with each entry containing the current min-heap index, plus any associated information needed for the implementation (like elapsed flag, or whatever happens when the timeout elapses).
I start by creating a basic min-heap implementation, with the entries using appropriate types for the target processor and maximum number of concurrent events. (I typically use an unsigned integer type for the timestamp. If the maximum allowed timeout (duration) is
N-bit, then
N+2 bits suffices for the timestamp, giving me absolute reliability according to tests I've done before. Others can do it in
N+1 bits, but often end up having off-by-one errors near the maximum duration timeouts. With non-power of two durations, three times the maximum timeout duration suffices for timestamp range.)
The test implementation is run on a fully-hosted system (Linux), with inputs supplied from files, so I can easily construct test cases, including the worst cases I can imagine – those being the most important to test for in my opinion.
When the basic min-heap works as I need it to, I save it as a separate unit test, but continue with adding the identifier array for canceling timeouts and triggering timeout events. If there are different event mechanisms, I test those separately, without the min-heap at all.
When that sub-system has been tested and works, I add my comments on my current ideas on how it will be used in the final firmware/application, but do not integrate it yet. I make sure I have all the important
modules working in isolation first.
When I have all the modules tested, I start the integration work. While one could just add all the modules at once, I prefer to do it one module at a time, because that way I minimize the area where bugs and errors and issues can arise. After adding a module, I do full testing, including the new tests for the added module, of course.
If the firmware/application is complex enough, I often realize that two (or three) modules could/should be merged, because that would yield better, easier to maintain code. (If it adds to the code complexity, I consider merging having negative value, because the code I write tends to get used for a long time, so cost of maintenance is as important as cost of creation to me.) I end up refactoring the modules in a separate unit test/module; usually not within the actual firmware/application directly. (I tend to have learned more about exactly what features matter, so I do tend to rewrite the tests, instead of just reusing the existing ones from the modules; but test coverage should always expand, and not shrink, here.) Reintegrating into the firmware/application is then done just like for any other module.
This leads to things like having to carefully document the internal interfaces to each subsystem. This makes refactoring easier, because from the interfaces actually used in the firmware/application, you can tell if parts of the "modules" are unused, and can be discarded or refactored. It also explains why I find
modular programming and
software minimalism (and their combination, the
Unix philosophy) so useful: as a proven historical track record, this approach produces good results.
This serves me well, but makes me somewhat less productive in terms of 'lines of code of new features per day'. My bug rate is much lower than typical, and I tend to keep core documentation up to date because I rely on it myself, so I consider this well worth it; but many, especially manager types worrying about the production costs, often disagree. (I still do struggle with writing truly useful code comments, though. Should have learned to do that from the get go, instead of later on!) So, I do not recommend adopting this for those who do paid commercial work, because it can reduce their attractiveness in employers eyes. I am not limited by that, as I only work for free or for myself, currently.