I need controls anyway as one of things I'm going to use it for is testing how software works under sudden node restart, or for testing stuff that can potentially hang the board and require power cycle. But yeah, worth for added protection alone (especially considering that power supply can output 15A on 5V bus), it even have discharge control.
It also allows for doing neat tricks like "scale down to one node when there is no jobs to run in cluster, scale up when there is work to do"
Solving it in software is easy enough (and I did it while testing) but it needs to be done for
every single image I try to use with it which makes testing new (as in "not made/modified by me(yet)") ones annoying as I'd have to power on whole thing then power on single board after switch booted, then ssh to it and add modifications (or try to fix it "blindly" by mounting SD card directly). I will probably add that anyway as a second layer, or just use static IPs, it's not like that will change often
BTW, the totally lame but effective software solution might be to just insert a random delay before the DHCP request to avoid the 'thundering herd' of clients hitting the DHCP server, if this is in some startup script written in bash, sleep $(($RANDOM / 1000))
should give you a random sleep between 0 and 32 seconds.
It was not that problem, it was just "switch booted longer than ARM boards, long enough for DHCP to timeout". And I "fixed" it with simple sleep. Then modified some boot options and it stopped working because boot took longer... "real" fix was just trying it in loop until it worked but that ain't exactly pretty.
I wonder why those TI parts are 5x as expensive as diodes one.... they pretty much do same thing