| General > General Technical Chat |
| Boeing Starliner: 2 SW bugs found, patched, uploaded in-flight to avoid disaster |
| (1/6) > >> |
| splin:
It seems the software problems were worse than previously reported, which was a timer error which set the clock 11 hours off. But there was another serious bug: https://www.theregister.co.uk/2020/02/10/more_software_errors_beset_boeings_calamity_capsule/ --- Quote ---Firstly, that timer wasn't the only software glitch. The Service Module (SM) Disposal Sequence was incorrectly translated into the SM Integrated Propulsion Controller (IPC). The result was that rather than performing a burn to dispose of the SM prior to re-entry, the bug could have actually sent the SM bouncing off the Crew Module. Fortunately, the team noticed that second error while reviewing the code following the first, and uploaded the fix prior to landing. --- End quote --- The comments are well worth reading, specifically this: --- Quote ---Re: "re-verifying flight software code" It’s worse. They cut and pasted their own code, from the capsule to the service module, but then didn’t update what lookup tables it used. If not spotted and rushed patched, when the two parts separated, the service module would have used completely wrong thrust, then try to self correct using even more incorrect thrust and so on until it’s out of thrust or has crashed back into the capsule. Boeing sent multiple, untested, unvalidated software patches, written on the fly to the star liner whilst on mission, just to get it to return safely and it still failed to reach the iss. The approach and docking at the iss hasn’t been tested. Let’s not forget this was a proof it all works mission. That without direct intervention would have resulted in total loss. --- End quote --- I'm not sure what the source for this claim is but if true, I'd struggle to come to a much different conclusion than this later comment: --- Quote ---This situation in this case is a whole different kettle of fuck up. Its not a one point failure situation - its the exact opposite - you'd be hard pressed to find anything that these people did right. I think my A level computer studies group could have done better (in BASIC) than this shower. They cut/pasted code and worst of all, look up tables, into software for a different system, didn't check it visually or otherwise, didn't simulate it on ground systems that should have been available as part of the contract. --- End quote --- The reviewer, if there was one, didn't verify the contents of the look-up tables which is just as important as the code itself - unforgivable. :palm: This wasn't one of those obscure and totally unexpected bugs such as a subtle compiler fault that very rarely causes a latent data error to be introduced into the software state which only reveals itself long after the initial source of the error and in a totally unrelated subsystem. Or a race condition which only occurs in very specific circumstances not exercised by even the most extensive test-suites. In such cases one can have a great deal of sympathy with the developers when these faults make it undetected to released product. The atmosphere must be very uncomfortable in Boeing's development teams right now. Hmm, my tablet auto-corrected Boeing to Boring - how wrong could it get. :-DD |
| SilverSolder:
If true, these are signs of near total decay at Boeing - needs a "back to basics" cultural change. |
| edy:
What ever happened to "measure 1,000,000 times, cut once". I would have expected there to be extensive simulator and systems testing, and not just on one but several different independently made simultation/testing platforms, just to hammer out every possible issue there could ever be. I guess the code is so huge now that it is becoming impossible to manage. That, or there could be several generations of code in there now that newer people in the development cycle are having a hard time understanding. Perhaps a "reboot" and recoding from basic fundamentals is needed every once in a while to build up the system from scratch again (although that may also introduce errors... isn't there a concept of "leave spaghetti code alone if it seems to work"). Anyways, a lot of assuming still as to exactly what went wrong but it seems like one problem (timing error) was compounded by another (copying code without associated tables) when they tried to fix it mid-mission. Honestly I didn't even know Boeing and NASA were up to this... I think it is cool that they are using unmanned flights for supply missions. |
| Tomorokoshi:
What is it they say? "Test what you fly, fly what you test". https://appel.nasa.gov/2002/10/01/test-what-you-fly/ |
| WastelandTek:
--- Quote from: SilverSolder on February 11, 2020, 07:11:59 pm --- If true, these are signs of near total decay at Boeing - needs a "back to basics" cultural change. --- End quote --- It's not just Boeing, it's culture wide, we can't even do the choo choo trains any more, and there never seem to be any real consequences for the perpetrators of failure. |
| Navigation |
| Message Index |
| Next page |