Electronics > Microcontrollers

Interrupt routine duration

<< < (10/11) > >>

tggzzz:

--- Quote from: T3sl4co1l on August 15, 2022, 02:07:16 am ---
--- Quote from: tggzzz on August 14, 2022, 11:10:40 pm ---You can't test quality into a design. Tests can only prove the prescence of faults, not their absence.

I'm sure you are perfectly aware of that, but it horrifies me every time someone is surprised by that new (to them) concept.
--- End quote ---

Sure you can. That's what they said about semiconductors. The yield might've been shite early on (like the <1% of early JP transistor lines, or certain Intel lines, etc.), but all they needed was a few parts that worked, and refinements to gradually bring that up.

Mind, this is a case where the design is essentially correct, there's just finitely many errors that occur in manufacturing (process impurities, dust, etc.), and they only need to get lucky enough to find one free of defects.

--- End quote ---

And as you realise, that testing is very different. Trying to use design validation/verification tests to detect replication errors is not fruitful!


--- Quote ---Does that work in software?  Maybe.  But it's worth noting, if nothing else, the meaning behind that statement, and where it may or may not apply.  It was applied erroneously to them, by those righteous hardware reliability types.  We must always be cognizant of the limitations of our knowledge, like this.

So, software.  Well, fundamentally it's a design, not a production process.  So, the above is out (and, to be clear, I'm not trying to force the above meaning into current context!).

And, I know where you're coming from.  To be clear: software design is something that -- given adequately comprehensive specifications -- we can prove, perfectly to work.  Not just "beyond a shadow of a doubt", not anything that could be tested (complete test coverage is combinatorial in complexity, it can't be done in general!), perfect proof.

--- End quote ---

Validation vs verification is relevant at this point!


--- Quote ---Assuming all the toolchain and stuff is working correctly, I mean, but a lot of work goes into those, along similar lines, when you're asking for something so reliable.  Provable stacks exist, from transistor level to compiler.

Now, I'm not quite sure if you're talking about formally proven systems here, or more informally, but it's good to know in any case that it's out there, and doable.

--- End quote ---

I've always rather liked the strategy Boeing used with the 777. Hatley & Pirbhai's technique was to have one stack for the specifications using whichever techniques were suitable for the component of the design. It was almost executable (unlike the later UML kitchen sink). They also had an separate independent stack for the implementation. The only part they thought worth automating was a database linking every specification artefact with a corresponding implementation artefact - which could be hardware/software/mechanical (or in the case of the 737 MAX, maybe wetware!). That ensured that nothing fell down the gaps between the floorboards.


--- Quote ---AFAIK, provable computing is not very often used, even in high-rel circles, just because it's, I don't know, so much of a pain, so different from regular development processes?

--- End quote ---

It does seem to require specialists with domain knowledge and maths experience, which is usually an empty set!

Back in the 80s I was beguiled by the promise of formal methods. They did have some success (IBM's CICS, Transputer's floating point), but then at least one unpleasant failure (RSRE's VIPER processor, where they almost managed to have formal traceability from instruction set specification to transistors).

Then I realised that the even if formal methods could become practical, they would always run into the issue of interacting with things that aren't formally specified.


--- Quote ---And most of the time, it doesn't matter: if the thing does what it needs to, most of the time, and is reasonably tolerant of nonstandard inputs (as fuzzing can cover -- whether formally, or by the crude efforts of testers), who cares, ship it.  Some customers will eventually hit the edge cases, and maybe you patch those things up on an as-needed basis.  Maybe the thing is still chock full of disastrous bugs (like RCE), but who's ever going to activate them?  And what does it matter if it's not a life-support function, or connected to billions of other nodes (as where viruses can spread)?

So, to be clear, it depends on the level of competency required.  Provable computing is just another option in the toolbox.

Clearly, you're approaching things from a high-rel standpoint.  That's an important responsibility.  But it's also not something that can be applied in general.  At least, not with developers and toolchains where they are right now.

And that's even assuming that every project was specified perfectly to begin with.  Clients or managers come to engineers for solutions, not for mathematical proofs; it's up to the engineers to figure out if proofs are warranted, or if winging it will suffice.  And for 99.9% of everything, the latter is true, and so things are.

--- End quote ---

How systems work is usually easy and boring. It is much more interesting and important to understand how it fails, and how that failure is detected and corrected.

I would be satisfied with people that

* can spot where salesmen are effectively claiming they've solved the Byzantine Generals or Dining Philosophers problems
* ensure there are manual correction mechanisms that can overcome inevitable automated failures
* don't seriously believe that a system is working because none of the unit tests has failed
* realise that unit tests aren't going to be much with, say, ACID transactional properties
Everybody has seen the first two; I've seen the last two :(


--- Quote ---And, I also mention testing for a couple reasons:
1. It's the most basic way to figure out how something works (or doesn't). 

--- End quote ---

Not really w.r.t. "works" (doesn't work, yes) 

The nearest is using subtle failures (especially in wetware) to give glimpses as to how things normally operate.


--- Quote ---It can be exceedingly inefficient (trivially, say, how do you test a 1000 year timer?), but to the extent anything can be learned by doing it, in any particular case -- that's at least some information rather than complete ignorance, or guesswork.
2. There's "test driven development".  Which, I don't even have any good ways to do, in most embedded projects; most of the tools I have, don't come with test suites, so I can't even run tests to confirm they work on my platform.  And most embedded platforms have no meaningful way of confirming results, other than what I've put into them (e.g. debug port).  In relatively few cases, I can write a function in C, and test it on the PC -- exhaustively if need be (a lazy, and often infeasible method, but when it is, it's no less effective than direct proof).

TDD can be equivalent to proof, even without exhaustive testing, if all code paths can be interrogated and checked; granted, this is also, in general, not something you're often going to have (the code paths are invisible to the test harness, and highly nonlinear against the input, i.e. how the compiler decides to create branches may vary erratically with how the input is formulated or structured).  Though, this hints at something which can: if we add flags into every code path, and fuzz until we find the extent of which inputs, given other inputs, activate those paths, we can attempt to solve for all of them -- and as a result, know how many we're yet missing.

--- End quote ---

In theory, yes. In practice, not really. There is neither the time nor expertise to create decent tests that will detect flaws. At best there will be "happy days" (?daze?) unit tests, and idiotically trivial unit tests that give an appearance of code coverage.

Whether or not the unit tests are sufficient to discover edge-case errors is rarely explored, since it is perceived as too expensive.


--- Quote ---TDD I think is mainly a level-up in responsibility, where the project is persistent enough to not only be worth writing tests for, but to accumulate tests over time as bugs are found (write a test for it, to prevent it popping up in later refactoring!), while evolving new features -- extending an API while keeping it backwards-compatible, say.  It's far more agile than drawing up a comprehensive provable spec every time, and it's reliable enough for commercial application.  (So, it would figure that I haven't been exposed to it; I simply don't work on a scale where that's useful, besides the practicability issue.)

--- End quote ---

TDD is a useful organisational tool to break atherosclerotic waterfall processes. Too often converts then have a pseudo-religious faith in TDD's powers, and think it is sufficient. It isn't, of course.

TDD is highly beneficial, but neither necessary nor sufficient.


--- Quote ---(And maybe I'm overstating how much trouble it is to do provable computing, or something in the spirit of it, if not formal.  I don't work with it either, and curious readers should read up on it instead.)

--- End quote ---

Having observed, from a distance, people paid to explore the possibilities - you're not overstating it.


--- Quote ---And fuzzing, while it's still not going to be exhaustive; anywhere that we can ensure, or at least expect, linearity between ranges (i.e., a contiguous range of inputs does nothing different with respect to execution), we are at least very unlikely to need test coverage there.  (Insert Pentium FDIV bug here. :P )

Actually, heh, I wonder how that's affected by branch-free programming.  One would want to include flags equivalent to program branches.  So, it's not something that can be obviously discovered from the machine code, for example; the compiler may emit branchless style instructions instead of literally implementing a control statement.  It might not even branch in the source, if similar [branch-free] techniques are used (like logical and bit operators, and bit or array vectorization tricks).

--- End quote ---

Fuzzing, in any of its forms, is a useful extra tool in the armoury. Especially where it is able to spot cracks between organisational boundaries, or grossly naive programming presumptions.

<snipped points of violent agreement>

uer166:

--- Quote from: tggzzz on August 14, 2022, 06:43:01 pm ---
--- Quote from: uer166 on August 14, 2022, 12:50:32 am ---
--- Quote from: Siwastaja on August 13, 2022, 06:11:29 am ---My programs have all basically had empty while(1); loops for years, I do all in ISRs after init.

--- End quote ---

I used to be a proponent for co-operative schedulers (no interrupts allowed at all). But with the exception of some regulatory and reliability edge cases, the nested ISR way you mention is generally easier to understand, and can still be designed to pretty high assurance levels.

--- End quote ---

If the customer doesn't need it to be reliable, then I can deliver something arbitrarily fast, cheap, and soon :)

I've seen nested ISRs become hairy and unpredictable, especially where realtime constraints and long-duration processing is involved.

Short ISRs generating events handled by a cooperative scheduler seems a good middle ground. Such events are processed identically to events generated by applications within the cooperative scheduler.

--- End quote ---

The issue with this is you've erased most of the guarantees that a "true" co-op scheduler if you add an interrupt. Namely stuff like:

* No issues with data consistency. This means no physical way to get corrupt data due to non-atomicity
* Fully deterministic execution: you can always say that at time T+<arbitrary ms>, you're executing task X
* Worst case analysis triviality given some constraints
* Easy runtime checking of health, such as a baked-in task sequence check, deadline check, etc etc
The problem with true co-op of course is the massive headache that it causes when you try to implement complex stuff: everything needs to be synchronized to the tick rate, no task can take longer to execute than the tick rate, total ban on interrupts, etc. Lots of fast stuff needs to be delegated to pure hardware. A lot can be done in this architecture, but even stuff like life-saving UL991/UL1998 GFCI equipment doesn't need this kind of assurance. Maybe a jet turbine FADEC is a different situation of course..

uer166:

--- Quote from: TC on August 14, 2022, 01:14:10 pm ---RE: hardware vs. software safety...

Safety of software will generally require that the custom code is certified, that the tools that are used to develop it are certified, that numerous constraints for the development of that software are enforced (like no object-oriented languages, MISRA C, etc.), safety-certified RTOS, safety-certified hardware, etc.

--- End quote ---

Your code of course needs to be certified, but the tools generally don't need to be. It's perfectly fine to use a specific GCC release in 99.9% of cases, and stick with it. The exact same is true of the place/route tools for the CPLD: I bet you'd use the normal vendor tools to convert your VHDL into the CPLD config. Of course if you switch the GCC version + recompile, and the issued binary is different, you need to re-certify and get the notified bodies involved, but like, just don't change the code..


--- Quote ---In contrast, it is much easier to do a fault analysis of something like a CPLD that implements a safe-torque off function in hardware. Consider also the performance of the safety function (i.e. real-time performance of hardware vs. a bunch of code).

So this is a good example of why hardware-based safety-functions are so important.
--- End quote ---

Is it? Modern MCUs are more ASICs than MCUs. There is plenty of hardware in the MCU to do a guaranteed shutdown to end up in a regulatory-defined risk-averse state. I've done UL2231 GFCI systems for shock protection implemented almost 100% in STM32G4 firmware and internal periphery since there's really no way to do it in pure hardware anymore. It used to be just some opamps and comparators, but times are changing: the new requirements are much too tight and complex nowadays.

tggzzz:

--- Quote from: uer166 on August 15, 2022, 05:37:49 pm ---
--- Quote from: tggzzz on August 14, 2022, 06:43:01 pm ---
--- Quote from: uer166 on August 14, 2022, 12:50:32 am ---
--- Quote from: Siwastaja on August 13, 2022, 06:11:29 am ---My programs have all basically had empty while(1); loops for years, I do all in ISRs after init.

--- End quote ---

I used to be a proponent for co-operative schedulers (no interrupts allowed at all). But with the exception of some regulatory and reliability edge cases, the nested ISR way you mention is generally easier to understand, and can still be designed to pretty high assurance levels.

--- End quote ---

If the customer doesn't need it to be reliable, then I can deliver something arbitrarily fast, cheap, and soon :)

I've seen nested ISRs become hairy and unpredictable, especially where realtime constraints and long-duration processing is involved.

Short ISRs generating events handled by a cooperative scheduler seems a good middle ground. Such events are processed identically to events generated by applications within the cooperative scheduler.

--- End quote ---

The issue with this is you've erased most of the guarantees that a "true" co-op scheduler if you add an interrupt. Namely stuff like:

* No issues with data consistency. This means no physical way to get corrupt data due to non-atomicity
* Fully deterministic execution: you can always say that at time T+<arbitrary ms>, you're executing task X
* Worst case analysis triviality given some constraints
* Easy runtime checking of health, such as a baked-in task sequence check, deadline check, etc etc
The problem with true co-op of course is the massive headache that it causes when you try to implement complex stuff: everything needs to be synchronized to the tick rate, no task can take longer to execute than the tick rate, total ban on interrupts, etc. Lots of fast stuff needs to be delegated to pure hardware. A lot can be done in this architecture, but even stuff like life-saving UL991/UL1998 GFCI equipment doesn't need this kind of assurance. Maybe a jet turbine FADEC is a different situation of course..

--- End quote ---

None of those points is useful; they are all true in any embedded realtime system! (Exception: xCORE+xC systems with their (regrettably) unique architecture concepts and design-time guarantees. See my other posts for those!)

The point is to ensure the ISRs merely read the peripheral and atomically insert a message in a queue; that queue is atomically read by the cooperative scheduler whenever it chooses what to do next.

Now I know you can't do such atomic actions is most C standards. In that case assembler is required, but that is eminently tractable since it executes only in a few well designated places.

See my other posts about interrupt and task priorities; there should be two of each: normal and panic.

uer166:

--- Quote from: tggzzz on August 15, 2022, 06:38:27 pm ---None of those points is useful; they are all true in any embedded realtime system!

--- End quote ---

Huh? How are they not useful if they can create a provably correct realtime system; none of those points are true normally.. You don't get any of those points/guarantees in any run-of-the-mill embedded system. With time-triggered co-op you're getting true determinism at the expense of it being an absolute dog to design for and being generally inflexible/fragile, but that is a trade-off you might want to do in some contexts.

Think about it: as you execute your code in a co-op scheduler, there can be nothing that can interrupt your instructions, change control flow, or modify any state. This reduces the overall state space of your system by many orders of magnitude. As soon as you have even one interrupt, your tasks can get interrupted at any point in control flow, and all those guarantees go out the window. TT co-op more-or-less erases a very large subset of hard to reproduce possible bugs, which means you don't need to mitigate them and deal with them in any way.

Now is that something you'd do as a matter of course? Hell no, it's in a similar category to provably correct, fully statically analyzed systems: it's limiting, difficult, slow, sub-optimal in terms of resource use, and inflexible. P.s.: quit shilling for xCORE, I've read enough about it and it's uninteresting to me and probably to most people on these forums.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod