Author Topic: Event driven cadence and starvation.  (Read 388 times)

0 Members and 1 Guest are viewing this topic.

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4318
  • Country: gb
Event driven cadence and starvation.
« on: November 12, 2024, 01:15:20 pm »
Thinking out loud ....

I have a flow in an asynchronous message bus (MQTT) system which is experiencing oscillations due to event starvation.

The original design allowed the sensors to govern the event cadence.  This worked absolutely fine when all my temperature sensors output every 10 seconds.

I have replaced them with battery operated sensors which are a lot more frugal about their transmissions and will not resend a "same value" twice within (by default) 10 minutes.

My system uses transient, expiring state to control the downstream.  So, when a temperature arrives for a zone and heating is required a message is published to demand heating with an expiry of, 1 minute.  If that requires a radiator it gets switched on for 3 minutes.  If it requires the boiler (it will) that gets switched on for 10 minutes.  Every time a temperature arrives for that zone while there are active "timers" the timers will be reset.  Once updates stop, the schedule stops requesting etc.  the demands and control states expire.

These 1min, 3min, 10min are the core "temporal hysteresis" preventing "oscillation around the trigger".  There are other forms of hysteresis not relevant.

So, the 1min timeout doesn't really effect anything.  It has to expire first before the others for radiator and boiler will start counting down.  So a single "below spec" temp which causes at least one schedule to request heating will cause any relevant radiator to be on for a minimum of 4 minutes.  The boiler will be on for a minimum of 11mins.

Originally I have been setting all those temp sensors config to send "at least every 5 minutes".  Trouble is, I have quite a few of these now and I am replacing at least one battery every few months.  Thats not a problem, but usually they lose their config when you swap the battery and reconfiguring them is a PITA.  So, some slip through and return to only updating every 10mins if the temp doesn't change.

I want to decouple the "cadence" away from the sensors transmission rate.  So I basically need either a synthetic event that causes all schedules to rerun all retained values, or I add a "repeater" service to the bus which will periodically retransmit temperature data more periodically.  I "could" make the schedules "poll" retained state, but I want to still retain the instant response to events.

The repeater service sounds more fitting, as then nothing else needs to know it exists.  Compartmentalisation of responsibility.

The repeater service will subscribe to the tree of topics needing synthetic events of retained state.
When a value is received its timestamp is checked for expiry.  If expired it's dropped.  If not expired it is retained in a map.
The service will wake a thread and process the contents of the map, data which still hasn't expired will be retransmitted.  Data that has expired will be removed from the map.  (EDIT:  The service shall not receive or shall ignore it's own retransmissions!  EDIT:  resent="true" ;))

An instance of the service will take, a topic expression to operate on, a period to refresh data over, the "max age" of data to retransmit before discarding.

That period can be 1 minute, giving me a near guaranteed 1 minute cadence on the consumers.

The max age for a domestic air temp can be 30 minutes.  The metric here being not how fast the temperature can change, but how slowly it can.  30mins gives the heating system enough time to have responded.  Even if the temperature remains the same, and the sensor never resends same value, we will keep assuming the temperature has not changed and so we keep the heating on.  If it still hasn't changed after 30mins it's broken and so it will expire and the heating will turn off.

I'm going to start writing a book on "How to overcomplicated domestic heating control".
« Last Edit: November 12, 2024, 01:20:32 pm by paulca »
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9095
  • Country: fi
Re: Event driven cadence and starvation.
« Reply #1 on: November 12, 2024, 01:34:56 pm »
I'm going to start writing a book on "How to overcomplicated domestic heating control".

I think you summarized it pretty well :-+

Probably you are overqualified for the task and ended up designing an overmodular, overdistributed system. And based on something as finicky and underwhelming as MQTT :-DD

I think, if you want to keep pursuing complexity, the next step is to reinvent MQTT, you could clearly have more features on that.

But seriously though, doing computation and decision making centralized would hugely simplify everything. To have simple access to all relevant state. I have seen pretty decent HVAC control system which is one huge Python script.

Then again, I don't understand why you don't just simply store the latest temperature measurement where it is being received (and where the decision is made). Only watch the message age for producing warnings.

And there are much more sane ways to prevent control loop oscillations than just ensuring minimum on-time. P control has over hundred years of successful legacy in HVAC; every mechanical thermostat has an electromechanical P controller, 99% of people just never realize it. So if you are close to the setpoint, run PWM. Combat slow sensor response by coupling a fast-coupling ramp into the signal exactly like the old mechanical thermostats do (when the heater switches on, a small resistor starts heating up the sensor). If you are far too cold or far too hot, then the load is on or off 100% of the time but the closer you get to setpoint, the closer you have to 50% (or so) duty cycle. Period can be anything suitable, like 60 seconds, depending on what you control (actual boiler, or mixing valve, or electric heater).
« Last Edit: November 12, 2024, 01:40:12 pm by Siwastaja »
 

Offline Postal2

  • Frequent Contributor
  • **
  • Posts: 731
  • Country: ru
Re: Event driven cadence and starvation.
« Reply #2 on: November 12, 2024, 02:13:57 pm »
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4318
  • Country: gb
Re: Event driven cadence and starvation.
« Reply #3 on: November 13, 2024, 11:30:09 am »
I'm going to start writing a book on "How to overcomplicated domestic heating control".
I think, if you want to keep pursuing complexity, the next step is to reinvent MQTT, you could clearly have more features on that.

Already been there.  MQTT does not timestamp messages and provides no way to do so.  It will "retain" messages though.  So, how do you know how old the message is?  You don't.  So I already have a proxy component which translates all the messages into a "normalized" form.  Basically converting them to single value messages with a "value" field, a timestamp field and a few other conventions and nice to haves.

Quote
But seriously though, doing computation and decision making centralized would hugely simplify everything. To have simple access to all relevant state. I have seen pretty decent HVAC control system which is one huge Python script.

Then again, I don't understand why you don't just simply store the latest temperature measurement where it is being received (and where the decision is made). Only watch the message age for producing warnings.

The "last message on cache" for each topic of interest is stored on the consumers (schedules).  However the schedules only impetus is as an event handler on temperature data.

The control is "centralised" only so much as they all share the same message bus and there are no restrictions on what data can be subscribed to.  So, example, the automated lights, not only know the motion trigger, they can also get the lumminusity in that room (if available) and the garage roof solar panel output... to decide if it's day or night.

It is decentralised in that each nano-service runs as it's own Python script and is a bus client.  Some "workflow" style services are purely bus consumers and publishers and touch nothing else.  Like the heating scheduler for example.  This is by design.  Making a single monolithical python script would doubtless result in a large sprawling unmanagable nest of cross dependant states.  I seen it coming and decided I would avoid it by forcing myself to only communicate via the network between "components".

The do end up running 95% on one box though.  A couple run elsewhere for non-functional reasons.

Quote
And there are much more sane ways to prevent control loop oscillations than just ensuring minimum on-time. P control has over hundred years of successful legacy in HVAC;

I don't disagree on the functional side.  However on the non-functional side of things there are over arching restrictions that prevent proper P control, I assume you mean proportional or integral back off algorithms in my software lingu.

The main one is that I have no proportional control over the primary boiler.  I have ON and OFF.  Therefore a good portion of my control flow is designed to not cycle a lot.  I already feel uncomfortable when it's cycling 20+ times a day and look for ways to avoid that.

The radiators are different.  They are just thermo electric acutators.  They can be PWMed.... although I do control them with mechanical relays currently making that PWM period about 1 minute and even then it could become annoying to hear, click, clack, click, clack.

I do have set point ramping code, but it's designed to identify "upcoming" schedule changes and predict when to turn the heating on in advance to "head it off".  It just projects a potential heating performance ramp and checks to see if it will meet the target temp at the target time.  If it would overshoot, it takes no action.  If it would undershoot, it raises a request for heating.  This process effectively forms a coupled loop.  If the heating out performs the "predicted" ramp the future scheduler will stop requesting.  The loop will effectively back off as it approaches the point the 'hard' schedule kicks in. 

More appropriately... If I buy an EMS controller for the boiler I can then modulate the loop temp and take advantage of the boilers heat source modulation and pump modulation abilities.  Instead of turning the boiler OFF, I can proportionally back off the modulation as zones approach targets.

Again however there is a catch.  My boiler is oversized.  A common thing installers do here as the price difference between a 20kW boiler and a 30Kw is a few hundred quid and they last 10-12years.  So why not, right?

Well, the catch is, should you want to, say, run the heating loop at 40*C during the "edges of summer", then the larger boiler is unlikely to be able to modulate it's flame down far enough and so will start to self cycle instead.  Not efficient.  Stopping and starting the unit costs gas and maintenance.

So, I'm basically stuck with "long cycle" times on the boiler and making the flow control maximise the times when the heating is on.
"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 

Offline paulcaTopic starter

  • Super Contributor
  • ***
  • Posts: 4318
  • Country: gb
Re: Event driven cadence and starvation.
« Reply #4 on: November 13, 2024, 11:42:20 am »
As to being over complicated.

What I could do is just set the minimum cadence to longer than the longest sensor output period.  Just make the heating stay on for 15 minutes and avoid starvation.  Give them a bigger stomach.

"What could possibly go wrong?"
Current Open Projects:  STM32F411RE+ESP32+TFT for home IoT (NoT) projects.  Child's advent xmas countdown toy.  Digital audio routing board.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf