General > General Technical Chat
Individual device consumption from aggregate instantaneous power timeseries.
(1/1)
paulca:
Bit of a mouthful. 

I have a detailed time series of instantaneous Watts being consumed.  I have a few data analytics tools.  How hard can it be, right?

Chopping a lot of steps the source data looks like this:


--- Code: ---+-------------------+----------+------+
|               time|       key| value|
+-------------------+----------+------+
|1678892412041096000|mainsPower|382.96|
|1678892442053409000|mainsPower|393.56|
|1678892448316177000|mainsPower|418.38|
|1678892458305550000|mainsPower|398.81|
|1678892472059483000|mainsPower|391.73|
|1678892502071421000|mainsPower|390.59|
|1678892532074947000|mainsPower|389.02|
|1678892562082924000|mainsPower|390.51|
|1678892592156734000|mainsPower|403.92|
|1678892622162091000|mainsPower|   392|
|1678892638103360000|mainsPower|379.91|
|1678892651098937000|mainsPower|399.94|
|1678892652170208000|mainsPower|393.23|
|1678892682185678000|mainsPower|387.42|
|1678892712182986000|mainsPower|387.84|
|1678892729019943000|mainsPower|466.35|
|1678892733018366000|mainsPower| 398.8|
|1678892742194999000|mainsPower|396.54|
|1678892753001237000|mainsPower|460.11|
|1678892756999136000|mainsPower|408.65|
+-------------------+----------+------+

--- End code ---

So far I have extracted the timestamp to timestamp, chronological deltas in the wattage, a rounded version and their abs() value.

Ordering by the ABS value does start to show promise:


--- Code: ---+-------------------+----------+-------+-----------------------+-------------------+-------+---------+
|time               |key       |value  |timestamp              |delta              |delta_2|abs_delta|
+-------------------+----------+-------+-----------------------+-------------------+-------+---------+
|1678973146796417000|mainsPower|492.91 |2023-03-16 13:25:46:000|-100.07999999999998|-100.0 |100.0    |
|1678910098765749000|mainsPower|434.95 |2023-03-15 19:54:58:000|-100.00000000000006|-100.0 |100.0    |
|1678955387327490000|mainsPower|643.03 |2023-03-16 08:29:47:000|100.17999999999995 |100.0  |100.0    |
|1678971862984229000|mainsPower|459.95 |2023-03-16 13:04:22:000|-100.02000000000004|-100.0 |100.0    |
|1678972967735389000|mainsPower|599.19 |2023-03-16 13:22:47:000|100.70000000000005 |101.0  |101.0    |
|1678973512903132000|mainsPower|498.69 |2023-03-16 13:31:52:000|-100.84999999999997|-101.0 |101.0    |
|1678972043090063000|mainsPower|561.52 |2023-03-16 13:07:23:000|-100.78999999999996|-101.0 |101.0    |
|1678975416882210000|mainsPower|523.49 |2023-03-16 14:03:36:000|-101.36000000000001|-101.0 |101.0    |
|1678972293340457000|mainsPower|456.83 |2023-03-16 13:11:33:000|-101.12000000000006|-101.0 |101.0    |
|1678972548598568000|mainsPower|442.36 |2023-03-16 13:15:48:000|-100.73000000000002|-101.0 |101.0    |
|1678971751988803000|mainsPower|466.29 |2023-03-16 13:02:31:000|-101.19999999999999|-101.0 |101.0    |
|1678972552598249000|mainsPower|525.47 |2023-03-16 13:15:52:000|-100.74000000000001|-101.0 |101.0    |
|1678972664635737000|mainsPower|518.88 |2023-03-16 13:17:44:000|-100.70000000000005|-101.0 |101.0    |
|1678970718564290000|mainsPower|533.16 |2023-03-16 12:45:18:000|-100.64999999999998|-101.0 |101.0    |
|1678972724653133000|mainsPower|438.7  |2023-03-16 13:18:44:000|-101.30000000000001|-101.0 |101.0    |
|1678970564699165000|mainsPower|2362.87|2023-03-16 12:42:44:000|101.58999999999969 |102.0  |102.0    |
|1678975831698850000|mainsPower|494.92 |2023-03-16 14:10:31:000|-101.78000000000003|-102.0 |102.0    |
|1678972391481375000|mainsPower|585.24 |2023-03-16 13:13:11:000|102.29000000000002 |102.0  |102.0    |
... snip...

--- End code ---

Without getting into exotics and without even aiming at the end goal I tried to pull some histograms and statistics together.

De-noise.  Truncate the abs deltas to 2 significant figures. 
Group By the resulting deltas and Sum their non-absed real value.  Order ascending.

This gives me a much more encouraging view.  I call this statistic the "coherence" across that group.  Lower is better.

--- Code: ---+-----------------+-------------------+
|sig_fig_abs_delta|         sum(delta)|
+-----------------+-------------------+
|            380.0|-0.8699999999999477|
|            910.0| 1.7599999999998772|
|            870.0|  1.849999999999909|
|            780.0|    3.2800000000002|
|           1000.0|-3.5200000000000955|
|           1800.0|-3.7000000000000455|
|            200.0|  7.199999999999875|
|            230.0|  8.620000000000118|
|           1300.0|-22.550000000000637|
|            150.0| -118.9099999999998|
|            180.0|  176.1400000000001|
|            340.0|            -344.62|
|            350.0| 348.02000000000004|
|            360.0| 357.81000000000006|
|            370.0| -368.5899999999999|
|            390.0| 393.13000000000005|
|            400.0|  396.7499999999998|
|            210.0| 407.14000000000044|
|            410.0|            -410.78|
|            440.0| -442.4100000000001|
+-----------------+-------------------+
only showing top 20 rows

--- End code ---

I also counted the various instances of the 2-sig-fig-abs-deltas.


--- Code: ---+-----------------+-----+
|sig_fig_abs_delta|count|
+-----------------+-----+
|            110.0|   46|
|            150.0|   41|
|            100.0|   39|
|            120.0|   36|
|           1400.0|   33|
|            140.0|   27|
|            170.0|   27|
|           1500.0|   26|
|            160.0|   24|
|            130.0|   24|
|            180.0|   23|
|            210.0|   22|
|           2200.0|   20|
|            190.0|   19|
|            200.0|   16|
|           1700.0|   16|
|            220.0|   14|
|            230.0|   10|
|           1800.0|   10|
|            250.0|    9|
+-----------------+-----+
only showing top 20 rows

--- End code ---

The catch is.  Neither actually say much individually.  I need to combine them.... one moment please...


--- Code: ---+-----------------+------------------+-----+--------------------+
|sig_fig_abs_delta|        sum(delta)|count|      count_over_sum|
+-----------------+------------------+-----+--------------------+
|            200.0| 7.199999999999875|   16|   2.222222222222261|
|            230.0| 8.620000000000118|   10|   1.160092807424578|
|            910.0|1.7599999999998772|    2|  1.1363636363637157|
|            870.0| 1.849999999999909|    2|  1.0810810810811342|
|            780.0|   3.2800000000002|    2|  0.6097560975609384|
|            180.0| 176.1400000000001|   23| 0.13057794935846478|
|            210.0|407.14000000000044|   22|0.054035466915557245|
|            160.0|1578.8199999999997|   24|0.015201226232249404|
|            170.0|           1867.13|   27|0.014460696362867074|
|            220.0|1330.0100000000004|   14|0.010526236644837253|
|            360.0|357.81000000000006|    3|0.008384338056510437|
|            260.0|            777.47|    5| 0.00643111631316964|
|            240.0|1193.9099999999999|    7|0.005863088507508942|
|            270.0|           1077.95|    6|0.005566120877591725|
|            330.0| 981.5700000000002|    5|0.005093880212312926|
|           1200.0|           1140.96|    5|0.004382274575795821|
|            250.0|           2247.42|    9|0.004004591932082121|
|            280.0|           1113.02|    4|0.003593825807263122|
|            310.0|             625.8|    2|0.003195909236177...|
|            320.0| 644.7900000000001|    2|0.003101785077311993|
+-----------------+------------------+-----+--------------------+
only showing top 20 rows

--- End code ---

Interesting.  Some candidates.

The next approach has to be more exotic and actually trying to match up pairs of deltas + and - in time and see how that plays out with some "sanity test" stats like above.

Anyone worked on stuff like this?  Any "information theorists" who can help with the more exotic pivots etc.?

I believe the companies who normally offer this stuff do it by running machine learning on datasets of all their customers and devices and publish likely candidates to end devices, but allow users to override them.

Surely it should be achievable some how with a single user data set.  I currently have only 1 day at max resolution before it's down sampled.  I might expand that to a month to give me more distribution for analysis like this.
paulca:
Actually, I need to work backwards towards the data source to work out what it's actual sampling and max message rate is.  If it's limiting it's updates to "onchanged and 1 message every 30 seconds", I can flash it with Tasmota and fix that!
paulca:
These two:

--- Code: ---|            200.0| 7.199999999999875|   16|   2.222222222222261|
|            230.0| 8.620000000000118|   10|   1.160092807424578|

--- End code ---

They have a high count.  The count is an event number. * When all of those instances are added up, including noise and errors, they are only asymmetric by 7 or 8 watts.

These HAVE to be devices.... or just a data anomaly and cherry picking what I want to see.

* an uneven count needs a +- delta grace?
Navigation
Message Index
There was an error while thanking
Thanking...

Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod