EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: hamster_nz on November 18, 2020, 08:59:57 pm

Title: Measuring intervals...
Post by: hamster_nz on November 18, 2020, 08:59:57 pm
From http://www.madore.org/~david/computers/unix-leap-seconds.html (http://www.madore.org/~david/computers/unix-leap-seconds.html) :
Quote
one should consequently avoid taking the difference between values of gettimeofday() to measure delays. Instead, use clock_gettime(CLOCK_MONOTONIC,...): the gettimeofday() function should only be used to obtain the current date and time (as a wall clock), not to measure intervals (as a stopwatch) except if those intervals span over several months (or, perhaps more to the point, if they are expected to survive over a reboot).

I've never used clock_gettime(CLOCK_MONOTONIC,...). I will from now on.
Title: Re: Measuring intervals...
Post by: Nominal Animal on November 18, 2020, 11:48:41 pm
Yup, it's one of those things that often get overlooked.

(Just like read() and write() returning a short count, or -1 with errno == EINTR if a signal is delivered to a handler installed without SA_RESTART flag.  We really, really need better POSIX C learning materials, concentrating on POSIX stuff like getline() instead of fgets(); nftw() or scandir() or glob() instead of opendir()/readdir()/closedir(), and so on. :rant:)

Whenever using clock_gettime(), the following utility functions come in handy:
Code: [Select]
static inline double difftimespec(const struct timespec after, const struct timespec before)
{
    return (double)(after.tv_sec - before.tv_sec)
         + (double)(after.tv_nsec - before.tv_nsec) / 1000000000.0;
}

static inline int64_t difftimespec_ns(const struct timespec after, const struct timespec before)
{
    return (int64_t)(after.tv_sec - before.tv_sec) * (int64_t)1000000000
         + (int64_t)(after.tv_nsec - before.tv_nsec);
}

static inline int64_t difftimespec_us(const struct timespec after, const struct timespec before)
{
    return (int64_t)(after.tv_sec - before.tv_sec) * (int64_t)1000000
         + (int64_t)(after.tv_nsec - before.tv_nsec) / 1000;
}

static inline int64_t difftimespec_ms(const struct timespec after, const struct timespec before)
{
    return (int64_t)(after.tv_sec - before.tv_sec) * (int64_t)1000
         + (int64_t)(after.tv_nsec - before.tv_nsec) / 1000000;
}
The first one is equivalent of standard C difftime() but for struct timespecs, the three others provide the difference in nanoseconds, microseconds, and milliseconds (rounding toward zero).  The casts are very deliberate: if the time_t type is signed and wraps around, the result will be correct, using these casts.  (This is true on all POSIXy systems I know of.)  The nanosecond resolution works for intervals up to slightly over 292 years, the microsecond version slightly over 292,000 years, and the millisecond version slightly over 292,000,000 years, positive or negative.

The double version is nanosecond-precise for up to 104 day intervals; microsecond-precise for 285 years; and millisecond-precise for almost 285421 years.  The order of the casting ensures that even when .tv_sec's are large, there will be no cancellation error, as only the difference is converted to floating-point and not the times themselves.

Of course, the clocks themselves do not necessarily increment every nanosecond; one needs to use clock_getres() to find out the resolution of a particular clock.
Title: Re: Measuring intervals...
Post by: golden_labels on November 19, 2020, 08:32:39 am
If you are on Linux, you may be interested in CLOCK_BOOTTIME instead of CLOCK_MONOTONIC for measuring real world time differences. The difference comes if the system is suspended. But be aware that it’s not like CLOCK_BOOTTIME is “simply better” — both of them have their own proper uses.
Title: Re: Measuring intervals...
Post by: Nominal Animal on November 19, 2020, 08:36:55 am
Yes; the difference is that CLOCK_MONOTONIC does not advance while the system is suspended, but CLOCK_BOOTTIME does.

The most reliable documentation is available at the Linux man-pages project, https://man7.org/linux/man-pages/man3/clock_gettime.3.html (https://man7.org/linux/man-pages/man3/clock_gettime.3.html).  This is the upstream for distribution man pages.
Title: Re: Measuring intervals...
Post by: magic on November 19, 2020, 02:28:52 pm
Is CLOCK_BOOTTIME guaranteed to be monotonic, though?
I suppose interesting things could happen if, for example, Windows decides to adjust DST in the meantime.
Title: Re: Measuring intervals...
Post by: Nominal Animal on November 19, 2020, 04:32:59 pm
Yes, CLOCK_BOOTTIME is monotonic.

Windows standard libraries don't have clock_gettime(), though; it's POSIX stuff.  WSL has had several clock_gettime() related bugs, so if it yields garbage on WSL, I think it is just to be expected.  (Anyone relying on WSL for anything meaningful needs to get their noggin checked, anyway.  Same goes for relying on Wine on Linux to replicate Windows behaviour exactly, too.)
Title: Re: Measuring intervals...
Post by: magic on November 19, 2020, 04:56:20 pm
No, I ask what happens if the RTC is adjusted back while Linux is hibernated.
Hopefully it notices that something is amiss and doesn't decrement CLOCK_BOOTTIME on resume.

But it's one of those things I would like to try before trusting ;)
Title: Re: Measuring intervals...
Post by: golden_labels on November 19, 2020, 05:58:29 pm
CLOCK_BOOTTIME has exactly the same guarantees as CLOCK_MONOTONIC, but in addition keeps track of time while the system is hibernated.

For practical reasons the question of what happens if you modify the RTC is irrelevant. At this point you have intentionally introduced an error into the underlying environment. You could as well overwrite a fragment of kernel memory with random data and ask if CLOCK_BOOTTIME or CLOCK_MONOTONIC remains monotonic. If RTC is invalid, the whole timekeeping becomes unreliable.

If you ask to satisfy your curiosity, you may TIAS™, but any answer you receive will be relevant only for that particular test. Since neither POSIX not Linux give any guarantees about how timekeeping behaves while facing damage to the platform, to stay on the safe side one must assume it gets borked.
Title: Re: Measuring intervals...
Post by: SiliconWizard on November 19, 2020, 06:00:30 pm
Yes, CLOCK_BOOTTIME is monotonic.

Windows standard libraries don't have clock_gettime(), though; it's POSIX stuff.  WSL has had several clock_gettime() related bugs, so if it yields garbage on WSL, I think it is just to be expected.  (Anyone relying on WSL for anything meaningful needs to get their noggin checked, anyway.  Same goes for relying on Wine on Linux to replicate Windows behaviour exactly, too.)

As said above, CLOCK_MONOTONIC or CLOCK_BOOTTIME  - it all depends on what you want to achieve. For code profiling/benchmarking for instance, CLOCK_MONOTONIC would be the way to go.

I have written a small library to have cross-platform timing. On Windows, it uses QueryPerformanceFrequency() and QueryPerformanceCounter(), which is the closest I have found. Works well.
Title: Re: Measuring intervals...
Post by: golden_labels on November 19, 2020, 06:25:54 pm
More examples:
Title: Re: Measuring intervals...
Post by: Nominal Animal on November 19, 2020, 06:38:45 pm
The difference between CLOCK_BOOTTIME and CLOCK_MONOTONIC is adjusted by kernel/time/timekeeping.c:__timekeeping_inject_sleeptime(), which indeed only increments the difference, never decrements it.  It is called by kernel/time/timekeeping.c:timekeeping_resume() (https://elixir.bootlin.com/linux/v5.9.9/source/kernel/time/timekeeping.c#L1675), which uses one of three sources to measure/estimate the suspended time: a clock source active in suspend, a persistent clock, or RTC, in this order of preference.

If the RTC somehow went backwards – say, lost power and reset to zero –, then CLOCK_BOOTTIME would not include the suspend/hibernation duration; as if the suspend duration was zero.  In no case would either CLOCK_BOOTTIME or CLOCK_MONOTONIC go backwards.

There is quite some interesting logic in how the suspend duration is measured/estimated.  It is not exactly precise (especially because the RTC has only a second resolution), so it tries hard to avoid introducing drift etc.

Hibernation (suspend to disk) is implemented in Linux as a variation on suspend, so the above applies to hibernation also.

Note: I'm not saying there are no bugs in the Linux kernel in this area; I am saying that the kernel does guarantee CLOCK_BOOTTIME and CLOCK_MONOTONIC monotonicity, and if they are ever observed going backwards, it is a kernel bug that the developers clearly want to fix.  (They've done quite a lot of kernel timekeeping work in the last few years, including upgrading to 64-bit time: signed 64-bit integer with only positive values valid, giving about 292 year range at nanosecond precision, in theory.  Current implementation has range limits, as in certain situations a scaled value is maintained, avoiding a costly division per update.)
Title: Re: Measuring intervals...
Post by: Nominal Animal on November 19, 2020, 06:58:20 pm
Data acquisition with timestamps should record or detect weirdly large time steps, as that may indicate data was lost, delayed, queued or — if the timing is data itself — it may be garbage. Using CLOCK_BOOTTIME detects all this, including hibernation.
Yes - and preferably recorded/stored as the difference to CLOCK_BOOTTIME at data acquisition start.

It is very useful to record that difference either as a human-readable decimal number with nine decimal digits, or in nanoseconds as a 64-bit (signed or unsigned) integer.  In the header/summary section, report the data acquisition start, of course.  I prefer a variant of the ISO 8601 format, namely YYYY-MM-DD HH:MM:SS.sssssssss in UTC.  (The ISO 8601 equivalent would be for example YYYY-MM-DDTHH:MM:SSZ.)
It is listed in decreasing order of significance like the decimal values themselves, and even even sorts correctly as text.  Local conventions be damned, data acquisition is science.

If data size is an issue, then using a 32-bit (signed/unsigned) integer at microsecond precision gives you about a 35/71 minute range, or about 24/49 days at millisecond precision. (At nanosecond precision the range is only just over two/four seconds, so not very useful.)
Title: Re: Measuring intervals...
Post by: magic on November 19, 2020, 07:15:26 pm
__timekeeping_inject_sleeptime(), which indeed only increments the difference, never decrements it.
Right, it seems that timespec64_valid_strict will reject negative values.

For practical reasons the question of what happens if you modify the RTC is irrelevant. At this point you have intentionally introduced an error into the underlying environment.
No shit, man. People do it all the time for reasons varying from ignorance to well calculated and outright nefarious. Software would better deal with it.

Besides, I gave a pretty innocent scenario when it may happen: Linux is hibernated, Windows runs and enables DST or changes time zone or whatever. Resume Linux, oopsie.
Title: Re: Measuring intervals...
Post by: Nominal Animal on November 19, 2020, 08:28:58 pm
Dual-booting Windows with Linux was historically rife with those RTC issues.  Eventually Linux devs just punted and added support for maintaining RTC in local time.

Nowadays, it is enough to run 'sudo timedatectl set-timezone Europe/Helsinki set-local-rtc 1' to ensure Linux maintains the RTC in local time (while internally still working off UTC).  Then, if one hibernates Linux, boots into Windows, then reboots/hibernates windows and wakes up in Linux, the RTC is maintained correctly in both OSes, and BOOTTIME jumps by the duration (to within one second or so) relative to MONOTONIC.

Well, possibly plus minus one hour, if the local time zone supports Daylights Savings Time, and Linux wakes up during the changeover.
Title: Re: Measuring intervals...
Post by: magic on November 19, 2020, 08:57:17 pm
Arch Linux has supported local time RTC since forever but it's madness. Besides problems with DST, try changing the time zone in one (but only one) OS and see what happens ::)
When I used to dual boot, I always switched Win$h!t to UTC instead.
Title: Re: Measuring intervals...
Post by: golden_labels on November 19, 2020, 09:11:28 pm
Users may do many weird or unsafe things. I never claimed no one is changing RTC. I am saying that from the perspective of Linux that is no different than breaking the underlying platform and that there is no guarantees about behavior of either clock.

Whichever you choose — the local time or UTC for RTC — it is expected to not change and always represent approximately that time. This goes way beyond monotonicity of CLOCK_MONOTONIC and CLOCK_BOOTTIME. If the UTC time is not monotonic then, for example, filesystems will contain timestamps from the future, certificate verification is broken, updates may be delayed, the node will be useless in a distributed system or on web servers sessions will not work as expected.
Title: Re: Measuring intervals...
Post by: Nominal Animal on November 19, 2020, 09:34:31 pm
When I used to dual boot, I always switched Win$h!t to UTC instead.
You can do that now?  Nice!

All machines should internally work off UTC, especially if network-connected (NTP).  Even if your organization/provider does not have their own NTP servers, there are country-specific pools (cc.pool.ntp.org), Ubuntu pools (n.ubuntu.pool.ntp.org), and so on, that one is welcome to use.

Whichever you choose — the local time or UTC for RTC — it is expected to not change and always represent approximately that time. This goes way beyond monotonicity of CLOCK_MONOTONIC and CLOCK_BOOTTIME. If the UTC time is not monotonic then, for example, filesystems will contain timestamps from the future, certificate verification is broken, updates may be delayed, the node will be useless in a distributed system or on web servers sessions will not work as expected.
Which is why NTP is so important for distributed systems.  And it is nice to know that Linux at least tries to do the right thing when something odd happens with the timekeeping system.

That said, I do find it extremely annoying when applications get in a dizzy just because CLOCK_REALTIME just went backwards or forwards a bit, or something similar.  I prefer defensive coding, as in writing e.g. for (i = 0; i < n; i++) instead of for (i = 0; i != n; i++).  (I just don't trust us human programmers to get everything right; it's nice when the code can recover from an occasional glitch [that does not lose data].)
Title: Re: Measuring intervals...
Post by: magic on November 20, 2020, 06:16:43 am
I don't know if you can do it now, but there was a registry key you could set on older NT, up to at least XP :)
Title: Re: Measuring intervals...
Post by: Siwastaja on November 28, 2020, 09:55:00 am
Yes, you can configure Windows to use the UTC time, I've done it for Win7 and Win10 for dual boot systems so yes the feature is not removed. It's an easy fix with regedit.

Although when I had a lot of complex PCB design to do on Windows I liked the fact that the time shown was two hours less than actually, keeping me happily working for longer: "oh, it's still just 8 pm, moar coffee!"