That's cool. Talk is cheap though.
So, just show us an example of a "real-time task" running with a few µs latency on Windows and vanilla Linux. Measure real latency, not self-convincing tricks.
On Linux, it's just not possible with the vanilla kernel and base options due to how the scheduler works. Even interrupts are all handled in a single thread by default - that's not the case if you either use a RT-patched kernel, or if you enable the 'threadirqs' option on the kernel boot line. That helps. You can use benchmarks from the rt-tests package to get reliable figures. The 'cyclictest' program for instance.
With a RT-patched kernel, I can get max latencies (it's the max that matters, of course, not the min or average) of about 15-20µs, it's already pretty good and I've never seen better than this on any machine I've dealt with - maybe it can happen, if all stars are aligned. With a vanilla kernel (latest) with the threadirqs option, I can get a max. latency of about 150-200µs. All that with pretty fast machines.
That's a benchmark, and of course, to get that in your applications, you need to carefully craft your programs, configure your kernel properly and create threads with a high priority.
On Windows I don't have recent benchmarks on latency. I've never seen it have lower latencies than the Linux kernel though up to Win 7.