I'll stick to my story, and claim you are not thinking this through correctly. You can re-read my post.
There is no way to know what the prescale is set to on the old pic, but it is /1 /2 /4 , etc. With a 20MHz clock and a Fosc/4 of 5MHz, the only thing that makes sense (for the 18 value you load) is a /2 prescale for a timer clock period of 400ns.
The new pic is at Fosc/4 of 8MHz, and a timer prescale of /4 for a 500ns timer clock period (which is in the info you provided)
Your timer load values are to produce an overflow, so look at the number of counts you are leaving for the timer to count- for the old pic it is 256-18=238, for the new pic it is 256-59=197.
To get 100us on the old pic, you needed 250 timer counts (250*400ns), and you are loading 18 for 238 counts to go till overflow. You needed 250, so you must have lost 12 getting to the isr and setting the timer. 12 timer counts is 4.8us, and 4.8us is 24 Fosc/4 clocks. Its taking you 24 instruction clocks to set the timer value.
To get to 100us on the new pic, you need 200 timer counts (200*500ns), and you are loading 59 for 197 counts to go till overflow. You needed 200, so you must have lost 3 getting to the isr and setting the timer. 3 timer counts is 1.5us, and 1.5us is 12 Fosc/4 clocks. Its taking you 12 instruction clocks to set the timer value.
So, the new pic with the non-pro compiler is getting to set the timer value quicker than the pro version on the old pic. That can be explained by the automatic saving of registers that the old pic does not have. You can also read the assembly listing to see it.
If you are using something else for the old pic timer prescale, just tell us and I'll adjust my story.
In addition, you can skip the whole thing and use one of the pwm timers, which is 16bit and can simply produce an interrupt every 100us, on time, every time. Set once, and never touch it again. You still have interrupt latency that can change from something like 3-5 clocks depending on what instruction is being interrupted.