1-5 nanoseconds is not particularly hard to achieve. You can generate the pulse with anything (a microcontroller, 555, or a debounced push-button) then use a fast device to square it up. Fast logic gates like the 74AUC series or (LV)PECL will easily do this with resistors to set the output level appropriately -- in fact they are overkill as they have sub ns rise times. A 74AUC logic gate won't drive the needed output current but you can put multiple gates in parallel to increase current load. The PECL families are emitter coupled bipolar logic so the pulse amplitude isn't as accurate as a CMOS based gate. If you are happy with the slower end of the 1-5 ns range then you can use correspondingly slower logic families.
Even the later generations of bipolar TTL logic can produce an edge that fast. I swapped out the 74LS140 dual 4-input NAND buffer in my function generator for an Advanced Schottky replacement to get a faster edge on the function generator's sync output pulse.
Personally my choice would be to use a 74LVC125 tri-state buffer (1) to drive a transistor cascode to drive a parallel terminated source. This allows for easy adjustment of pulse amplitude and makes it easier to get a very clean output. This is essentially a modern version of how many reference level pulse generators like the Tektronix PG506 work.
But AC logic (advanced CMOS logic) or many of the more recent CMOS logic families like LVC can generate a 1 nanosecond edge directly.
(1) The LVC logic family can operate at 5 volts for higher performance and I think it is the fastest commonly available CMOS logic family when used this way.