I had further thoughts about this. When the delay caused by real world gates is taken into consideration, the "solution" I gave yesterday may be unworkable.
Problem 1. There is a delay in D1, A2 will not pass on the input pulse until D1 Q goes high, but the tail end of the input is not delayed so the pulse is shortened.
Problem 2. Delays propagating from D3 to D4 might cause D4 Qbar to be delayed too long causing a glitch between the original pulse and the extended pulse.
Problem 3. For the same reason as Problem 2, the extended pulse is too long.
All this can be addressed by rejigging the circuit, but I can't model all of it in ltspice.
Solution to problem 1. Clock D1 from the inverted pulse from A1. This means the rising input pulse happens well after D1 Q goes high, and D1 will lower Q after the falling input pulse. So the full pulse passes through A2, A3. If the delay in rise and fall times is equal, the output pulse is the same length as the input pulse, just delayed by two gate delays.
Solution to problem 2 and 3. D3 can be clocked by the output of A3 and cleared by an RC circuit from D3 Q slightly less than 150ns later. D flip flops can be used this way as one shot, especially ones with schmitt-trigger inputs. The output of D3 Qbar goes to A3 instead of the output of D4. D4 is not required. The D3 Qbar transition occurs within the high of the input pulse so A3 will not glitch. The RC constant is calculated to give the right pulse extension.
D3 could be replaced by any one shot that has an enable (the output of D2). Depending on its output, there is a spare gate on the quad nand that could be used as an inverter.
Is it worth continuing this thought thread?
I've attached the ltspice diagram with Problem 1 solution. The pulse extension still works in ltspice but unlikely to in reality.