Presuming you are talking about digital signals, I just code it directly in xC for the XMOS processors. Hence for a standard two-rows of alphanumeric characters with RW, E, and DATA lines, the code is:
static void lcdWriteByteNoWait(int byte) {
uint32_t tStart;
lcdRW <: 0 @ tStart; // set the RW line low, and record the time
// high nybble
lcdE @ (tStart+Tsp1) <: 1 ; // wait until Tsp1 later, then set the E line active
lcdData @ (tStart+Tsp1+Tpw-Tsp2) <: (byte >> 4); // wait until it is "safe" to change the data lines, and then change them
lcdE @ (tStart+Tsp1+Tpw) <: 0;
// low nybble
lcdE @ (tStart+Tsp1+Tcyc) <: 1 ;
lcdData @ (tStart+Tsp1+Tcyc+Tpw-Tsp2) <: (byte & 0xf);
lcdE @ (tStart+Tsp1+Tcyc+Tpw) <: 0;
// complete the two cycles; use RW to avoid optimisation
// and to give an external hint it has finished
lcdRW @ (tStart+Tsp1+Tcyc+Tcyc) <: 1;
}
where the magic time constants are read directly from the data sheet
// interface timing in 100MHz ticks
#define Tsp1 (6) // min 40ns
#define Tsp2 (10) // min 80ns
#define Thd2 (2) // min 10ns
#define Tpw (26) // min 230ns
#define Td (14) // min 120ns
#define Tcyc (60) // min 500ns
A good feature of the XMOS processors is that they can also be doing other time-critical I/O and processing simultaneously. In my frequency counter application, that means comms over a USB link, plus counting transitions in a 62.5 MHz input streams.