I agree 100% with Ian.M, and will elaborate a bit. Even if some of the data is delayed relative to other bits, you need to think about the setup and hold times of the device that is receiving the data, and skip the idea that data can be aligned really tightly through trick PCB traces and latched at infinitesimally small moments in time - there's no need for this, and it never works.
Synchronous logic will have a global clock, and you can use either transition to generate or latch data, the 0->1 or 1->0 transition. If your data gets generated a little after the clock line's 0->1 transition, arrange for the receiver to clock it at the clock line's 1->0 transition, maybe by using an inverting clock input or an explicit inverter on the clock line. That way, you have one half clock cycle for the data to stabilize and charge the input circuit of the receiving device before you tell the receiver to latch the data, and you have another half cycle where the sender's data is still stable, so that the receiving device can properly latch the data without it changing.
You can alternatively skew the clock of the data producing circuit by one half cycle (generate on clock 1->0) if you want to keep the controller's clock set to latch on a rising (0->1) clock pulse - it's only the relative clock transitions that matter. But, to repeat, the important trick is to produce data and latch the data on opposite transitions of the clock line, so that you get a free half cycle of clock delay to let everything settle. With any sensible clock frequency, a half cycle will be much longer than your entire PCB, so squiggly traces are not needed at all, nor are they reliable. Always check the setup and hold times of the receiver circuit (the GPIO pins on your microcontroller?) to make sure - that, along with the delays and skew of your external logic will determine the maximum half-period of your clock. It's likely that this will still be an extremely small number, allowing a very fast clock if you need it.