Now, this can be written a bit better, but, moving bits out in 1 clock cycle in 'C' I would think is impossible since I can't think of a way to do so in assembly. Maybe 2 cycles, prestoring 12 varibles, using a movf, then a movwf line by line.
Now, I would sacrifice one additional cycle just so that I only affect 1 pin on port B and don't have to generate the 12 or 16 bytes ahead of time. To do this, I would only use #asm, btfsc and bcf/bsf to your IO pin in your C code, you'll get 3 instructions per bit coming out of the pin and a reference low output gap between consecutive high bits so you can see the spacing on you scope. Here is how I would go about this: Now, when I say comes out fast, I mean with a 40Mhz pic, this will come out at 3.3333 megabits/sec + my fat setup pulse. You need not worry about the C compiler altering the timing ever & since I only bit-check a 2 byte integer low & high memory points directly. It's been awhile since I've added assembly in line into C routines with MPLab, so I hope I got everything OK.
(Note, you need to have set testval_h & testval_l at the beginning of you C code, they can be ram or directly pointing to your ADC registers)
#asm
bsf PORTB,0 ; I'm using B0 in this test
bsf PORTB,1 ; Optional trigger for scope on a second channel
bcf PORTB,1
bcf PORTB,0 ; Initial 4 cpu cycle high for scope to trigger, using the scope to only trigger on x ns high.
btfsc testval_h,3
bsf PORTB,0
bcf PORTB,0
btfsc testval_h,2
bsf PORTB,0
bcf PORTB,0
btfsc testval_h,1
bsf PORTB,0
bcf PORTB,0
btfsc testval_h,0
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,7
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,6
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,5
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,4
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,3
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,2
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,1
bsf PORTB,0
bcf PORTB,0
btfsc testval_l,0
bsf PORTB,0
bcf PORTB,0
#endasm
Using the equivalent movf seq(x),0 , then movwf LATB, will give you a 2 CPU cycle per bit, but, all of port B will be affected and you need to prep and store all of the seq(x) which would take a mound of many more CPU cycles ahead of time. You can also add a NOP inbetween my previous bsf & bcf to make everything an even 4 cpu cycles & fatter high bits. You can also change those NOPs to an alternating bsf PORTB,1 and bcf PORTB,1 to generate a high and low latch clock for a second channel on you scope.
The nature of my code which makes the fat header & a blank 0 between bits makes the output easy to decode visually on a single trace of your oscilloscope display.