Its not how I would do it, but the Clear() function in post 10 works fine-
this is simply printing out some info about the pointers, then running Clear() 1024/32 times where the last time shows that the pointers are a match (before the increment, as the 'end' pointer is one 'position' short of the end, which is a little different than most would do)
pScreen: 0x80000058
lcdptr: 0x80000058
lcdptrPlus127: 0x80000450
[Clear start][lcdptr: 0x80000058]
lcdptr: 0x80000060
lcdptr: 0x80000068
lcdptr: 0x80000070
lcdptr: 0x80000070
[Clear end][lcdptr: 0x80000078]
[Clear start][lcdptr: 0x80000078]
lcdptr: 0x80000080
lcdptr: 0x80000088
lcdptr: 0x80000090
lcdptr: 0x80000090
[Clear end][lcdptr: 0x80000098]
...
...
[Clear start][lcdptr: 0x80000438]
lcdptr: 0x80000440
lcdptr: 0x80000448
lcdptr: 0x80000450
lcdptr: 0x80000450
lcdptr==lcdptrPlus127
[Clear end][lcdptr: 0x80000458]
FYI, this Clear() function run 32 times takes 340 cp0 cycles (680 cpu cycles), the cls function I showed earlier takes 308 cpu cycles and memset takes 424 cpu cycles. So by calling the Clear() function multiple times, it takes an extra 372 cycles.
I don't know all the details of the C rules and which compiler follows them or not, but these 2 produce the same result in xc32-
if ((lcdptr++) == lcdptrPlus127)
if (lcdptr++ == lcdptrPlus127)