You can simulate and test. Usually 0x0 would point to the LSB of the 16 bit side while 0xF would point to the MSB as you are basically addressing in order of magnitude, however, this is what I have used in Intel's Quartus.
If you want to change the order, you can XOR your address on the chosen side of the ram with 0x00F.
Or, swap the bits around on the 16 bit side.
There is 0 delay or logic consumption in doing so in an FPGA.
Worst case, you can do 16bit x 16bit and demux the output / or use a bit write mask to write 1 bit at a time.
(Additional note: in some of my code, I just have a 'parameter' to select the endian...)