To me, the formula looks like a first-order approximation (I'd assume current consumption will be non-linear in clock speed, VCC and temperature). I would assume they chose this approximation by guessing and then validating that reality stays within a certain error band of the approximation.
I'd be surprised if they did not simply measure these values using some minimal benchmark design. Program the chip with a design that continuously addresses all the BRAMs on the chip and has a read/write toggle from outside, then let it run for a second and average the chip's current consumption. Repeat for a couple chips from the same series. To take into account the effect of the benchmark's logic driving the BRAM, someone at Xilinx could simply take the routed design and remove the BRAMs from it, leaving everything else in place.
I would guess that they simulated the components of the BRAM, but never the whole block at once.