Smaller value increases margin against noise, but increases power consumption during transactions: whenever a device outputs '0', current I = Vcc / Rpu flows through that resistor.
I just usually default to 4.7k unless it's a high-speed bus or shares enough devices that 4.7k is uncomfortably close to the maximum value (obtained with similar calculations you presented), in which case I reduce it to 2.2k. Or obviously for BOM optimization, some close-by value.
For low power design, increasing speed instead of the pull-up value and getting back to sleep faster is likely better choice.
I also prefer to leave a bit safety margin on the low side because the driver's Rds_on (i.e., the transistor resistance) may be underestimated in some formulas. This output pin resistance forms a voltage divider, rising "zero" voltage above zero, again bad for noise margins. For example, if Rio = 50 ohms, Rpu = 500 ohms, Vdd=3.3V, zero level will be 50/550 * 3.3V = 0.3V. This may be still acceptable but just barely so. For this reason alone, I never go below 1kOhm. If I have to "for speed", I2C is clearly the wrong choice.
But as you found out, generally, the suitable range of resistor values is wide. Pick something in the middle of that range, when unsure.