0. Don't. I2C is terrible in between-board and off-board situations.
1. At the very least, plan on using shielded multiconductor cable, and put the shield through one or more pins of the connector, or preferably the connector shell itself.
2. What other requirements do you have? This is very ambiguous so far. Does it need to be IP rated (dust or water sealed)? (Your looking at M12 might suggest so, but DIN not so much.)
What's wrong with cheap and abundant DE-9? More pins than you need, so what, 3-wire serial is still in common use on them..
Mini-DIN will be quite compact and cheap, is shielded, and... I'm not sure if you can get any with retention clips or locks, that'd be something to look into I guess.
Even just plain old Molex SL or KK family, can be used in through-chassis application, but obviously you aren't getting anything close to IP67 with them.
3. Yes, ESD is probably a good idea (at both ends?), and filtering. 2.2k pullups, 100 to 470p cap to GND, and a ferrite bead (300 ohm or so, 0603 or bigger?) towards the cable, is recommended. I2C is awful for off-board connections but with shielding, filtering and protection, you at least have a better chance at success.
4. If you're hard set on saving the button pin, you might do some signal abuse, like, leaving off the pull-ups on the host side, and tying them to the button on the sensor end. So the lines are default low, until the button is pressed, pulling them up and permitting communication. Presumably the button cannot be pressed for less time than a bus cycle takes (this can be enforced with the help of a capacitor), so when the host detects both lines going high, it tries talking.
Note: I don't recall what LL --> HH transition means to a normal I2C host, if it'll get confused by that or what -- worst case I think, you'll have to disable it, monitor the lines in software, then when it becomes available, enable and reset the controller. And put a timeout on it, so it doesn't try and wait forever for the sensor to deliver data that it can't because the button's let go.
Another possibility is just a big fat pulldown resistor, and monitoring supply current to the sensor. This potentially limits selection (you can't use a sensor that draws more than [threshold] mA), and puts a hard limit on maximum cable length (but even for thin wires, this should be much longer than the 1m typical length given).
A more refined method might look at fluctuations in supply current, so that an oscillator (at the sensor end) draws pulses periodically, or pumps charge back and forth through the supply; less current is consumed, and there's no absolute (DC) threshold, so that much more load current can be drawn. Downsides: higher complexity, and the sensor itself can generate interfering signals (i.e. its current consumption probably isn't perfectly stable, but fluctuates while it works).
Probably, the cost of one measly wire isn't going to outweigh the complexity of any of these methods, and that's alright.
Tim