Call me silly, but my own preference in this kind of case would be to create a logic interface to the external matrix, and then interface it to my microcontroller using UART over a digital isolator like ISO6721 (1/1) or ISO6742DWR (2/2).
Looking at Mouser, ATtiny417 is around 1€ apiece, and can be programmed in C/C++ with the same toolchain (GCC or clang) as the MK66 and RT1062, although the backend is different (AVR instead of ARMv7e-m:Cortex-M4/M7). ISO6721BDR is 1.17€ in hundreds and 1.62€ in singles. For the 16 input signal lines, I'd use 0603 resistor arrays, 10k and 3.3k (isolated/separate), in voltage divider configuration (3.3/13.3≃0.248, 12V×3.3/(3.3+10)≃2.97V. With 5% resistors, the factor is 0.248±0.019.
The trick is that I would power the ATtiny417 and ISO6742 from a power opamp output (total maximum current draw is about 20mA), input in a similar voltage divider configuration (making sure the input voltage won't exceed the supply voltage, after the dividers, taking into account the resistor variance), in unity buffer configuration. Perhaps TI TLV9301IDBVR, at about 0.50€.
The main loop on the ATtiny417 would check the row selector inputs at regular intervals (basically the main loop is written around the row selector checking, with a delay to ensure minimum iteration interval). When the row selector inputs stabilize (only a specific one remains low for three consecutive main loop iterations), the column data inputs are checked, the matrix state updated accordingly, and applicable events sent.
I personally would prefer UART for the communications if possible, because then the ATtiny could always send one byte per event: 0..63 for keypresses, 64..127 for autorepeats, and 192..255 for releases. This leaves 0xAA = 0b10101010 = 170 free for synchronization (although the ATtiny417 internal oscillator is 20MHz ± 2%), with code
(c<128) indicating matrix key
(c&63) press or autorepeat event. The other side of the UART could be used to request
N sync bytes, change key dead time, initial repeat delay, autorepeat interval, etc.
Not only would the 12V keyboard matrix and its associated logic be electrically isolated from the MCU side, but this also makes the UART input on the MK66/RT1064 MCU an event queue, simplifying the main loop a lot. The UART interrupt is trivial, as it just stuffs the received byte into a circular buffer of sufficient size; and depending on what exactly the buttons are, can even discard the key releases if the duration of each keypress is not important.
I'd also make it a separate unit from the main MCU: just switch that few-euro module if it burns out. The isolator ensures any oddness on the 12V side should be limited to that side, which helps with expensive MCUs like RT1064 (currently 20€+ in singles, 15€+ even in hundreds). I like the modularity here, and how using a secondary MCU simplifies the overall design, and even helps with repairability (considering the 12V keyboard matrix a glitchy black box).
Now you are absolutely free to roll your eyes, even I am not sure if this suggestion would work
! Again, I'm just a hobbyist, so I don't know whether this would make any business sense even if it does work; and I'm almost certainly missing things experienced EE's know to avoid here.