1) Do anyone know how this https://github.com/promach/DDR/blob/main/phase_detector.v works internally ?
I haven't looked code. I can only comment on the general principle.
Master and slave data are delayed with an offset and sampled with the same clock. When there's an edge in data the VALID signal produces a pulse and INCDEC is set to '1' or '0' depending on the master and slave samples being different or the same. The code is supposed to aggregate these readings. If there is a skew (that is INCDEC is either always '1' or '0') most of the time, your code is supposed to correct the delay.
The diagram you posted is supposed to be read from top to bottom - the top two lines produce the same reading for both master and slave thus INCDEC is set to '0'. The delay is then decremented which shifts data relative to the clock - look at the second set of lines. But the master and slave readings are still the same so INCDEC is still '0'. So, they decrement the delay again - look at the third set of lines. Still the same, so they do another decrement - look at the forth (last) set of the lines. This time, master and slave readings are different, so INCDEC is now '1'. Therefore they increment the delay by '1' going back to the third set of lines. This continues indefinitely keeping the data aligned to the clock.
The pdcount simply adds some stability. Instead of doing delay increments/decrements immediately, they accumulate INCDEC values with the pdcount register. If the INCDEC values are mixed (meaning alignment is close to optimum), pdcount varies around "10000". If ones start to prevail (pdcount goes to "11111") and they do the increment. If zeroes start to prevail, pdcount moves to "00000" and they do the decrement. Of course you can use any number of bits - the more bits you use the more stable and less responsible the calibration becomes.
The drawback is you need data to maintain calibration. If data stops, the calibration stops too. As applied to DDR3, this means that you need to read data periodically, or you lose calibration. That's where calibrated delays in 7-series come handy.