LOL, so you don't want to play the guitar, you want to play with the DSP techniques, sorry for not reading in full the first post.
The first idea that comes to mind (and I think the most simple) will be to implement a homodyne detection algorithm (AKA lock-in amplifier).
https://en.wikipedia.org/wiki/Homodyne_detection (https://en.wikipedia.org/wiki/Homodyne_detection)
https://en.wikipedia.org/wiki/Lock-in_amplifier (https://en.wikipedia.org/wiki/Lock-in_amplifier)
Continuously multiply your samples with the reference frequency, then integrate. I didn't put it on paper, but it should work even if the reference is not sinusoidal, so you can multiply with a square wave instead of a reference sinus. To avoid multiplication, you can use a square wave of value +/-1.
Even simpler, you can use a reference square wave of 0/1, which in the end will be to continuously count your samples, and add only those overlapping with the 1 semi-alternance of the reference square wave. So, all you have to do is to pick which samples to add, and which samples to skip in a running window adder. You will need a second reference square wave which is 90 degree shifted (quadrature) and do a running adder for it, too. The guitar string is tuned on the reference frequency when (adder1*adder1 + adder2*adder2) is maxim.
The math of the lock-in amplifier is nicely explained in https://www.youtube.com/watch?v=rzzliN_vTKs (https://www.youtube.com/watch?v=rzzliN_vTKs) between minute 4:00 and 8:00. Should work, too, with a square 0/1 instead of a sinus reference, so you get rid of most of the multiplications and use addition instead.
Now, the only problem is that the amplitude of the string is not constant, so we can turn it into a saturated square wave too, like we did with the reference frequency. In the end we will just add (AKA integrate) the signum of the sampled signal, but we will add only those samples where the reference signal is 1, and discard those where the value of the reference oscillator is zero.
The guitar string is tuned when sigma1*sigma1+sigma2*sigma2 is maxim, where sigma1, sigma2 are something like this:
If reference == 1 then
sigma1 = sum(signum(sample))
else
discard sample;
If reference_quadrature == 1 then
sigma2 = sum(signum(sample))
else
discard sample;
Didn't test it, thought, so I hope I'm not assuming anything fundamentally wrong here ^-^.
Success!
It actually turns out that the I2S sample rate was 44100, not 48000, and I had a stupid coding error - in DSP code there are seems to be a million ways to do stupid...
I know that the code still isn't technically correct (I think I've got time going the wrong way), but it works and I know have a tuned 12-string, that I can put back under the bed.
In case anybody discovers this in the future and says 'but what did they do?", here it the guts:
#define MAX_REF_LENGTH 2048
/* note rx_buf is stereo samples, and only one channel is used */
void do_dsp(int offset, int length) {
static int first = 1;
int i,s;
if(first) {
/***********************************
* Build reference sin/cos waves
***********************************/
for(i = 0; i < N_STRING; i++) {
int j;
phase_per_sample[i] = freqs[i]/SAMPLE_RATE;
phase_from_tuned[i] = 0.0;
display_phase[i] = 0;
for(j = 0; j < MAX_REF_LENGTH; j++) {
kernel_cos[i][j] += cos(j * phase_per_sample[i]*PI*2);
kernel_sin[i][j] += sin(j * phase_per_sample[i]*PI*2);
}
}
first = 0;
}
for(s = 0; s < N_STRING; s++) {
int i;
float x1 = 0.0, y1 = 0.0;
float x2 = 0.0, y2 = 0.0;
float a1, a2, diff;
/* Find the phase angle at the start of the buffer */
for(i = 0; i < length; i++) {
x1 += kernel_cos[s][i] * rx_buf[2*i];
y1 += kernel_sin[s][i] * rx_buf[2*i];
}
a1 = atan2f(y1,x1)/(2*PI);
levels[s] = x1*x1+y1*y1;
/* Find the phase angle at an offset through the buffer */
x2 = 0; y2 = 0;
for(i = 0; i < length; i++) {
x2 += kernel_cos[s][i] * rx_buf[2*(i+offset)];
y2 += kernel_sin[s][i] * rx_buf[2*(i+offset)];
}
a2 = atan2f(y2,x2)/(2*PI);
/* adjust the phase angle of the early signal by 'offset' samples */
a1 -= phase_per_sample[s]*offset-floor(phase_per_sample[s]*offset);
if(a2 < 0.0) a2 += 1.0;
if(a2 < 0.0) a2 += 1.0;
if(a1 < 0.0) a1 += 1.0;
/* Work out the phase difference */
diff = a2-a1;
/* Clamp the phase angle to +/- 0.5 */
if(diff >= 0.5) diff -= 1.0;
if(diff < -0.5) diff += 1.0;
phase_from_tuned[s] = diff;
}
}
Oh, and my tuner implementation is usable with offset of 2000 and length of 2000.
Here is an "this might just be working" video:
https://www.youtube.com/watch?v=8ty29MD_TBI (https://www.youtube.com/watch?v=8ty29MD_TBI)