I am familiar with the L1 basis pursuit and sparse reconstruction methods of Donoho and Candes.

With only four standards, I think there is barely enough information to provide a six-point calibration uniquely (given that one only is finding S11 and S21). More data would be needed, but obtaining reliable standards would be tricky. Perhaps, for example, one could use shorted stubs of various lengths as standards, and try to fit the lengths of the stubs.

I think L1 minimization is only good for sparse problems, and I don't think VNA calibration is sparse, unless you plan to basically only find a limited subset of calibration corrections that best correct the data. But if you want to find a fully dense set of calibration variables, there is no way around simply needing more data.

The problem is highly nonlinear. One way to solve it in practice is to use iteratively reweighted least squares, which can be set up so that minimizes a L1 norm. This is a variation on the Levenberg-Marquardt method, where you would find the Jacobian matrix that describes the partial derivatives of the data with respect to the calibration parameters, and then do invert this matrix to solve for the best fit parameters, adjusting weights as you go to favor one parameter over the other, the parameters representing hypothesis about which model best fits the calibration standards.

For the VNA I built, I wrote down a brief description of the calibration method I used. It was very simple and worked:

https://github.com/profdc9/VNA/blob/master/doc/VNA%20calibration%20method.pdfMight I suggest that the NanoVNA (and my design) have some problems that probably make a method like this overkill and not likely to result in much improvement in practice? Designs using the SA612 are susceptible to thermal drift, even though they are partially thermally compensated. Also, the codec clock is not synchronized with the SI5351A clock on the NanoVNA, so there is phase error that accumulates over a longer interval. I have triggered the data acquisition off of the intermediate frequency signal to try to remedy this problem and have obtained higher dynamic range as a result.