I realize this is an old thread, but I was just researching this exact same subject and thought what I found might prove useful.
The first question you have to answer, is how fast do you need need to be able to handle quadrature input? This is determined by the resolution of the quadrature encoder (# of steps/revolution) and the maximum RPM you need to be able to handle.
For handling low speed input, by all means bit-bang it in software. This method can handle higher speed input, but you may want to dedicate a processor to just the quadrature input processing to ensure you don't lose track of position.
Newer ARM processors typically include a QDAC (Quadrature Counter) input block that you can use as long as the step rate is < 10,000 Hz. This worked out to a max speed of ~ 1000 RPM at 512 steps per revolution (128 step coding disk in x4 mode) for the processor I am using (nRF52832).
If you have higher speed requirements (like a motorized application), then this can be too slow. The LS7366R-S is a great choice for these kinds of applications. In fact the LS7366R-S is overkill for me since it will operate up to 78kRPM at 512 steps/revolution. The downside to the LS7366R-S is it is a single sourced, special purpose IC and considering how difficult the supply chain has become, I am uncomfortable designing that into a commercial product that will be in production for > 10 years. The IC is not exactly cheap either at $7 each.
I found this app note from Renesas that explains how to implement a quadrature counter function with a SPI port using an inexpensive PLD (Programmable Logic Device). The is about 1/4 the speed of the dedicated LS7366S-R for less than $0.50 each. And the concept is easily implemented in a number of comparable parts from a bunch of IC makers.
https://www.renesas.com/br/en/document/apn/cm-277-quadrature-encoder-counter-spi-bus-interfaceThe app note design can operate above 20,000 RPM at 512 steps/revolution which is a perfect fit for my application. You do need to program the PLD, but the same would be true for a processor based solution. Programming PLDs is not something than most hobbyist are familiar with, but it is less complicated than writing embedded firmware is.