Yup, that would do it.
That's the most safe settings it could choose, but definitely not the fastest.
multcount = 0 (off)
IO_support = 0 (default)
using_dma = 0 (off)
So, those are the important ones.
Right now, each 512byte sector is read like this:
1. Send READ_SECTOR command to the ATA controller
2. Twiddle thumbs until IRQ fires
3. Issue 256 IO operations to read the data. (16 bits per read)
4. Done
Now, the "multicount" setting changes how many sectors that can be transfered in one go. I.e, to transfer (for example) 4 sectors (sequential sectors), you only have to issue one command, and wait for one IRQ. This produces a noticable increase in performance.
You can adjust this value with the "-m" parameter to hdparm
"IO_support" says if the IO operations are 16 or 32 bits (0 versus 1). Going 32 bits halves the number of IO operations needed to read out the data This also produces a noticable increase in performance.You can adjust this value with the "-c" parameter to hdparm
You can adjust this value with the "-c" parameter to hdparm
"using_dma" eliminates the IO operations all together. The data is DMAd into your RAM for you, and ready to use by the time the IRQ fires.
You can adjust this value with the "-d" parameter to hdparm
It seems your controller or disk doesn't like DMA though..
The mismatch in sectors reported from various fdisks is an issue I've seen myself, even on 4.4.x kernels. Don't know what's up there..