Taking it really slow...
We have four 'things' trying to get granted access to a shared resource - CPUs accessing memory, people accessing a shower, whatever
Call them 'A', 'B', 'C', and 'D'. We want to give them all fair access.
So the following 'round robin' system is chosen:
If 'A' last had access then the priorities will be 'B','C','D' and lowest will be 'A'
If 'B' last had access then the priorities will be 'C','D','A' and lowest will be 'B'
If 'C' last had access then the priorities will be 'D','A','B' and lowest will be 'C'
If 'D' last had access then the priorities will be 'A','B','C' and lowest will be 'D'
To make life simpler, rather than remembering who had the resource last, we will remember who has the highest priority.
That is what figure 2.2 implements - the register in the top block holds who has priority to the resource, and the logic works out the new priorities.
You end up with these patterns - when there are competing requests you take the first in the list, and set the highest priority to be
the next one after the one taken:
Highest priority => complete list of priority, from highest to lowest.
A => ABCD
B => BCDA
C => CDAB
D => DABC
Hopefully you follow me this far.
The problem with the logic was the combinatorial loop caused by the priorities being reshuffled - the one that Verilator can't handle.
Here's how that is solved.
A N+(N-1) stage (7 stages where N = 4), fixed order arbiter is created, with the first N-1 being repeated.
For our four resources the priority will be a never-changing "ABCDABC". What does change is that the first three stages can be disabled depending on who is the priority to the resource:
A => ABCDABC
B => -BCDABC
C => --CDABC
D => ---DABC
You end up with the same effective priority as the round robin, but you have fixed, unchanging priorities in the logic, and no loop.
To make it clearer, the second repetition of a request has been removed (as the first one will always be granted as it has higher priority):
A => ABCD---
B => -BCDA--
C => --CDAB-
D => ---DABC
That is why in figure 2.3 the "request" lines for first N-1 inputs feed two blocks, and there are N-1 OR gates to merge the grant outputs.