what do you think of section 8 where it describes pipelining?
I think it describes a shift register, not a pipeline. I don't think think the text is interesting. Not because it's old, but because the author is talking about insignificant things and lacks the substance.
Shift register is not really a pipeline. You certainly can look at the shift register as a rudimentary pipeline which has no logic. However, this is unlikely to help to understand what the pipelining is.
Pipelineing is where you have some sort of combinatorial logic, and you split it in stages by inserting registers. The registers split your logic into stages. Each stage executes in parallel. Since each stage contains less logic than the whole thing, this lets you increase clock speed and thereby increase the throughput. However, each stage takes a full clock cycle to execute. Thus you'll need several clock cycles to get the result, which increases latency.
For example, if you have
x <= a + b + c + d;
you can make it into a pipeline by inserting two intermediary registers
x <= a_plus_b + c_plus_d;
a_plus_b <= a + b;
c_plus_d <= c + d;
This produces 2 stages. At the first stage, a_plus_b and b_plus_c are calculated. On the second stage they're used to calculate x.
Or you can do it this way:
x <= a_plus_b_plus_c + d_plus;
a_plus_b_plus_c <= a + b + c;
d_plus <= d;
Which creates an unbalanced pipeline - first stage takes longer than the second one. This diminishes the benefits of pipelining.
But you cannot do it this way:
x <= a_plus_b_plus_c + d;
a_plus_b_plus_c <= a + b + c;
which will mess up totally everything. If you can explain why, you understand how the pipelines work.
Edit: Discaimer: All the code above should be located inside a clocked process
always @(posedge clk) ...