There are two ways to do this and i think your getting confused between the two.
-Way one-
The first way is to read in one byte every time the code is executed (from start to end).
The micro runs through all the code and saves one byte into the array (or checks one header byte).
Then the main program loop starts again and it all happens a 2nd time. Saving the next byte into the array. (or checking the next header byte).
-Way two-
This way is only possible because the Arduino has a 128byte buffer built into the serial functions (serial.available, serial.read etc..)
This means at any time in your program the serial buffer might have nothing in it, or anything from 1 to 128 bytes.
Every time you do a serial.read the first byte in the buffer is returned and this byte is deleted from the buffer.
If you call 'Serial.available' it returns the number of bytes currently in the buffer.
So.. "if(Serial.available() > 0)" means, "if the buffer contains more than 0 bytes"
This means you have the option of reading in multiple bytes from the buffer during one execution run of your code. However, the chances are pretty high that your code is going to run so fast that the buffer won't have time to fill up past 1 byte before you read and clear it. So its probably not worth worrying about unless you plan to have a lot of other code running in the main program loop that might be taking up lots of cpu time.
In essence both ways are the same, it's just that way "one" uses the main program loop to read bytes and way "two" uses a second loop inside the main program loop to do it.
In order to write the code you first need to pick one.
Personally i like the first way as it's quite simple and it keeps the main program loop executing faster (one loop per byte read).
Oh, another thing, you definitely want to avoid having a loop (inside your main loop) which waits for data to arrive in the buffer. This will 'lockup' your main loop during these waiting times and any other code you put in your main program loop later on will be effected by this lockup.
You can read out all bytes in the buffer and save them (way two) and that's fine, but you don't want to wait for new bytes to arrive. Let the main loop keep executing as fast as it can and process bytes as they appear in the buffer.