Author Topic: Bear Metal ARM. Double Jump table?  (Read 3996 times)

0 Members and 1 Guest are viewing this topic.

TheBrick

• Regular Contributor
• Posts: 58
Bear Metal ARM. Double Jump table?
« on: December 16, 2014, 07:59:58 pm »
Hi,

I've had a little time and been able to get a bit further with my quest to understand ARM bear metal programming and it is going quite well. I've an understanding of the system now but I've just been working my way through a more complicated example listed here

http://www.embedded.com/design/mcus-processors-and-socs/4026075/Building-Bare-Metal-ARM-Systems-with-GNU-Part-2

Code: [Select]
/* Setup the exception vectors to RAM    *    * NOTE: the exception vectors must be in RAM *before* the remap    * in order to guarantee that the ARM core is provided with valid vectors    * during the remap operation.    */    /* setup the primary vector table in RAM */    *(uint32_t volatile *)(&__ram_start + 0x00) = LDR_PC_PC | 0x18;    *(uint32_t volatile *)(&__ram_start + 0x04) = LDR_PC_PC | 0x18;    *(uint32_t volatile *)(&__ram_start + 0x08) = LDR_PC_PC | 0x18;    *(uint32_t volatile *)(&__ram_start + 0x0C) = LDR_PC_PC | 0x18;    *(uint32_t volatile *)(&__ram_start + 0x10) = LDR_PC_PC | 0x18;    *(uint32_t volatile *)(&__ram_start + 0x14) = MAGIC;    *(uint32_t volatile *)(&__ram_start + 0x18) = LDR_PC_PC | 0x18;    *(uint32_t volatile *)(&__ram_start + 0x1C) = LDR_PC_PC | 0x18;    /* setup the secondary vector table in RAM */    *(uint32_t volatile *)(&__ram_start + 0x20) = (uint32_t)reset_addr;    *(uint32_t volatile *)(&__ram_start + 0x24) = 0x04U;    *(uint32_t volatile *)(&__ram_start + 0x28) = 0x08U;    *(uint32_t volatile *)(&__ram_start + 0x2C) = 0x0CU;    *(uint32_t volatile *)(&__ram_start + 0x30) = 0x10U;    *(uint32_t volatile *)(&__ram_start + 0x34) = 0x14U;    *(uint32_t volatile *)(&__ram_start + 0x38) = 0x18U;    *(uint32_t volatile *)(&__ram_start + 0x3C) = 0x1CU;    /* check if the Memory Controller has been remapped already */    if (MAGIC != (*(uint32_t volatile *)0x14)) {        AT91C_BASE_MC->MC_RCR = 1;   /* perform Memory Controller remapping */    }
Code: [Select]
LDR_PC_PC contains opcode of the ARM instruction LDR pc,[pc,...].

In this series when setting up the vector table in RAM there is a duplicate of the RAM table. First the vector table is moved from ROM to RAM. I understand this, and I understand why this is done, faster more efficient power wise, ability to alter programmatically. Then the address space is remapped using

Code: [Select]
 AT91C_BASE_MC->MC_RCR = 1;   /* perform Memory Controller remapping */
Which I understand. What I don't understand is what it the point of the secondary jump table situated immediately after the first?

An interrupt occurs -> it heads to the vector table which points the pc in the direction of the function that is to handle the interrupt. This second jump just seems to add an extra step for no reason. What am I missing?

gxti

• Frequent Contributor
• Posts: 507
• Country:
Re: Bear Metal ARM. Double Jump table?
« Reply #1 on: December 16, 2014, 09:00:58 pm »
Just guessing from what the article implies, but it sounds like AT91SAM7's ISR vector is code that the CPU jumps to, as opposed to e.g. STM32 where it's a vector of addresses that the CPU reads and jumps to. The author seems to have converted the former style to the latter by using his "secondary vector table", which is a vector of addresses like STM32. That makes it easier to point vectors to different places at runtime, otherwise you'd have to generate an instruction to jump to the address you want and write it into the ISR vector. It may not even be possible to fit opcodes that jump to arbitrary addresses into a fixed space like that, but I'm not very knowledgeable about ARM opcodes. The author seems to hint that this is the case in note #5 but doesn't explicitly spell it out.

As for why you'd want to move the ISR or ISR vector into RAM -- it's not always the case that it is faster. When an interrupt happens, the CPU has to push many registers onto the stack, which means writing to RAM. If it also has to concurrently read code from RAM because you just jumped there, there could be a bottleneck. Reading code from flash, on the other hand, could happen concurrently and often has its own separate cache. If you don't need the ability to modify or dynamically load ISRs at runtime, I would stick to ROM-only approaches to make things simpler. But I have not worked with AT91SAM7 before, only STM32, so there could be reasons why it is applicable to that target.

And as for Bear ARM:

TheBrick

• Regular Contributor
• Posts: 58
Re: Bear Metal ARM. Double Jump table?
« Reply #2 on: December 16, 2014, 09:28:34 pm »
Dam homophones!

Thanks that makes perfect sense and rereading this section with that in mind it makes more sense given the
Code: [Select]
LDR_PC_PC = 0xE59FF000U;opcodes definition.

Also thanks for the tip on RAM / ROM vector table placement. Given what you have said about a RAM bottle neck and as I understand the pipe line is fetch, decode, execute (staggered with another fetch happening during the decode e.t.c.) this possible RAM bottle neck must be frequently hit upon as many instructions involve pushing or popping to the stack, and during the execution of one of these instructions the fetch for a instruction two executes in the future will being grabbed from the RAM at the same time?

westfw

• Super Contributor
• Posts: 3245
• Country:
Re: Bear Metal ARM. Double Jump table?
« Reply #3 on: December 18, 2014, 06:58:21 am »
The 91SAM7 series pre-dates the cortex M-series, so it doesn't have the standard NVIC defined by ARM.  In those days, the details of the interrupt controller were vendor specific, and you have to read the Atmel datasheet to see how that particular chip worked... (um.  interrupt cause was stuffed into the PC before invoking the basic ARM interrupt, allowing some vectoring to be achieved by further dispatching off another table indexed by the PC value.  Or something like that.)

gmb42

• Regular Contributor
• Posts: 189
• Country:
Re: Bear Metal ARM. Double Jump table?
« Reply #4 on: December 18, 2014, 11:28:44 am »
The double vector table is because (for all original non Cortex-M ARM) the vector table is actually a table of instructions, not vectors, and when an exception occurs the pc is set to the exception "vector" address and starts executing from there.

As each exception only has one instruction slot in the table there is not enough space in the slot to store code to either set the pc (or branch) to anywhere in the 32 bit address space, so the instruction in the first table is a load of the pc from the relevant address in the second table which is a true vector table and thus that vector can be anywhere in the 32 bit address space.  If you made sure your exception handlers could fit within the limited immediate branch range, the instructions in the first table could be made branches direct to the exception handlers and the second table dispensed with.

Cortex-M has real vector table that consists of vectors that are loaded into the pc when the exception occurs and can thus point directly to anywhere in the 32 bit address space.

See the explanation in the note describing part (11) of the low level init code in the linked article.
« Last Edit: December 18, 2014, 11:31:21 am by gmb42 »

abyrvalg

• Frequent Contributor
• Posts: 395
• Country:
Re: Bear Metal ARM. Double Jump table?
« Reply #5 on: December 18, 2014, 11:29:50 am »
SAM7 is a "classic" ARM7TDMI, so just two interrupt vectors: IRQ at +0x18, FIQ ("fast interrupt", a mode with separate bank of registers - no need to save regs) at +0x1C. SAM7 has Atmel-specific VIC which provides a jump address in some register in FFFFFFxx space, so it is possible to have a nice vectoring with a simple LDR PC, [PC, #-offset] (don't remember the correct number - a negative offset that rolls over to FFFFFFxx) fetching the address straight from that VIC register instead of RAM.
The rest are exception vectors - RESET, UNDEF (undefined instruction), SWI (software interrupt, SWI/SVC instruction), PABT (opcode prefetch abort - execution went wild), DABT (data abort - bad data pointer dereference/misaligned access).

Jeroen3

• Super Contributor
• Posts: 3516
• Country:
• Embedded Engineer
Re: Bear Metal ARM. Double Jump table?
« Reply #6 on: December 18, 2014, 08:52:28 pm »
Reading code from flash, on the other hand, could happen concurrently and often has its own separate cache.
I'm not so sure. If an interrupt happens, you'll be way out of the flash prefetch/cache range. So you'll have the complete set of waitcycles for flash to return the data. And since it's an interrupt, nothing else will happen. This is usually why the vector table is copied and remapped to ram, since ram has (usually) no wait states.. However, this increases the risk of an corrupted vector table. But on fast targets (>70MHz) you'll often find this recommendation.

Indeed IRQ and FIQ are in the older designs. FIQ locks a few processor registers so you don't have to stack them away. Should be perfect for your simple event flag setting interrupts.

Smf