Author Topic: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data  (Read 6360 times)

0 Members and 2 Guests are viewing this topic.

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1506
  • Country: gb
I've run into a bizarre problem with the SysTick timer interrupt on my CH32V003 project. Basically, what's happening is that a certain section of code being executed inside the SysTick ISR is somehow causing data transmitted on the UART to be corrupted. Yeah, I know, sounds crazy! But that is what I've painstakingly narrowed it down to. I am at a loss as to how this is occurring, and how to proceed in fixing it.

The problem also only occurs when my code is compiled with -Os, and not when compiled in debug mode with -g (and no -O flags).

The UART corruption takes the form of a character being corrupted roughly every N characters. For example, a string of "-----------------------------" might become "------------<-----------<----". The corrupted character is random (i.e. not a bit-flip of the correct character) but always the same at each instance. What's interesting - and this is what put me on to the trail of the SysTick ISR - is that I noticed that N is a value such that, given the baud rate the UART is operating at, it roughly corresponds to the regular period that the SysTick ISR is being called. In my case, baud is 115200, the bit period is 8.68 us, so each character takes 86.8 us (8N1, so 10 bits per character), and if N = 12 (as per example above), that's 1041 us - approximately matching the 1 ms period of the ISR!

This is my ISR's code:

Code: [Select]
extern void SysTick_Handler(void) __attribute__((naked));

void SysTick_Handler(void) {
const uint32_t ticks_orig = sys_timestamp.ticks;
if(++sys_timestamp.ticks < ticks_orig) sys_timestamp.epochs++;

/**** THIS BLOCK IS THE CULPRIT (when commented, problem disappears) ****/
for(size_t i = 0; i < interval_calls.count; i++) {
if(--interval_calls.calls[i].interval_remaining_ms == 0) {
interval_calls.calls[i].callback();
interval_calls.calls[i].interval_remaining_ms = interval_calls.calls[i].interval_ms;
}
}

SysTick->SR = 0;

interrupt_return(); /* macro for "mret" inline assembly instruction */
}

The SysTick timer is set up with this code:

Code: [Select]
void SysTick_Init(const uint32_t hclk_freq_hz) {
SysTick->CMP = (hclk_freq_hz / 1000) - 1;
SysTick->CNT = 0;
SysTick->SR = 0;
SysTick->CTLR = STK_STRE | STK_STCLK | STK_STIE | STK_STE;
interrupt_enable(SysTicK_IRQn); /* macro to enable IRQ in PFIC->IENR */
}

The ISR code is just some straightforward code that increments a timestamp, but also has a rudimentary 'callback' system that will call any registered functions at a given interval in milliseconds. I use it to sample a GPIO for button input every 20 ms.

What's stupid is that even when no interval callbacks are registered (i.e. interval_calls.count = 0, verified by dumping that value with GDB) and the loop body should not be executing, the problem still occurs, so it's not like it's inadvertently calling some random function pointer and jumping of into who-knows-where and clobbering the UART registers or something.

In trying to get some insight into this by debugging with GDB, there's also something wacky going on there concerning the ISR that I can't explain. According to the disassembly produced by objdump (from the compiled .elf), SysTick_Handler starts at address 0xf5c, yet GDB thinks SysTick_Handler starts at 0xf66 if I set a breakpoint by name. If I set a breakpoint by address at 0xf5c, it doesn't get hit! [False alarm, somehow I managed to set the breakpoint to a disabled state.] I don't know WTF is going on there, but I have a feeling it might be part of the problem.

Here's the disassembly of the ISR in case it helps:

Code: [Select]
00000f5c <SysTick_Handler>:
     f5c: 82818793          addi a5,gp,-2008 # 20000028 <sys_timestamp>
     f60: 43d4                lw a3,4(a5)
     f62: 43d8                lw a4,4(a5)
     f64: 0705                addi a4,a4,1 # 1001 <uart_init+0x53>
     f66: c3d8                sw a4,4(a5)
     f68: 00d77963          bgeu a4,a3,f7a <SysTick_Handler+0x1e>
     f6c: 0007c703          lbu a4,0(a5)
     f70: 0705                addi a4,a4,1
     f72: 0ff77713          zext.b a4,a4
     f76: 00e78023          sb a4,0(a5)
     f7a: 83c18413          addi s0,gp,-1988 # 2000003c <interval_calls>
     f7e: 4481                li s1,0
     f80: 8722                mv a4,s0
     f82: 431c                lw a5,0(a4)
     f84: 00f4e863          bltu s1,a5,f94 <SysTick_Handler+0x38>
     f88: e000f7b7          lui a5,0xe000f
     f8c: 0007a223          sw zero,4(a5) # e000f004 <__global_pointer$+0xc000e804>
     f90: 30200073          mret
     f94: 441c                lw a5,8(s0)
     f96: 17fd                addi a5,a5,-1
     f98: c41c                sw a5,8(s0)
     f9a: e799                bnez a5,fa8 <SysTick_Handler+0x4c>
     f9c: 445c                lw a5,12(s0)
     f9e: 9782                jalr a5
     fa0: 405c                lw a5,4(s0)
     fa2: 83c18713          addi a4,gp,-1988 # 2000003c <interval_calls>
     fa6: c41c                sw a5,8(s0)
     fa8: 0485                addi s1,s1,1
     faa: 0431                addi s0,s0,12
     fac: bfd9                j f82 <SysTick_Handler+0x26>

I don't know where to proceed with this. :( Any suggestions for avenues of attack?
« Last Edit: March 29, 2023, 03:07:09 am by HwAoRrDk »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #1 on: March 29, 2023, 03:44:01 am »
Presumably you have enabled their "fast interrupt" register stacking thing.

I worry a little about calling a lot of code from that state, but haven't really thought about it.

I absolutely worry about the timing implications of calling an arbitrary amount of code from an interrupt handler. I think you should increment the tick count there and nothing else. Put the callback scheduling stuff in non-interrupt code.
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1506
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #2 on: March 29, 2023, 05:09:21 am »
Presumably you have enabled their "fast interrupt" register stacking thing.

Yes, hardware prologue/epilogue (HPE) is being used, but not vector table-free (VTF).

I worry a little about calling a lot of code from that state, but haven't really thought about it.

I absolutely worry about the timing implications of calling an arbitrary amount of code from an interrupt handler. I think you should increment the tick count there and nothing else. Put the callback scheduling stuff in non-interrupt code.

The callback stuff is only really an abstraction to allow me to separate code into different files rather than pile stuff into the ISR. Technically it could be calling an arbitrary amount of code, but in practice it isn't, and I know not to use it like that. It's only a very small amount of code in the callbacks. I started out with two: one to toggle an LED every 500 ms (kind of a 'heartbeat'); a second to sample a GPIO for a button every 20 ms. Not exactly taxing. :) Got rid of the former and now I have just the latter.

As I mentioned before, everything works just fine when compiled without -Os. So something about the optimisation is screwing things up, or perhaps revealing an unseen issue.

One further thing I've noticed when attempting to investigate with GDB, is that the corrupted characters appear after the ISR returns. That is, if my code is printing a long string of characters, it happens when the ISR interrupts execution somewhere inside my putchar()-like function that writes a single character to the UART data register. But I couldn't get a handle on what exactly was being affected or how.

Aside: I am finding it quite hard to read RISC-V assembly, as the ordering of operands is not consistent. Sometimes the destination is listed first (e.g. "lw a4,4(a5)"), sometimes last (e.g. "sw a4,4(a5)"). Very confusing. Why is that?
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #3 on: March 29, 2023, 07:00:02 am »
As I mentioned before, everything works just fine when compiled without -Os. So something about the optimisation is screwing things up, or perhaps revealing an unseen issue.

I think the problem is probably in the code that gets interrupted.

This code looks fine, other than not being as optimised as I'd like. For example, loading from 4(a5) in both the 2nd and 3rd instructions, which could result in different values, which could break your code.

I'd expect the optimiser to combine those loads. If it hasn't then I imagine it is because
Code: [Select]
sys_timestamp.ticks is declared as volatile in code we don't see. In which case it is your responsibility to make sure you load it only once.

In fact in general that you don't factor out things such as
Code: [Select]
interval_calls.calls[i] in your code scares me.


Quote
Aside: I am finding it quite hard to read RISC-V assembly, as the ordering of operands is not consistent. Sometimes the destination is listed first (e.g. "lw a4,4(a5)"), sometimes last (e.g. "sw a4,4(a5)"). Very confusing. Why is that?

"store" is the *only* exception, I imagine for mental compatibility with every other ISA that has instructions called "load" and "store", not "move" or both load and store being "ld". ARM for example. Or MIPS.
« Last Edit: March 29, 2023, 07:01:39 am by brucehoult »
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1506
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #4 on: March 29, 2023, 07:53:58 am »
I think the problem is probably in the code that gets interrupted.

That's a thought I had when I realised the corruption happens at the point the ISR has returned back into UART code. But the code in my uart_putchar() is really simple, such that I don't see how it could go wrong.

Code: [Select]
int uart_putchar(int c) {
if(!uart_opts.binary && c == '\n') uart_putchar('\r');
while(!(USART1->STATR & USART_STATR_TXE));
USART1->DATAR = c & USART_DATAR_DR;
return c;
}

Yes, I know, a recursive call. But it's inherently bounded, so sue me. ;D

Maybe I need to figure out how to get GDB to break on every write to USART1->DATAR, and see what value is being written to the register. Perhaps that will shed some light on whether the corruption comes from the character to be transmitted being corrupted before being written to USART1->DATAR.

This code looks fine, other than not being as optimised as I'd like. For example, loading from 4(a5) in both the 2nd and 3rd instructions, which could result in different values, which could break your code.

I'd expect the optimiser to combine those loads. If it hasn't then I imagine it is because
Code: [Select]
sys_timestamp.ticks is declared as volatile in code we don't see. In which case it is your responsibility to make sure you load it only once.

Indeed, sys_timestamp is volatile. Not sure what you mean by "your responsibility to make sure you load it only once". You mean in case it changes in-between accesses? That won't happen, because the SysTick_Handler ISR is the only thing that ever writes to it - after all, that's its job, and no-where else's.

In fact in general that you don't factor out things such as
Code: [Select]
interval_calls.calls[i] in your code scares me.

Again, not sure what you mean by this. You mean "factor out" as in make a pointer instead?

Code: [Select]
for(size_t i = 0; i < interval_calls.count; i++) {
interval_call_t *c = &interval_calls.calls[i];
if(--c->interval_remaining_ms == 0) {
c->callback();
c->interval_remaining_ms = c->interval_ms;
}
}

How would that have a meaningful change to things? It's not like i is going to change mid-loop... :-//

"store" is the *only* exception, I imagine for mental compatibility with every other ISA that has instructions called "load" and "store", not "move" or both load and store being "ld". ARM for example. Or MIPS.

It is? Ah, well, I'm not familiar with ARM, or MIPS, or x86, etc. Still, it's a bit taxing mentally when every third instruction in RISC-V code seems to be a load or store of some kind. I suppose I'll get used to it eventually. :)
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #5 on: March 29, 2023, 09:34:14 am »
This code looks fine, other than not being as optimised as I'd like. For example, loading from 4(a5) in both the 2nd and 3rd instructions, which could result in different values, which could break your code.

I'd expect the optimiser to combine those loads. If it hasn't then I imagine it is because
Code: [Select]
sys_timestamp.ticks is declared as volatile in code we don't see. In which case it is your responsibility to make sure you load it only once.

Indeed, sys_timestamp is volatile. Not sure what you mean by "your responsibility to make sure you load it only once". You mean in case it changes in-between accesses? That won't happen, because the SysTick_Handler ISR is the only thing that ever writes to it - after all, that's its job, and no-where else's.

Yes. Because it is declared volatile the compiler has to assume it could change between accesses, so can't combine them the way it normally would (and the way it does for interval_calls.calls[i]. Loading it twice is just a waste of time.

The code is also loading interval_calls.count every time around the loop, which is presumably also completely unnecessary. It's even having to recalculate the address of interval_calls each time around the loop, after the possible call to callback(), because there are only two S registers in RV32E and they are used up by i and the address of interval_calls.calls\[i\].

Quote
In fact in general that you don't factor out things such as interval_calls.calls[i] in your code scares me.

Again, not sure what you mean by this. You mean "factor out" as in make a pointer instead?

Code: [Select]
for(size_t i = 0; i < interval_calls.count; i++) {
interval_call_t *c = &interval_calls.calls[i];
if(--c->interval_remaining_ms == 0) {
c->callback();
c->interval_remaining_ms = c->interval_ms;
}
}

How would that have a meaningful change to things? It's not like i is going to change mid-loop... :-//

Well no it's not. And the compiler does essentially that rewrite for you anyway.  I just find it a lot easier to write, read, and think about this way. In fact I'd go further.

Code: [Select]
        interval_call_t *c = interval_calls.calls;
        size_t limit = interval_calls.count;
for(size_t i = 0; i < limit; i++, c++) {
if(--c->interval_remaining_ms == 0) {
c->callback();
c->interval_remaining_ms = c->interval_ms;
}
}

I appreciate this is just a style thing, not an issue of the code being correct or not. The compiler will work hard for you and figure out that you're using the same element of interval_calls.calls every time.

But ... the compiler has to actually do some work to analyse the code to see that. And so does the human reader.
« Last Edit: March 29, 2023, 12:07:16 pm by brucehoult »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #6 on: March 29, 2023, 09:36:26 am »
Ok, I give up. How are you supposed to escape an array access using the variable i, outside of an actual code block, without this BBS thinking you want to use italics?
 

Offline mikerj

  • Super Contributor
  • ***
  • Posts: 3256
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #7 on: March 29, 2023, 11:37:46 am »
Ok, I give up. How are you supposed to escape an array access using the variable i, outside of an actual code block, without this BBS thinking you want to use italics?

You enclose the part you don't want the buletin board codes to be active in nobbc tags  [nobbc] your stuff  [/nobbc]

e.g.

interval_calls.calls[i].interval_remaining_ms = interval_calls.calls[i].interval_ms;

IMO The dodgiest part of the systick code is that there is no check for a null pointer.


Quote
The corrupted character is random (i.e. not a bit-flip of the correct character) but always the same at each instance

Does this mean the corruption always result in the same (incorrect) character being printed, or something else?  I read this a few times and it didn't immediately make this clear.

Checking the assembly listing of the uart_putchar() function may be instructive.  Disabling interrupts around each operation in this function, one at a time, may help to narrow down the exact part of the code causing issues.
« Last Edit: March 29, 2023, 11:50:43 am by mikerj »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #8 on: March 29, 2023, 12:10:59 pm »
Checking the assembly listing of the uart_putchar() function may be instructive.

It is certainly possible that while(!(USART1->STATR & USART_STATR_TXE)); might cause problems with optimisation turned on, if USART1->STATR is not declared volatile.
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 825
  • Country: es
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #9 on: March 29, 2023, 03:49:16 pm »
Try removing the “naked” attribute (and probably, disabling HPE) just to be sure it works as expected and doesn’t clobber some reg at the “right” moment.
 

Offline mikerj

  • Super Contributor
  • ***
  • Posts: 3256
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #10 on: March 29, 2023, 04:09:22 pm »
Try removing the “naked” attribute (and probably, disabling HPE) just to be sure it works as expected and doesn’t clobber some reg at the “right” moment.

Well spotted, I missed the function attribute.  Shouldn't the "interrupt" attribute be used here rather than "naked"?

https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Function-Attributes.html
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 5975
  • Country: es
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #11 on: March 29, 2023, 04:21:30 pm »
Just why the recursive thing?
As stated, try enforcing volatile access.
Code: [Select]
int uart_putchar_(int c) {
        while(!((volatile)USART1->STATR & USART_STATR_TXE));
USART1->DATAR = c & USART_DATAR_DR;
return c;
}
int uart_putchar(int c) {
if(!uart_opts.binary && c == '\n')
            uart_putchar_('\r');
        return uart_putchar_(c);
}
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 

Offline jnk0le

  • Regular Contributor
  • *
  • Posts: 51
  • Country: pl
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #12 on: March 29, 2023, 06:30:59 pm »
Quote
This is my ISR's code:

Code: [Select]

extern void SysTick_Handler(void) __attribute__((naked));

void SysTick_Handler(void) {
   //...
   interrupt_return(); /* macro for "mret" inline assembly instruction */
}

According to the v003 core datasheet [1], par 3.4; it stacks only the caller saved registers.

And because the `naked` attribute removes prologue/epilogue entirely, you get corruption due to use of callee saved registers.


[1] http://www.wch-ic.com/downloads/QingKeV2_Processor_Manual_PDF.html
 

Offline AVI-crak

  • Regular Contributor
  • *
  • Posts: 125
  • Country: ru
    • Rtos
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #13 on: March 29, 2023, 06:59:26 pm »
C-style "callback" for interrupt is a potential bug hazard. In your case, CH32V003 does not explicitly save the program context before calling the break, while sharing static variables (via "callback") within functions can generate errors.
About UART.
The algorithm becomes extremely simple when sending the entire string (with a trailing zero), and much more complicated when sending one character at a time.
You might be interested https://github.com/AVI-crak/Rtos_cortex/blob/master/sPrint.h
One macro printo() combines many functions, if something is not used - it will be removed automatically.
printo() uses the GCC _Generic((X) macro, so all preparation for printing is done at the compilation stage. No additional effort is required, up to 9 variables or constants in the parameter of each macro.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #14 on: March 29, 2023, 07:32:37 pm »
Try removing the “naked” attribute (and probably, disabling HPE) just to be sure it works as expected and doesn’t clobber some reg at the “right” moment.

Well spotted, I missed the function attribute.  Shouldn't the "interrupt" attribute be used here rather than "naked"?

https://gcc.gnu.org/onlinedocs/gcc/RISC-V-Function-Attributes.html

The "interrupt" attribute is for standard RISC-V cores. It causes the compiler to save all the registers it uses, not only any S registers it uses.

But the OP is using the custom "fast interrupt" mode, which saves registers in hardware before calling the interrupt function. WCH's supplied compiler has a different __attribute __ ((interrupt("WCH-Interrupt-fast"))) for this. If you're not using their compiler then using a naked function with an mret forced at the end *should* be equivalent.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #15 on: March 29, 2023, 07:40:53 pm »
According to the v003 core datasheet [1], par 3.4; it stacks only the caller saved registers.

Ohhh! That is not generally known. Certainly wasn't to me!

Most people who are using the fast interrupt feature are also using their compiler, which presumably does the right thing.

So, to use a standard toolchain, what you need to do with any interrupt function complex enough to need to use the S registers (the most obvious being if you call some other function from it), the way to do it with a generic RISC-V toolchain is:

Code: [Select]
void SysTick_Handler_inner(void){
    ... do all the stuff here
}

__attribute__((naked))
void SysTick_Handler(void){
    SysTick_Handler_inner();
    asm volatile ("mret");
}

Edit after thinking:

No, that's stupid.  Simply use a normal function (as on ARMv7-M), but force mret instead of ret.

Code: [Select]
void SysTick_Handler(void){
    ... do all the stuff here
    asm volatile ("mret");
    __builtin_unreachable(); // suppress the usual ret
}

No attributes needed at all. Any S registers used will be saved and restored in software, the others won't be (and don't need to be).

But turning off "fast interrupts" and using the standard RISC-V __attribute__((interrupt)) is probably not slower, and will for sure use less stack (for this particular handler).
« Last Edit: March 29, 2023, 10:03:01 pm by brucehoult »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #16 on: March 29, 2023, 08:23:18 pm »
Ok, so back after reading the manual for this chip.

The __attribute __ ((interrupt("WCH-Interrupt-fast"))) saves 10 registers to the stack. The normal stack in RAM. I'd assuming when hearing about the feature and that it supports 2-deep interrupt nesting that they'd put a 2nd and 3rd register set in the CPU and it would just be BOOM single cycle or something and you're done.

So, actually, saving to the stack in RAM won't be all that much faster than doing the saving in software, and if you have a small interrupt handler that only needs a couple of registers then using the standard RISC-V __attribute__((intterrupt)) that saves only what it uses will probably be faster.

The 10 registers saved:

x1: ra (only actually useful if your interrupt routine calls other functions)

x5-x7: t0-t2

x10-x15: a0-a5

So, the OP's disassembled code changes: ra, a3, a4, a5, s0, s1

The hardware stacking is needlessly saving and restoring: t0, t1, t2, a0, a1, a2

The hardware stacking is NOT saving: s0, s1

The OP would be just as well off turning off the "fast interrupt" feature and using standard RISC-V __attribute__((interrupt)).

NB: I'm assuming here that the compiler is using the RV32E ABI that is simply RV32I truncated to 16 registers. I haven't checked the hex opcodes against the disassembled instructions to check.

There is a proposed EABI that should give considerably better code in many cases, mainly by increasing the S registers to 5 instead of 2 (this code, as compiled, really wants to have 3 S registers, and has to re-create the contents of a4 because it can't use an s2 instead) by cutting down to 2 T registers and 4 A registers (the same as 32 bit ARM, so should be enough for ported software).

Standard ABI vs EABI for x5-x15:

t0, t1, t2, s0, s1, a0, a1, a2, a3, a4, a5
t0, s3, s4, s0, s1, a0, a1, a2, a3, s2, t1

Hardware stacking for the EABI would only need to stack x5, x10-x13, x15 i.e. 6 registers instead of 10.  Which just goes to show WCH are not using it.

https://github.com/riscv-non-isa/riscv-eabi-spec/blob/master/EABI.adoc
 

Offline jnk0le

  • Regular Contributor
  • *
  • Posts: 51
  • Country: pl
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #17 on: March 29, 2023, 09:19:57 pm »
Quote
No, that's stupid.  Simply use a normal function (as on ARMv7-M), but force mret instead of ret.

Code: [Select]

void SysTick_Handler(void){
    ... do all the stuff here
    asm volatile ("mret");
}


No attributes needed at all. Any S registers used will be saved and restored in software, the others won't be (and don't need to be).

You will end up with an unreachable ret after the mret but it's only 2 bytes so NDB.

That will break when anything is allocated on stack.
To get rid of epilogue after inline asm, you can use `__builtin_unreachable();`

Quote
The __attribute __ ((interrupt("WCH-Interrupt-fast"))) saves 10 registers to the stack. The normal stack in RAM. I'd assuming when hearing about the feature and that it supports 2-deep interrupt nesting that they'd put a 2nd and 3rd register set in the CPU and it would just be BOOM single cycle or something and you're done.
Thats in V3 and V4 of their core version.


Quote
Hardware stacking for the EABI would only need to stack x5, x10-x13, x15 i.e. 6 registers instead of 10.  Which just goes to show WCH are not using it.
There is EABIEN bit in INTSYSCR.
Not present on V4 so that is most probably pushing eabi register set. (less cycles)
Of course its functionality is not documented at all.
« Last Edit: March 29, 2023, 09:28:44 pm by jnk0le »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #18 on: March 29, 2023, 10:28:00 pm »
That will break when anything is allocated on stack.

Ah yeah, crud.  Even the saving of s0 and s1 will do that.

And you can't even call another function with the main guts in from a standard function, because saving the return address will also do it.

Code: [Select]
void bar(int *i);

__attribute__((noinline))
void my_handler_inner(){
    int xxx[20];
    for (int i=0; i<10; ++i)
        bar(&xxx[i]);
}

__attribute__((naked))
void my_handler(){
    my_handler_inner();
    asm volatile ("mret");
    //__builtin_unreachable();
}

This works on gcc. Clang complains about the call: "error: non-ASM statement in naked function is not supported". Using `asm volatile ("call my_handler_inner")` makes both gcc and clang happy.

https://godbolt.org/z/Kv7dhr7G8

The caller MUST be naked, otherwise it will allocate a stack frame and save ra but never deallocate the stack space.

The called function must NOT be inlined, otherwise any stack it uses (e.g. to save s0 or s1 or to allocate an array) will also never be deallocated.


OR just turn fast interrupts off (don't turn it on) and use the standard __attribute__((interrupt))

OR use WCH's recommended IDE and compiler and use their __attribute __ ((interrupt("WCH-Interrupt-fast")))
« Last Edit: March 29, 2023, 10:48:34 pm by brucehoult »
 

Offline jnk0le

  • Regular Contributor
  • *
  • Posts: 51
  • Country: pl
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #19 on: March 29, 2023, 11:20:09 pm »
well, the issues that were raised in this topic are the ones I have already proposed solutions for in various places.
You can have a look at 1.4.1 of the "total-embedded/Xteic" spec that I'm doing right now. It should be much cleaner than current approach leading to a lot of outdated (of course) toolchain builds for every new interrupt controller.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4064
  • Country: nz
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #20 on: March 29, 2023, 11:29:38 pm »
well, the issues that were raised in this topic are the ones I have already proposed solutions for in various places.
You can have a look at 1.4.1 of the "total-embedded/Xteic" spec that I'm doing right now. It should be much cleaner than current approach leading to a lot of outdated (of course) toolchain builds for every new interrupt controller.

Interesting document. I haven't previously seen it (or heard of you).

I've only taken a very quick glance at it so far, but I'm amused you quote me in it :-)
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1506
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #21 on: March 30, 2023, 01:29:02 am »
Does this mean the corruption always result in the same (incorrect) character being printed, or something else?  I read this a few times and it didn't immediately make this clear.

Yes. I could probably have worded that better. For example, when the corrupted character occurs, each instance will always be a '<', but whether it is '<' or '|' or whatever is random. And the corrupted character bears no apparent relation to the correct character.

Checking the assembly listing of the uart_putchar() function may be instructive.

Here it is:

Code: [Select]
00001034 <uart_putchar>:
    1034: 86cff2ef          jal t0,a0 <__riscv_save_0>
    1038: 1ffff797          auipc a5,0x1ffff
    103c: fc97c783          lbu a5,-55(a5) # 20000001 <uart_opts+0x1>
    1040: 842a                mv s0,a0
    1042: e791                bnez a5,104e <uart_putchar+0x1a>
    1044: 47a9                li a5,10
    1046: 00f51463          bne a0,a5,104e <uart_putchar+0x1a>
    104a: 4535                li a0,13
    104c: 37e5                jal 1034 <uart_putchar>
    104e: 40014737          lui a4,0x40014
    1052: 80075783          lhu a5,-2048(a4) # 40013800 <__global_pointer$+0x20013000>
    1056: 0807f793          andi a5,a5,128
    105a: dfe5                beqz a5,1052 <uart_putchar+0x1e>
    105c: 1ff47793          andi a5,s0,511
    1060: 80f71223          sh a5,-2044(a4)
    1064: 8522                mv a0,s0
    1066: 844ff06f          j aa <__riscv_restore_0>
...

Not sure what the "..." at the end represents. Perhaps it means there's a gap to the next function, which is at 0x106c. For alignment?
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1506
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #22 on: March 30, 2023, 01:40:15 am »
It is certainly possible that while(!(USART1->STATR & USART_STATR_TXE)); might cause problems with optimisation turned on, if USART1->STATR is not declared volatile.

I checked, all the members of USART1 struct are volatile.

I am using the WCH-supplied headers, so one should well hope the peripheral registers are all marked volatile, otherwise they've got big problems with their own HAL/SPL! ;D Not that they are completely free of mistakes, though. I have had to make several corrections - only 4 EXTI were defined for AFIO_EXTICR; RCC_CFGR0 had incorrect definitions for non-existent APB1 & 2 prescalers, but not for extended ADC prescaler.
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1506
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #23 on: March 30, 2023, 02:20:14 am »
So, actually, saving to the stack in RAM won't be all that much faster than doing the saving in software

I can't remember which thread it was or who did it, but there was someone here on the EEVblog forum that did some benchmarking of regular vs. HPE vs. VTF interrupts and found that HPE did make a significant different in ISR latency, but VTF added basically nothing.

The hardware stacking is NOT saving: s0, s1

Ohhhhhhhhhh... Yes, I think you're right! |O

And it explains why commenting out the for-loop dealing with the callbacks fixes the problem - it removes a line of code that sets register s0 to the base memory address of the interval_calls struct:

Code: [Select]
addi s0,gp,-1988 # 2000003c <interval_calls>

So even though interval_calls.count equals zero, and the loop body does not execute, it's too late, s0 has already been clobbered. And when the ISR returns back into uart_putchar(), guess what code is used to write the character to be transmitted to the UART data register?

Code: [Select]
andi a5,s0,511
sh a5,-2044(a4)

Yup, it takes the character value from s0 (previously copied there from the function argument in a0), masks it, then writes to USART1->DATAR. (It also later copies s0 back to a0 for the function return value, but that didn't affect anything as I'm not using the return value.)

So, if I am understanding correctly, to solve my problem my options here are as follows:

- Switch to WCH's version of compiler (IIRC, GCC 8.x-based) and use __attribute__((interrupt("WCH-Interrupt-fast"))) on my ISRs.
- Turn off HPE and use regular __attribute__((interrupt)) on my ISRs.

In order to do the latter, I think I need to modify this part of the startup assembly code:

Code: [Select]
li t0, 0x3
csrw 0x804, t0

Changing the immediate value to 0x2 on the first instruction, so that it doesn't set the HWSTKEN bit 0, leaving HPE disabled.

Do I still need to use mret to return from the ISR with __attribute__((interrupt))? (And perhaps use __builtin_unreachable() to get rid of the redundant regular ret, as suggested.) I see I do not, __attribute__((interrupt)) generates mret anyway.

I could switch to the WCH compiler. In their 'community' distribution you can just extract the RISC-V GCC toolchain on its own from the installer archive. That's what I did to get their custom OpenOCD. Might be worth doing, as it would also mean I can take advantage of their proprietary 'XW' extensions to the compact instruction set.
« Last Edit: March 30, 2023, 02:33:19 am by HwAoRrDk »
 

Offline HwAoRrDkTopic starter

  • Super Contributor
  • ***
  • Posts: 1506
  • Country: gb
Re: Bizarre problem on CH32V003 with SysTick ISR corrupting UART TX data
« Reply #24 on: March 30, 2023, 06:30:48 am »
Yup, it takes the character value from s0 (previously copied there from the function argument in a0), masks it, then writes to USART1->DATAR.

Y'know, I've just noticed that there is evidence that this is almost certainly what was happening, because there is a correlation between the address that was being stored in s0 in SysTick_Handler(), and the corrupted character being transmitted on the UART in uart_putchar().

What value was being written to s0? 0x2000003c. What character was showing up? The '<' character, which has an ASCII value of 0x3c. And what is the LSB of s0? Same!

The fact I was seeing a different corrupt character from time to time must have been because I was re-compiling the code with changes and making the interval_calls variable reside at different memory addresses.

With regard to what the WCH compiler must be doing with its custom __attribute__((interrupt("WCH-Interrupt-fast"))), am I right in thinking that it is basically just changing the behaviour of the regular 'interrupt' attribute to save only s0 & s1 on the stack (when necessary), rather than additionally all the rest that HPE instead does automatically (x1, x5-7, x10-15)? I might try the WCH compiler and compare the ISR code that is generated.

Anyway, I modified the startup assembly code as I previously mentioned, to disable HPE, and changed my code to use __attribute__((interrupt)) on the ISRs, and now everything is back working again. :-+

Thank you everyone for your help. I probably wouldn't have got to the bottom of this without it.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf