My 48MHz comment "closer" was a bit tongue in cheek. That being said, I have always preferred Harvard architecture from a x MHz = x instructions perspective.
I can live with 12MHz, would prefer higher, I'm not sure if I can enable a clock output for a set # of pulses, using PWM mode, that might work?
Back to the original problem, surely a 48MHz controller can do something useful at more than 1/50 of its specified clock speed when running regular C.
I'm guessing that whatever configuration this is operating on at the moment is much slower than 48MHz, perhaps 8MHz Is
Can anyone comment on how to correct the clock configuration, when using the internal oscillator?
6b2: 001a movs r2, r3
6b4: 000b movs r3, r1
6b6: 6153 str r3, [r2, #20]
arch_ioport_pin_to_base(pin)->OUTSET.reg = arch_ioport_pin_to_mask(pin);
6b8: 6193 str r3, [r2, #24]
6ba: e7fc b.n 6b6 <main+0x22>
Which is about as good as you can do. Given that, I think you are correct that it should be running faster than 500kHz, if your clock rate were actually 48MHz.conf_board.h:Code: [Select]// clock resonators
#define BOARD_FREQ_SLCK_XTAL (32768U)
#define BOARD_FREQ_SLCK_BYPASS (32768U)
#define BOARD_FREQ_MAINCK_XTAL 0
#define BOARD_FREQ_MAINCK_BYPASS 0
#define BOARD_MCK CHIP_FREQ_CPU_MAX
#define BOARD_OSC_STARTUP_US 15625
My 48MHz comment "closer" was a bit tongue in cheek. That being said, I have always preferred Harvard architecture from a x MHz = x instructions perspective.This mostly true for Cortex-M, but you need to keep in mind memory limitations too. For 48 MHz, you are running with 1 Flash Wait State. But sine CPU does 32-bit fetches, and most instructions are 16-bit, it all balances out. But in some degenerate cases, the code may be slower because of that. In that case, it is possible to place parts of the program into RAM.I can live with 12MHz, would prefer higher, I'm not sure if I can enable a clock output for a set # of pulses, using PWM mode, that might work?If you use PWM, then you can normally output up to Fper/2. So if you use a regular TC clocked from 48 MHz, then you can get up to 24 MHz. But if you use TCC clocked from 96 MHz, then you can get full 48 MHz. The problem is controlling the exact number of cycles. It all depends on your actual goal.Back to the original problem, surely a 48MHz controller can do something useful at more than 1/50 of its specified clock speed when running regular C.It does a lot with this code. It takes your input parameters and calculates what register it should access to toggle the required pins. And I can see it taking 50 instructions to do that.
If you want more optimal code - write more optimal code. You are using a framework - it is a compromise between the convenience, speed and code size.I'm guessing that whatever configuration this is operating on at the moment is much slower than 48MHz, perhaps 8MHz IsNot really, it looks like it is running at 48 MHz.Can anyone comment on how to correct the clock configuration, when using the internal oscillator?It is hard to tell what you have at the moment. If it is a clean ASF code, then post conf_clocks.h file. Otherwise, post your complete project.
# define CONF_CLOCK_FLASH_WAIT_STATES 1
# define CONF_CLOCK_DPLL_ENABLE true
# define CONF_CLOCK_GCLK_1_ENABLE true
# define CONF_CLOCK_GCLK_1_CLOCK_SOURCE SYSTEM_CLOCK_SOURCE_OSCULP32K
# define CONF_CLOCK_GCLK_0_CLOCK_SOURCE SYSTEM_CLOCK_SOURCE_DPLL
Ok, I pasted your main() code into AS7 and looked at the compiler output.It seems that the ioport_*() functions are actually pretty well optimized in the current-ish ASF (v3.40.0) for SAMD21 (inlined, in fact.)The loop in main becomes:Code: [Select]6b2: 001a movs r2, r3
Which is about as good as you can do. Given that, I think you are correct that it should be running faster than 500kHz, if your clock rate were actually 48MHz.
6b4: 000b movs r3, r1
6b6: 6153 str r3, [r2, #20]
arch_ioport_pin_to_base(pin)->OUTSET.reg = arch_ioport_pin_to_mask(pin);
6b8: 6193 str r3, [r2, #24]
6ba: e7fc b.n 6b6 <main+0x22>
I can't help much with a corrected clock setup. ASF seems to only make the SAMD's confusing clock system even more confusing, so I have up and wrote bare metal code (also, I never used the internal clocks.) :-( However:I do have my "bare metal" code for running at 48MHz based on an 8MHz external clock here:https://github.com/WestfW/SAMD10-experiments/blob/master/D10-LED_TOGGLE0/src/UserSource/led_toggle.c#L16It's for a SAMD10, which theoretically has "similar" clock setup. But beware subtle differences between SAMD families (I don't know if there are any that come into play here, but there could be.)
- To get 48MHz from internal clocks, you'll need either the DFLL or DPLL
- Those both need an input frequency of around 32kHz.
- If you want to use the 8MHz internal clock as input, you need to divide it down to 32kHz first.
- (so you might as well use the internal 32kHz clock)
Your system does run at 8 MHz as configured by this line "# define CONF_CLOCK_GCLK_0_CLOCK_SOURCE SYSTEM_CLOCK_SOURCE_OSC8M"
Probably minimal changes that have to be made are these:Code: [Select]# define CONF_CLOCK_FLASH_WAIT_STATES 1
# define CONF_CLOCK_DPLL_ENABLE true
# define CONF_CLOCK_GCLK_1_ENABLE true
# define CONF_CLOCK_GCLK_1_CLOCK_SOURCE SYSTEM_CLOCK_SOURCE_OSCULP32K
# define CONF_CLOCK_GCLK_0_CLOCK_SOURCE SYSTEM_CLOCK_SOURCE_DPLL
I'm not 100% sure of all the names and stuff, but the idea is there.
#define CONF_CLOCK_FLASH_WAIT_STATES 1
/* SYSTEM_CLOCK_SOURCE_OSC32K configuration - Internal 32KHz oscillator */
# define CONF_CLOCK_OSC32K_ENABLE true
# define CONF_CLOCK_OSC32K_STARTUP_TIME SYSTEM_OSC32K_STARTUP_130
# define CONF_CLOCK_OSC32K_ENABLE_1KHZ_OUTPUT true
# define CONF_CLOCK_OSC32K_ENABLE_32KHZ_OUTPUT true
# define CONF_CLOCK_OSC32K_ON_DEMAND true
# define CONF_CLOCK_OSC32K_RUN_IN_STANDBY false
# define CONF_CLOCK_DPLL_ENABLE true
# define CONF_CLOCK_DPLL_ON_DEMAND true
# define CONF_CLOCK_DPLL_RUN_IN_STANDBY false
# define CONF_CLOCK_DPLL_LOCK_BYPASS false
# define CONF_CLOCK_DPLL_WAKE_UP_FAST false
# define CONF_CLOCK_DPLL_LOW_POWER_ENABLE false
# define CONF_CLOCK_DPLL_LOCK_TIME SYSTEM_CLOCK_SOURCE_DPLL_LOCK_TIME_DEFAULT
# define CONF_CLOCK_DPLL_REFERENCE_CLOCK SYSTEM_CLOCK_SOURCE_DPLL_REFERENCE_CLOCK_GCLK
# define CONF_CLOCK_DPLL_FILTER SYSTEM_CLOCK_SOURCE_DPLL_FILTER_DEFAULT
# define CONF_CLOCK_DPLL_REFERENCE_FREQUENCY 32768
# define CONF_CLOCK_DPLL_REFERENCE_DIVIDER 1
# define CONF_CLOCK_DPLL_OUTPUT_FREQUENCY 48000000
/* DPLL GCLK reference configuration */
# define CONF_CLOCK_DPLL_REFERENCE_GCLK_GENERATOR GCLK_GENERATOR_1
/* DPLL GCLK lock timer configuration */
# define CONF_CLOCK_DPLL_LOCK_GCLK_GENERATOR GCLK_GENERATOR_1
/* Configure GCLK generator 0 (Main Clock) */
# define CONF_CLOCK_GCLK_0_ENABLE true
# define CONF_CLOCK_GCLK_0_RUN_IN_STANDBY false
# define CONF_CLOCK_GCLK_0_CLOCK_SOURCE SYSTEM_CLOCK_SOURCE_DPLL
# define CONF_CLOCK_GCLK_0_PRESCALER 1
# define CONF_CLOCK_GCLK_0_OUTPUT_ENABLE false
/* Configure GCLK generator 1 */
# define CONF_CLOCK_GCLK_1_ENABLE true
# define CONF_CLOCK_GCLK_1_RUN_IN_STANDBY false
# define CONF_CLOCK_GCLK_1_CLOCK_SOURCE SYSTEM_CLOCK_SOURCE_OSC32K
# define CONF_CLOCK_GCLK_1_PRESCALER 1
# define CONF_CLOCK_GCLK_1_OUTPUT_ENABLE false
You'll notice that I used the DPLL rather than the DFLL. That's because I decided I don't understand the DFLL.It looks like the DFLL always has an output of 48MHz, and you have the OPTION of syncing it to a higher accuracy low-frequency source (the input clock frequency can be 8MHz, but the reference frequency is limited to 32kHz?) if configured properly. That's pretty boring (I've asked in a couple places why one would want to use the DFLL instead of the DPLL, given that the DPLL is so much more flexible. I don't think I ever got an answer.That's pretty boring (I've asked in a couple places why one would want to use the DFLL instead of the DPLL, given that the DPLL is so much more flexible. I don't think I ever got an answer.
increasing the number of flash wait states doesn't seem to change the speed of the loop I guess such a short loop stays in the cache/flash-accelerator, or whatever...
I'm not a coder so I'm not well versed on searching GitHub etc. for bare metal code that could be the starting basis for my project.
One thing I dislike about these sam d's (I only have the D10), is the clock system seems needlessly complicated.
Clock controls everywhere, the need to sync (or not), and the terminology gets a little fuzzy when reading about it all.
The other thing I dislike, is with the 32bit address space available, it still is a big mix of 8/16/32bit registers. I guess copy/paste is used in chip design similar to documentation (I know nothing of chip design).
This leaves DPLL to do more interesting things
If you simply ignore syncs, then you get the same exact behavior [of blocking the bus]
the flexibility [of the SAMD clock system] is amazing.
There is no real caching in those parts.
- [size=0pt]The NVM Controller cache reduces the device power consumption and improves system performance when wait states are required. Only the NVM main array address space is cached. It is a direct-mapped cache that implements 8 lines of 64 bits (i.e., 64 Bytes). NVM Controller cache can be enabled by writing a '0' to the Cache Disable bit in the Control B register ([/size][size=0pt][color=rgb(0.000000%, 0.000000%, 100.000000%)]CTRLB[/color][/size][size=0pt].CACHEDIS). [/size]
You may consider SYNCBUSY bit to be a flag that next write will block the bus. And in that case you may still go ahead and do the write. Or wait for the bit to clear, it is up to you.
I'm insisting as much as I can on turning everything into 32-bit registers.
Assuming that that's true, there could be very good reasons for looping and checking for sync manually, when accessing slow peripherals (like the RTC._)
What's the largest number of GCLKs you've ever used in a project? :-) I can't imagine ever needing 8, especially with most of the peripherals having their own prescaler.
I didn't think so either (instead I was expecting some vaguely decribed "accelerator" (but I guess that's STM?))But the datasheet says there is a 64-byte direct-mapped cache (enabled by default, but can be disabled to configured in "deterministic" mode.)
I had thought that was the case, but in my (simple) rtc code if I didn't use the syncbusy in my rtc reset code (clear enable, set swrst) there didn't seem to be any blocking going on with the result being the rtc did not start. I'm sure there is a simple explanation, though.
void wdt_reset(void)
{
while (WDT->STATUS.bit.SYNCBUSY);
WDT->CLEAR.reg = WDT_CLEAR_CLEAR_KEY;
}
void wdt_reset(void)
{
if (0 == WDT->STATUS.bit.SYNCBUSY)
WDT->CLEAR.reg = WDT_CLEAR_CLEAR_KEY;
}
This is interesting. I thought the cache was much more recent addition, but I guess not. However it is implemented, it is transparent enough to not cause any problems, since I never had any