Author Topic: megaAVR 0-Series ATMEGA4809 and ATMEGA4808 makes me feel like I am starting over  (Read 8860 times)

0 Members and 1 Guest are viewing this topic.

Online westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Where did you see 3.49?  I see about $2.30 for the DIP, and $1.60 for Qfn or lqfp.
 But yeah, they’re just barely competitive with the low end cm0 chips...


I wonder why the stm32f0xx chips haven’t shown up in any populat hobbyist form, yet...

 

Offline techman-001

  • Frequent Contributor
  • **
  • !
  • Posts: 748
  • Country: au
  • Electronics technician for the last 50 years
    • Mecrisp Stellaris Unofficial UserDoc
Where did you see 3.49?  I see about $2.30 for the DIP, and $1.60 for Qfn or lqfp.
 But yeah, they’re just barely competitive with the low end cm0 chips...

I wonder why the stm32f0xx chips haven’t shown up in any popular hobbyist form, yet...

It was a Google search for Mouser and ATmega4809 in 40 pin DIP from Australia, and the price was taken from the Mouser web site. I think they add extra for OZ. I should have stated $AUD, apologies.

Perhaps stm32f0xx were never popular with hobbyists because Arduino is a Atmel shop ?

Plus the ST free software development collection was a nightmare of broken complexity. I remember trying to create a Blinky with it in 2014 and failed utterly. I didn't remember C being that hard and complex after a 10 year hiatus from embedded. It was enough to drive a person to Forth!

So  STM32F0xx is  hard, Arduino easy, it's a no brainer to see where hobbyists were attracted and why.

Of course that bloody $2 "Blue Pill" had to get Arduinoized (to a degree), so  hobbyists went straight to the old STM32F103 Cortex-M3, bypassing the much newer Cortex-M0 completely.

Loosely related, but interesting:
Now we have a new player that's going to cause quite a stir in the hobby market I think. It's a RISC-V core with STM32F103 peripherals. It's fast (108 MHz with zero Flash wait states !), cheap and fully open. It's the Chinese GD32VF103C. See this link for datasheets: http://dl.sipeed.com/LONGAN/Nano/DOC/

How long before Arduino caters for the GD32VF103C do you think, tomorrow, a week, two weeks ?

The Mecrisp-Stellaris creator, Matthias Koch already has separate releases for the Risc-V and the STM32F103 so it won't take long for him to include the GD32VF103C in his supported hardware list. He has some of the LONGAN Nano units in the post to him already.





 

 

Offline rx8pilotTopic starter

  • Super Contributor
  • ***
  • Posts: 3634
  • Country: us
  • If you want more money, be more valuable.
The ATMEGA4809 [UQFN-48] is only $1.35 USD and the ATMEGA4808 [VQFN-32] is only $1.11 USD in qty:100.


That is pretty good. I have been using AVR out of habit because I have so much code/understanding already developed so it is very easy for me to push out a low-volume product without much challenge. I am a bit scared of shopping for the lower-priced or slightly better specs at the expense of a learning curve and new development tools.

Almost everything in my work can easily be accomplished just about any MCU out there. I am rarely, if ever, on the fringes of the available performance so I just go with the easy and predictable option.
Factory400 - the worlds smallest factory. https://www.youtube.com/c/Factory400
 

Online magic

  • Super Contributor
  • ***
  • Posts: 6733
  • Country: pl
Sadly ARM is not without its warts. I had a look at some of the micros mentioned here, a number of things jumped out.

For starters, complexity. More registers to setup, thicker user manuals to comb through.
You technically get 4x wider data bus, but it's interleaved with instruction fetches so only 2x more bandwidth in practice. Why? :wtf:
The above also affects instruction latency in all kinds of weird ways. I guess people rarely write cycle-accurate assembly these days, but I did it once on AVR, it was piece of cake.
15 cycles interrupt latency? Are you kidding?

Maybe the 16 bit PICs would be a viable compromise with more bits, more MHz but still a simple Harvard architecture. Too bad it's single-vendor and probably ultimately destined to extinction. Or is it doing well?
 

Offline techman-001

  • Frequent Contributor
  • **
  • !
  • Posts: 748
  • Country: au
  • Electronics technician for the last 50 years
    • Mecrisp Stellaris Unofficial UserDoc
Sadly ARM is not without its warts. I had a look at some of the micros mentioned here, a number of things jumped out.

For starters, complexity. More registers to setup, thicker user manuals to comb through.
You technically get 4x wider data bus, but it's interleaved with instruction fetches so only 2x more bandwidth in practice. Why? :wtf:
The above also affects instruction latency in all kinds of weird ways. I guess people rarely write cycle-accurate assembly these days, but I did it once on AVR, it was piece of cake.
15 cycles interrupt latency? Are you kidding?

Maybe the 16 bit PICs would be a viable compromise with more bits, more MHz but still a simple Harvard architecture. Too bad it's single-vendor and probably ultimately destined to extinction. Or is it doing well?

I'm not sure more capability and the resultant increase in technical manual size is a reason for criticism.  If that were true, the Motorola MC14500B one bit CPU would probably still be the most popular device around.

Perhaps today's highly accurate cycle timing (and parallel processing) by FPGA's clocked at high speeds like 500 MHz has surpassed the need for cycle counting in MCU's?
Some ARM Cortex-M have a DWT (Data Watchpoint and Trace) unit which counts the execution cycles and finally the Cortex-M instruction prefetch unit (PFU) may make cycle counting difficult depending on instruction flow ?

15 cycle interrupt latency may be a result of having up to 93 on board peripherals with many interrupt trigger options for each peripheral. Then again when the clock is 200+ MHz those 15 cycles may not be quite the delay you expected ?

I like 16 bit Pics and MSP430's myself. I have plenty of stock and Forths for them as well, I just find myself using Cortex-M0 all the time for some reason which I suspect is the low cost, 32 bits, speed, large flash size (64KB) and the huge number of on board peripherals. If I needed low power for a simple battery powered application, I'd probably use a MSP430.
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21606
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
15 cycles interrupt latency? Are you kidding?

Curiously, interrupt latency is almost a constant.  Even the mighty PC endures whole microseconds of context switching and cache flushing.  (Pretty sure I'm exaggerating, but, in the end, overall IO latency still hasn't changed much over the last 3 decades; except for specially optimized architectures like Xcore, but that's not in PCs.)


Quote
Maybe the 16 bit PICs would be a viable compromise with more bits, more MHz but still a simple Harvard architecture. Too bad it's single-vendor and probably ultimately destined to extinction. Or is it doing well?

Hm, Microchip isn't going anywhere, in the immediate future at least.

If you need higher performance, there's always DSPs -- multiple buses, not just a separate instruction path but source and destination as well.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Online westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Quote
You technically get 4x wider data bus, but it's interleaved with instruction fetches so only 2x more bandwidth in practice.
No, that's not quite accurate.  Some of the ARM architectures are "modified Harvard" machines, which means separate buses for data and instructions, even though they share an address space.  Flash is frequently front-ended by some sort of "accelerator" even on very low-end parts; the simplest being flash memory that is wider than 32bits.  (and remember that most CM0 instructions are only 16bits wide.)  And the buses tend to run faster than memory accesses, and RAM (data) tends to be pretty zippy - on the order of 32bits per 20ns rather than 8bits per 50ns.
I think.  It's pretty tough wading through separate documents on the CPU architecture vs the chip implementation.But then IO tends to be on the other side of some additional bus controller, which introducing more delay and strangeness.Unless is a chip with the optional "tightly coupled memory" and/or "single cycle IOBUS" port features.

And yeah, writing cycle-accurate, deterministic code is quite difficult compared to the 8bit chips :-(

Quote
Almost everything in my work can easily be accomplished just about any MCU out there. I am rarely, if ever, on the fringes of the available performance
Amen!!!

 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14297
  • Country: fr
15 cycles interrupt latency? Are you kidding?

Curiously, interrupt latency is almost a constant.  Even the mighty PC endures whole microseconds of context switching and cache flushing.  (Pretty sure I'm exaggerating, but, in the end, overall IO latency still hasn't changed much over the last 3 decades; except for specially optimized architectures like Xcore, but that's not in PCs.)

This is not quite curious, as pipelines in modern CPUs have tended to grow longer and longer, which in turn usually makes interrupt latency longer.
What could be seen as curious would be why we didn't care to work on that specifically, but since clock frequencies have also dramatically increased, interrupt latencies in cycles haven't mattered much except, as you said, in very specific domains, for which there are other solutions than general-purpose CPUs.

Talking about that, I'm actually not sure how we could design a CPU with, say, a mere 1-cycle interrupt latency, if said CPU is pipelined. Unless maybe the interrupts were served on a separate, specialized core, which would make that not really an "interrupt" per se then.
« Last Edit: September 17, 2019, 01:49:00 pm by SiliconWizard »
 

Online Kleinstein

  • Super Contributor
  • ***
  • Posts: 14073
  • Country: de
Even with the AVRs interrupt latency is at least 6 cycles + the longest instruction (another 6 cycles for an RET AFAIK). In addition there is usually a jump from the table.
AFAIK the ARMs have automatic saving of a few registers with the interrupt. So there will likely be quite some cycles for the interrupt to start with. The wider RAM can really help storing the data to a stack.

The flash of the AVR is already 16 bit wide, and ARM fash often needs some wait states if a fast clock is used. This make clock accurate timing from code run time rather difficult. However for this purpose there are timers and the event handling system - so critical timing is usually done by the HW and not the code speed.
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14297
  • Country: fr
Even with the AVRs interrupt latency is at least 6 cycles + the longest instruction (another 6 cycles for an RET AFAIK). In addition there is usually a jump from the table.
AFAIK the ARMs have automatic saving of a few registers with the interrupt. So there will likely be quite some cycles for the interrupt to start with. The wider RAM can really help storing the data to a stack.

Yep. In simpler MCUs, registers used in the ISR must be saved by hand (or automatically depending on your programming language and specific options/attributes), which can add significant overhead, so merely using a typical "interrupt latency" figure is kind of pointless, or at least can be misleading. That's always a best case.

The flash of the AVR is already 16 bit wide, and ARM fash often needs some wait states if a fast clock is used. This make clock accurate timing from code run time rather difficult. However for this purpose there are timers and the event handling system - so critical timing is usually done by the HW and not the code speed.

Yes and yes. And sure in some cases, you'd need to run the ARM MCU at an higher clock freq than the AVR. But does that matter in practice? If this is power draw that is your concern, think again. Some ULP ARM-based MCUs these days draw less power @80MHz than a typical AVR @16MHz for instance... As to performance, they are just light years ahead.

Simple 8-bit MCUs have their uses of course, and I agree that simplicity is a nice feature in itself. But the points on interrupt latency and execution time predictability are often not a good enough reason (except for simplicity for the latter) to choose them IMO. Simplicity and cost can be. And of course, legacy. Many AVR users keep using AVR's, and are eyeing the new "AVR" entrants, mainly because they are very familiar with them, and thus there is no extra cost/time spent to switch to something else.

 

Offline rx8pilotTopic starter

  • Super Contributor
  • ***
  • Posts: 3634
  • Country: us
  • If you want more money, be more valuable.
Many AVR users keep using AVR's, and are eyeing the new "AVR" entrants, mainly because they are very familiar with them, and thus there is no extra cost/time spent to switch to something else.

Every time a new project comes through that needs an MCU not needed spectacular compute and latency power, I look at using new devices. After a little searching, I end up with an AVR design entirely based on the familiarity. My last project used four distributed AVR's instead of a more powerful 32bit option. It was actually kinda nice having the tasks and physical layout distributed by task and also made latency and compute performance easy to manage. The downside is that I have to manage 4 programming efforts.

Perhaps I am too scared of change so I keep using the same family of MCU's. Learning curves are expensive.
Factory400 - the worlds smallest factory. https://www.youtube.com/c/Factory400
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14297
  • Country: fr
Perhaps I am too scared of change so I keep using the same family of MCU's. Learning curves are expensive.

They sure are. And it adds uncertainty, something you may not be comfortable with: until you're proficient enough with a new architecture, you won't know for sure how to implement a given functionality, or if it's even possible, whereas you perfectly know how to with the architecture you're familiar with.

So that's quite understandable.
 

Offline iMo

  • Super Contributor
  • ***
  • Posts: 4675
  • Country: nr
  • It's important to try new things..
Perhaps I am too scared of change so I keep using the same family of MCU's. Learning curves are expensive.
They sure are. And it adds uncertainty, something you may not be comfortable with..
Therefore we still use  :-DD
 

Online magic

  • Super Contributor
  • ***
  • Posts: 6733
  • Country: pl
It all depends on application, I suppose. If you build a standalone product which needs some kind of "SoC" for data processing, an ARM may be a good choice. To me, "a microcontroller" is primarily about interface glue, perhaps a bit of bit-banging and shoving data in/out of a computer, which does any actual computations.

You got me with the cost of CALL/RETURN on AVR, I wasn't aware it's so bad. Though actually it's only 4 cycles (and 5 on >128kB parts). If you forego procedure calls, all else is ≤3 cycles. And the only 3 cycle ones are JMP (not needed with ≤8kB flash), LPM and a few "conditional skip next instruction". Life isn't bad.

Ironically, the more advanced ARM cores with 3-6 stage pipelines have more reasonable 12 cycle latency than M0/M0+. It's surprising, given that ARM supposedly doesn't even finish the pending instruction like AVR. Until further explanation is provided, I call it "the joys of using a stripped-down application processor for MCU" :P
Even pushing a few registers on stack shouldn't take so long.

You certainly cannot have 1 cycle latency in a pipelined core, unless you predict that an IRQ is going to hit you and prefetch it into the pipe in advance :D
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf