Author Topic: Looking for Cortex-M3 or M4 with DMA to GPIO capability  (Read 28319 times)

0 Members and 1 Guest are viewing this topic.

Offline poorchavaTopic starter

  • Super Contributor
  • ***
  • Posts: 1672
  • Country: pl
  • Troll Cave Electronics!
Looking for Cortex-M3 or M4 with DMA to GPIO capability
« on: April 24, 2013, 08:13:46 am »
Hello,

I have a project which requires MCU to generate data and send it at high speed to an FPGA. ARM Cortex core is a requirement.

The intended way how this is supposed to work is
-processor generates data and places into an array
-timer interrupt initiates a transfer from the data array to GPIO.
-DMA is expected to - once set up- transfer one chunk of data of fixed size (16 bits) to GPIO and advance the DMA address register to another cell in the array
-DMA stops at the end of array and this has to be able to generate interrupt for the processor to know that it has to repopulate the array

Now the problem is, that GPIO<->DMA transfers seem to be largely omitted by chip manufacturers. I have found a mention, that some LPC families allow such transfer as memory-to-memory transfer into the address space occupied by GPIO registers, but I couldn't find any particular code example or application note that would confirm that. On top of that I'm kind of biased towards STM32 family.

So the question here is: do you know of any widely available, open-source/free toolchain friendly Cortex-M3 or M4 MCU that you know for sure that can do this kind of thing? I would prefer something from bigger players like ST, NXP, TI, Atmel.
I love the smell of FR4 in the morning!
 

Offline BravoV

  • Super Contributor
  • ***
  • Posts: 7547
  • Country: 00
  • +++ ATH1
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #1 on: April 24, 2013, 09:22:35 am »
Not claiming an expert, infact the new TI ARM-M4F is my first Arm and I'm in the middle of drowning in their tons of documents  :-\, coincidentally I just passed the DMA section, these are the quoted from technical datasheet and the DMA API summaries, hope this helps.

DMA specification from datasheet :
Quote
The <TI new Cortex™-M4F> microcontroller includes a Direct Memory Access (DMA) controller, known as micro-DMA (uDMA). The uDMA controller provides a way to offload data transfer tasks from the Cortex™-M4F processor, allowing for more efficient use of the processor and the available bus bandwidth. The uDMA controller can perform transfers between memory and peripherals. It has dedicated channels for each supported on-chip module and can be programmed to automatically perform transfers between peripherals and memory as the peripheral is ready to transfer more data.

The uDMA controller provides the following features:

* ARM® PrimeCell® 32-channel configurable µDMA controller
* Support for memory-to-memory, memory-to-peripheral, and peripheral-to-memory in multiple transfer modes
 – Basic for simple transfer scenarios
 – Ping-pong for continuous data flow
 – Scatter-gather for a programmable list of up to 256 arbitrary transfers initiated from a single request
* Highly flexible and configurable channel operation
 – Independently configured and operated channels
 – Dedicated channels for supported on-chip modules
 – Flexible channel assignments
 – One channel each for receive and transmit path for bidirectional modules
 – Dedicated channel for software-initiated transfers
 – Per-channel configurable priority scheme
 – Optional software-initiated requests for any channel
* Two levels of priority
* Design optimizations for improved bus access performance between µDMA controller and the processor core
 – µDMA controller access is subordinate to core access
 – RAM striping
 – Peripheral bus segmentation
* Data sizes of 8, 16, and 32 bits
* Transfer size is programmable in binary steps from 1 to 1024
* Source and destination address increment size of byte, half-word, word, or no increment


From the DMA API documentation :
Quote
The microDMA (uDMA) API provides functions to configure the Stellaris uDMA (Direct Memory Access) controller. The uDMA controller is designed to work with the the ARM Cortex-M3 processor and provides an  efficient and low-overhead means of transferring blocks of data in the system.

The uDMA controller has the following features:

* dedicated channels for supported peripherals
* one channel each for receive and transmit for devices with receive and transmit paths
* dedicated channel for software initiated data transfers
* channels can be independently configured and operated
* an arbitration scheme that is configurable per channel
* two levels of priority
* subordinate to Cortex-M3 processor bus usage
* data sizes of 8, 16, or 32 bits
* address increment of byte, half-word, word, or none
* maskable device requests
* optional software initiated transfers on any channel
* interrupt on transfer completion

The uDMA controller supports several different transfer modes, allowing for complex transfer schemes. The following transfer modes are provided:

* Basic mode performs a simple transfer when request is asserted by a device. This is appropriate to use with peripherals where the peripheral asserts the request line whenever data should be transferred. The transfer will stop if request is de-asserted, even if the transfer is not complete.

* Auto-request mode performs a simple transfer that is started by a request, but will always complete the entire transfer, even if request is de-asserted. This is appropriate to use with software initiated transfers.

* Ping-Pong mode is used to transfer data to or from two buffers, switching from one buffer to the other as each buffer fills. This mode is appropriate to use with peripherals as a way to ensure a continuous flow of data to or from the peripheral. However, it is more complex to set up and requires code to manage the ping-pong buffers in the interrupt handler.

* Memory scatter/gather mode is a complex mode that provides a way to set up a list of transfer “tasks” for the uDMA controller. Blocks of data can be transferred to and from arbitrary locations in memory.

* Peripheral scatter/gather mode is similar to memory scatter/gather mode except that it is controlled by a peripheral request.
« Last Edit: April 24, 2013, 09:28:50 am by BravoV »
 

Offline Memphis

  • Contributor
  • Posts: 27
  • Country: cz
  • In quantum theory, we are lost in space and time.
    • My personal YT channel
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #2 on: April 24, 2013, 09:51:59 am »
Why do you not using FSMC in STM32(F2,F3,F4)?  ??? It can be driven via DMA. Data can be outputted with parallel 16bits pins + some control signals such as addressing and control lines, which of course can be used for anything you want.
...sorry for my english :palm:
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #3 on: April 24, 2013, 07:43:07 pm »
I'd use the ethernet MAC. You simply dump data into it without any protocol.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline andyturk

  • Frequent Contributor
  • **
  • Posts: 895
  • Country: us
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #4 on: April 24, 2013, 10:24:18 pm »
Now the problem is, that GPIO<->DMA transfers seem to be largely omitted by chip manufacturers. [...] On top of that I'm kind of biased towards STM32 family.
I'm pretty sure the STM32's DMA will do exactly that. Just make sure to use a DMA channel that isn't being used by another peripheral. You should be able to configure it to do 16-bit writes and increment the source by 2 bytes each transfer. The destination address won't change (i.e., it's always the same GPIOx_ODR register).
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 13745
  • Country: gb
    • Mike's Electric Stuff
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #5 on: April 24, 2013, 10:36:45 pm »
NXP LPC1800 will do DMA to peripherals - not sure offhand which peripherals but SDHC is one of them, which should give a fairly decent data rate & be easy to interface to an FPGA.

Another option may be to use an ARM that supports external memory, and have the FPGA pretend to be external RAM

Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 

Offline jmole

  • Regular Contributor
  • *
  • Posts: 211
  • Country: us
    • My Portfolio
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #6 on: April 24, 2013, 11:08:14 pm »
I'll toot my own horn here and recommend the Cypress PSoC 5 LP.

Rather than DMA, per se, you can write a simple verilog component with a CPU accessible data register, and then latch the register value to GPIO on any digital signal, internal or external.

The toolchain is proprietary, but completely free (with full functionality). It actually uses GCC underneath everything else for the C code compilation.

I make a dev kit for it called freeSoC, and Cypress recently announced a low-cost PSoC 4 board (same digital flexibility, but a slower proc) called PSoC Pioneer.
« Last Edit: April 24, 2013, 11:09:47 pm by jmole »
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #7 on: April 25, 2013, 02:14:27 am »
Quote
I have found a mention, that some LPC families allow such transfer as memory-to-memory transfer into the address space occupied by GPIO registers, but I couldn't find any particular code example or application note that would confirm that.
Wouldn't that be the default assumption, for an architecture with a single address space?
 

Offline poorchavaTopic starter

  • Super Contributor
  • ***
  • Posts: 1672
  • Country: pl
  • Troll Cave Electronics!
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #8 on: April 25, 2013, 06:22:36 am »
I meant that when setting up various stuff when initializing DMA, for example SPI or UART will be a peripheral-to-memory transfer, whereas GPIO is in most cases referred to as memory-to-memory transfer.

As for TI Stellaris/TivaC I think they have a very rich feature set, but they are a very young product (did they hit series production yet?).

Why do you not using FSMC in STM32(F2,F3,F4)?  ??? It can be driven via DMA. Data can be outputted with parallel 16bits pins + some control signals such as addressing and control lines, which of course can be used for anything you want.


That sounds like a viable idea.I will need to read a bit more on how FSMC works. I want to keep protocol between devices very simple, because I don't have funds to buy and IP core for some fancy protocol (I'm not an expert on fpga/pld design either). Device used will most likely be Lattice MachXO2-1200HC-6TG100I. The solution can be crude but has to be fast :)
I love the smell of FR4 in the morning!
 

Offline Memphis

  • Contributor
  • Posts: 27
  • Country: cz
  • In quantum theory, we are lost in space and time.
    • My personal YT channel
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #9 on: April 25, 2013, 07:38:53 am »
I meant that when setting up various stuff when initializing DMA, for example SPI or UART will be a peripheral-to-memory transfer, whereas GPIO is in most cases referred to as memory-to-memory transfer.

As for TI Stellaris/TivaC I think they have a very rich feature set, but they are a very young product (did they hit series production yet?).

Why do you not using FSMC in STM32(F2,F3,F4)?  ??? It can be driven via DMA. Data can be outputted with parallel 16bits pins + some control signals such as addressing and control lines, which of course can be used for anything you want.


That sounds like a viable idea.I will need to read a bit more on how FSMC works. I want to keep protocol between devices very simple, because I don't have funds to buy and IP core for some fancy protocol (I'm not an expert on fpga/pld design either). Device used will most likely be Lattice MachXO2-1200HC-6TG100I. The solution can be crude but has to be fast :)

The best thing about FSMC is that it has a linear address space, so outputting data is trivial and fast and also can be synchronized with clock.
For example i did connected SSD1963 (which is TFT controller) thru FSMC for speed reason. And the result is that i can play uncompressed images at 25-30fps just by using DMA copying data from SDCARD to FSMC with some internal buffering.  8)

...sorry for my english :palm:
 

Offline knik

  • Newbie
  • Posts: 6
  • Country: pl
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #10 on: April 27, 2013, 07:51:53 am »
I'm not sure about GPIO output but I tested input GPIO->DMA (no FSMC) on STM32f103 and it seems to work well.
DMA peripheral addr set to GPIO->IDR and it works, max speed is limited to about 6 Mtransfers/sec.
 

Offline andersm

  • Super Contributor
  • ***
  • Posts: 1198
  • Country: fi
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #11 on: April 27, 2013, 12:01:00 pm »
I'm not sure about GPIO output but I tested input GPIO->DMA (no FSMC) on STM32f103 and it seems to work well.
DMA peripheral addr set to GPIO->IDR and it works, max speed is limited to about 6 Mtransfers/sec.
Does the speed change if you change the IO pin max output speed? Ie. does the peripheral limit the transfer speed to what the IO pins are actually capable of?

One advantage of using an external bus interface is that you get the strobe signals for free. The DMA peripheral may not be able to guarantee a consistent data rate (eg. does the transfer stall every time the core reads or writes to memory or another peripheral?), so the receiving should be prepared to handle that. If not all IO port bits are needed for data, one could be used as a clock signal.

Offline poorchavaTopic starter

  • Super Contributor
  • ***
  • Posts: 1672
  • Country: pl
  • Troll Cave Electronics!
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #12 on: April 27, 2013, 02:28:04 pm »
I've done some reading about FSMC and asynchronous SRAM and I think it should do the job. Since I'm communicating with FPGA (or maybe CPLD) I can easily implement an SRAM interface. Thay way I should be able (on top of high speed transfer using DMA) to write to FPGA as I would to a normal variable (FSMC maps the external SRAM to normal data memory address space).

First goal is to turn an FPGA into a variable bitrate serializer that can achieve at least 40 Mbps. So with 16-bit bus width theoretically 2.5M transfers /s should be enough.

I think I will settle with STM32F103 with 1Mbit flash in 144pin QFP.
I love the smell of FR4 in the morning!
 

Offline free_electron

  • Super Contributor
  • ***
  • Posts: 8517
  • Country: us
    • SiliconValleyGarage
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #13 on: April 27, 2013, 03:23:35 pm »
NXP LPC1800 will do DMA to peripherals - not sure offhand which peripherals but SDHC is one of them, which should give a fairly decent data rate & be easy to interface to an FPGA.

Another option may be to use an ARM that supports external memory, and have the FPGA pretend to be external RAM

That's the way i do it. Get a cpu with the EMIF ( external memory interface) exposed ( typically those are the vety high pincount beasts)
Professional Electron Wrangler.
Any comments, or points of view expressed, are my own and not endorsed , induced or compensated by my employer(s).
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #14 on: April 27, 2013, 05:51:04 pm »
IMHO the ethernet MAC is the best way to go. In NXP's ARM devices the ethernet MAC has an SRAM buffer connected to a different memory bus so DMA transfers are not colliding with data and instruction fetches for executing code. Another advantage is that the MAC is clocked by the external device so you can use the FPGA's internal clock (which doesn't need to be 50MHz; I even implepemented bit-banging ethernet PHYs). The MAC is doing all the clock domain crossing work for you.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #15 on: April 27, 2013, 10:58:57 pm »
Quote
the ethernet MAC is the best way to go.
Is the interface to the Phy usually only 4bits (or less) wide?  I guess GMII has 8bits for Gigiabit ethernet, but I can't say as I've ever seen that on a microcontroller-like part.

Using the Mac as a general-purpose dma'ed interface is an interesting idea, though.  Has anyone done video that way?
 

Offline knik

  • Newbie
  • Posts: 6
  • Country: pl
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #16 on: April 28, 2013, 02:55:32 pm »
I'm not sure about GPIO output but I tested input GPIO->DMA (no FSMC) on STM32f103 and it seems to work well.
DMA peripheral addr set to GPIO->IDR and it works, max speed is limited to about 6 Mtransfers/sec.
Does the speed change if you change the IO pin max output speed? Ie. does the peripheral limit the transfer speed to what the IO pins are actually capable of?

One advantage of using an external bus interface is that you get the strobe signals for free. The DMA peripheral may not be able to guarantee a consistent data rate (eg. does the transfer stall every time the core reads or writes to memory or another peripheral?), so the receiving should be prepared to handle that. If not all IO port bits are needed for data, one could be used as a clock signal.

To me it looks like DMA speed limit, it would need a bit more than 10 cycles per transfer, it just misses some transfers when timer trigger is too fast.
I haven't tested different pin speeds but I think it doesn't matter when you use it as input.

If FSMC is used with DMA I wouldn't be surprised if it turned out as fast as DMA+GPIO.
 

Offline saigai

  • Contributor
  • Posts: 15
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #17 on: April 28, 2013, 08:49:51 pm »
Now the problem is, that GPIO<->DMA transfers seem to be largely omitted by chip manufacturers. I have found a mention, that some LPC families allow such transfer as memory-to-memory transfer into the address space occupied by GPIO registers, but I couldn't find any particular code example or application note that would confirm that. On top of that I'm kind of biased towards STM32 family.

I did GPIO<->DMA like this on an LPC1768, but the opposite way. Maybe the code would be helpful to you, though:

https://github.com/desaster/grabor/blob/master/src/dma.c

(currently this code is actually unused in the project)
 


Online 0xdeadbeef

  • Super Contributor
  • ***
  • Posts: 1576
  • Country: de
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #19 on: February 03, 2015, 10:45:14 pm »
I second the LPC1768. It has a pretty capable DMA.
Did some VGA demo stuff some years ago where more or less the complete VGA signal creation was DMA based.
Also my little engine signal generator makes heavy use of DMA.
Trying is the first step towards failure - Homer J. Simpson
 

Offline tymm

  • Contributor
  • Posts: 17
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #20 on: February 04, 2015, 04:16:42 am »
STM32 (and even down to ST's Cortex-M0 parts) will allow you to DMA to GPIO if you're willing to coax them a little.  I do plenty with GCC on the M0's & M3's.

Unfortunately their documentation on this (and in general how the DMA controllers work) is terrible and this hides lots of useful possibilities.

IIRC, yes you just set it up as a memory-to-memory transfer, point the destination register at the GPIO's output data register (ODR) and disable auto increment on the destination address (it's been a little while since i poked at the specific problem, but I'm almost positive there's not much more than that).  Simplest is if you're DMAing to a whole GPIO port at once; if you're picking and choosing the pins it can get more complicated - though the "bit banding" on many M3's is one way to simplify that.

 

Offline andersm

  • Super Contributor
  • ***
  • Posts: 1198
  • Country: fi
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #21 on: February 04, 2015, 08:48:45 am »
Simplest is if you're DMAing to a whole GPIO port at once; if you're picking and choosing the pins it can get more complicated - though the "bit banding" on many M3's is one way to simplify that.
Most, if not all, manufacturers have ways to set individual GPIO pins. Eg. ST have bit set/reset registers, and NXP have a very unusual method where the LSBs of the address used to access the port become a mask.

Offline miceuz

  • Frequent Contributor
  • **
  • Posts: 387
  • Country: lt
    • chirp - a soil moisture meter / plant watering alarm
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #22 on: February 04, 2015, 08:51:10 am »
This project is using DMA on Freescale Kinetis K20 to bit bang data for ws2812 leds on a 8-bit port:
https://github.com/scanlime/fadecandy

Offline Yansi

  • Super Contributor
  • ***
  • Posts: 3893
  • Country: 00
  • STM32, STM8, AVR, 8051
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #23 on: February 04, 2015, 09:04:31 am »
Hello,

I have a project which requires MCU to generate data and send it at high speed to an FPGA. ARM Cortex core is a requirement.

The intended way how this is supposed to work is
-processor generates data and places into an array
-timer interrupt initiates a transfer from the data array to GPIO.
-DMA is expected to - once set up- transfer one chunk of data of fixed size (16 bits) to GPIO and advance the DMA address register to another cell in the array
-DMA stops at the end of array and this has to be able to generate interrupt for the processor to know that it has to repopulate the array

Now the problem is, that GPIO<->DMA transfers seem to be largely omitted by chip manufacturers. I have found a mention, that some LPC families allow such transfer as memory-to-memory transfer into the address space occupied by GPIO registers, but I couldn't find any particular code example or application note that would confirm that. On top of that I'm kind of biased towards STM32 family.

So the question here is: do you know of any widely available, open-source/free toolchain friendly Cortex-M3 or M4 MCU that you know for sure that can do this kind of thing? I would prefer something from bigger players like ST, NXP, TI, Atmel.

Regarding that ARM is Von Neumann architecture, so it has only one memory space and memory mapped peripherals, where the hell is the problem to tell the DMA controller to puke data into GPIO?

I don't know how well perform other ARM vendors at this, but on STM32 the DMA to GPIO transfers work pretty well and suprisingly some people use that.

The DMA controller is a universal block, you only tell it the address to start from and where to copy it. There aren't any restrictions known to me for what address it can be used.

As it was maybe pointed out, for an FPGA you need synchronous or strobed datatransfer, rather then spewing it through GPIO. Use the external bus out of the MCU (FMC or FSMC peripheral on STM32). It can do strobed (asynchronous) or clocked (synchronous) transfers to peripherals, without software bitbanging.

« Last Edit: February 04, 2015, 09:08:18 am by Yansi »
 

Offline Scrts

  • Frequent Contributor
  • **
  • Posts: 797
  • Country: lt
Re: Looking for Cortex-M3 or M4 with DMA to GPIO capability
« Reply #24 on: February 04, 2015, 10:36:23 am »
I've tried Atmel SAMA5 microprocessor memory interface to the FPGA. Works good and it's highly customizable, which makes FPGA side development really easy.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf