Author Topic: Got bored, cooked up a board. (Read 5346 times)

technix · « **on:** November 06, 2018, 08:41:42 pm »

This board serves three purposes:
1. Escape from boredom
2. Proof-of-Concept for daisy-chained JTAG between MCU and CPLD
3. Proof-of-Concept for the Duplex and Triplex ideas

This board is an Arduino-style board with a STM32L432KCU6 MCU, a QSPI Flash chip, and a XC2C32A-2QFG32C CPLD. The two chips are bridged using 4-line SPI (no interrupt line if you don't want to sacrifice chip select - I ran out of pins,) they share a reset pin, and the CPLD is clocked from the clock output of the MCU. CPLD pin connections are printed on the board including the MCU-CPLD bridge, for the sake of easier development.

This is also my first Arduino-style board with a built-in USB hub. Previously for boards that have two or more USB devices I included one USB connector for each corresponding device, but since that USB hub chip costs about the same as a USB micro-B connector I am starting switching to this design as both a cost save measure and a ease of use point.

The JTAG chain is TDI - STM32L4 - XC2C32A - TDO. The connector follows CMSIS convention.

iMo · « **Reply #1 on:** November 07, 2018, 02:46:49 pm »

That CPLD is small.. Go with something bigger..

Warhawk · « **Reply #2 on:** November 07, 2018, 03:18:18 pm »

nice render, which tool did you use?

technix · « **Reply #3 on:** November 08, 2018, 02:03:15 am »

Quote from: imo on November 07, 2018, 02:46:49 pm

That CPLD is small.. Go with something bigger..

Something bigger means more pins which I have nowhere to bond out to.

Quote from: Warhawk on November 07, 2018, 03:18:18 pm

nice render, which tool did you use?

KiCad 5, ray trace mode. That render burns CPU.

hans · « **Reply #4 on:** November 08, 2018, 11:24:54 am »

But a 32-macrocell CPLD is basically only good for some glue logic.

If you implement a simple SPI receive/transmit statemachine, you probably burn half of the macrocells.

legacy · « **Reply #5 on:** November 08, 2018, 11:38:54 am »

I assume the board has no real purpose. Just a proof of concept for something.

BrianHG · « **Reply #6 on:** November 08, 2018, 12:16:24 pm »

The smallest Altera Max 10 FPGA would give you 2000 logic elements/macrocells plus 10 kilobytes of dual port memory with 27 IOs, 36 pin bga for 1$ more, but, its a 1.25v FPGA. The slightly larger ones are 3.3v, but, the price is double and they are a larger 144pin QFP.

technix · « **Reply #7 on:** November 08, 2018, 12:51:26 pm »

Quote from: hans on November 08, 2018, 11:24:54 am

But a 32-macrocell CPLD is basically only good for some glue logic.

If you implement a simple SPI receive/transmit statemachine, you probably burn half of the macrocells.

Yeah sadly. Is there any low pin count QFN or QFP CPLD or FPGA with more logic than 32 macrocells, hopefully from Xilinx or Intel? I am okay if the FPGA needs external configuration, as most FPGA has an external configuration interface that can also be used for regular communication, and in this case the QSPI would be a sweet place for the FPGA bitstream the SPI-based MCU-CPLD bridge can also be used to transfer that bitstream, and the MCU would upload it upon boot.

Quote from: legacy on November 08, 2018, 11:38:54 am

I assume the board has no real purpose. Just a proof of concept for something.

More POC than actual use. It is a STM32 QSPI POC, it is a JTAG daisy chain POC, it is also a POC for a subsystem I am using on another project (I never got to test due to some other problems in that project.)

Quote from: BrianHG on November 08, 2018, 12:16:24 pm

The smallest Altera Max 10 FPGA would give you 2000 logic elements/macrocells plus 10 kilobytes of dual port memory with 27 IOs, 36 pin bga for 1$ more, but, its a 1.25v FPGA. The slightly larger ones are 3.3v, but, the price is double and they are a larger 144pin QFP.

1.25V... I don't think there is any STM32 that can run at that low a voltage, at least not flashed when running atthat voltage. And the pain of level shifting bus transceivers is something I am not willing to bear again.

legacy · « **Reply #8 on:** November 08, 2018, 12:55:37 pm »

Quote from: technix on November 08, 2018, 12:51:26 pm

More POC than actual use. It is a STM32 QSPI POC, it is a JTAG daisy chain POC, it is also a POC for a subsystem I am using on another project (I never got to test due to some other problems in that project.)

Why do you need a CPLD/FPGA?

technix · « **Reply #9 on:** November 08, 2018, 12:58:44 pm »

Quote from: legacy on November 08, 2018, 12:55:37 pm

Quote from: technix on November 08, 2018, 12:51:26 pm
More POC than actual use. It is a STM32 QSPI POC, it is a JTAG daisy chain POC, it is also a POC for a subsystem I am using on another project (I never got to test due to some other problems in that project.)

Why do you need a CPLD/FPGA?

On that target project there was a bit of glue logic with quite a few of control points involved, the MCU there ran out of pins, and stepping up the MCU costs more than adding that CPLD. So that CPLD was involved as both the replacement of the glue logic block and an SPI I/O expansion.

asmi · « **Reply #10 on:** November 08, 2018, 03:54:51 pm »

Quote from: technix on November 08, 2018, 12:51:26 pm

1.25V... I don't think there is any STM32 that can run at that low a voltage, at least not flashed when running atthat voltage. And the pain of level shifting bus transceivers is something I am not willing to bear again.

1.25V is just core voltage, IO voltage can be anything you want up to 3.3 V.

technix · « **Reply #11 on:** November 08, 2018, 03:59:27 pm »

Quote from: asmi on November 08, 2018, 03:54:51 pm

Quote from: technix on November 08, 2018, 12:51:26 pm
1.25V... I don't think there is any STM32 that can run at that low a voltage, at least not flashed when running atthat voltage. And the pain of level shifting bus transceivers is something I am not willing to bear again.
1.25V is just core voltage, IO voltage can be anything you want up to 3.3 V.

With 3.3V VDDIO I do get to use it in a similar design next time, but is it available in QFP or QFN package? Vcore really isn't a big issue here as I can just put a local regulator right next to it (as I have done here with that LP2985-1.8 next to the CPLD.

What part number is it again? MAX 10 was surprisingly hard to find here in China though...

asmi · « **Reply #12 on:** November 08, 2018, 04:27:05 pm »

Quote from: technix on November 08, 2018, 03:59:27 pm

With 3.3V VDDIO I do get to use it in a similar design next time, but is it available in QFP or QFN package?

No, the only non-BGA option available is QPF144x0.5mm.

Quote from: technix on November 08, 2018, 03:59:27 pm

Vcore really isn't a big issue here as I can just put a local regulator right next to it (as I have done here with that LP2985-1.8 next to the CPLD.

Depending on the device, you will need to use switched DC-DC as FPGA can consume quite large current. For Xilinx 7 series devices this is all but required because core voltage is even lower (1.0 V, or even 0.95 V for some parts), and some larger Artix devices can consume about 10 amps of current on 1.0 V rail.

Quote from: technix on November 08, 2018, 03:59:27 pm

What part number is it again? MAX 10 was surprisingly hard to find here in China though...

Smallest MAX10 part in QPF package is 10M02SCE144C8G. The smallest package is WLCSP-36, but it's 0.4 mm pitch part, so PCB for it is going to be expensive unless you come up with some clever tricks to brake it out.

technix · « **Reply #13 on:** November 08, 2018, 04:55:24 pm »

Quote from: asmi on November 08, 2018, 04:27:05 pm

Quote from: technix on November 08, 2018, 03:59:27 pm
With 3.3V VDDIO I do get to use it in a similar design next time, but is it available in QFP or QFN package?
No, the only non-BGA option available is QPF144x0.5mm.

Oof... That sound to me like an overlap with the EP4CE6E22 Cyclone IV and XC6SLX9-2PQG144C Spartan-6 chips, which are much cheaper and much easier to find, while have the same package size.

Quote from: asmi on November 08, 2018, 04:27:05 pm

Quote from: technix on November 08, 2018, 03:59:27 pm
Vcore really isn't a big issue here as I can just put a local regulator right next to it (as I have done here with that LP2985-1.8 next to the CPLD.
Depending on the device, you will need to use switched DC-DC as FPGA can consume quite large current. For Xilinx 7 series devices this is all but required because core voltage is even lower (1.0 V, or even 0.95 V for some parts), and some larger Artix devices can consume about 10 amps of current on 1.0 V rail.

If it is a 1A beast I do have a good chip for that: MP2212 5V input dual 2A switch mode regulator, one output for the 3.3V rail and the other for 1.2V. Going up it would be TPS563201 for 3A, I think I have a Chinese-made 5A chip somewhere. If it goes even higher I would break out the big gun known as STM32F334K8.

Quote from: asmi on November 08, 2018, 04:27:05 pm

Quote from: technix on November 08, 2018, 03:59:27 pm
What part number is it again? MAX 10 was surprisingly hard to find here in China though...
Smallest MAX10 part in QPF package is 10M02SCE144C8G. The smallest package is WLCSP-36, but it's 0.4 mm pitch part, so PCB for it is going to be expensive unless you come up with some clever tricks to brake it out.

The main offenders here are C3, C4, D3 and D4 pins, the remaining pins should be routable within the same layer. C4 and D4 are VDDIO, which can just via down. C3 and D3 are actually JTAG so... tough cookies. It will be a challenge for 2-layer board to use for sure.

p.s. LCSC carries that chip.

BrianHG · « **Reply #14 on:** November 08, 2018, 05:27:08 pm »

The core voltage current on a MAX10 is less than 200ma, it only has 2000 logic elements, 110592bits or ram, up to 450mhz core. You also get an internal oscillator and PLL.

3.78$ 10M02DCV36C8G is the cheapest 36 pin bga
7.38$ 10M02SCE144C8G is the cheapest 144pin TQFP, with built in core regulator meaning all you need is 3.3v for this guy.

The 144 pin guy has migration upgrade-ability.

iMo · « **Reply #15 on:** November 08, 2018, 05:45:00 pm »

Quote from: technix on November 08, 2018, 12:51:26 pm

.. Is there any low pin count QFN or QFP CPLD or FPGA with more logic than 32 macrocells, hopefully from Xilinx or Intel? ..

For example I worked with:

iCE40LP384 is 32pin qfn, 384 LEs, 1.2V core, 3.3V IO, $1.5 at single quantity.

iCE40UP5k is 48pin qfn, 1.2V core, 3.3V IO, 5280 LEs, 128kB SRAM, 15kB BRAM, 8xDSP, PLL, 2xI2C and 2xSPI hardened on chip, $6 single quantity.

Free ICECUBE2, RADIANT, ICESTORM (open source) dev tools.

asmi · « **Reply #16 on:** November 08, 2018, 06:02:50 pm »

Quote from: imo on November 08, 2018, 05:45:00 pm

For example I worked with:

iCE40LP384 is 32pin qfn, 384 LEs, 1.2V core, 3.3V IO, $1.5 at single quantity.

iCE40UP5k is 48pin qfn, 1.2V core, 3.3V IO, 5280 LEs, 128kB SRAM, 15kB BRAM, 8xDSP, PLL, 2xI2C and 2xSPI hardened on chip, $6 single quantity.

Free ICECUBE2, RADIANT, ICESTORM (open source) dev tools.

The problem with these parts is that they are SRAM-based, meaning you will need some sort of external storage to boot these things up, while MAX10 is flash-based so it's (almost) instant-on and requires no external components. The good news is that ICE parts (like most other SRAM-based FPGAs) can be boot-strapped by the MCU if you happen to have it onboard, this allows to consolidate storage (or simply use MCU's flash memory to store bitstream by incorporating it into MCU's firmware). These parts to have OTP memory, but this is unsuitable to me as I want to be able to reprogram them.
These small MAX10 guys would be the perfect choice for me if Antel would for once use common sense when designing pinout such that VCC and GND balls would be in the middle of the device (like pretty much all other BGAs I've ever worked with, except perhaps DDR3, but that's a different story) allowing using small planelets with a single via in between balls.

iMo · « **Reply #17 on:** November 08, 2018, 06:10:14 pm »

The iCE40LP384 requires min 8kB bitstream flash (or reads the bitsream off MCU, or OTP).
The iCE40UP5k requires min 128kB bitstream flash (or reads the bitstream off MCU, or OTP).
The nice thing is the bitstream flash is accessible off the iCE40 (via the boot SPI pins) so the FPGA usercode can read/write its own data into the bitstream flash.

technix · « **Reply #18 on:** November 08, 2018, 06:27:53 pm »

Quote from: imo on November 08, 2018, 05:45:00 pm

For example I worked with:

iCE40LP384 is 32pin qfn, 384 LEs, 1.2V core, 3.3V IO, $1.5 at single quantity.

iCE40UP5k is 48pin qfn, 1.2V core, 3.3V IO, 5280 LEs, 128kB SRAM, 15kB BRAM, 8xDSP, PLL, 2xI2C and 2xSPI hardened on chip, $6 single quantity.

Free ICECUBE2, RADIANT, ICESTORM (open source) dev tools.

If I buy it here it is $2 for single. LCSC doesn't carry it themselves but I can buy it through them from Mouser or Digi-key.

As of the dev tools, is there any low cost or free interface hardware, or can Lattice tools borrow another vendor's debug pods? (I kind of wish CMSIS-DAP would take off as the universal USB to JTAG interface here. While it is designed by ARM it is fully open source and vendor independent, and can work on almost any USB-capable microcontroller.)

Quote from: asmi on November 08, 2018, 06:02:50 pm

The problem with these parts is that they are SRAM-based, meaning you will need some sort of external storage to boot these things up, while MAX10 is flash-based so it's (almost) instant-on and requires no external components. The good news is that ICE parts (like most other SRAM-based FPGAs) can be boot-strapped by the MCU if you happen to have it onboard, this allows to consolidate storage (or simply use MCU's flash memory to store bitstream by incorporating it into MCU's firmware). These parts to have OTP memory, but this is unsuitable to me as I want to be able to reprogram them.
These small MAX10 guys would be the perfect choice for me if Antel would for once use common sense when designing pinout such that VCC and GND balls would be in the middle of the device (like pretty much all other BGAs I've ever worked with, except perhaps DDR3, but that's a different story) allowing using small planelets with a single via in between balls.

I am okay with SRAM-based FPGA at least on this board and that specific project, if the FPGA can be configured through slave SPI interface and the same pins can be used after configuration is loaded.

For this board there is that 16MB QSPI chip attached on the micro. I doubt I would ever really have that many code or resources on a 32-pin micro, so it does seem natural to put FPGA bitstream in QSPI. I can just set up QSPI in XIP mode and start a DMA memory-to-peripheral transaction to dump QSPI contents into the FPGA-facing SPI port - no CPU involvement here even while the CPU is starting up other hardware and waiting for USB enumeration.

For the other board the GigaDevice chip there has a peculiarity: the latter 64kB half of the Flash on the 128kB device is not SRAM shadowed (they use QSPI internally) so it is rarely used as code storage. I just figure out a way to convince the MCU to shadow the first 64kB fully, then start a similar DMA transaction dumping the latter half of the Flash into the FPGA.

Quote from: imo on November 08, 2018, 06:10:14 pm

The iCE40LP384 requires min 8kB bitstream flash (or reads the bitsream off MCU, or OTP).
The iCE40UP5k requires min 128kB bitstream flash (or reads the bitstream off MCU, or OTP).
The nice thing is the bitstream flash is accessible off the iCE40 (via the boot SPI pins) so the FPGA usercode can read/write its own data into the bitstream flash.

I believe none of those can fill up a 16MB QSPI chip here, so set up XIP and start DMA dumping.

However for my other project I do have a 64kB bitstream image size cap. Maybe I should just run libz on it and decompress it into the FPGA when MCU boots?

Quote from: blueskull on November 08, 2018, 06:13:37 pm

I second iCE40LP384-SG32. I've used it in various projects, mostly for timing.
It requires 8kB of config data, easily fits in a modern ARM CM chip.
Since the config bits are highly sparse in nature, you can implement some sort of simple compression scheme to easily shrink file size by 20%~50%.
I've done booting from MSP430FR2433.

The bootloading protocol is very simple and is fully open to the public,as described in Lattice TN1248 document.

Maybe I should evaluate that chip some time then. How do I find the debug pods though?

iMo · « **Reply #19 on:** November 08, 2018, 06:37:08 pm »

Lattice (and IceStorm) programming tools support FT2232H. The FT232H works as well.
You need an FT2232H or an FT232H breakout board only. No need for special firmware, afaik. Lattice tools recognize both.
UPduino v2 uses FT232H on its board.
You may use any SPI flash programmer to flash the bitstream flash in-circuit (ie moded $1.5 usbasp).

SiliconWizard · « **Reply #20 on:** November 08, 2018, 06:38:39 pm »

From boredom to boardom!

I was going to suggest the ice40UP but those don't have a JTAG interface as far as I know (unless I missed something). They are accessed via SPI. So that would exclude it from your JTAG scan chain?

Don't know for the other ice40 series.

technix · « **Reply #21 on:** November 08, 2018, 06:55:00 pm »

Quote from: blueskull on November 08, 2018, 06:34:45 pm

Quote from: technix on November 08, 2018, 06:27:53 pm
Maybe I should evaluate that chip some time then. How do I find the debug pods though?

There is a $150 debug cable from Lattice, but you can cook your own with their firmware (somewhere scattered on this forum, I posted it) and write it into an FT2232H.
DigiKey sells a $30 FT2232H breakout board.

Alternatively, you can get a starter kit from Lattice, running between $20 to $200. Both have onboard debugger.

I did not find JTAG interface on that chip... How is it debugged? Will JTAG and SPI pins clash? (If JTAG and SPI do share pins, having more details on the protocol might actually allow me to implement the debug pod protocol directly on the STM32 chip.)

Quote from: SiliconWizard on November 08, 2018, 06:38:39 pm

From boredom to boardom!

I was going to suggest the ice40UP but those don't have a JTAG interface as far as I know (unless I missed something). They are accessed via SPI. So that would exclude it from your JTAG scan chain?

Don't know for the other ice40 series.

One of the POC here is exactly putting the MCU and the CPLD in the same scan chain. If that Lattice chip doesn't have a JTAG interface the whole purpose is defeated.

However being accessed over SPI does give me ideas on a future STM32L433C + QSPI + iCE40LP384 board in Arduino Uno form factor. Once again that QSPI is code storage for FPGA. Analog pins, USB, one hardware serial port and I2C pins have to be implemented using the STM32, but I will expose as many FPGA pins as possible.

technix · « **Reply #22 on:** November 08, 2018, 07:05:19 pm »

Quote from: imo on November 08, 2018, 06:37:08 pm

Lattice (and IceStorm) programming tools support FT2232H. The FT232H works as well.
You need an FT2232H or an FT232H breakout board only. No need for special firmware, afaik. Lattice tools recognize both.
UPduino v2 uses FT232H on its board.
You may use any SPI flash programmer to flash the bitstream flash in-circuit (ie moded $1.5 usbasp).

I wonder if FT232H can be similarly programmed as FT2232H.

I have MiniPro TL866II and SO-8 programming socket.

technix · « **Reply #23 on:** November 08, 2018, 07:07:14 pm »

I wonder how good of an idea it is to use STM32 MCO as the clock source for the CPLD/FPGA? This does save me an oscillator, and for FPGA's it does establish a booting sequence: MCU first, FPGA later.

iMo · « **Reply #24 on:** November 08, 2018, 07:09:30 pm »

iCE40 FPGAs do not need any clock to be booted. They have got their own on-chip clock (w/ selectable speeds) within their SPI interface.
PS: there is a reset pin called CBR, when set the FPGA starts to boot - in slave or master or OTP mode. After some time it signals on a pin called CDONE the boot is done and after another time the SPI pins are enabled for the user (except the /CS pin).


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Got bored, cooked up a board. (Read 5346 times)

Share me