Author Topic: Physical to Logical Mapping: Alternatives?  (Read 866 times)

0 Members and 1 Guest are viewing this topic.

Offline NivagSwerdnaTopic starter

  • Super Contributor
  • ***
  • Posts: 2495
  • Country: gb
Physical to Logical Mapping: Alternatives?
« on: May 14, 2021, 10:45:30 am »


I'm using ATSAMC21J for a particular retro project which is 5V based (hence the ATSAMC21 fits rather nicely); the project is quite generic... basically it is meant to be a test circuit that you drop into a 40 pin DIP socket and then you can twiddle the values.  Because it is generic I cannot easily control the physical mapping of function to a pin (since it differs for different target devices)...

This leaves me with a dogs breakfast of addressing in the PORT registers  PORTA and PORTB.... each is 32 bits.

So... if I want to change the value of all the A's in green.... A0..A15... the target address bus.. I need to twiddle bits... each bit is going to take a few instructions until I get a value that I can write.. indeed the change can never be atomic since it is spread across 2 registers....

All this bit twiddling is going to take a lot of time (probably manageable but not great) and for instance guarantees that I would never be able to emulate the underlying device at device speeds (my clock in 48MHz; theirs is 3MHz).

Can anyone suggest a 5V part, 64 QFN or similar where I could map the physical pins into a more logical structure?  Or any other clever tricks for that matter!

Thanks in advance
 

Offline ajb

  • Super Contributor
  • ***
  • Posts: 2582
  • Country: us
Re: Physical to Logical Mapping: Alternatives?
« Reply #1 on: May 14, 2021, 03:11:56 pm »
There are software techniques that can be used to make bit remapping like this faster (versus one bit at a time anyway), but they depend on how convoluted the map is and are probably not fast enough to run a 3MHz bus (is it actually 3MHz, or is that the CPU clock and bus is slower?) from a 48MHz MCU.

I know you said the device is meant to be generic, would an adapter board that remaps the pins as needed per application be a problem?  What you're doing looks a lot like emulating an old micro bus, so you can probably cover a ton of target boards with just a handful of adapters for the most common parts if that's the case?

Ultimate flexibility would be to use an FPGA to provide the pin remapping, but not sure how many 5V options there are these days. 
 
The following users thanked this post: NivagSwerdna

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 21606
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Physical to Logical Mapping: Alternatives?
« Reply #2 on: May 15, 2021, 12:38:54 am »
Aside, I don't know why or how the fuck they make those nonsense pinouts... ::)

Given that that's the state of things, the intended solution is the PCB, that's what it's there for.  Perhaps you should consider a second riser board on top of the DIP40, to configure for different MCU pinouts?

You may also be interested in an MCU with external bus (it's uh, EBM in AVRs, not sure if same in SAMs; see also FSMC in STM32s), if the timing matches (or can be fitted with a few logic gates), you can end up with the peripheral bus mapped into physical memory, vastly improving throughput and simplifying access.

The upscale AVR XMEGAs (and probably newer MEGAs?) have EBM, generally have logical pinouts, and are available in QFNs, but maybe don't have the memory or CPU power you need here, I don't know.  They're also 3.3V devices.

3.3V isn't a hard stop: it's TTL compatible.  That may be retro-relevant.  For example I have an XMEGA directly wired to a GPIB bus, it works fine.  It's not a good idea if 5V CMOS devices are present, however.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6172
  • Country: fi
    • My home page and email address
Re: Physical to Logical Mapping: Alternatives?
« Reply #3 on: May 15, 2021, 07:25:32 am »
ATSAMC21 has PORTx.OUT, PORTx.OUTSET, PORTx.OUTCLR, and PORTx.OUTTGL registers to set the output pin states; PORTx.DIR, PORTx.DIRSET, PORTx.DIRCLR, and PORTx.DIRTGL to set the pin direction.  Only when reading pin states, do you need any bit operations.  SAMC21 is a Cortex-M0+ with a single-cycle multiplier, so we can do something pretty crafty:

Code: [Select]
#include <stdint.h>

/* TODO: Verify these addresses! */
#define  PORTA_DIRCLR  (*(volatile uint32_t *)0x60000004)
#define  PORTA_DIRSET  (*(volatile uint32_t *)0x60000008)
#define  PORTA_DIRTGL  (*(volatile uint32_t *)0x6000000C)
#define  PORTA_OUTCLR  (*(volatile uint32_t *)0x60000014)
#define  PORTA_OUTSET  (*(volatile uint32_t *)0x60000018)
#define  PORTA_OUTTGL  (*(volatile uint32_t *)0x6000001C)
#define  PORTA_IN      (*(volatile uint32_t *)0x60000020)
#define  PORTB_DIRCLR  (*(volatile uint32_t *)0x60000084)
#define  PORTB_DIRSET  (*(volatile uint32_t *)0x60000088)
#define  PORTB_DIRTGL  (*(volatile uint32_t *)0x6000008C)
#define  PORTB_OUTCLR  (*(volatile uint32_t *)0x60000094)
#define  PORTB_OUTSET  (*(volatile uint32_t *)0x60000098)
#define  PORTB_OUTTGL  (*(volatile uint32_t *)0x6000009C)
#define  PORTB_IN      (*(volatile uint32_t *)0x600000A0)

/* Pin configuration, 640 bytes. */
uint32_t  pin_mask[40][2];
uint32_t  pin_mult[40][2];

/* PA00 = 0, PA31 = 31, PB00 = 32, PB31 = 63. */
int pin_define(const int pin, const int num)
{
    /* Safety check */
    if (pin < 0 || pin >= 40 || num < 0 || num >= 64)
        return -1;

    if (num < 32) {
        pin_mask[pin][0] = ((uint32_t)1) << num;
        pin_mask[pin][1] = 0;
        pin_mult[pin][0] = ((uint32_t)1) << (31 - num);
        pin_mult[pin][1] = 0;
    } else {
        pin_mask[pin][0] = 0;
        pin_mask[pin][1] = ((uint32_t)1) << (num - 32);
        pin_mult[pin][0] = 0;
        pin_mult[pin][1] = ((uint32_t)1) << (63 - num);
    }

    return 0;
}

void pin_mode_in(const int pin)  { PORTA_DIRCLR = pin_mask[pin][0]; PORTB_DIRCLR = pin_mask[pin][1]; }
void pin_mode_out(const int pin) { PORTA_DIRSET = pin_mask[pin][0]; PORTB_DIRSET = pin_mask[pin][1]; }
void pin_mode_tgl(const int pin) { PORTA_DIRTGL = pin_mask[pin][0]; PORTB_DIRTGL = pin_mask[pin][1]; }
void pin_mode(const int pin, const int mode)
{
    if (mode & 1) {
        pin_mode_out(pin);
    } else {
        pin_mode_in(pin);
        /* TODO: pullups etc., per additional mode bits */
    }
}

void pin_out_set(const int pin)  { PORTA_OUTSET = pin_mask[pin][0]; PORTB_OUTSET = pin_mask[pin][1]; }
void pin_out_clr(const int pin)  { PORTA_OUTCLR = pin_mask[pin][0]; PORTB_OUTCLR = pin_mask[pin][1]; }
void pin_out_tgl(const int pin)  { PORTA_OUTTGL = pin_mask[pin][0]; PORTB_OUTTGL = pin_mask[pin][1]; }
void pin_out(const int pin, const int state)
{
    if (state) {
        pin_out_set(pin);
    } else {
        pin_out_clr(pin);
    }
}

uint32_t pin_in(const int pin)
{
    return !!(((PORTA_IN * pin_mult[pin][0]) | (PORTB_IN * pin_mult[pin][1])) & 0x80000000);
}

which using arm-gcc 5.4.1 (-Wall -Os -mcpu=cortex-m0plus -mthumb) generates
Code: [Select]
    .syntax unified
    .cpu cortex-m0plus
    .fpu softvfp
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .eabi_attribute 26, 1
    .eabi_attribute 30, 4
    .eabi_attribute 34, 0
    .eabi_attribute 18, 4
    .thumb
    .syntax unified
    .file   "ops.c"
    .text

    .align  1
    .global pin_define
    .code   16
    .thumb_func
    .type   pin_define, %function
pin_define:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    push    {r4, r5, r6, r7, lr}
    cmp     r0, #39
    bhi     .L5
    cmp     r1, #63
    bhi     .L5
    ldr     r2, .L7
    lsls    r3, r0, #3
    ldr     r5, .L7+4
    cmp     r1, #31
    bgt     .L3
    movs    r4, #1
    movs    r0, r4
    lsls    r0, r0, r1
    str     r0, [r2, r3]
    movs    r0, #0
    adds    r2, r2, r3
    str     r0, [r2, #4]
    movs    r2, #31
    subs    r1, r2, r1
    lsls    r4, r4, r1
    str     r4, [r5, r3]
    adds    r3, r5, r3
    str     r0, [r3, #4]
    b       .L2
.L3:
    movs    r4, #1
    movs    r6, r1
    movs    r7, r4
    movs    r0, #0
    subs    r6, r6, #32
    lsls    r7, r7, r6
    str     r0, [r2, r3]
    adds    r2, r2, r3
    str     r7, [r2, #4]
    movs    r2, #63
    subs    r1, r2, r1
    lsls    r4, r4, r1
    str     r0, [r5, r3]
    adds    r3, r5, r3
    str     r4, [r3, #4]
    b       .L2
.L5:
    movs    r0, #1
    rsbs    r0, r0, #0
.L2:
    @ sp needed
    pop {r4, r5, r6, r7, pc}
.L8:
    .align  2
.L7:
    .word   pin_mask
    .word   pin_mult
    .size   pin_define, .-pin_define

    .align  1
    .global pin_mode_in
    .code   16
    .thumb_func
    .type   pin_mode_in, %function
pin_mode_in:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr     r3, .L10
    lsls    r0, r0, #3
    ldr     r1, [r0, r3]
    ldr     r2, .L10+4
    adds    r0, r3, r0
    str     r1, [r2]
    ldr     r2, [r0, #4]
    ldr     r3, .L10+8
    @ sp needed
    str     r2, [r3]
    bx      lr
.L11:
    .align  2
.L10:
    .word   pin_mask
    .word   1610612740
    .word   1610612868
    .size   pin_mode_in, .-pin_mode_in

    .align  1
    .global pin_mode_out
    .code   16
    .thumb_func
    .type   pin_mode_out, %function
pin_mode_out:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr     r3, .L13
    lsls    r0, r0, #3
    ldr     r1, [r0, r3]
    ldr     r2, .L13+4
    adds    r0, r3, r0
    str     r1, [r2]
    ldr     r2, [r0, #4]
    ldr     r3, .L13+8
    @ sp needed
    str     r2, [r3]
    bx      lr
.L14:
    .align  2
.L13:
    .word   pin_mask
    .word   1610612744
    .word   1610612872
    .size   pin_mode_out, .-pin_mode_out

    .align  1
    .global pin_mode_tgl
    .code   16
    .thumb_func
    .type   pin_mode_tgl, %function
pin_mode_tgl:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr     r3, .L16
    lsls    r0, r0, #3
    ldr     r1, [r0, r3]
    ldr     r2, .L16+4
    adds    r0, r3, r0
    str     r1, [r2]
    ldr     r2, [r0, #4]
    ldr     r3, .L16+8
    @ sp needed
    str     r2, [r3]
    bx      lr
.L17:
    .align  2
.L16:
    .word   pin_mask
    .word   1610612748
    .word   1610612876
    .size   pin_mode_tgl, .-pin_mode_tgl

    .align  1
    .global pin_mode
    .code   16
    .thumb_func
    .type   pin_mode, %function
pin_mode:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    push    {r4, lr}
    lsls    r3, r1, #31
    bpl     .L19
    bl      pin_mode_out
    b       .L18
.L19:
    bl      pin_mode_in
.L18:
    @ sp needed
    pop     {r4, pc}
    .size   pin_mode, .-pin_mode

    .align  1
    .global pin_out_set
    .code   16
    .thumb_func
    .type   pin_out_set, %function
pin_out_set:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr     r3, .L22
    lsls    r0, r0, #3
    ldr     r1, [r0, r3]
    ldr     r2, .L22+4
    adds    r0, r3, r0
    str     r1, [r2]
    ldr     r2, [r0, #4]
    ldr     r3, .L22+8
    @ sp needed
    str     r2, [r3]
    bx      lr
.L23:
    .align  2
.L22:
    .word   pin_mask
    .word   1610612760
    .word   1610612888
    .size   pin_out_set, .-pin_out_set

    .align  1
    .global pin_out_clr
    .code   16
    .thumb_func
    .type   pin_out_clr, %function
pin_out_clr:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr     r3, .L25
    lsls    r0, r0, #3
    ldr     r1, [r0, r3]
    ldr     r2, .L25+4
    adds    r0, r3, r0
    str     r1, [r2]
    ldr     r2, [r0, #4]
    ldr     r3, .L25+8
    @ sp needed
    str     r2, [r3]
    bx      lr
.L26:
    .align  2
.L25:
    .word   pin_mask
    .word   1610612756
    .word   1610612884
    .size   pin_out_clr, .-pin_out_clr

    .align  1
    .global pin_out_tgl
    .code   16
    .thumb_func
    .type   pin_out_tgl, %function
pin_out_tgl:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    ldr     r3, .L28
    lsls    r0, r0, #3
    ldr     r1, [r0, r3]
    ldr     r2, .L28+4
    adds    r0, r3, r0
    str     r1, [r2]
    ldr     r2, [r0, #4]
    ldr     r3, .L28+8
    @ sp needed
    str     r2, [r3]
    bx      lr
.L29:
    .align  2
.L28:
    .word   pin_mask
    .word   1610612764
    .word   1610612892
    .size   pin_out_tgl, .-pin_out_tgl
    .align  1

    .global pin_out
    .code   16
    .thumb_func
    .type   pin_out, %function
pin_out:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    push    {r4, lr}
    cmp     r1, #0
    beq     .L31
    bl      pin_out_set
    b       .L30
.L31:
    bl      pin_out_clr
.L30:
    @ sp needed
    pop     {r4, pc}
    .size   pin_out, .-pin_out

    .align  1
    .global pin_in
    .code   16
    .thumb_func
    .type   pin_in, %function
pin_in:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    movs    r2, r0
    ldr     r3, .L34
    push    {r4, lr}
    ldr     r4, [r3]
    ldr     r3, .L34+4
    ldr     r1, .L34+8
    ldr     r0, [r3]
    lsls    r3, r2, #3
    ldr     r2, [r3, r1]
    adds    r3, r1, r3
    ldr     r3, [r3, #4]
    muls    r2, r4
    muls    r0, r3
    orrs    r0, r2
    lsrs    r0, r0, #31
    @ sp needed
    pop     {r4, pc}
.L35:
    .align  2
.L34:
    .word   1610612768
    .word   1610612896
    .word   pin_mult
    .size   pin_in, .-pin_in

    .comm   pin_mult,320,4
    .comm   pin_mask,320,4
    .ident  "GCC: (GNU Tools for ARM Embedded Processors) 5.4.1 20160919 (release) [ARM/embedded-5-branch revision 240496]"

(exact output, only whitespace modified for easier reading).

Essentially, pin_mask[logical][bank] contains the bit mask – only one bit set per logical pin! – used with clear/set/toggle registers (when setting pin direction or output state); and pin_mult[logical][bank] contains a multiplier that shifts the desired bit to the most significant position, or zero if the bank does not affect the logical pin state, for use when reading the pin states.  You have 40 logical pins, and the SAMC21 has GPIO pins in two logical banks, so these lookup tables do take 640 bytes of SRAM.

Is this fast enough for you?  I doubt you can get 3 MHz with a 48 MHz MCU – that is 1:12 – but it is not that much work.

A similar approach – using a bit mask per pin per bank and CLR/SET/TGL registers; and a multiplier to "shift" the input bit to the highest bit position but still being able to clear the result to zero – works on many other ARMs as well.  Note that instead of a multiplier, you can use a shift count, but only if the shift instruction supports clearing the entire register (shift down by 32), or if you do not use the pin corresponding to bit 0 in any bank (so that you can add an explicit additional shift right).  But, when you have a single-cycle 32×32 multiplication instruction, it makes sense to use it to ones advantage.

If I was doing this, I'd prototype it using Teensy 4.1 with a similar scheme (with four GPIO banks, not two).  It has an i.MX RT1062 running at up to 600 MHz (~960 MHz if overclocked and well cooled).  That'd at least tell oneself how much computing power is actually needed, even if one didn't end up using that particular processor.
« Last Edit: May 15, 2021, 07:27:48 am by Nominal Animal »
 
The following users thanked this post: NivagSwerdna

Offline NivagSwerdnaTopic starter

  • Super Contributor
  • ***
  • Posts: 2495
  • Country: gb
Re: Physical to Logical Mapping: Alternatives?
« Reply #4 on: May 15, 2021, 08:21:43 am »
That looks interesting!
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6172
  • Country: fi
    • My home page and email address
Re: Physical to Logical Mapping: Alternatives?
« Reply #5 on: May 15, 2021, 03:33:09 pm »
Similarly – if you have sufficient SRAM for the lookup tables –, you can create N-bit output "buses", by using a pre-populated lookup table,
    uint32_t  bus_map[MAXVAL+1][banks];
where MAXVAL is (uint32_t)((1 << N)-1), i.e. 2N-1.  To write value v to the bus, you write bus_map[v & BUS_MAX] to the OUTSET registers of each bank, and bus_map[(~v) & BUS_MAX] to the OUTCLR registers.

If the bus pins happen to be in the same bank and the bus width is a compile time constant, this is just one lookup, a XOR, and two 32-bit writes.

Technically, all the pins in the same bank do change at the same time, but either rising or falling slopes are always in the same order (depends on which write you do first); in different banks, there is that small latency, but still, the pin change order is always the same, and should suffice for a bus.

If N is large, I recommend splitting it into multiple lookup tables.  It does slow things down a bit (because v must be split into groups of bits), but it dramatically reduces the size of the lookup tables needed.

For example, if you have a 16-bit bus on a SAMC21, you only need 2048 bytes for two lookup tables, and you can use any GPIO pins in whatever order you choose for the bus:
Code: [Select]
uint32_t  bus_outmap[256][2][2];    /* [value][group][bank] */

void bus_write16(uint32_t  value)
{
    const uint32_t  val0 = value & 255;
    const uint32_t  val1 = (value >> 8) & 255;
    const uint32_t  not0 = val0 ^ 255;
    const uint32_t  not1 = val1 ^ 255;

    /* Rising edges. */
    PORTA_OUTSET = bus_outmap[val0][0][0] | bus_outmap[val1][1][0];
    PORTB_OUTSET = bus_outmap[val0][0][1] | bus_outmap[val1][1][1];

    /* Falling edges. */
    PORTA_OUTCLR = bus_outmap[not0][0][0] | bus_outmap[not1][1][0];
    PORTB_OUTCLR = bus_outmap[not0][0][1] | bus_outmap[not1][1][1];
}

Unfortunately, reading from a bus requires a rather large lookup table.  The logic is the same, except inverted, and that now you have a bus width of 2×32=64 bits (on SAMC21; twice that on many other ARMs); note that this lookup table size does not depend on the bus width at all, only on the number of GPIO bits on the hardware.  For example, using 8192 bytes of lookup tables, regardless of the bus size:
Code: [Select]
uint32_t  bus_inmap[256][4][2];     /* [value][group][bank] */

uint32_t  bus_read(void)
{
    const uint32_t  val0 = PORTA_IN;
    const uint32_t  val1 = PORTB_IN;

    return bus_inmap[ val0        & 255][0][0] | bus_inmap[ val1        & 255][0][1]
         | bus_inmap[(val0 >>  8) & 255][1][0] | bus_inmap[(val1 >>  8) & 255][1][1]
         | bus_inmap[(val0 >> 16) & 255][2][0] | bus_inmap[(val1 >> 16) & 255][2][1]
         | bus_inmap[(val0 >> 24) & 255][3][0] | bus_inmap[(val1 >> 24) & 255][3][1];
}
Again, if the pins are all in the same bank, you can save half (three quarters on those ARMs that have four GPIO bank pins) on lookup table size.

Setting up the lookup tables is simple, but not very fast (there are two loops of (MAXVAL+1) per "bus" bit).  I would clear the lookup tables first, then set the lookup bits for each individual pin:
Code: [Select]
void bus_init(void)
{
    memset(bus_outmap, 0, sizeof bus_outmap);
    memset(bus_inmap, 0, sizeof bus_inmap);
}

/* pin: PA00=0, PA01=1, .., PA31=31, PB00=32, PB01=33, .., PB31=63.
*/
void bus_define(const uint32_t  addrbit, const uint32_t  pin)
{
    /* Output mapping. */
    {
        const uint32_t  value = 1 << (addrbit & 7);
        const uint32_t  group = addrbit >> 3;
        const uint32_t  pinmask = 1 << (pin & 31);
        const uint32_t  bank = pin / 32;

        for (uint32_t  val = value; val < 256; val += value)
            bus_outmap[val][group][bank] |= pinmask;
    }

    /* Input mapping. */
    {
        const uint32_t  value = 1 << (pin & 7);
        const uint32_t  group = (pin >> 3) & 3;
        const uint32_t  addrmask = 1 << addrpin;
        const uint32_t  bank = pin / 32;

        for (uint32_t  val = value; val < 256; val += value)
            bus_inmap[val][group][bank] |= addrmask;
    }

    /* TODO: Set the actual pin properties, like default direction,
             driving strength etc., perhaps based on an additional
             parameter to this function. */
}



I recently discovered that on Teensy 4.0 I can use this to implement a pretty darn efficient 16-bit "bus" using GPIO1 pins (labeled 0, 1, 14-21, 24-27 on Teensy), you see.  Note that while the pads labeled 24-33 look odd, one can solder an SMD 2×5 pin header with 0.1" spacing to them, so that a carrier or break-out board only needs pins in the standard 0.1" spacing.

(Teensy 4's i.MX RT1062 has four GPIO banks, each with a data set (DR_SET), clear (DR_CLEAR), and toggle (DR_TOGGLE) registers, so in that regard quite similar to SAMC21.)

I am still investigating on exactly how I could use an 18-bit bus (3×6 bits for RGB data, and 8 bits for register stuff) efficiently for use with parallel display modules (ILI9341 and the like), without using the pins for an UART, SPI, or I2C.  The lookup tables take up to 3200 bytes, but the i.MX RT1062 on it has lots of RAM (a megabyte total).
Obviously, I cannot use DMA here, but since I just want this as an USB-controlled embedded Linux "framebuffer" with a few buttons/encoders, that should be okay; note that its native USB is HS (max. 480 Mbit/s), not FS/LS (12/1 Mbit/s).  I probably could even have a 32-bit true color framebuffer with an automatic 8-bit (via lookup) overlays, combined during output updates.  Might be useful for error/status messages for an appliance, without disturbing the application-accessible true color framebuffer...  Obviously, an indexed color (paletted) framebuffer would be easy to support, since the "palette" can directly refer to the GPIO port bit masks.  ("Clear" is always the binary inverse of the "set", applied to the "bus" bits only.  The output toggle register is useful for the write strobe, too.)

Without using the six oddball SD card pads on the bottom, there are 34 GPIO pins on the Teensy 4.0: abovementioned 16 pins in bank 1; 9 pins (labeled 6-13, 32) in bank 2; 3 pins (labeled 28, 30, 31) in bank 3; and 6 pins (labeled 2-5, 31, 33) in bank 4.  Bank 4 has five pins (labeled 2, 3, 4, 33, 5) in consecutive bits so no lookup table is needed for those.  Four bank 1 pins (labeled 19, 18, 14, 15) are also in consecutive bits, as are six pins labeled 17, 16, 22, 23, 20, 21.  Unfortunately, pins labeled 16-19 (bank 1) have two of the three I2C buses, and pins labeled 11-13 (and 10 for chip select, bank 2) have the easiest-to-access SPI bus.  There is another SPI bus in bank 1 using pins labeled 0, 1, 26, 27; and a third one using the oddball SD card pads labeled 34-39, but I don't like using the oddball pads as they are hard to break out.  (I haven't included the six oddball SD card pads in this list anywhere, except mentioning they are labeled 34 through 39.)

Can you tell I've looked at various 32-bit microcontrollers, and their datasheets to see which ones have "nice" GPIO banks?  ;D

It is a darned pity the i.MX RT1062 is only available in MAPBGA.  If I wasn't just an uncle bumblefuck hobbyist, I'd do my own i.MX RT1062 board, with a different set of pins, optimized for interfacing parallely-bussy thingies to high-speed USB 2.0 (max. 480 Mbit/s).  Even a TQFP scares me a bit..

(Even with the slow tty layer in Linux when using USB Serial, I can get 20+ Mbits/s ping-pong and 200+ Mbits/s one-way with a Teensy 4.0, so it should have no issues whatever even with 60Hz full updates to a 32-bit 320×240 framebuffer (147,456,000 bits per second).)
« Last Edit: May 15, 2021, 03:37:30 pm by Nominal Animal »
 
The following users thanked this post: NivagSwerdna


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf