Author Topic: SAM C20/C21Pin to PORT mapping  (Read 2672 times)

0 Members and 1 Guest are viewing this topic.

Offline NivagSwerdnaTopic starter

  • Super Contributor
  • ***
  • Posts: 2495
  • Country: gb
SAM C20/C21Pin to PORT mapping
« on: October 02, 2020, 05:43:16 pm »
I'm new to SAM chips... anyway...

If I say have 16 lines on an external address bus and for the sake of routing these lines end up jumbled across the actual uP pins...

Now perhaps A15 goes to PA01, A14 to PA03, .... A2 to PA06, A1 to PA02, A0 to PA21 etc

I can perform a 32bit read against IN for Port Group 0 which should have all the bits I need but not in the right order and also 16 spare bits

Is there an efficient way to move the bits around?  or will it require at least 16 operations to pull each bit out of the 32bit value and stick them back together again?

Worst case, I might need some lines on different port groups... eg A7 is actually on PB04...

Again is there a clever way of mapping this?

or... do I really need to constrain my routing to at least get a few bits together?

Using PA00...PA15 doesn't seem practical since some pins are multiplexed to functions I need above and beyond GPIO.

Am I missing something?

PS
ATSAMC21J18

PPS
I don't have enough CPU cycles to afford to do this 1 bit at a time
« Last Edit: October 02, 2020, 05:48:32 pm by NivagSwerdna »
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11262
  • Country: us
    • Personal site
Re: SAM C20/C21Pin to PORT mapping
« Reply #1 on: October 02, 2020, 06:30:46 pm »
There is no automatic mapping that will do all the work. But you may not need to extract them bit by bit, it is possible to make more optimized code if you have at least some of the pins in the right order.

Also, table lookup may be possible to. It will consume some RAM or flash, but will require way fewer cycles.

Can you give a complete mapping?
« Last Edit: October 02, 2020, 06:34:19 pm by ataradov »
Alex
 

Offline ajb

  • Super Contributor
  • ***
  • Posts: 2607
  • Country: us
Re: SAM C20/C21Pin to PORT mapping
« Reply #2 on: October 02, 2020, 06:38:23 pm »
Well, you could make the age-old tradeoff of storage vs processing and use a lookup table.  A full 128kB lookup table is kinda silly, but if you've got the storage space would be about as fast as you could get, especially if you can get all of the lines into one 16-bit half of the IO bank.  More reasonably, if you can marshal the 8 low and 8 high lines each into one byte of a port you can do two 256 byte lookup tables with no intermediate computations, other than concatenating the results of the two lookups to form a 16-bit word.  There are intermediate solutions as well, which will depend on how much flexibility you have in terms of routing vs processing.
 

Offline NivagSwerdnaTopic starter

  • Super Contributor
  • ***
  • Posts: 2495
  • Country: gb
Re: SAM C20/C21Pin to PORT mapping
« Reply #3 on: October 02, 2020, 06:41:47 pm »
Maybe I was being a bit hard on myself as I have a quite a few contiguous bits...

... but still...

read for PORTA
read for PORTB

shift on A

then some ANDs, SHIFTS and ORs for the other non-contiguous sections

Guess I will have to burn cycles



 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11262
  • Country: us
    • Personal site
Re: SAM C20/C21Pin to PORT mapping
« Reply #4 on: October 02, 2020, 06:52:02 pm »
This is not too bad. PA is obvious. For PB, don't extract the bits and then reassemble. Move things to their final location. So to place PB2 and PB3 into their final location do this ((pb & 0xc) << 13).

With this mapping table look up with not be too good. If possible, even if you can't arrange bits in the right order, have then located on close bits in the port. That will make table lookup a very easy task.

And writing this part in assembly may prove to be useful here too.
« Last Edit: October 02, 2020, 06:54:22 pm by ataradov »
Alex
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: SAM C20/C21Pin to PORT mapping
« Reply #5 on: October 03, 2020, 06:14:35 am »
One of the things that people tend to overlook with the SAMD chips (and probably other ARMs as well) is that the PORTs are also readable 8 or 16 bits at a time.  (the .h files are not very helpful for doing this :-(  )I don't know if that will help your particular situation.  Lookup tables for 8bit quantities are a lot more reasonable than 32bit quantities.
The Cortex-M3 and M4 have bitfield instructions that can help with this sort of thing.but NOT the M0 or M0+.
 

Offline NivagSwerdnaTopic starter

  • Super Contributor
  • ***
  • Posts: 2495
  • Country: gb
Re: SAM C20/C21Pin to PORT mapping
« Reply #6 on: October 03, 2020, 09:03:49 am »
One of the things that people tend to overlook with the SAMD chips (and probably other ARMs as well) is that the PORTs are also readable 8 or 16 bits at a time.
I did wonder about that but thought that since the registers are natively 32 bits it didn't really have any advantage using smaller reads?
 

Offline NivagSwerdnaTopic starter

  • Super Contributor
  • ***
  • Posts: 2495
  • Country: gb
Re: SAM C20/C21Pin to PORT mapping
« Reply #7 on: October 03, 2020, 09:09:32 am »
I am wondering if there are chips that allow you to map the physical pins into bits of the Port Groups?  That would obviously require quite a lot of logic but be very useful for optimising reads and writes for buses.
 

Online dietert1

  • Super Contributor
  • ***
  • Posts: 2073
  • Country: br
    • CADT Homepage
Re: SAM C20/C21Pin to PORT mapping
« Reply #8 on: October 03, 2020, 10:14:26 am »
I remember using a Kinetis Arm MCU some years ago with a graphics screen. That MCU had something called Flexbus, which was a port configuration including a traditional bus with parallel address and data transfer. Maybe something like that could streamline the given input task, using an external 16 bit to 8 bit Mux.

Regards, Dieter
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6264
  • Country: fi
    • My home page and email address
Re: SAM C20/C21Pin to PORT mapping
« Reply #9 on: October 03, 2020, 12:02:08 pm »
For input, you can use 8-bit port reads and small look-up tables: if the input pins are spread over P bytes, then 256P for up to 8 parallel bits, 512P for up to 16 parallel bits, and 1024P for up to 32 parallel bits.

For example, if A, B and C correspond to some port data input bytes, with LA[], LB[], and LC[] the three 256-entry lookup tables in rom/flash, the actual value is
    v = LA[A] | LB[B] | LC[C]
In each of the lookup tables, a bit is set if the corresponding input bit is set, clear otherwise.  If v is 8-bit, then each table takes 256 bytes for a total of 768 bytes of lookup; if 16-bit, total 1536 bytes; if 32-bit, total 3072 bytes of lookup is needed.
 
For output, it gets slightly more complicated, but you can use a similar approach; you just need to shadow the PDOR (port data output register).
Essentially, you first split the output value v into bytes.  Arbitrarily choosing little-endian byte order:
    b0 = v & 255
    b1 = (v >> 8) & 255
    b2 = (v >> 16) & 255
Next, we do a lookup (which is the inverse of the above LA[] etc) for each, noting that the value here needs to be at least 24-bit, because the output is spread over 3 PDOR bytes:
    o = L0[b0] | L1[b1] | L2[b2]
Then, we split that output again into bytes, again I'm arbitrarily picking little-endian byte order,
    o0 = o & 255
    o1 = (o >> 8) & 255
    o2 = (o >> 16) & 255
Finally, we OR each output PDOR byte with the shadowed state – i.e., containing bits set for the unrelated output pins that are/need to be currently high:
    A = s0 | o0
    B = s1 | o1
    C = s2 | o2
Above, L0[], L1[], and L2[] are all 32-bit with 256 entries, 1024 bytes each, for a total of 3072 bytes of rom/flash needed for lookups.

I researched this when I was looking into 18-bit parallel output for ILI9341 etc. 240×320 pixel full-color display modules.
For example, if the output pins are spread over 5 PDOR register bytes (and each value is 18-bit, or 3 bytes), the output lookup tables need 5×3×256 = 3840 bytes.  (5 bytes can be efficiently split into two sub-tables, one with 32-bit values and the other 8-bit values.  Otherwise, you may need to expand to e.g. two 32-bit tables, taking up 6144 bytes, as you'll want to minimize the number of rom/flash lookups per output word, since they tend to be slower than RAM accesses on 32-bit ARMs.)

If anyone is interested, I can show some example C code, including generating the tables given pin mapping etc.  However, I suspect this should be obvious for many; it is by no means my own original invention.  Just re-discovered on my own.
« Last Edit: October 03, 2020, 12:11:30 pm by Nominal Animal »
 
The following users thanked this post: NivagSwerdna

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11262
  • Country: us
    • Personal site
Re: SAM C20/C21Pin to PORT mapping
« Reply #10 on: October 03, 2020, 04:57:38 pm »
I did wonder about that but thought that since the registers are natively 32 bits it didn't really have any advantage using smaller reads?
Let's say you can fit your 16 bits into the groups: 6 bits somewhere in PA[7:0], 5 bits in PA[23:16] and 5 bits in PB[15:8], then the whole implementation will be 3 register reads and 3 table lookups. Each table will be 256*2 bytes in size, so not a lot of space wasted.

If you place the code in the SRAM, tables in SRAM and use IOBUS version of the PORT, themn you will get the fastest possible version of that mapping.
« Last Edit: October 03, 2020, 05:00:31 pm by ataradov »
Alex
 
The following users thanked this post: NivagSwerdna

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11262
  • Country: us
    • Personal site
Re: SAM C20/C21Pin to PORT mapping
« Reply #11 on: October 03, 2020, 04:58:43 pm »
I am wondering if there are chips that allow you to map the physical pins into bits of the Port Groups?  That would obviously require quite a lot of logic but be very useful for optimising reads and writes for buses.
Xmega series have virtual PORT peripherals that let you map individual bits from physical registers, but I have not really worked with them, I'm not sure what limitations they may have.
Alex
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6264
  • Country: fi
    • My home page and email address
Re: SAM C20/C21Pin to PORT mapping
« Reply #12 on: October 03, 2020, 05:19:52 pm »
I did wonder about that but thought that since the registers are natively 32 bits it didn't really have any advantage using smaller reads?
Let's say you can fit your 16 bits into the groups: 6 bits somewhere in PA[7:0], 5 bits in PA[23:16] and 5 bits in PB[15:8], then the whole implementation will be 3 register reads and 3 table lookups. Each table will be 256*2 bytes in size, so not a lot of space wasted.

If you place the code in the SRAM, tables in SRAM and use IOBUS version of the PORT, themn you will get the fastest possible version of that mapping.
Very good point!  (One of the microcontrollers I use is a Teensy 4, which uses an i.MX RT1052 with 1024k of RAM, but the output pins are really spread out.  It has LOTS of RAM, enough so that even Teensyduino, the Arduino library for these, supports marking code FASTMEM so that they're copied to RAM and executed from RAM.  It makes a difference for e.g. high-frequency interrupt functions, DMA completion stuff, et cetera.)

It really isn't a lot, and the lookups only cost the memory latency.  Plus, for outputs, for the cost of an extra XOR with a fixed mask, you can do guaranteed-complimentary output pins (as long as they are in the same 8-bit chunk in the same port, that is).
 
The following users thanked this post: NivagSwerdna


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf