Author Topic: Bit manipulation: 'Stretching a bitmask'  (Read 8158 times)

0 Members and 1 Guest are viewing this topic.

Offline Someone

  • Super Contributor
  • ***
  • Posts: 4510
  • Country: au
    • send complaints here
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #25 on: November 01, 2016, 10:54:05 pm »
It really depends on your instruction set if you want to optimise to this level, for instance with a short pipeline DSP (or AVR) processor you can use a shift through carry.
Code: [Select]
for length of source
  source = source << 1
  result = result << carry_flag
  result = result << 1
loop
result = result | (result >> 1)
And unroll the loop since its length is known at compile time. If branching is cheap or you can selectively apply operands or immediates, then it might be cheaper do something more like:
Code: [Select]
result = (result << 2) + (carry_flag) ? 3 : 0It all depends on the underlying architecture of the chip.
 

Offline dannyf

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #26 on: November 01, 2016, 11:13:27 pm »
here is the base line:

8-bit types -> 16-bit types, <9k ticks, for all 256 conversions;
16-bit types -> 32-bit types, 7 million ticks, for all 64K conversions.

they should be fairly easy to beat.
================================
https://dannyelectronics.wordpress.com/
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #27 on: November 02, 2016, 12:48:11 am »
I would keep is simple.

Code: [Select]
u8_t doubleup_bits(u8_t i) {
   u8_t o = 0;
   if(i&1) o += 3<<0;
   if(i&2) o += 3<<2;
   if(i&4) o += 3<<4;
   if(i&8) o += 3<<6;
  return o;
}

Or use a lookup 16 byte lookup table (which would be far more efficient:

Code: [Select]
  static const u8_t lookup_table[16] = {
      0x00, 0x03, 0x0C, 0x0F,
      0x30, 0x33, 0x3C, 0x3F,
      0xC0, 0xC3, 0xCC, 0xCF,
      0xF0, 0xF3, 0xFC, 0xFF},

  ...
      o = 0;
      if(i < 16)
        o = lookup_table[i];
  ...

But that is just me... simple....
« Last Edit: November 02, 2016, 12:55:37 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline spongman

  • Newbie
  • Posts: 7
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #28 on: November 02, 2016, 01:21:22 am »
don't ask me how, but

Code: [Select]
uint8_t double_bits(uint8_t i) { return ((i * 0x01010101 & 0x08040201) * 0x0204081 >> 21) * 3; }
does the trick.
 
The following users thanked this post: BravoV, BurnedResistor

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #29 on: November 02, 2016, 01:32:34 am »
Quote
That can be simplified to
[size=126%]?[/size][size=70.7%]i[/size][size=70.7%]?[/size][size=70.7%]n[/size]n[size=70.7%]i[/size]?3(2[size=70.7%]2[/size][size=70.7%]i[/size])That's very pretty, but it's a restatement of the problem rather than a solution. (1 bit-doubled is 3)

Ian's post pans out, I think:Quote
See if any of the ideas for interleaving bits work for you: http://graphics.stanford.edu/~seander/bithacks.html#InterleaveTableObvious
Here, I've tried to prune the "magic number" algorithm down to an 8bit->16bit version:Code: [Select]unsigned int stretch(int x)
{
    x = (x | (x << 4)) & 0x0F0F;
    x = (x | (x << 2)) & 0x3333;
    x = (x | (x << 1)) & 0x5555;
    return  x | (x<<1);
}
That works out to about 16 instructions (~48 bytes) on a CM3, with no looping or jumps.  That seems pretty good for 8bits; a loop would use half of that in jump instructions, and the "newValue |= (value & 0x1) ? 0b00000011 : 0" version is longer even for 4bits.
(the ?: version ends up being about 114 bytes.  A loop-based version is actually significantly shorter (~26 bytes), but of course it loops...)
Quote
how could I determine if this would be optimized?
Look at the assembly code produced.   I tried some examples with gcc, and it did NOT succeed in optimizing the constant case for the "?:" version until I explicitly declared the function as "static inline";  OTOH, the magic number version was optimized away even with no inlining hints at all.  Note that you'd have to include the function definition in a .h file to get constant-based optimization, and it's a bit long to get inlined for the non-constant cases (IMO.)

Whole "test program":
Code: [Select]#include <stdio.h>

// Magic number version
unsigned int stretch(int x)
{
    x = (x | (x << 4)) & 0x0F0F;
    x = (x | (x << 2)) & 0x3333;
    x = (x | (x << 1)) & 0x5555;
    return  x | (x<<1);
}


// ?: version
unsigned int stretch1(int value)
{
    int newValue = 0;

    newValue |= (value & 0x1) ? 0b00000011 : 0;
    newValue |= (value & 0x2) ? 0b00001100 : 0;
    newValue |= (value & 0x4) ? 0b00110000 : 0;
    newValue |= (value & 0x8) ? 0b11000000 : 0;
#if 1 // 4 bits or 8bits
    newValue |= (value & 0x10) ? 0b1100000000 : 0;
    newValue |= (value & 0x20) ? 0b110000000000 : 0;
    newValue |= (value & 0x40) ? 0b11000000000000 : 0;
    newValue |= (value & 0x80) ? 0b1100000000000000 : 0;
#endif
    return newValue;
}

// Loop based version
unsigned int stretch2(int value)
{
    unsigned int newValue = 0;
    int i;

    for (i=0;i < 8;i++) {
    newValue >>= 2;
    if (value & 1) {  // One bit out
        newValue |= 0xC000; // two bits in
    }
    value >>= 1;
    }
    return newValue;
}



volatile int*p = 0x20000000;

int main() {
    int c;
#ifdef __ARM__
    c = *p;
    *p = stretch(0x59);
    *p = stretch1(0xA);
    *p = stretch2(0x61);
#endif
    for (c=0; c <=255; c++) {
    volatile int s, s1, s2;
    s = stretch(c);
    s1 = stretch1(c);
    s2 = stretch2(c);
    if (s != s1 ||
        s != s2)  {
        printf("\nSomething wrong %x: %x %x %x", c, s, s1, s2);
    }
    }
}
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #30 on: November 02, 2016, 01:35:13 am »
don't ask me how, but

Code: [Select]
uint8_t double_bits(uint8_t i) { return ((i * 0x01010101 & 0x08040201) * 0x0204081 >> 21) * 3; }
does the trick.

Oh, that is awesome! - use the multiply to expand the bits out, then mask them off, then push them back together again.

However I like it best in VHDL:

Code: [Select]
   for i in 0 to in_bits'high
   loop
      out_bits(2*i)   <= in_bits(i);
      out_bits(2*i+1) <= in_bits(i);
   end loop;

Also runs a lot quicker - less than one clock cycle :D


Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #31 on: November 02, 2016, 01:50:51 am »
Quote
uint8_t double_bits(uint8_t i) { return ((i * 0x01010101 & 0x08040201) * 0x0204081 >> 21) * 3; }
Didn't work for me (2--> 0x60c) (and others)
I looked at some of those magic bit-reversal algorithms that use multiply/modulo, but they looked like they'd get messy fast above 4bits.  (the bit reversal algorithms actually derive 2 bits from each copy, so they only make 5 copies to do 8bits, and they STILL need 64bits...)
 

Offline John Coloccia

  • Super Contributor
  • ***
  • Posts: 1208
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #32 on: November 02, 2016, 07:06:23 am »
Where is the drive to optimize this code coming from? It's basic setup code and should be written as simply and clearly as possible. The standard way of doing stuff like this is with flags that are ORd together, like I showed.
 

Offline Kalvin

  • Super Contributor
  • ***
  • Posts: 2145
  • Country: fi
  • Embedded SW/HW.
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #33 on: November 02, 2016, 08:25:53 am »
Looks to me that a lookup-table with 16 entries containing the expanded 8-bit value is a simple to understand and possibly very efficient as it doesn't require any conditional jumps. You can expand it to 8-bit input value easily when you process the input in two 4-bit values:

Code: [Select]
uint16_t expand_bitmask(uint8_t bitmask)
{
    const uint8_t lookup [16] = {
    /*  0 */ b0000_0000,
    /*  1 */ b0000_0011,
    /*  2 */ b0000_1100,
    /*  3 */ b0000_1111,
    /*  4 */ b0011_0000,
    ...
    /* 15 */ b1111_1111 };

    uint16_t expanded_bitmask = (lookup[(bitmask >> 4) & 0x0f] << 8) | (lookup[bitmask & 0x0f]);
    return expanded_bitmask;
}
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #34 on: November 02, 2016, 09:13:00 am »
The 4-bit "magic number" version compiles to 20 byte, total:

Code: [Select]
// Magic number version
unsigned int stretch4(int x)
{
    x = (x | (x << 2)) & 0x33;
    x = (x | (x << 1)) & 0x55;
    return  x | (x<<1);
}
 

Offline rob42

  • Newbie
  • Posts: 9
  • Country: aq
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #35 on: November 02, 2016, 10:02:13 am »
Code: [Select]
// set group of bits
#define _bs(bits,n,m)  ((u32)(bits) << ((n) << ((m)-1)))
// clear group of bits
#define _nbs(bits,n,m) ((u32)~(_bs((bits),(n),(m))))

// clear/set 1-bit groups
#define _rm11(r,b,n) (r) = ((r) & _nbs(1,n,1)) | _bs(b,n,1)
#define _rm12(r, b, n1,n2) (r) = ((r) & (_nbs(1,n1,1) & _nbs(1,n2,1))) | (_bs(b,n1,1) | _bs(b,n2,1))
#define _rm13(r, b, n1,n2,n3) (r) = ((r) & (_nbs(1,n1,1) & _nbs(1,n2,1) & _nbs(1,n3,1))) | (_bs(b,n1,1) | _bs(b,n2,1) | _bs(b,n3,1))
#define _rm14(r, b, n1,n2,n3,n4) (r) = ((r) & (_nbs(1,n1,1) & _nbs(1,n2,1) & _nbs(1,n3,1) & _nbs(1,n4,1))) | (_bs(b,n1,1) | _bs(b,n2,1) | _bs(b,n3,1) | _bs(b,n4,1))
#define _rm15(r, b, n1,n2,n3,n4,n5) (r) = ((r) & (_nbs(1,n1,1) & _nbs(1,n2,1) & _nbs(1,n3,1) & _nbs(1,n4,1) & _nbs(1,n5,1))) | (_bs(b,n1,1) | _bs(b,n2,1) | _bs(b,n3,1) | _bs(b,n4,1) | _bs(b,n5,1))
#define _rm16(r, b, n1,n2,n3,n4,n5,n6) (r) = ((r) & (_nbs(1,n1,1) & _nbs(1,n2,1) & _nbs(1,n3,1) & _nbs(1,n4,1) & _nbs(1,n5,1) & _nbs(1,n6,1))) | (_bs(b,n1,1) | _bs(b,n2,1) | _bs(b,n3,1) | _bs(b,n4,1) | _bs(b,n5,1) | _bs(b,n6,1))

// clear/set 2-bit groups
#define _rm21(r,b,n) (r) = ((r) & _nbs(3,n,2)) | _bs(b,n,2)
#define _rm22(r, b, n1,n2) (r) = ((r) & (_nbs(3,n1,2) & _nbs(3,n2,2))) | (_bs(b,n1,2) | _bs(b,n2,2))
#define _rm23(r, b, n1,n2,n3) (r) = ((r) & (_nbs(3,n1,2) & _nbs(3,n2,2) & _nbs(3,n3,2))) | (_bs(b,n1,2) | _bs(b,n2,2) | _bs(b,n3,2))
#define _rm24(r, b, n1,n2,n3,n4) (r) = ((r) & (_nbs(3,n1,2) & _nbs(3,n2,2) & _nbs(3,n3,2) & _nbs(3,n4,2))) | (_bs(b,n1,2) | _bs(b,n2,2) | _bs(b,n3,2) | _bs(b,n4,2))
#define _rm25(r, b, n1,n2,n3,n4,n5) (r) = ((r) & (_nbs(3,n1,2) & _nbs(3,n2,2) & _nbs(3,n3,2) & _nbs(3,n4,2) & _nbs(3,n5,2))) | (_bs(b,n1,2) | _bs(b,n2,2) | _bs(b,n3,2) | _bs(b,n4,2) | _bs(b,n5,2))
#define _rm26(r, b, n1,n2,n3,n4,n5,n6) (r) = ((r) & (_nbs(3,n1,2) & _nbs(3,n2,2) & _nbs(3,n3,2) & _nbs(3,n4,2) & _nbs(3,n5,2) & _nbs(3,n6,2))) | (_bs(b,n1,2) | _bs(b,n2,2) | _bs(b,n3,2) | _bs(b,n4,2) | _bs(b,n5,2) | _bs(b,n6,2))

// configure mode of 1..6 pins
inline void gpio_mode1(GPIO_TypeDef* GPIOx, GPIOMode_TypeDef mode, u32 pin1){ _rm21(GPIOx->MODER, mode, pin1); }
inline void gpio_mode2(GPIO_TypeDef* GPIOx, GPIOMode_TypeDef mode, u32 pin1, u32 pin2){ _rm22(GPIOx->MODER, mode, pin1,pin2); }
inline void gpio_mode3(GPIO_TypeDef* GPIOx, GPIOMode_TypeDef mode, u32 pin1, u32 pin2, u32 pin3){ _rm23(GPIOx->MODER, mode, pin1,pin2,pin3); }
inline void gpio_mode4(GPIO_TypeDef* GPIOx, GPIOMode_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4){ _rm24(GPIOx->MODER, mode, pin1,pin2,pin3,pin4); }
inline void gpio_mode5(GPIO_TypeDef* GPIOx, GPIOMode_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5){ _rm25(GPIOx->MODER, mode, pin1,pin2,pin3,pin4,pin5); }
inline void gpio_mode6(GPIO_TypeDef* GPIOx, GPIOMode_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5, u32 pin6){ _rm26(GPIOx->MODER, mode, pin1,pin2,pin3,pin4,pin5,pin6); }

// configure output type of 1..6 pins
inline void gpio_otype1(GPIO_TypeDef* GPIOx, GPIOOType_TypeDef mode, u32 pin1){ _rm11(GPIOx->OTYPER, mode, pin1); }
inline void gpio_otype2(GPIO_TypeDef* GPIOx, GPIOOType_TypeDef mode, u32 pin1, u32 pin2){ _rm12(GPIOx->OTYPER, mode, pin1,pin2); }
inline void gpio_otype3(GPIO_TypeDef* GPIOx, GPIOOType_TypeDef mode, u32 pin1, u32 pin2, u32 pin3){ _rm13(GPIOx->OTYPER, mode, pin1,pin2,pin3); }
inline void gpio_otype4(GPIO_TypeDef* GPIOx, GPIOOType_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4){ _rm14(GPIOx->OTYPER, mode, pin1,pin2,pin3,pin4); }
inline void gpio_otype5(GPIO_TypeDef* GPIOx, GPIOOType_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5){ _rm15(GPIOx->OTYPER, mode, pin1,pin2,pin3,pin4,pin5); }
inline void gpio_otype6(GPIO_TypeDef* GPIOx, GPIOOType_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5, u32 pin6){ _rm16(GPIOx->OTYPER, mode, pin1,pin2,pin3,pin4,pin5,pin6); }

// configure speed of 1..6 pins
inline void gpio_ospeed1(GPIO_TypeDef* GPIOx, GPIOSpeed_TypeDef mode, u32 pin1){ _rm21(GPIOx->OSPEEDR, mode, pin1); }
inline void gpio_ospeed2(GPIO_TypeDef* GPIOx, GPIOSpeed_TypeDef mode, u32 pin1, u32 pin2){ _rm22(GPIOx->OSPEEDR, mode, pin1,pin2); }
inline void gpio_ospeed3(GPIO_TypeDef* GPIOx, GPIOSpeed_TypeDef mode, u32 pin1, u32 pin2, u32 pin3){ _rm23(GPIOx->OSPEEDR, mode, pin1,pin2,pin3); }
inline void gpio_ospeed4(GPIO_TypeDef* GPIOx, GPIOSpeed_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4){ _rm24(GPIOx->OSPEEDR, mode, pin1,pin2,pin3,pin4); }
inline void gpio_ospeed5(GPIO_TypeDef* GPIOx, GPIOSpeed_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5){ _rm25(GPIOx->OSPEEDR, mode, pin1,pin2,pin3,pin4,pin5); }
inline void gpio_ospeed6(GPIO_TypeDef* GPIOx, GPIOSpeed_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5, u32 pin6){ _rm26(GPIOx->OSPEEDR, mode, pin1,pin2,pin3,pin4,pin5,pin6); }

// configure pull mode of 1..6 pins
inline void gpio_pupd1(GPIO_TypeDef* GPIOx, GPIOPuPd_TypeDef mode, u32 pin1){ _rm21(GPIOx->PUPDR, mode, pin1); }
inline void gpio_pupd2(GPIO_TypeDef* GPIOx, GPIOPuPd_TypeDef mode, u32 pin1, u32 pin2){ _rm22(GPIOx->PUPDR, mode, pin1,pin2); }
inline void gpio_pupd3(GPIO_TypeDef* GPIOx, GPIOPuPd_TypeDef mode, u32 pin1, u32 pin2, u32 pin3){ _rm23(GPIOx->PUPDR, mode, pin1,pin2,pin3); }
inline void gpio_pupd4(GPIO_TypeDef* GPIOx, GPIOPuPd_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4){ _rm24(GPIOx->PUPDR, mode, pin1,pin2,pin3,pin4); }
inline void gpio_pupd5(GPIO_TypeDef* GPIOx, GPIOPuPd_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5){ _rm25(GPIOx->PUPDR, mode, pin1,pin2,pin3,pin4,pin5); }
inline void gpio_pupd6(GPIO_TypeDef* GPIOx, GPIOPuPd_TypeDef mode, u32 pin1, u32 pin2, u32 pin3, u32 pin4, u32 pin5, u32 pin6){ _rm26(GPIOx->PUPDR, mode, pin1,pin2,pin3,pin4,pin5,pin6); }
« Last Edit: November 02, 2016, 10:04:24 am by rob42 »
 

Offline @rt

  • Super Contributor
  • ***
  • Posts: 1051
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #36 on: November 02, 2016, 01:08:42 pm »
I can only gather I haven’t understood the problem properly.

Code: [Select]
byte b0 = b0000; // source
byte b1 = b00000000; // target

b1 = b0 * b0;
b1 += b1 << 1;


EDIT,, Ah I see, any of the first four bits could be set.
« Last Edit: November 02, 2016, 01:15:43 pm by @rt »
 

Offline obiwanjacobi

  • Frequent Contributor
  • **
  • Posts: 988
  • Country: nl
  • What's this yippee-yayoh pin you talk about!?
    • Marctronix Blog
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #37 on: November 02, 2016, 02:26:56 pm »
Where is the drive to optimize this code coming from? It's basic setup code and should be written as simply and clearly as possible. The standard way of doing stuff like this is with flags that are ORd together, like I showed.

+1

"Premature optimization is the root of all evil."   >:D
Arduino Template Library | Zalt Z80 Computer
Wrong code should not compile!
 

Offline spongman

  • Newbie
  • Posts: 7
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #38 on: November 02, 2016, 09:53:47 pm »
"Premature optimization is the root of all evil."   >:D
"about 97% of the time."
 

Offline spongman

  • Newbie
  • Posts: 7
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #39 on: November 02, 2016, 10:01:47 pm »
Quote
uint8_t double_bits(uint8_t i) { return ((i * 0x01010101 & 0x08040201) * 0x0204081 >> 21) * 3; }
Didn't work for me (2--> 0x60c) (and others)
I looked at some of those magic bit-reversal algorithms that use multiply/modulo, but they looked like they'd get messy fast above 4bits.  (the bit reversal algorithms actually derive 2 bits from each copy, so they only make 5 copies to do 8bits, and they STILL need 64bits...)

the uint8_t return type implies an "& 0xff" mask. if you just copied the expression and missed the mask then you'll see extra noise in the upper bytes.

2 --> (0x60c & 0xff) is correct.
 

Offline spongman

  • Newbie
  • Posts: 7
  • Country: us
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #40 on: November 02, 2016, 10:13:40 pm »

Oh, that is awesome! - use the multiply to expand the bits out, then mask them off, then push them back together again.


for more awesome: the ARM assembly that 'gcc -O3' generates:

Code: [Select]
double_bits:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        add     r0, r0, r0, asl #8
        ldr     r3, .L2
        add     r0, r0, r0, asl #16
        and     r3, r3, r0
        add     r3, r3, r3, asl #7
        add     r3, r3, r3, asl #14
        mov     r0, r3, asr #21
        add     r0, r0, r0, asl #1
        uxth    r0, r0
        bx      lr
.L2:
        .word   0x8040201
 

Offline magetoo

  • Frequent Contributor
  • **
  • Posts: 284
  • Country: se
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #41 on: November 02, 2016, 10:23:44 pm »
Where is the drive to optimize this code coming from? It's basic setup code and should be written as simply and clearly as possible.

+1
<<1

Unless you are in a tight inner loop, you have no business messing with magic constants and that sort of thing.  You should be able to glance at the code a year from now and instantly see what it does.

If you really need a function or macro, obiwanjacobi's "no loops" suggestion works and what it does is obvious just from its structure.
 

Offline dannyf

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #42 on: November 03, 2016, 11:00:02 pm »
Quote
It's basic setup code and should be written as simply and clearly as possible.

pros write robust and well understood code;

newbies write fancy code.
================================
https://dannyelectronics.wordpress.com/
 

Offline dgtl

  • Regular Contributor
  • *
  • Posts: 183
  • Country: ee
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #43 on: November 03, 2016, 11:47:20 pm »
I've had the same question myself and my solution is following...

1) You should not compute values in runtime that can be pre-computed compile-time
2) As the gpio pin definitions in my projects have been constants for now, the register masks and values can be pre-computed
3) Let the compiler do the bit manipulation work, the interface shall be left clean. As long as everything is optimized out, the actual function can be just a stupid block of copypaste. (See linux kernel source for examples of such code, ie ilog2).

So I wrote my gpio accesses as macros and let the compiler optimize it to actual register read-modify-writes without any branches or shifting. The following code i'm using on F2-s (part of my gpio.h file). Of course, it would be terribly inefficient if arguments weren't constant or the optimization otherwise failed.

Code: [Select]
////////////////////////////////////////////////////////////////////////////
// double bits: abcd -> aabbccdd
#define GPIO_BITMASK2(b) ( \
(((b) & (1<<0)) ? (3<<0):0 ) | \
(((b) & (1<<1)) ? (3<<2):0 ) | \
(((b) & (1<<2)) ? (3<<4):0 ) | \
(((b) & (1<<3)) ? (3<<6):0 ) | \
(((b) & (1<<4)) ? (3<<8):0 ) | \
(((b) & (1<<5)) ? (3<<10):0 ) | \
(((b) & (1<<6)) ? (3<<12):0 ) | \
(((b) & (1<<7)) ? (3<<14):0 ) | \
(((b) & (1<<8)) ? (3<<16):0 ) | \
(((b) & (1<<9)) ? (3<<18):0 ) | \
(((b) & (1<<10)) ? (3<<20):0 ) | \
(((b) & (1<<11)) ? (3<<22):0 ) | \
(((b) & (1<<12)) ? (3<<24):0 ) | \
(((b) & (1<<13)) ? (3<<26):0 ) | \
(((b) & (1<<14)) ? (3<<28):0 ) | \
(((b) & (1<<15)) ? (3<<30):0 ) \
)

// quadruple low bits, abcd ->aaaabbbbccccdddd
#define GPIO_BITMASK4L(b) ( \
(((b) & (1<<0)) ? (15<<0):0 ) | \
(((b) & (1<<1)) ? (15<<4):0 ) | \
(((b) & (1<<2)) ? (15<<8):0 ) | \
(((b) & (1<<3)) ? (15<<12):0 ) | \
(((b) & (1<<4)) ? (15<<16):0 ) | \
(((b) & (1<<5)) ? (15<<20):0 ) | \
(((b) & (1<<6)) ? (15<<24):0 ) | \
(((b) & (1<<7)) ? (15<<28):0 ) \
)

// quadruple high bits, abcd -> aaaabbbbccccdddd
#define GPIO_BITMASK4H(b) ( \
(((b) & (1<<8)) ? (15<<0):0 ) | \
(((b) & (1<<9)) ? (15<<4):0 ) | \
(((b) & (1<<10)) ? (15<<8):0 ) | \
(((b) & (1<<11)) ? (15<<12):0 ) | \
(((b) & (1<<12)) ? (15<<16):0 ) | \
(((b) & (1<<13)) ? (15<<20):0 ) | \
(((b) & (1<<14)) ? (15<<24):0 ) | \
(((b) & (1<<15)) ? (15<<28):0 ) \
)

// repeat 4bit value 8 times to full int length
#define GPIO_REPEAT4(b) ((b) * 0x11111111U)

// repeat 2bit value 16 times to full int length
#define GPIO_REPEAT2(b) ((b) * 0x55555555U)

// modify certain bits in register
#define GPIO_MODIFY_REG(reg, clr, set)  (reg)=((((reg) & (~(clr))) | ((set)&(clr))))

////////////////////////////////////////////////////////////////////////////
// set gpio pins in port to peripheral mode
#define GPIO_PERIPH(port, pinmask, periph) do { \
GPIO_MODIFY_REG((port)->MODER, GPIO_BITMASK2(pinmask), GPIO_REPEAT2(GPIO_MODER_PERIPH)); \
if ((pinmask) & 0x00FF) GPIO_MODIFY_REG(port->AFR[0], GPIO_BITMASK4L(pinmask), GPIO_REPEAT4(periph)); \
if ((pinmask) & 0xFF00) GPIO_MODIFY_REG(port->AFR[1], GPIO_BITMASK4H(pinmask), GPIO_REPEAT4(periph)); \
} while(0)
 

Offline Kalvin

  • Super Contributor
  • ***
  • Posts: 2145
  • Country: fi
  • Embedded SW/HW.
Re: Bit manipulation: 'Stretching a bitmask'
« Reply #44 on: November 04, 2016, 09:15:57 am »
If your compiler cannot optimize at the compilation time, but the mask is constant, you can precompute and cache the mask at the system boot time as a workaround:

Code: [Select]
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

uint16_t gpio_stretch_bitmask(uint8_t bits)
{
    uint16_t temp = 0;
    uint16_t mask = 0x03;
    while (bits)
    {
        temp |= ((bits & 1) ? mask : 0);
        mask <<= 2;
        bits >>= 1;
    }
    return temp;
}

static const uint8_t GPIO_MASK = 0x55;
static uint16_t GPIO_WIDE_MASK = 0; // Will be initialized in the gpio_init()

static void gpio_init()
{
    GPIO_WIDE_MASK = gpio_stretch_bitmask(GPIO_MASK);     
}

int main(int argc, char** argv)
{
    gpio_init();
    printf("Mask = 0x%x, Wide Mask= 0x%x\n", GPIO_MASK, GPIO_WIDE_MASK);
    return (EXIT_SUCCESS);
}

Edit: Oops! Fixed an error.
« Last Edit: November 04, 2016, 09:48:13 am by Kalvin »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf