Author Topic: Algorithm for calculating previous states of LFSR counters ?  (Read 5796 times)

0 Members and 1 Guest are viewing this topic.

Offline bsccaraTopic starter

  • Contributor
  • Posts: 23
  • Country: pt
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #25 on: July 03, 2019, 10:35:33 pm »
'duak', I've tested your idea prior to my reply using the following HDL:

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity Timer2 is
port
(
CLK: in std_logic;
COUNT: out std_logic_vector(35 downto 0)
);
end Timer2;
 
architecture RTL of Timer2 is
signal lfsr1: std_logic_vector(18 downto 1) := (others => '0');
signal lfsr2: std_logic_vector(18 downto 1) := (others => '0');
begin
process(CLK)
begin
if rising_edge(CLK) then
lfsr1(1) <= lfsr1(18);
lfsr1(11 downto 2) <= lfsr1(10 downto 1);
lfsr1(12) <= lfsr1(11) xnor lfsr1(18);
lfsr1(18 downto 13) <= lfsr1(17 downto 12);
end if;
end process;

process(CLK)
begin
if rising_edge(CLK) then
if lfsr1 = "000000000000000000" then
lfsr2(1) <= lfsr2(18);
lfsr2(11 downto 2) <= lfsr2(10 downto 1);
lfsr2(12) <= lfsr2(11) xnor lfsr2(18);
lfsr2(18 downto 13) <= lfsr2(17 downto 12);
end if;
end if;
end process;

COUNT <= lfsr2 & lfsr1;
end RTL;

and used the following for a single, larger LFSR:

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity Timer is
port
(
CLK: in std_logic;
COUNT: out std_logic_vector(34 downto 0)
);
end Timer;
 
architecture RTL of Timer is
signal lfsr: std_logic_vector(35 downto 1) := (others => '0');
begin
process(CLK)
begin
if rising_edge(CLK) then
lfsr(1) <= lfsr(35);
lfsr(33 downto 2) <= lfsr(32 downto 1);
lfsr(34) <= lfsr(33) xnor lfsr(35);
lfsr(35) <= lfsr(34);
end if;
end process;

COUNT <= lfsr;
end RTL;

I ran both through the Lattice Diamond toolchain, using LSE for synthesis, as top-level modules. And I got these results for Fmax for the CLK signal on a Mach XO2:

Stage2 x 18 bit LFSR1 x 35 bit LFSR
Synthesis136.799 MHz298.240 MHz
Map139.762 MHz400.000 MHz
Place & Route177.305 MHz358.938 MHz

Given these results I'd rather stick with my current LFSR design; the zero detection is not a (big) issue as it is indeed pipelined and the timer as a whole can run at well over 200 MHz.
 

Offline bsccaraTopic starter

  • Contributor
  • Posts: 23
  • Country: pt
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #26 on: July 03, 2019, 10:56:39 pm »
'hamster_nz', indeed you are correct on the polynomials. Also I'd like to thank your interest on my question as I'm struggling with your algorithm. Your code uses several 'magic' values that I don't quite get their origins but they do seem to be (or at least they are equal to):

Code: [Select]
0x0c == fourth state after 00000
0x19 == fourth state after 00001
0x07 == fourth state after 00010
0x0e == fourth state after 00100
0x09 == fourth state after 01000
0x06 == fourth state after 10000

If I'm correct your code is very efficient at computing the state at a fixed offset from any initial state, requiring only six 'magic' values. But I need to be able to compute the state at any offset from a fixed initial state (given the periodic nature of the LFSR, as you already mentioned, that initial state can be 0 and the offset can be reversed). From your code and explanation the first 'magic' value alone would require a table with one element for each possible offset, totally negating the purpose of the algorithm. So, can you clarify the origin of those 'magic' numbers ?
 

Online langwadt

  • Super Contributor
  • ***
  • Posts: 4857
  • Country: dk
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #27 on: July 03, 2019, 11:28:43 pm »
'duak', I've tested your idea prior to my reply using the following HDL:

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity Timer2 is
port
(
CLK: in std_logic;
COUNT: out std_logic_vector(35 downto 0)
);
end Timer2;
 
architecture RTL of Timer2 is
signal lfsr1: std_logic_vector(18 downto 1) := (others => '0');
signal lfsr2: std_logic_vector(18 downto 1) := (others => '0');
begin
process(CLK)
begin
if rising_edge(CLK) then
lfsr1(1) <= lfsr1(18);
lfsr1(11 downto 2) <= lfsr1(10 downto 1);
lfsr1(12) <= lfsr1(11) xnor lfsr1(18);
lfsr1(18 downto 13) <= lfsr1(17 downto 12);
end if;
end process;

process(CLK)
begin
if rising_edge(CLK) then
if lfsr1 = "000000000000000000" then
lfsr2(1) <= lfsr2(18);
lfsr2(11 downto 2) <= lfsr2(10 downto 1);
lfsr2(12) <= lfsr2(11) xnor lfsr2(18);
lfsr2(18 downto 13) <= lfsr2(17 downto 12);
end if;
end if;
end process;

COUNT <= lfsr2 & lfsr1;
end RTL;

and used the following for a single, larger LFSR:

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity Timer is
port
(
CLK: in std_logic;
COUNT: out std_logic_vector(34 downto 0)
);
end Timer;
 
architecture RTL of Timer is
signal lfsr: std_logic_vector(35 downto 1) := (others => '0');
begin
process(CLK)
begin
if rising_edge(CLK) then
lfsr(1) <= lfsr(35);
lfsr(33 downto 2) <= lfsr(32 downto 1);
lfsr(34) <= lfsr(33) xnor lfsr(35);
lfsr(35) <= lfsr(34);
end if;
end process;

COUNT <= lfsr;
end RTL;

I ran both through the Lattice Diamond toolchain, using LSE for synthesis, as top-level modules. And I got these results for Fmax for the CLK signal on a Mach XO2:

Stage2 x 18 bit LFSR1 x 35 bit LFSR
Synthesis136.799 MHz298.240 MHz
Map139.762 MHz400.000 MHz
Place & Route177.305 MHz358.938 MHz

Given these results I'd rather stick with my current LFSR design; the zero detection is not a (big) issue as it is indeed pipelined and the timer as a whole can run at well over 200 MHz.

how does that compare to a regular counter maybe with a bit or two of prescaler?
 
The following users thanked this post: Someone

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #28 on: July 03, 2019, 11:47:13 pm »
'hamster_nz', indeed you are correct on the polynomials. Also I'd like to thank your interest on my question as I'm struggling with your algorithm. Your code uses several 'magic' values that I don't quite get their origins but they do seem to be (or at least they are equal to):

Code: [Select]
0x0c == fourth state after 00000
0x19 == fourth state after 00001
0x07 == fourth state after 00010
0x0e == fourth state after 00100
0x09 == fourth state after 01000
0x06 == fourth state after 10000

If I'm correct your code is very efficient at computing the state at a fixed offset from any initial state, requiring only six 'magic' values. But I need to be able to compute the state at any offset from a fixed initial state (given the periodic nature of the LFSR, as you already mentioned, that initial state can be 0 and the offset can be reversed). From your code and explanation the first 'magic' value alone would require a table with one element for each possible offset, totally negating the purpose of the algorithm. So, can you clarify the origin of those 'magic' numbers ?

I did leave enough breadcrumbs  ;D

I got those numbers exactly as you suggested -  I put 00 as the initial state of the LFSR, and ran it forward four states, giving 0C.

I then put 01 ran it forward four states, giving 19. To isolate the change that bit makes, I have to remove the change that would occur even if this bit wasn't set, hence why it is XORed with 0C

I then did the same for 02, 04, 08 & 10, so I know what effect that having each bit set leads to four cycles into the future.

There isn't anything magical about the number four, I just wanted something I could count lines on the screen.

So what if I want to jump ahead 8 states? Easy - just run it through it twice.

Code: [Select]

          guess = 0x0c;
          if((state & 0x01) != 0) guess ^= 0x19 ^ 0x0C;
          if((state & 0x02) != 0) guess ^= 0x07 ^ 0x0C;
          if((state & 0x04) != 0) guess ^= 0x0e ^ 0x0C;
          if((state & 0x08) != 0) guess ^= 0x09 ^ 0x0C;
          if((state & 0x10) != 0) guess ^= 0x06 ^ 0x0C

          // advance the state four cycles into the future
          state = guess;
          guess = 0x0c;
          if((state & 0x01) != 0) guess ^= 0x19 ^ 0x0C;
          if((state & 0x02) != 0) guess ^= 0x07 ^ 0x0C;
          if((state & 0x04) != 0) guess ^= 0x0e ^ 0x0C;
          if((state & 0x08) != 0) guess ^= 0x09 ^ 0x0C;
          if((state & 0x10) != 0) guess ^= 0x06 ^ 0x0C

          state = guess;
          // statue is now 8 cycles into the future
         
Now If I run 00, 01, 02, 04, 08 and 10, through this "Advance 8 cycles" I will end up with 6 new magic numbers, that can be used to jump 8 states ahead in one operation.

You can probably see where this is going...

If you build a table "magic[5][6]" where:

* magic[0][] is what is used to jump ahead one state

* magic[1][] is what is used to jump ahead two states

* magic[2][] is what is used to jump ahead four states

* magic[3][] is what is used to jump ahead eight states

* magic[4][] is what is used to jump ahead sixteen states

You can calculate the value in magic[1][] by using the values in magic[0] twice. You can then calculate the value in magic[2][] by using calculate the value in magic[1][] twice, and so on. (For debugging the values for magic[2][] should be the numbers we already have)

So we now have a way to skip forward any of 1,2,4, 8 or 16 states.

If we want to skip back 10 states because the LFSR has 31 states is the same as skipping forward 21 states. You can use magic[0][], magic[2][] and magic[4][], to jump forward 1, then 4, then 16 states, for a total of 21, but only performing three sets of calculations:

Code: [Select]

So the end code looks something like:

for(index = 0; index < lfsr_bit_length; index++) {
   if((number_to_skip & (1<<index)) != 0) {
          guess = magic[index][0];
          if((state & 0x01) != 0) guess ^= magic[index][1] ^ magic[bit][0];
          if((state & 0x02) != 0) guess ^= magic[index][2] ^ magic[bit][0];
          if((state & 0x04) != 0) guess ^= magic[index][3] ^ magic[bit][0];
          if((state & 0x08) != 0) guess ^= magic[index][4] ^ magic[bit][0];
          if((state & 0x10) != 0) guess ^= magic[index][5] ^ magic[bit][0];
          state = guess;
    }
}

For a small LFSR like this it doesn't help much, but for a 36 bit FSM it is far quicker that rolling through states.

In BigO notation it changes the runtime from O(2^n) to O(n^2).
« Last Edit: July 03, 2019, 11:54:03 pm by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Someone

  • Super Contributor
  • ***
  • Posts: 5155
  • Country: au
    • send complaints here
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #29 on: July 03, 2019, 11:51:23 pm »
for a lower Fmax and increased propagation delays, making timing closure much harder.
You're talking about a loadable counter, have you looked at the critical timing paths? Using a binary counter with "hard" carry resources and the overflow as your end of count indicator is likely to be faster than an lfsr and the comparison logic.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #30 on: July 04, 2019, 12:27:11 am »
for a lower Fmax and increased propagation delays, making timing closure much harder.
You're talking about a loadable counter, have you looked at the critical timing paths? Using a binary counter with "hard" carry resources and the overflow as your end of count indicator is likely to be faster than an lfsr and the comparison logic.

I initially thought that too, but the LFSR with two taps soon becomes quicker that a long counter as bit length grows.

For example in Spartan 3 the carry chain is ~0.05 ns per bit. so a 36 bit counter needs 1.8ns for just carry chain alone, and then you need at least one level of logic in the loop as a MUX to load the preset values.

A long "LFSR with load" and two taps requires only one level of logic, making it at least 1.8ns faster than the a long counter (in Spartan 3).

The logic required to detect the zero state is another thing. That would need to be pipelined to run fast...

 
« Last Edit: July 04, 2019, 12:34:01 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Online langwadt

  • Super Contributor
  • ***
  • Posts: 4857
  • Country: dk
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #31 on: July 04, 2019, 12:36:28 am »
for a lower Fmax and increased propagation delays, making timing closure much harder.
You're talking about a loadable counter, have you looked at the critical timing paths? Using a binary counter with "hard" carry resources and the overflow as your end of count indicator is likely to be faster than an lfsr and the comparison logic.

I initially thought that too, but the LFSR with two taps soon becomes quicker that a long counter.

For example in Spartan 3 the carry chain is ~0.05 ns per bit. so a 36 bit counter needs 1.8ns for just carry chain alone, and then you need at least one level of logic in the loop as a MUX to load the preset values.

A long "LFSR with load" and two taps requires only one level of logic, making it at least 1.8ns faster than the a long counter (in Spartan 3).

The logic required to detect the zero state is another thing. That would need to be pipelined to run fast...

a single bit as prescaler/CE on a 35 bit counter and you have twice as much time ?
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #32 on: July 04, 2019, 05:45:03 am »
Curiosity got the better of me. Here is the implementation for the 9-bit LFSR, that dynmically builds the tables test it works for all values. The 36-bit one will involve:

- changing uint32_t to uint64_t
- changing BITS to 36
- changing the taps used in advance_lfsr()
- a slightly longer time to test for all values.  ;D

Code: [Select]

#include <stdio.h>
#include <stdint.h>

#define BITS (9)

static uint32_t magic_zero[BITS];
static uint32_t magic[BITS][BITS];

/*****************************************
 * The LFSR calculation
 ****************************************/
static uint32_t advance_lfsr(uint32_t state) {
  uint32_t new_bit;
  new_bit = 1^((state & 0x100) ? 1 : 0) ^ ((state & 0x010) ? 1 : 0);
  state = (state << 1 )|new_bit;
  state &= (1<<BITS)-1;
  return state;
}

/*****************************************
 * The LFSR speedup, that advances 2^bit_to_advance
 ****************************************/
static uint32_t advance_fast(uint32_t state, int bit_to_advance) {
   uint32_t guess, i;

   guess = magic_zero[bit_to_advance];
   for(i = 0; i < BITS; i++) {
      if((state & (1<<i)) != 0) guess ^= magic[bit_to_advance][i];
   }
   return guess;
}


/*****************************************
 * Calculate the magic values that are the
 * future contributions of 'zero' and than
 * the contribuions of each bit.
 ****************************************/
static void calculate_magic_values(void) {
   int i, bit;

   /* Calculate the first entry, using the traditional LFSR function */
   magic_zero[0] = advance_lfsr(0);
   for(i = 0; i < BITS; i++) {
     magic[0][i] = advance_lfsr(1<<i) ^ magic_zero[0];
   }

   /* Calculate all other entries, by applying the last set of values twice */
   for(bit=1; bit < BITS; bit++) {
     uint32_t state = 0;
     state = advance_fast(state,bit-1);
     state = advance_fast(state,bit-1);
     magic_zero[bit] = state;
     for(i = 0; i < BITS; i++) {
        state = 1<<i;
        state = advance_fast(state,bit-1);
        state = advance_fast(state,bit-1);
        magic[bit][i] = state ^ magic_zero[bit];
     }
   }

   /* Print out the table */
   for(i = 0; i < BITS; i++) {
       printf("%03x", magic_zero[i]);
       for(bit = 0; bit < BITS; bit++) {
          printf(" %03x", magic[i][bit]);
       }
       printf("\n");
   }
}

int main(int argc, char *argv[])
{
  uint32_t distance;

  calculate_magic_values();

  /* Verify that it works for all distances */
  for(distance = 0; distance < 1<<BITS; distance++) {
    uint32_t initial = 0;
    uint32_t state = initial;
    printf("\n");
    printf("distance = %i\n", distance);
    printf("State   Calculated  Guessed   Error?\n");
    do {
      uint32_t calced, guess, i;

      /* Calculate it the fast way */
      guess = state;
      for(i = 0; i < BITS; i++) {
         if((distance & (1<<i)) != 0) {
           guess = advance_fast(guess,i);
         }
      }

      /* Now calculate it the long way */
      calced = state;
      for(i = 0; i < distance; i++) {
         calced = advance_lfsr(calced);
      }

      /* Show results */
      printf("%03x      %03x          %03x        %s\n", state, calced, guess, calced ^ guess ? "yes" : "no");

      /* Move on to next state */
      state = advance_lfsr(state);
    } while(state != initial);
  }
  return 0;
}
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 
The following users thanked this post: bsccara

Offline Someone

  • Super Contributor
  • ***
  • Posts: 5155
  • Country: au
    • send complaints here
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #33 on: July 05, 2019, 01:11:31 am »
for a lower Fmax and increased propagation delays, making timing closure much harder.
You're talking about a loadable counter, have you looked at the critical timing paths? Using a binary counter with "hard" carry resources and the overflow as your end of count indicator is likely to be faster than an lfsr and the comparison logic.

I initially thought that too, but the LFSR with two taps soon becomes quicker that a long counter as bit length grows.

For example in Spartan 3 the carry chain is ~0.05 ns per bit. so a 36 bit counter needs 1.8ns for just carry chain alone, and then you need at least one level of logic in the loop as a MUX to load the preset values.

A long "LFSR with load" and two taps requires only one level of logic, making it at least 1.8ns faster than the a long counter (in Spartan 3).

The logic required to detect the zero state is another thing. That would need to be pipelined to run fast...
... so if you're able to accept pipelining the comparison, then why not pipeline a binary (and/or other types of) counter in the first place?
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #34 on: July 05, 2019, 01:55:29 am »
... so if you're able to accept pipelining the comparison, then why not pipeline a binary (and/or other types of) counter in the first place?

More a question for the OP, but I assume their view is it takes a problem you can solve with simple, fast, well understood hardware design and some not-too-hard software into one that needs tricky (and maybe unfeasible) hardware design with the goal of making it simple for software.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Someone

  • Super Contributor
  • ***
  • Posts: 5155
  • Country: au
    • send complaints here
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #35 on: July 05, 2019, 04:25:59 am »
... so if you're able to accept pipelining the comparison, then why not pipeline a binary (and/or other types of) counter in the first place?

More a question for the OP, but I assume their view is it takes a problem you can solve with simple, fast, well understood hardware design and some not-too-hard software into one that needs tricky (and maybe unfeasible) hardware design with the goal of making it simple for software.
Pipelining a counter is possibly unfeasible? I cannot see it being any more so than pipelining a match circuit. An LFSR that self resets to a loadable length is the oddball hardware design here. Seems like the classic case of pre-emptive optimisation gone wrong, having not seen any comparisons of fleshed out examples from you or the OP.

But do continue on your academic meander, its interesting if entirely missing the practicalities. Trying to suggest a binary counter as synthesised from high level code is somehow esoteric or complex in comparison has made my day.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #36 on: July 05, 2019, 05:29:08 am »
But do continue on your academic meander, its interesting if entirely missing the practicalities. Trying to suggest a binary counter as synthesised from high level code is somehow esoteric or complex in comparison has made my day.

To show the complexity:

if I were to race you to build either a 16-bit counter on a breadboard out of TTL Quad D-flipflops and two-input logic gates, I bet I could get my LFSR counter up and running quicker, and it would count faster, because it is less complex.

A full adder would need 5 gates per bit, so an adder-based binary counter would need 24 chips (80 for logic gates and and 16 FFs), but a LFSR would only need 5 (1 or two logic gates and 16 FFs). The binary counter might only need half that many logic gates as you can use half-adders... but that is still 14 chips vs 5.

If the OP is working past where a binary counter can meet timing, moving to an LFSR based timer is a workable solution.

LFSRs go from boring to interesting as you learn more about them, as they seem to pop up everywhere:

- Data integrity, such as CRC codes (https://en.wikipedia.org/wiki/Cyclic_redundancy_check)

- Scrambling (as used in PCIe, DisplayPort, SDI, networkng http://en.wikipedia.org/wiki/Scrambler)

- Communication such as GPS (eg.https://en.wikipedia.org/wiki/Direct-sequence_spread_spectrum)

- Forward Error correction (e.g. https://en.wikipedia.org/wiki/BCH_code, as used in HDMI)

- Pseudo-random binary sequences for such things as built in self tests and device characterization

- High speed counters

- Cryptography
« Last Edit: July 05, 2019, 05:32:31 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Someone

  • Super Contributor
  • ***
  • Posts: 5155
  • Country: au
    • send complaints here
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #37 on: July 05, 2019, 05:47:41 am »
if I were to race you to build either a 16-bit counter on a breadboard out of TTL Quad D-flipflops and two-input logic gates, I bet I could get my LFSR counter up and running quicker, and it would count faster, because it is less complex.

A full adder would need 5 gates per bit, so an adder-based binary counter would need 24 chips (80 for logic gates and and 16 FFs), but a LFSR would only need 5 (1 or two logic gates and 16 FFs). The binary counter might only need half that many logic gates as you can use half-adders... but that is still 14 chips vs 5.

If the OP is working past where a binary counter can meet timing, moving to an LFSR based timer is a workable solution.

LFSRs go from boring to interesting as you learn more about them, as they seem to pop up everywhere:
Except the target is an FPGA, not discrete gates, and is programmed in a high level language. I'm fully aware of the wide use of LFSRs and that as a counter they can be power efficient, but that isn't looking at the specific example here of a self resetting loadable counter.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2812
  • Country: nz
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #38 on: July 05, 2019, 08:10:25 am »
Two designs in the same wrapper.

Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity binary_counter is
    Port ( clk     : in   STD_LOGIC;
           load    : in   STD_LOGIC;
           initial : in   STD_LOGIC_VECTOR (35 downto 0);
           pulse   : out  STD_LOGIC);
end binary_counter;

architecture Behavioral of binary_counter is
signal   counter : unsigned(initial'length downto 0) := (others => '0');
constant zero    : unsigned(0 downto 0)              := (others => '0');
begin

process(clk)
begin
if rising_edge(clk) then
pulse <= std_logic(counter(counter'high));
if load = '1' then
counter <= zero & unsigned(initial);
else
counter <= (zero & counter(counter'high-1 downto 0)) + 1;
end if;
end if;
end process;
end Behavioral;


Code: [Select]
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity lfsr_counter is
    Port ( clk     : in   STD_LOGIC;
           load    : in   STD_LOGIC;
           initial : in   STD_LOGIC_VECTOR (35 downto 0);
           pulse   : out  STD_LOGIC);
end lfsr_counter;

architecture Behavioral of lfsr_counter is
signal state : STD_LOGIC_VECTOR(initial'high downto 0) := (others => '0');
begin

process(clk)
begin
if rising_edge(clk) then
if state = x"000000000" then
pulse <= '1';
else
   pulse <= '0';
end if;

if load = '1' then
state <= initial;
else
state <= state(state'high-1 downto 0) & (state(state'high) xor state(5) xor '1');
end if;
end if;
end process;
end Behavioral;

Although subtly different, both are constrained the same, to the time where both meet the same timing:
Code: [Select]
TIMESPEC TS_clk = PERIOD "clk" 3.51 ns HIGH 50%;

Want to guess which timing is the counter, and which the LFSR?

Code: [Select]
 Design statistics: 
    Minimum period:   3.509ns{1}   (Maximum frequency: 284.981MHz)

Code: [Select]
 Design statistics: 
    Minimum period:   2.586ns{1}   (Maximum frequency: 386.698MHz)
« Last Edit: July 05, 2019, 08:29:20 am by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline Someone

  • Super Contributor
  • ***
  • Posts: 5155
  • Country: au
    • send complaints here
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #39 on: July 05, 2019, 09:33:03 am »
Two designs in the same wrapper.

Want to guess which timing is the counter, and which the LFSR?

Code: [Select]
 Design statistics: 
    Minimum period:   3.509ns{1}   (Maximum frequency: 284.981MHz)

Code: [Select]
 Design statistics: 
    Minimum period:   2.586ns{1}   (Maximum frequency: 386.698MHz)
No idea what optimisations you're using in implementation but that binary counter is a full width, when we have repeatedly said it could be divided up or pipelined. If you're not making any effort to improve upon an inferred single cycle counter then its not surprising it implemented slower.

Just to settle this for my own knowledge I threw three versions into a few different devices and consistently had a pipelined counter slower than either an LFSR or partitioned counter. The partitioned counter was consistently the fastest, using a few extra LUTs and control sets compared to the LFSR but fewer flops. But they all surpassed the switching limits of the clock nets, so its past the point of worrying unless you want to get into clocks routed on the general fabric (at which point there are usually better ways to reframe the problem).
 

Offline bsccaraTopic starter

  • Contributor
  • Posts: 23
  • Country: pt
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #40 on: July 05, 2019, 10:51:05 am »
'someone', my main goal going with the LFSR is not absolute maximum Fmax. I need it to be fast enough to not impede timing closure but also as light as possible on resources as I have 16 of them on the design. As it is the FPGA is already almost maxed out, particularly on LUTs...
There's also some misunderstanding going on, as the LFSR counter does not self-reset; after being enabled from other logic it steps from a loadable initial state until zero, at which time that other logic will disable it. The loading is done while the timer is reset but I wouldn't call it self-reseting, as it doesn't continue on by itself.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7192
  • Country: fi
    • My home page and email address
Re: Algorithm for calculating previous states of LFSR counters ?
« Reply #41 on: July 05, 2019, 05:39:08 pm »
- Pseudo-random binary sequences
I don't know much about the hardware (just about zero on VHDL, Verilog, or FPGAs), but I do scientific work (computational materials physics; simulators and such), and the two PRNG families I use, Mersenne Twister and Xorshift, are both based on linear feedback shift registers.  Xorshift is pure XOR LFSR with optional output twizzling through multiplication that does not affect the state at all (it just provides better distribution in the output bits), and Mersenne Twister is a twisted generalized feedback shift register.

MT19937 in particular is used everywhere; it's the default PRNG in Python (random module), for example.



It might be important to note here that OP is actually interested in two different aspects of XNOR LFSRs:  One is hardware implementation for a fast counter, towards an all-zero state; and another is a generic C implementation run on a desktop-type computer, to calculate the initial state needed for the hardware implementation to reach the all-zero state in a specific number of steps.

A 9-bit XNOR LFSR can have 511 distinct states (excluding the all-ones halt state).  This is such a small number of states that the generic C implementation can tabulate all values according to the number of steps to the all-zero state.

A 35-bit XNOR LFSR can have 235-1 = 34,359,738,367 distinct states (again excluding the all-ones halt state).  It is not feasible to tabulate all of these in a desktop program; that'd take gigabytes of memory.

My example program above shows that for any XNOR LFSR implementation, you can easily create (and shows how) a complementary XNOR LFSR, that produces the same sequence, but in the reverse direction.  Since I wrote it for illustration only, it is also just about the slowest way to go about it, though; but to show much faster implementations, one should use a fixed characteristic polynomial (fixed taps and LFSR size).

My suggestion for the desktop program is to precalculate the reverse sequence once, saving suitable states and their distances from the zero state.  To find the desired start state, one picks the saved state furthest from the zero state not further than the desired distance, and continue from that point using the complementary XNOR LFSR.  A few tens of thousands of saved states means every distinct state can be reached with fewer than a million iterations, which should be plenty fast enough for a desktop configuration application.

My earlier post attempted to explain that if the desktop application needs to be even faster, one can find any distinct state based on their distance from the zero state, in logarithmic time.  It is just much more complicated to implement, as it usually involves functions that produce the same result in fixed time as iterating the XNOR LFSR a fixed number of times would.  In other words, you can implement the functionality to find a specific state for the hardware counter to reach all-zero state in a specific number of steps, even on a weak microcontroller; it just takes somewhat more mathy programming work, i.e. human programmer time and effort.

xhamster_nz above showed an alternate method, using precalculated look-up tables, to implement a fast state finder for a specific distance from the all-zero state.  I found them fascinating; I have never explored that approach before.  (Then again, I haven't explored skipping ahead in an LFSR before, either.)  Thanks!

The Xilinx XAPP 052 from 1996 states that for a 9-bit XNOR LFSR, taps 9 and 5 (corresponding to characteristic polynomial of x9+x5+1) produces a maximal-length sequence; and taps 35 and 33 (corresponding to x35+x33+1) for 35-bit XNOR LFSR.  Two taps is the minimum number of taps needed for a maximal-length XNOR LFSR, so these taps seem optimal to me for OP's use case.

Perhaps OP could confirm these are the XNOR LFSRs they're particularly interested in?  If so, we could easily whip up example programs/C functions to generate the desired states for the hardware to reach the all-zero state in the specified number of steps.  Otherwise, I am afraid this thread is at serious risk of derailing...
« Last Edit: July 08, 2019, 11:14:23 am by Nominal Animal »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf