Author Topic: My first FPGA code  (Read 24218 times)

0 Members and 1 Guest are viewing this topic.

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3238
  • Country: ca
Re: My first FPGA code
« Reply #175 on: December 01, 2019, 12:26:19 am »
They pretty much have an use all the same terminology and data timing points as Quartus' timing report, yet the 1 single number figure they want you to calculate by hand with a calculator is the FMAX?  I just want an Idea of how much work I need to do to stretch my design or it's expand-ability at a glance before digging into the difficult numbers.

A lot. The tools stop optimizing as soon as the timing is met. If you want higher speed, the tools may be able to make it, but at the cost of extra optimization, which may take hours. There's no way to find out aside of actually trying.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #176 on: December 01, 2019, 12:50:08 am »
Please be more clear when you say, "you can create a block"?

Something like this? The calculation in question is comparing the magnitude of 2 bytes, with an upper and lower value. I just wanna make sure the same logic is re-used and the sampled clock just de-selected, switched and not some whole new logic re-created, lets be efficient here.


always @(posedge clk_1)begin

             if(enable==1'b0) begin

                     calculation
                    ....
             
                end
                end

always @(posedge clk_2)begin

             if(enable==1'b1) begin

                     calculation
                    ....
             
                end
                end


         
Running at the upper limits of the FPGA means you will not be able to use the same logic with 2 different clocks simultaneously.  Since we don't know what you are trying to do, but based on your post, I'm thinking something closer to this:
----------------------------------------------------
always @(posedge clk_1)begin

             if(enable==1'b0) begin

                    if ( input_data > low_limit  &&  input_data < high_limit ) input_data_in_range <= 1;
                     else   input_data_in_range <= 0;
                    ....
             
                end
     end
-------------------------------------------------------
The above test only takes 1 logic cell.  That 1 logic cell is the 'input_data_in_range' result.  The math involved inside the if's (...) are the gates preceding the 'D' input of the logic cell register ''input_data_in_range'.  Now, if your input data is 8bits and the high_ and low_limits are also stored registers 8 bits each, this logic will be 25 logic cells total.

Now, if you add more tests, on the 'input_data', those 8 bits are automatically reused, or, if you use the high_ and low_limits elsewhere but with different data, then those 16 logic cell registers again are automatically re-used and just the alternate new 8 bit input data will add 8 additional registers to your design.  IE, expand the above to 1 upper and lower limit, 16 logic cells, 2 different 8 bit input_data making another 16 logic cells, and 2 different data_in_range outputs #1 and #2 regs would bring the design to 34 logic cells.  You are using nothing of the bottom end Max10's 2000 logic cells here.  What else do you need to do?

« Last Edit: December 01, 2019, 12:52:02 am by BrianHG »
 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #177 on: December 01, 2019, 12:59:01 am »
Yeah something like that Brian.

the "calculation" , should this be defined as a function?  For the "compiler" to understand I wanna re-use this, if the upper and lower limits are the same for both.

Or is it gonna be smart and re-use what it can?

« Last Edit: December 01, 2019, 01:01:16 am by lawrence11 »
 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #178 on: December 01, 2019, 01:06:37 am »
Well I did not know my pre-requesites.

Max10 seemed like the best since it was 3.3 Volts.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #179 on: December 01, 2019, 01:19:29 am »
Yeah something like that Brian.

the "calculation" , should this be defined as a function?  For the "compiler" to understand I wanna re-use this, if the upper and lower limits are the same for both.

Or is it gonna be smart and re-use what it can?
Since I don't know the speed you need to run the 'Function', I cant tell you.  If the function only needs to run at 50 million times a second for each data input, IE a kind of DSP processor, to save space, the general practice would be to run the function at 100MHz or 150MHz, 1 clock, all positive.  What you then do is take the slower source data A & B coming in to feed through the function multiplexed one after the other as 1 single pipe which feeds the function.  At the output of the function, you de-mux your results back into the results A & B.  In fact, int  this setup, if your source data is 50MHz and you run your DSP function at 150MHz, you can feed it muxed 3 source data's and de-mux the output in 3 parallel answers.  1 calculating DSP pipe running 3x the speed of your source data means you can calculate 3 different products simultaneously.  This is how one port of the FPGA memory inside the 'FPGA VGA Controller for 8-bit computer' project was made into a 5 port ram where the read addresses were running at 1/5th the system clock rate.

     Going the other way is a problem.  If you want to parallel calculate a 108MHz data simultaneously with a 150MHz data, you would need a 300MHz DSP core, or just 2 DSP cores running in parallel.  However, if you only need to do 1 data port at a time, 108MHz, or 150Mhz, you can just keep the DSP running at 150MHz and switch between source data.  Converting an 8bit 108MHz data to 150MHz with a 'data_ready' signal would only take 18 logic cells for reliable functionality, 30 logic cells for guaranteed under obscene circumstances (Such circumstances being an unstable glitchy source clock).

 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #180 on: December 01, 2019, 01:23:38 am »
its not simultaneously but mutually exclusive.

Worst case stress test would be 150mhz.

All logic blocks do the exact same frikkin thing, its just a different pin, and a bit slower.

You wanna know why this pin is there? Layout reasons ok! Things got...messy.

Both pins of the type clk#_p

My original question was regarding the syntax, and now I am just confused.

Why did you guys ever think I didnt have a crystal? I aint that noob.
« Last Edit: December 01, 2019, 01:27:55 am by lawrence11 »
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #181 on: December 01, 2019, 01:25:01 am »
Without knowing you function, for all I know, it may only take 100 or 200 logic cells.  You may still be using nothing of the Max10.  Heck, the 8bit video card project is only at 600 logic cells now, that's with RS232 debugger com port, cursor lines, video generator, plus 8kbyte ram  for font and test memory.  It would only take around 30% of the logic cells in the Max10 and it is almost finished.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: My first FPGA code
« Reply #182 on: December 01, 2019, 01:27:57 am »
My little LC3 project will run at 100 MHz on an Artix 7 100T but timing fails at 200 MHz.  I didn't iterate over values to find the highest possible frequency because the crystal oscillator is 100 MHz and that is as fast as I want to push the design.  In ISE they used to tell you the maximum possible frequency.  I haven't found that datapoint in Vivado.  I'm sure it's there somewhere but I haven't run across it.
They removed the feature in Vivado:
https://www.xilinx.com/support/answers/57304.html
I find it kind of funny that with a bloody computer, and soooooo much on screen data and tables to show you everything that you have to read that thread, and use the formula given to you in that thread to calculate the FMAX on your own.  Where the formula is ' 1/(T-WNS) ' and T is your given clock period.  Though, on their timing reports video:
https://www.xilinx.com/video/hardware/timing-summary-report.html
They pretty much have an use all the same terminology and data timing points as Quartus' timing report, yet the 1 single number figure they want you to calculate by hand with a calculator is the FMAX?  I just want an Idea of how much work I need to do to stretch my design or it's expand-ability at a glance before digging into the difficult numbers.

Thanks for the tip!  My little LC3 will only run about 104 MHz according to the calculation.  100 MHz is probably close enough because I don't want to get into lengthy optimizations.  I certainly don't want to pipeline the design for a toy project.
« Last Edit: December 01, 2019, 01:29:28 am by rstofer »
 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #183 on: December 01, 2019, 01:34:54 am »
Sometimes Its impossible to communicate via this forum.

The time wasted today was humongous.

I need to talk to a tutor real time.
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #184 on: December 01, 2019, 01:41:17 am »
its not simultaneously. mutually exclusive.

Worst case stress test would be 150mhz.

All logic blocks do the exact same frikkin thing, its just a different in, and a bit slower.

Both pins of the type clk#_p

My original question was regarding the syntax, and now I was forced to explain alot more than I wanted to.

Something like:
always @(posedge clk150) begin

if (source_a_or_b == 1) begin
         dsp_input_data         <= source150_data;
         dsp_input_data_rdy  <= source150_data_rdy;
end else begin
         dsp_input_data         <= source108_data;
         dsp_input_data_rdy  <= source108_data_rdy;
end

        dsp_output         <= (dsp_input_data * 10 / 3 + 100 + offset_reg_setting) * contrast_reg_setting;
        dsp_output_rdy <=  dsp_input_data_rdy;

end // always clk150m


The first IF() selects what the reg 'dsp_input_data' will be equal to based on the input ' source_a_or_b ' being high or low.  It also makes the 'dsp_input_data_rdy' reg equal to the correct matching source data ready flag.

After the IF(), ELSE, the next line is the DSP and the next line after that passes the data ready flag through the same number of clock steps required by the dsp.

You now have switch selected 2 alternate source data for the DSP processor.  The only thing you are missing is the clock domain transition step between the 108MHz and 150MHz clock.
 
The following users thanked this post: lawrence11

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #185 on: December 01, 2019, 01:45:01 am »
So these calculations are considered DSP and not flip flop or gates?

Sweet deal!
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #186 on: December 01, 2019, 01:56:46 am »
So these calculations are considered DSP and not flip flop or gates?

Sweet deal!

The 'dsp_input_data' would count as 8 logic cells as it is a register.  Same for the ' dsp_input_data_rdy ' would count as 1 since it is 1 single signal wire.

if your 'dsp_output' is 16 bits, then that counts at 16 logic cells and the ' dsp_output_rdy' still counts as 1 since is it 1 wire signal cell.

This here ' (dsp_input_data * 10 / 3 + 100 + offset_reg_setting) * contrast_reg_setting; ' is a hefty calculation which is created out of gates and a dedicated multiplier or 2, but the final result will be the 16 main logic cells.  However, this may be too hefty for 150MHz operation.  You may need to break it into 2 or 3 steps.  With a 16 bit result, each step will eat another 16 logic cells unless the compiler can find ways to achieve the same results while removing unnecessary logic cells which happens all the time.  The numbers I am giving you are worst case scenario.

So, if your function is only 10 times more complicated that what I created above, your DSP will eat only around 170 logic cells, plus maybe a few dedicated 9x9 or 18x18 multipliers.  You still haven't scratched 10% of the smallest MAX10, and only 0% of it's embedded memory.  If you exceed the dedicated multipliers of the MAX10, then those additional parallel multiplications will be achieved in gates.  Addition and subtraction is usually done in gates unless tied to 1 multiplication where the addition may be added into part of the dedicated multiplier logic known as a MAD, or multiply add as these standard DSP functions are industry known for digital filters.

« Last Edit: December 01, 2019, 02:02:30 am by BrianHG »
 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #187 on: December 01, 2019, 02:01:31 am »
its not 16 bit, its 2 byte, seperate byte.

Should be good @ 150mhz
« Last Edit: December 01, 2019, 02:49:57 am by lawrence11 »
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #188 on: December 01, 2019, 02:23:18 am »
The bottom end max10 has 16 18x18bit multipliers.  So at 18 bit if you:

p3                  <= p1 * p2 ;
p6                  <= p4 * p5 ;
p9                  <= p7 * p8 ;
p12                <= p10 * p11 ;
p13                <= p3 * p6 ;
p14                <= p9 * p12 ;
final_result    <= p13 * p14 ;

You would eat 7 multipliers,  126 logic cells with a clock delay from inputs p1,p2,p4,p5,p7,p8,p10,p11 to output final result would be a 3 clock pipe.  This code would be doing 1 billion multiplies a second with a 150MHz source clock.

I am not counting where the source logic cells where the p1,p2,p4,p5,p7,p8,p10,p11 are stored.  If they are all 8 bit, that would be 64 logic cells.

To think that a sub 10$ chip can do anything a billion time a second and still be relatively empty and have room to do quite a bit more is kid of amazing yet there is so much more larger and faster out there.
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2778
  • Country: ca
Re: My first FPGA code
« Reply #189 on: December 01, 2019, 03:16:12 am »
A lot. The tools stop optimizing as soon as the timing is met. If you want higher speed, the tools may be able to make it, but at the cost of extra optimization, which may take hours. There's no way to find out aside of actually trying.
And to me that behavior makes perfect sense, especially if you consider that they have to support parts with insane amount of resources, so finding the absolute best solution can take ridiculously long time, and in most cases when you have a well-constrained design, you don't really need the absolute best.

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #190 on: December 01, 2019, 03:30:18 am »
A lot. The tools stop optimizing as soon as the timing is met. If you want higher speed, the tools may be able to make it, but at the cost of extra optimization, which may take hours. There's no way to find out aside of actually trying.
And to me that behavior makes perfect sense, especially if you consider that they have to support parts with insane amount of resources, so finding the absolute best solution can take ridiculously long time, and in most cases when you have a well-constrained design, you don't really need the absolute best.
Are you sure that's true.  I have an old design which takes 90% of an EP3C40, and to meet my designated FMAX, it took 7 hours to 'fit'. (Compile only around 10 minutes.).  Now, compiling that design into an EP3C80, at 45% full took around 30 minutes to 'fit' reaching my designated FMAX.  When a FPGA is almost empty, the fitter can always take the most optimum route, including duplicating registers at different locations to achieve the final FMAX, IE, a straight line.  There is no thinking to be done.  Insane resources means easy routing in a straight line.  On the other hand, when the FPGA is full as it lacks resources, you don't have the luxury of routing in these straight lines.  Your fitter must work 100 fold harder to achieve both a fit and push things around that maze of FPGA fabric to get that timing as good as possible.
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3238
  • Country: ca
Re: My first FPGA code
« Reply #191 on: December 01, 2019, 03:43:06 am »
All logic blocks do the exact same frikkin thing, its just a different pin, and a bit slower.

Re-using processing logic only makes sense if it's massive. Otherwise you risk spending more resources on re-using then you gain.

Look at the docs for your FPGA and see what it can offer for clock muxing. If it has clock muxes, see if you can use them.

If you cannot mux, there are things as run-time reconfiguration - where you reconfigure parts of FPGA on the fly - but these techniques are complex and slow to switch.

Most likely you can mux clocks with regular LUTs, but this will introduce really big jitter, and may also introduce glitches. It may work at 150 MHz though. Just try it and see if it meets timing.

You can also detect the transitions of your clocks using higher frequency sampling. However, for 150 MHz, you will need really high frequency. For example, Xilinx 7-series have oversample mode - you can sample at x4 the clock, which is about 2.4 GHz for its maximum 600 MHz clock frequency. This is enough to detect transitions in your 150 MHz clock.

The straightforward way is to simply sample the signals separately, then cross the clock domain to your global clock (at around 200 MHz) and then use mux to select the sample you want. This way you can re-use your processing chain.
« Last Edit: December 01, 2019, 04:14:09 am by NorthGuy »
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3238
  • Country: ca
Re: My first FPGA code
« Reply #192 on: December 01, 2019, 04:12:29 am »
Are you sure that's true.

P&R is a very complex task. The lack of resources is one of the factors. The speed is another.

For 50 MHz you can place your elements practically anywhere. If you want to run it at 500 MHz, you must place all destinations within few connection boxes from their sources, which creates placement problems and local congestion. If you have a big design, it's much easier to find a acceptable solution, than the solution which can run at the highest speed. So, why bother with high speed if you don't need it anyway. That's what Vivado does and that's why it doesn't tell you the highest achievable speed.

Sometimes slack numbers will give you the idea of the speed you can achieve. Other times they will be way off.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: My first FPGA code
« Reply #193 on: December 01, 2019, 03:16:36 pm »
Maybe I have missed it in Vivado but ISE used to point out the longest logic chain.  You would swear the logic wasn't very complex yet there were 19 levels of logic between clocks.  It was embarrassing!
If you drilled down far enough, you could see the levels in the RTL Schematic.

For my toy projects timing hasn't been an issue.  But this may be another number that is worth tracking.
 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #194 on: December 03, 2019, 08:34:45 pm »
So, how is a main system clock supposed to be generated, am I doing this right?

To sample a 150mhz clock I need something 3x, so 450mhz

I was just wondering what the Instantiation was for, I thought instantiations were for testbenches?

ALTCTKCTRL gave me 2 files that I can play with, the rest to me is all to be ignored and chinese.

So for anything that has a relation with the switchover should be found in ALTPLL, not ALTCLKCTRL, correct?



« Last Edit: December 03, 2019, 08:46:41 pm by lawrence11 »
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #195 on: December 03, 2019, 09:47:12 pm »
So, how is a main system clock supposed to be generated, am I doing this right?

To sample a 150mhz clock I need something 3x, so 450mhz

I was just wondering what the Instantiation was for, I thought instantiations were for testbenches?

ALTCTKCTRL gave me 2 files that I can play with, the rest to me is all to be ignored and chinese.

So for anything that has a relation with the switchover should be found in ALTPLL, not ALTCLKCTRL, correct?

To sample 150MHz, you need a 150MHz clock.  Unless there is something special about your sampler's data buss.

Is the MAX10 'PLL' providing a clock for your sampler?
Or does the sampler generate it's own clock?
How many bits is the buss?

 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9929
  • Country: us
Re: My first FPGA code
« Reply #196 on: December 03, 2019, 10:07:37 pm »
Earlier I mentioned that you can't find an edge by using if pos_edge(mysignal) inside an always @ (posedge clk) because the clk edge won't coincide with the transition on mysignal.  What you do is keep a delayed copy of mysignal (what it was before the current clk event) and use a bit of logic to see that there is a difference between the current value and the previous value.

Like this:

https://www.chipverify.com/verilog/verilog-positive-edge-detector
« Last Edit: December 03, 2019, 10:14:50 pm by rstofer »
 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #197 on: December 03, 2019, 10:25:11 pm »
Ok then, I will pick that route, 150mhz sample @ 150mhz clock, source synchronous mode.

This not being the normal mode, but source synchronous mode.

Wich will have more delay but more power efficient.

I will have to get familiar with a clock switchover, now that I am more familiar with this GUI.

Still, it lacks videos that shows these file organizations, like STM32, nice videos everywhere.

They should show: Heres how it works a sim/synthesis workflow with whatever library GUI we offer.



« Last Edit: December 03, 2019, 10:28:45 pm by lawrence11 »
 

Offline BrianHG

  • Super Contributor
  • ***
  • Posts: 8027
  • Country: ca
Re: My first FPGA code
« Reply #198 on: December 03, 2019, 10:27:44 pm »
Earlier I mentioned that you can't find an edge by using if pos_edge(mysignal) inside an always @ (posedge clk) because the clk edge won't coincide with the transition on mysignal.  What you do is keep a delayed copy of mysignal (what it was before the current clk event) and use a bit of logic to see that there is a difference between the current value and the previous value.

Like this:

https://www.chipverify.com/verilog/verilog-positive-edge-detector
This wont work, for example, if his core clock is 150MHz and the sampler's clock is 148.5MHz.  I have a nice small clock domain code which works great for slight clock inconsistencies.

For sampling his SPI clock, your mentione 1-delay system exactly the way he can extract the SPI SCLK transition and perform a bit transaction at every low-high and high-low transition.

It is exactly how my RS232 Sync UART example detects the start bit and aligns it's internal period counter.
 

Offline lawrence11Topic starter

  • Frequent Contributor
  • **
  • !
  • Posts: 322
  • Country: ca
Re: My first FPGA code
« Reply #199 on: December 03, 2019, 10:37:35 pm »
Its like the NAND land video.

You choose your gears.

The max10 has a max gear of 450 mhz, wich is result of PLL via precision crystal.

But you can setup the stuff to work with other slower gears. Wich I will do, I see now...= determinism, less power, connect the chain to chosen clock.

But if you want the absolute fastest propagation delay, you can run as fast as possible, 450 Mhz. Connect the chain to PLLed system clock.

At the highest speed, FPGA will sample once reference is high, process @ 450mhz.

Do I get it guys?
« Last Edit: December 03, 2019, 10:42:53 pm by lawrence11 »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf