How many of you feel this way?
Or, do you go to FPGA just because you have no other choice and would prefer a simple MCU only solution?
I use Verilog to make my life soooo much easier. Especially if you use a simple single synchronous clock for everything, nothing asynchronous. Coding this way make very portable designs across all FPGAs and PLDs. As for the above VHDL example, it twists my head and I avoid it at all cost and wont ever use it.
You must never forget that you are describing hardware.
<= means 'is connected to', not 'becomes equal to'.
if rising_edge(clk) then
a <= b;
end if;
i_counter: counter port map (
clk => sys_clk,
count => cycle_count);
<= means 'is connected to', not 'becomes equal to'.
my_logic_gate: d_type PORT MAP (
d_in => my_data,
q_out => my_output,
clk => master_clock
);
PROCESS (clk)
-- exchange the values of a and b on every clock edge
BEGIN
IF clk'event AND clk = '1' THEN
a <= b;
b <= a;
END IF;
END PROCESS;
PROCESS (a)
BEGIN
b <= NOT a;
END PROCESS;
You must never forget that you are describing hardware.
What does 'signal' mean in this context? Why is there no mode? What does ':=' mean? Why do we need 'begin'? nq is less than or equal to nq0?
What does 'signal' mean in this context? Why is there no mode? What does ':=' mean? Why do we need 'begin'? nq is less than or equal to nq0?
I hate this, when tutorials are written by people a little too familiar with the subject matter, and they begin with material that should have been on about page 5, leaving out the important introduction to the subject (definitions, context, general explanation of what the heck is going on) which should have filled pages 1 to 4.
A "signal" is ...
I don't know what you mean by "mode" in this context.
":=" is a ...
"<=" does indeed mean ....
"Begin" just means "by this point, we've declared all the signals we're going to use... now here's the logic which defines their behaviour". It's just semantics. Some things must go before the 'begin', and some after. Don't read too much into it, just copy an example and structure your code the same way.
PROCESS (clk)
-- exchange the values of a and b on every clock edge
BEGIN
IF clk'event AND clk = '1' THEN -- A signal which changes in this block is going to be a flip-flop clocked by clk
a <= b; -- connect the output of flip-flop b to the input of flip-flop a
b <= a; -- connect the output of flip-flop a to the input of flip-flop b
END IF;
END PROCESS;
PROCESS (a)
BEGIN
b <= NOT a; -- build an inverter. Connect its input to a and output to b.
END PROCESS;
Now i feel like everything i've learned about HDL is wrong.I use Verilog to make my life soooo much easier. Especially if you use a simple single synchronous clock for everything, nothing asynchronous. Coding this way make very portable designs across all FPGAs and PLDs. As for the above VHDL example, it twists my head and I avoid it at all cost and wont ever use it.
What is wrong with seeing <= and := as assignment operators? Just like in C the = assigns the value from what is on the right to what is on the left. In VHDL <= and := assign what is on the right to what is on the left so there really isn't any difference.
Verilog has the '=' symbol for 'blocking' assignment and '<=' for 'non-blocking' assignments (whatever that may mean).
I think Bruce's questions were meant to be rhetorical.
Verilog has the '=' symbol for 'blocking' assignment and '<=' for 'non-blocking' assignments (whatever that may mean).
A 'blocking' assignment blocks anything else from happening (simultaneously) in the same code block while the assignment is happening, a 'non-blocking' one doesn't.
So, if we start off with three registers and their initial values A=1, B=2 and C=3.
If we execute the following sequence of blocking assignments:
begin
B = A;
C = B;
end
we get the result A=1, B=1, C=1. That is, the first statement executed in its entirety before the second, each blocking assignment is 'executed' in sequence. Now let's do the same thing with non-blocking assignments, and the same initial values as before:
begin
B <= A;
C <= B;
end
This time the result is A=1, B=1, C=2. The values for the right hand sides were taken as we 'passed' 'begin', all the assignments happened simultaneously, and they all finished at the same time, just as we reached 'end'.
That's slightly simplistic and wouldn't probably satisfy a language lawyer, but it gives the essentially flavour of what's going on.
The blocking assignment is useful in writing test beds and the like but dangerous, and usually wrong, in writing code that you actually expect to be implemented in hardware. You can fake up quite a complex signal for a test bed by combining blocking assignments with delays but that kind of usage is not synthesizeable and so will never make it to real hardware.
ARCHITECTURE ... OF ... IS
SIGNAL a, b, c : STD_LOGIC;
BEGIN
a <= b AND c;
END ARCHITECTURE;
ARCHITECTURE ... OF ... IS
SIGNAL a, b, c : STD_LOGIC;
BEGIN
PROCESS(clk)
BEGIN
IF rising_edge(clk) THEN
a <= b AND c;
END IF;
END PROCESS;
END ARCHITECTURE;
For all the Python haters, yes.. you can design hardware with Python - http://www.myhdl.orgAs if VHDL and Verilog weren't confusing enough, now we have another HDL to learn.
What advantages does MyHDL have over the other two?
All these High Level Synthesis (HLS) HDLs seem to have common threads to address these (and other) problems:
Actually you must forget about the hardware otherwise you'll be writing way too much code.
// copy them all across
a_out <= a_in; b_out <= b_in; c_out <= c_in; d_out <= d_in;
// bubble sort them
if(a_out > b_out) swap(a_out, b_out);
if(b_out > c_out) swap(b_out, c_out);
if(c_out > d_out) swap(c_out, d_out);
if(a_out > b_out) swap(a_out, b_out);
if(b_out > c_out) swap(b_out, c_out);
if(a_out > b_out) swap(a_out, b_out);
// Should now be in order
(I might have 20% more code / cycles than needed)Why should you suddenly optimise all facets of an FPGA design if you have lots of gates and lots of speed?
Thought experiment - A design requires a module that takes a clk signal and four 8-bit numbers (a_in, b_in, c_in, d_in) and sorts them low to high, to generate four outputs (a_out, b_out, c_out and d_out). Design it with a software mindset, and then a H/W mindset.
Thought experiment - A design requires a module that takes a clk signal and four 8-bit numbers (a_in, b_in, c_in, d_in) and sorts them low to high, to generate four outputs (a_out, b_out, c_out and d_out). Design it with a software mindset, and then a H/W mindset.
Thought experiment - A design requires a module that takes a clk signal and four 8-bit numbers (a_in, b_in, c_in, d_in) and sorts them low to high, to generate four outputs (a_out, b_out, c_out and d_out). Design it with a software mindset, and then a H/W mindset.
I don't think you gain a lot in terms of efficiency, but I would argue it is easier to design with hardware mindset. You can synthesise your "software mindset" design and see how much resources it uses. You can then compare to what you would get with "hardware mindset".
Assuming Xilinx 7-series 6-input LUTs, you would need:
- 6 modules to do 6 comparisons - 4 LUTs each = 24 LUTs. It'll take 2 layers of combinatory logic. You'll get 6 outputs from this representing the results of the comparisons
- For each 8 bit output - 6 x 2 table which converts 6 outputs from the previous layer into the 2-bit index. The 2-bit index will select which input you want to multiplex to the given output. 2 LUTs each = 8 LUTs. One layer of combinatory logic.
- For each bit of the outputs (32 bits total) a mux which uses 2-bit index from the previous layer to select one of the 4 inputs. 1 LUT each = 32 LUTs. One layer of combinatory logic.
Bottom line:
24 + 8 + 32 = 64 LUTs = 16 slices.
2 + 1 + 1 = 4 layers of combinatory logic roughly 0.7 ns each (including intra-layer routing) = 2.8 ns. I'd expect it would run fine with 4 ns clock period - 250 MHz.
array items = [a_in, b_in, c_in, d_in];
sort(items);
a_out = items[0];
b_out = items[1];
c_out = items[2];
d_out = items[3];
I asked a software friend how they would do it. First reply was to put an "ORDERED BY" clause on the SQL query used to get the items.
To do it in a single cycle, I'd make use of VHDL variables, and translate your 'software mindset' example more or less directly.
If that method ended up too slow to meet the required fmax, then it would need to be pipelined. On the first clock, perform three of the compare/swap operations, store the intermediate results in internal registers, and set a flag. Then, on the second, perform the other three compare/swaps, assign the final result to the outputs, and clear the flag again.
I asked a software friend how they would do it. First reply was to put an "ORDERED BY" clause on the SQL query used to get the items.That's scary on so many levels
In an FPGA, I'd do it one of two ways depending on the required clock speed and latency.
To do it in a single cycle, I'd make use of VHDL variables, and translate your 'software mindset' example more or less directly.
If that method ended up too slow to meet the required fmax, then it would need to be pipelined. On the first clock, perform three of the compare/swap operations, store the intermediate results in internal registers, and set a flag. Then, on the second, perform the other three compare/swaps, assign the final result to the outputs, and clear the flag again.