Electronics > FPGA

Why does my macrocell count increase?

(1/4) > >>

rea5245:
Hi,

I'm trying to cram a 16 bit adder/subtracter into an ATF1508 CPLD. I'm doing something that I expected would reduce the complexity of my design, but it ends up increasing it.

I'm using Quartus II 13.0.1, set to the compatible EPM7128 device. This is the first time I've ever used Verilog or a CPLD, so there's a good chance I'm messing up something.

I have the following code:

--- Code: ---// 16 bit adder/subtracter
// Opcode 0 produces A + B
// Opcode 1 produces A - B
// Opcode 2 produces A + 1
// Opcode 3 produces -B

module ALU
#(parameter WIDTH = 16)
(output [WIDTH - 1:0] result, output c, output v, output n,
input [3 : 0] opcode, input [WIDTH - 1:0] A, input [WIDTH - 1:0] B);

// Calculate -B
wire [WIDTH - 1 : 0] negb = ~B + 16'd1;

// Choose what we'll feed into the adder. The first argument will always be A.
// The second argument could be -B, 1, or B
wire isSub = (opcode == 1);
wire isIncr = (opcode == 2);
wire [WIDTH - 1 : 0] addend2 = isSub ? negb : (isIncr ? 16'd1 : B);

// If the opcode is 3, produce -B. Otherwise, produce A + whatever we selected above
wire carry;
wire [WIDTH - 1 : 0] iresult;
assign {carry, iresult} = (opcode == 3) ?  negb : A + addend2;

// Set the outputs
assign result = iresult;

assign v = ~((A[WIDTH - 1] ^ addend2[WIDTH - 1])) & (A[WIDTH - 1] ^ iresult[WIDTH - 1]);
assign c = carry;
assign n = result[WIDTH - 1];

endmodule

--- End code ---

When I compile it, it uses 101 macrocells (out of 128 available).

I hoped that if I reduced the opcode from 4 bits to 2, I might save some macrocells - enough to add a little more functionality (like calculating a Z flag). So I changed "input [3 : 0] opcode" to "input [1 : 0] opcode" and recompiled. I was rewarded with 148 "Can't place node" errors.

Why would that happen?

Thank you,
Bob

BrianHG:
Did you move around the IOs?

A design like this should use a very small amount of macrocells, though, I do not know the product term size of the EMP7128 as it's been decades since I use them.

Have you tried clocking the design?  Remember to use the global clock input pin...

Yes, your code can be written in a cleaner way and clocking the design does improve output switching timing and stability.

Maybe cleaning your code might help Quartus determine the minimum LC usage design.

ale500:
What if you use combined add/sub logic per bit ? a-la full adder/subtractor. For getting negated B you are using one 16 bit full adder...
85 Macrocells using combined full adder/subtractors :).

rea5245:

--- Quote from: BrianHG on September 28, 2022, 03:29:36 am ---Have you tried clocking the design?  Remember to use the global clock input pin...

--- End quote ---

I don't understand the benefit of clocking it.

The classic 74181 ALU is not clocked.

If I wanted to store the result in a latch, I'd certainly need a clock. But since I can't even fit the adder/subtractor in the CPLD, I figured I would have a separate latch chip. Of course, that would be clocked.

In what way could my design be cleaner? As I said, this is my first attempt at Verilog.

--- Quote from: BrianHG on September 28, 2022, 03:29:36 am ---A design like this should use a very small amount of macrocells

--- End quote ---

Should it? Remember, it's 16 bits wide.

BrianHG:

--- Quote from: rea5245 on September 28, 2022, 12:55:12 pm ---In what way could my design be cleaner? As I said, this is my first attempt at Verilog.

--- End quote ---

This is a characteristic of CPLDs & FPGA.  It is not a characteristic of verilog programming.
Clocking the FPGA, meaning is you feed for example a 50MHz clock in your design, at the absolute most basic, this means once you modify your inputs, all the output bits will change to their new results at the next clock cycle, usually all in a crisp parallel manner.

A non-clocked design just means the combination of logic gates which generate your function will ripple through the CPLD internal fuse wiring until a stable result is reached.  Just like a wiring of 74LS/HC gates to make your ALU.  However, since you only want to replicate a 74LS IC, the CPLD should still run circles around it.

Ok, for cleaning your design, let's take a look at this:

Beginning:

--- Code: ---module ALU
// Opcode 0 produces A + B
// Opcode 1 produces A - B
// Opcode 2 produces A + 1
// Opcode 3 produces -B

#(parameter WIDTH = 16)
(output reg [WIDTH - 1:0] result, output reg c, output reg v, output reg n, // I had to add the 'reg' so that the always @(*) begin will function and also allow clocking in the future if you so desire.
input [3 : 0] opcode, input [WIDTH - 1:0] A, input [WIDTH - 1:0] B);

--- End code ---

Good enough.

Next, lets wires containing your 4 calculated opcodes:

--- Code: ---wire [WIDTH:0] i_add  = A + B ;  // We are adding the extra 1 bit at the top to keep track of the carry/borrow flag.
wire [WIDTH:0] i_sub  = A - B ;
wire [WIDTH:0] i_inc  = A + 1 ;
wire [WIDTH:0] i_negb = 0 - B ;  // I wired it like this to hopefully help direct the compiler to a simplification of the inputs.

--- End code ---

Now, lets set the outputs:

--- Code: ---always @(*) begin  // If we ever want to clock this design, the ' @(*) ' here would change to ' @(posedge clk_in) '

case (opcode[1:0]) // You only wanted to use the 2 bottom bits of the opcode.

2'd0 : begin // A+B
result  <= i_add [WIDTH - 1:0] ; // Trim the i_add integer bits down to the 'WIDTH' to match 'result's width
c       <= i_add [WIDTH] ; // I'm assuming ' c ' is the carry / borrow flag.
v       <= 1'b0 ; // What does 'v' equal?
n       <= 1'b0 ; // is n a negative flag?  Are you using signed numbers?  This would change things.
end

2'd1 : begin // A-B
result  <= i_sub [WIDTH - 1:0] ; // Trim the i_add integer bits down to the 'WIDTH' to match 'result's width
c       <= i_sub [WIDTH] ; // I'm assuming ' c ' is the carry / borrow flag.
v       <= 1'b0    ; // What does 'v' equal?
n       <= (B > A) ; // is n a negative flag?  Are you using signed numbers?
end

2'd2 : begin // A+1
result  <= i_inc [WIDTH - 1:0] ; // Trim the i_add integer bits down to the 'WIDTH' to match 'result's width
c       <= i_inc [WIDTH] ; // I'm assuming ' c ' is the carry / borrow flag.
v       <= 1'b0 ; // What does 'v' equal?
n       <= 1'b0 ; // is n a negative flag?  Are you using signed numbers?
end

2'd3 : begin // 0-B
result  <= i_negb[WIDTH - 1:0] ; // Trim the i_add integer bits down to the 'WIDTH' to match 'result's width
c       <= i_negb[WIDTH]       ; // I'm assuming ' c ' is the carry / borrow flag.
v       <= 1'b0 ; // What does 'v' equal?
n       <= 1'b1 ; // is n a negative flag?  Are you using signed numbers?
end
endcase

end // always @(*)
endmodule

--- End code ---

This should make a little more sense and allow you to add opcodes if you wish.
Though, what does the 'v' stand for?  I have not computed the 'v' in my code.  It is just always set to 0.

This code took too many marcocells with a 'speed' optimized compile.
Setting the compiler optimization to 'Area' allowed this code to fit with 104 of 128 macrocells.

I've attached a Quartus 13.0sp1 full project using an EPM7128 which seems to compile and fit this code.