Author Topic: Help! Maybe an interesting problem for Boolean logic fans (Read 2178 times)

hamster_nz · « **on:** January 07, 2019, 08:03:21 am »

Help!

I'm doing a little bit of Meta-programming - in this case it is a program that generates the HDL code for a RISC-V ISA instruction decoder.

The input file is the bit-level encoding of the ISA. The bit pattern for the instruction is specified, and then followed with the list of output flags that should be set:

Code: [Select]

-----------------000-----0000011 = instr_valid, type_load, flag_load_byte                   # Load byte unsigned
-----------------001-----0000011 = instr_valid, type_load, flag_load_half                   # Load half-word unsigned
-----------------010-----0000011 = instr_valid, type_load, flag_load_word                   # Load word
-----------------100-----0000011 = instr_valid, type_load, flag_load_byte, flag_sign_extend # Load byte signed
-----------------101-----0000011 = instr_valid, type_load, flag_load_half, flag_sign_extend # Load half-word signed

The code then generates the logic requried for each flag ( the '-' are don't care, and the '+' are logical OR operations):

Code: [Select]

  type_load            ------------------0------0000011
                     + -----------------010-----0000011

  flag_load_byte       ------------------00-----0000011

  flag_load_half       ------------------01-----0000011

  flag_load_word       -----------------010-----0000011

  flag_sign_extend     -----------------10------0000011

Which it then also expresses as HDL. e.g:

Code: [Select]

        flag_load_half <= '0';
        IF ((instruction(13 downto 12) = "01") AND (instruction(6 downto 0) = "0000011")) THEN
            flag_load_half <= '1';
        END IF;

The flags that control the details of the load operation will only only get acted on when the "type_load" is asserted, so they can be futher simplified, saving resources.

What I can't work out how to do is see how to get to this:

Code: [Select]

  type_load            ------------------0------0000011
                     + -----------------010-----0000011

  flag_load_byte       ------------------00------------

  flag_load_half       ------------------01------------

  flag_load_word       -----------------0--------------

  flag_sign_extend     -----------------1--------------

or the logically equivalent:

Code: [Select]

  type_load            ------------------0------0000011
                     + -----------------010-----0000011

  flag_load_byte       ------------------00------------

  flag_load_half       ------------------01------------

  flag_load_word       ------------------1-------------

  flag_sign_extend     -----------------1--------------

I can see how identifying the common, constant bits in type_load and removing them from the flags gets me most of way there:

Code: [Select]

  type_load            ------------------0------0000011
                     + -----------------010-----0000011

  flag_load_byte       ------------------00------------

  flag_load_half       ------------------01------------

  flag_load_word       -----------------010------------

  flag_sign_extend     -----------------10-------------

But I can't see how I programmatically figure out that the "0" in flag_sign_extend isn't needed, nor is the "10" bits (or maybe both '0's) in flag_load_word.

Please note that this is an extremely limited sample of the total ISA and problem. Some of the generated expressions have quite a few terms with the most complex of all being the "instruction_valid" flag:

Code: [Select]

   instr_valid          -----------------11------11-0011
                      + ------------------01-----11-0011
                      + -----------------01------1110011
                      + -------------------------1101111
                      + -----------------000-----1100-11
                      + -----------------100-----1100011
                      + 0000000-----------1------0110011
                      + 0000000-----------00-----0110011
                      + 0100000----------000-----0110011
                      + -----------------11------0010011
                      + -----------------100-----00-0011
                      + -----------------011-----0010011
                      + -----------------0-0-----00-0011
                      + 0100000----------101-----0-10011
                      + 0000000-----------01-----0-10011
                      + -----------------101-----0000011
                      + -----------------001-----0-00011
                      + -----------------0-0-----0100011
                      + 00000000000-00000000000001110011
                      + 00000000000000000001000000001111
                      + 0000--------00000000000000001111
                      + -------------------------0-10111

(yep, the generated HDL for that expression really sucks, but the synthesis tool should simplify it).

Any help or pointer to an algorithm would be most appreciated. Even a few keyword crumbs I can throw into Google would be great...

amyk · « **Reply #1 on:** January 07, 2019, 12:25:59 pm »

Start from here: https://en.wikipedia.org/wiki/Karnaugh_map

emece67 · « **Reply #2 on:** January 07, 2019, 01:01:40 pm »

hamster_nz · « **Reply #3 on:** January 08, 2019, 08:07:26 am »

Quote from: emece67 on January 07, 2019, 01:01:40 pm

If I understand your problem well, you are trying to minimize a set of boolean functions.

Karnaugh is feasible for single functions up to 4-5 variables, you have here more functions of more variables, so it may be not applicable. The other well know logic minimization algorithm, that of Quine-McCluskey, can be used to minimize a set of functions.

But the problem with the problem of minimization of boolean functions is that it is know to be, at least, NP-complete and conjectured to be EXPTIME, so don't rely on any algorithm to work well on it for a large set of functions on a large set of variables.

So maybe you are alone with some heuristics and relying on the synthesizer.

Good luck.

Found a workable solution. I attempt to remove each bit from the pattern one by one, and test if the end result is the same..

Input is now reduced nicely:

Code: [Select]

 Definitions:
  -----------------000-----0000011 = valid, type_load, flag_load_byte
  -----------------001-----0000011 = valid, type_load, flag_load_half
  -----------------010-----0000011 = valid, type_load, flag_load_word
  -----------------100-----0000011 = valid, type_load, flag_load_byte, flag_load_extend
  -----------------101-----0000011 = valid, type_load, flag_load_half, flag_load_extend

 Resulting expressions:
   flag_load_byte       ------------------00------------

   flag_load_extend     -----------------1--------------

   flag_load_half       -------------------1------------

   flag_load_word       ------------------1-------------

   type_load            ------------------0------0000011
                      + -----------------010-----0000011

   valid                ------------------0------0000011
                      + -----------------010-----0000011

And the generated VHDL:

Code: [Select]

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;

ENTITY decoder IS
    PORT (
        inst                 : in  std_logic_vector(31 downto 0);
        flag_load_byte       : out std_logic := '0';
        flag_load_extend     : out std_logic := '0';
        flag_load_half       : out std_logic := '0';
        flag_load_word       : out std_logic := '0';
        type_load            : out std_logic := '0';
        valid                : out std_logic := '0');
END ENTITY;

ARCHITECTURE auto_generated_code OF decoder IS
BEGIN

flags_decode: PROCESS(inst)
    BEGIN
        flag_load_byte <= '0';
        IF ((inst(13 DOWNTO 12) = "00")) THEN
            flag_load_byte <= '1';
        END IF;

        flag_load_extend <= '0';
        IF ((inst(14 DOWNTO 14) = "1")) THEN
            flag_load_extend <= '1';
        END IF;

        flag_load_half <= '0';
        IF ((inst(12 DOWNTO 12) = "1")) THEN
            flag_load_half <= '1';
        END IF;

        flag_load_word <= '0';
        IF ((inst(13 DOWNTO 13) = "1")) THEN
            flag_load_word <= '1';
        END IF;

        type_load <= '0';
        IF ((inst(13 DOWNTO 13) = "0") AND (inst(6 DOWNTO 0) = "0000011")) THEN
            type_load <= '1';
        END IF;
        IF ((inst(14 DOWNTO 12) = "010") AND (inst(6 DOWNTO 0) = "0000011")) THEN
            type_load <= '1';
        END IF;

        valid <= '0';
        IF ((inst(13 DOWNTO 13) = "0") AND (inst(6 DOWNTO 0) = "0000011")) THEN
            valid <= '1';
        END IF;
        IF ((inst(14 DOWNTO 12) = "010") AND (inst(6 DOWNTO 0) = "0000011")) THEN
            valid <= '1';
        END IF;

    END PROCESS;
END auto_generated_code;

If anybody is interested, code can be found at: https://github.com/hamsternz/risc-v_decode_generator - seems to save about about 20% of the logic resources.

But for me the big benefit is that if I want to decode the ISA into slightly different flags (e.g. perhaps I want to split out 'Shift' operations from the general 'ALU' operations, which looks to be a good idea, BTW) I rewrite the entire instruction decoder in a few seconds.

dgtl · « **Reply #4 on:** January 08, 2019, 08:17:17 am »

There are tools for logic minimization, ie espresso (https://en.wikipedia.org/wiki/Espresso_heuristic_logic_minimizer). But unless you intend to pipeline the stuff or otherwise break optimizations, the hdl tools will take care of minimizing logic between triggers.

Ian.M · « **Reply #5 on:** January 08, 2019, 08:45:49 am »

Next time you want to quickly simplify some combinatorial logic, +1 for throwing it at Espresso!

If you've got a windows PC, you may find Logic Friday useful. Its a GUI frontend for the Espresso logic minimizer The original website's gone to domain parking, but it can still be accessed, including downloading the installer, [here] at the Internet Archive.

It took about two minutes to enter the truthtable for the four flags that matter, with don't cares for all flags in the rows not present in the set of instruction words shown above (I hope I counted the bit positions correctly):

Code: [Select]

b14	b13	b12		Byte	Half	Word	SignExt
0	0	0		1	0	0	0
0	0	1		0	1	0	0
0	1	0		0	0	1	0
0	1	1		X	X	X	X
1	0	0		1	0	0	1
1	0	1		0	1	0	1
1	1	0		X	X	X	X
1	1	1		X	X	X	X

then hit Enter to submit to get the raw equations:

Code: [Select]

Entered by truthtable:
Byte = b14' b13' b12' + b14 b13' b12';
Half = b14' b13' b12 + b14 b13' b12;
Word = b14' b13 b12';
SignExt = b14 b13' b12' + b14 b13' b12;

then click the Minimize icon (options exact, jointly) to get:

Code: [Select]

Minimized:
Byte = b13' b12';
Half = b12;
Word = b13 ;
SignExt = b14 ;

If you are working with actual logic gates, it can even draw you a schematic using minimum gates, or assuming 'classic' logic, minimum packages.

Edit: I got the bit positions backwards. Now fixed

legacy · « **Reply #6 on:** January 08, 2019, 11:52:48 am »

is there any C source of the Berkley original program (for DOS)?
I'd like to write an overlay for Linux/UNIX

dgtl · « **Reply #7 on:** January 08, 2019, 11:56:35 am »

At the bottom of the wikipedia article, there's a link https://ptolemy.berkeley.edu/projects/embedded/pubs/downloads/espresso/index.htm

legacy · « **Reply #8 on:** January 08, 2019, 02:17:22 pm »

Found this, which looks a POSIX compliant version of the espresso logic minimization tool

To be tested

AndyC_772 · « **Reply #9 on:** January 08, 2019, 02:29:54 pm »

Quote from: dgtl on January 08, 2019, 08:17:17 am

But unless you intend to pipeline the stuff or otherwise break optimizations, the hdl tools will take care of minimizing logic between triggers.

^^ This!

If the end result you're looking to achieve is to generate a file which can then be synthesized and fitted into an FPGA, then you're wasting your time trying to make that file compact and simple.

During synthesis, the FPGA tool will parse your code and compile a set of look-up tables, describing the state of each output in terms of the inputs. How those tables are populated makes no difference; any code which is correct will result in the same LUT contents at this stage.

The contents of these LUTs are then mapped to the physical logic inside whatever FPGA you're using. How they were originally assigned the values they now contain has, by this time, been lost.

IMHO it would be entirely valid for your generated code to consist of a single large CASE structure, or something else which is verbose but easily read, understood, and observed to be correct. A little bit of intelligence might be needed to keep the file size sane, but that's the only reason to start including cleverness at this stage. It won't result in fewer FPGA resources being needed at the end of the process.

If you're intending to pipeline your logic, then that's a different issue, because you then need to determine what the intermediate steps should be. Even then, the logic to actually describe those steps can be as verbose as you like - it doesn't cost anything.

legacy · « **Reply #10 on:** January 08, 2019, 04:19:11 pm »

Quote from: AndyC_772 on January 08, 2019, 02:29:54 pm

If you're intending to pipeline your logic, then that's a different issue, because you then need to determine what the intermediate steps should be. Even then, the logic to actually describe those steps can be as verbose as you like - it doesn't cost anything.

I don't understand this "pipelining". What does exactly mean?

In a CPU, you have latched stage. A stage takes its inputs @ clock's edge, and produce its output, which is sampled by the next stage @ clock's edge.

I haven't understood the point

legacy · « **Reply #11 on:** January 08, 2019, 04:24:00 pm »

Code: [Select]

c3600 espresso-ab-1.0 $ ./src/espresso -h
UC Berkeley, Espresso Version #2.3, Release date 01/31/88

SYNOPSIS: espresso [options] [file]

  -d        Enable debugging
  -e[opt]   Select espresso option:
                fast, ness, nirr, nunwrap, onset, pos, strong,
                eat, eatdots, kiss, random
  -o[type]  Select output format:
                f, fd, fr, fdr, pleasure, eqntott, kiss, cons
  -rn-m     Select range for subcommands:
                d1merge: first and last variables (0 ... m-1)
                minterms: first and last variables (0 ... m-1)
                opoall: first and last outputs (0 ... m-1)
  -s        Provide short execution summary
  -t        Provide longer execution trace
  -x        Suppress printing of solution
  -v[type]  Verbose debugging detail (-v '' for all)
  -D[cmd]   Execute subcommand 'cmd':
                ESPRESSO, many, exact, qm, single_output, so, so_both,
                simplify, echo, signature, opo, opoall, pair, pairall,
                check, stats, verify, PLAverify, equiv, map, mapdc, fsm,
                contain, d1merge, d1merge_in, disjoint, dsharp, intersect,
                minterms, primes, separate, sharp, union, xor, essen,
                expand, gasp, irred, make_sparse, reduce, taut, super_gasp,
                lexsort, test
  -Sn       Select strategy for subcommands:
                opo: bit2=exact bit1=repeated bit0=skip sparse
                opoall: 0=minimize, 1=exact
                pair: 0=algebraic, 1=strongd, 2=espresso, 3=exact
                pairall: 0=minimize, 1=exact, 2=opo
                so_espresso: 0=minimize, 1=exact
                so_both: 0=minimize, 1=exact

back to Espresso

source Version #2.3 is POSIX, but ... it's really a mess

Full of "register" here and there, and ugly things that make the c-compiler abort.

I had to restyle a bit, not it compiles fine, now I have to test the behavior, and prepare a new patch for Gentoo as Overlay since the current support is abandoned.

legacy · « **Reply #12 on:** January 08, 2019, 04:25:52 pm »

(in theory, this app should need a complete code restyling, well ... I have never seen a Quine-McCluskey implementation so I might look deeper at the source)

AndyC_772 · « **Reply #13 on:** January 08, 2019, 05:13:35 pm »

Quote from: legacy on January 08, 2019, 04:19:11 pm

Quote from: AndyC_772 on January 08, 2019, 02:29:54 pm
If you're intending to pipeline your logic, then that's a different issue, because you then need to determine what the intermediate steps should be. Even then, the logic to actually describe those steps can be as verbose as you like - it doesn't cost anything.

I don't understand this "pipelining". What does exactly mean?

Suppose you have a complex logical function that you want to execute; for example, say you have 100 bits you want to XOR together.

One way to do it is to build logic that can do it all in one go. You might quite reasonably build ten 10-input XOR gates, then join all their outputs to the inputs of an 11th gate. The output of that 11th gate is the final output you wanted, but it's taken two gates' worth of propagation delay to get there. Since the maximum clock speed of any domain is limited by the slowest device connected to it, your system clock can't have a period any shorter than 2x your gate delay.

But: suppose you latch the outputs of the first gates into a set of D-types, then feed the outputs of those D-types into the inputs of the second gate. The final output comes from that second gate, and is valid after two clocks.

It still takes just as long (probably longer, in fact) to get the final output, but now your master clock can run twice as fast. The 100-input (first) stage can process a new block of data at the same time as the 10-input (second) stage is processing the intermediate results from the previous clock. This splitting up of logic into blocks, with memory bits used to store intermediate results, is pipelining.

With pipelining, although the latency through the system may be longer than without it, the throughput can be greatly increased.

In a simple CPU, the pipeline may consist of steps like: fetch instruction from SRAM, decode instruction into flags which control things like the ALU or stack, execute the instruction, and store results back to RAM. While each instruction is being executed, the next one is being fetched, and the results of the previous one are being written.

The trick in this case is to try and work out what useful intermediate results to produce from the original instruction word. That's something the synthesizer can't do because it's a behavioural change (ie. make the logic do thing A vs thing B), rather than just finding an efficient way to state a defined mapping between inputs and outputs.

NorthGuy · « **Reply #14 on:** January 08, 2019, 06:20:18 pm »

This approach may be interesting from the theoretical viewpoint, but may not be practical in this case.

RICS-V has a very simple ISA which is created by humans - bits are located with care to simplify implementation. Putting everything together into a big table loses all the structure which has been put there by humans. Then heuristics algorithms are used to re-create the structure. This is a lot of work, and the result won't be as good as if you would simply use the existing structure while writing the HDL code.

For example, flag_load_extend. You already know it is in bit 14. You simply map the signal to the bit. The humans who created RISC-V made sure it maps to bit 14 in all commands. There's no reason for creating huge tables and analyzing them with sophisticated methods to figure this out. The same is true for the rest of the ISA - it is well thought out and easy to implement.

Look, for example at this:

Code: [Select]

type_load <= '0';
        IF ((inst(13 DOWNTO 13) = "0") AND (inst(6 DOWNTO 0) = "0000011")) THEN
            type_load <= '1';
        END IF;
        IF ((inst(14 DOWNTO 12) = "010") AND (inst(6 DOWNTO 0) = "0000011")) THEN
            type_load <= '1';
        END IF;

What are you going to do with type_load? Use it a la "if type_load = '1'"? This doesn't provide much benefits compared to the "case" statement based on "inst(6 downto 2)". The validity logic which got embedded into the type_load code by your algorithm, doesn't have to be there. And if you need to add a new instruction, adding it to the "case" statement is extremely easy and doesn't require re-generating the whole code.

hamster_nz · « **Reply #15 on:** January 08, 2019, 06:40:17 pm »

Quote from: Ian.M on January 08, 2019, 08:45:49 am

Next time you want to quickly simplify some combinatorial logic, +1 for throwing it at Espresso!

If you've got a windows PC, you may find Logic Friday useful. Its a GUI frontend for the Espresso logic minimizer The original website's gone to domain parking, but it can still be accessed, including downloading the installer, [here] at the Internet Archive.

It took about two minutes to enter the truthtable for the four flags that matter, with don't cares for all flags in the rows not present in the set of instruction words shown above (I hope I counted the bit positions correctly):
Code: [Select]
b14 b13 b12 Byte Half Word SignExt 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 1 1 X X X X 1 0 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 X X X X 1 1 1 X X X Xthen hit Enter to submit to get the raw equations:
Code: [Select]
Entered by truthtable: Byte = b14' b13' b12' + b14 b13' b12'; Half = b14' b13' b12 + b14 b13' b12; Word = b14' b13 b12'; SignExt = b14 b13' b12' + b14 b13' b12;then click the Minimize icon (options exact, jointly) to get:
Code: [Select]
Minimized: Byte = b13' b12'; Half = b12; Word = b13 ; SignExt = b14 ;
If you are working with actual logic gates, it can even draw you a schematic using minimum gates, or assuming 'classic' logic, minimum packages.

Edit: I got the bit positions backwards. Now fixed

I don't think that I have a logic minimization problem, but I have a 'factoring problem' - or maybe I don't really understand my problem at all.

I think it looks like this:

(Boolean Expression A) AND (Boolean Expression B) = (Boolean Expression C)

I know what A and C are, but want to find B.

(Decoding for a all "LOAD" instructions) AND ( Simple, unknown expression for SIGN_EXTEND_FLAG) = (When the SIGN_EXTEND_FLAG is asserted during a LOAD).

It is easy, but wasteful on logic to say that "Unknown expression for SIGN_EXTEND_FLAG" is equal to "When the SIGN_EXTEND_FLAG is asserted during a LOAD". It also increases fan-out in the decoder.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Help! Maybe an interesting problem for Boolean logic fans (Read 2178 times)

hamster_nz

Help! Maybe an interesting problem for Boolean logic fans

amyk

Re: Help! Maybe an interesting problem for Boolean logic fans

emece67

Re: Help! Maybe an interesting problem for Boolean logic fans

hamster_nz

Re: Help! Maybe an interesting problem for Boolean logic fans

dgtl

Re: Help! Maybe an interesting problem for Boolean logic fans

Ian.M

Re: Help! Maybe an interesting problem for Boolean logic fans

legacy

Re: Help! Maybe an interesting problem for Boolean logic fans

dgtl

Re: Help! Maybe an interesting problem for Boolean logic fans

legacy

Re: Help! Maybe an interesting problem for Boolean logic fans

AndyC_772

Re: Help! Maybe an interesting problem for Boolean logic fans

legacy

Re: Help! Maybe an interesting problem for Boolean logic fans

legacy

Re: Help! Maybe an interesting problem for Boolean logic fans

legacy

Re: Help! Maybe an interesting problem for Boolean logic fans

AndyC_772

Re: Help! Maybe an interesting problem for Boolean logic fans

NorthGuy

Re: Help! Maybe an interesting problem for Boolean logic fans

hamster_nz

Re: Help! Maybe an interesting problem for Boolean logic fans

Share me