Author Topic: Up to date and not ancient VHDL tutorial! (Read 7920 times)

Bassman59 · « **Reply #25 on:** August 16, 2018, 06:13:30 pm »

Quote from: rstofer on August 16, 2018, 05:59:41 pm

Quote from: james_s on August 16, 2018, 04:56:53 pm
Quote from: rstofer on August 14, 2018, 09:48:36 pm
I only know about Xilinx tools but one of the things I can do is view the final logic diagram (called RTL Schematic). It's a really nice way to see how the building blocks are implemented. It will also show the block layout for large projects and allow you to drill down through modules.

Altera Quartus has a very similar tool and it's quite useful. My observation is that code that produces nice tidy looking results on the RTL schematic usually works well too.

I was staggered to see that a +1 binary counter I coded turned into an adder and a register. I don't know what I expected but that wasn't it!

I'm not sure what else you'd get! An incrementer is no more than an adder with one of the inputs set to a constant 1. The other input comes from the adder output.

But -- the adder part is purely combinatorial. If you don't register its output, then you've got a combinatorial loop between the adder input and output, and that's bad.

Some FPGA families have specialist adder resources. Others do not. For the former, the synthesizer will preferably use the adder (until it runs out of them). The register may be part of the adder resource. For the latter, it will build the adder out of combinatorial logic, and stick registers on the output.

Quote

As the design gets larger, the utility diminishes. There's just too much information. Still, it is useful to drill down into the modules, clear down to gates and flops. Just to see...

Yes, you can't look at the whole design without getting lost. But you can drill down into a specific module and see what the synth did with your code. I do this all the time.

Bassman59 · « **Reply #26 on:** August 16, 2018, 06:14:33 pm »

Quote from: james_s on August 16, 2018, 04:56:53 pm

Quote from: rstofer on August 14, 2018, 09:48:36 pm
I only know about Xilinx tools but one of the things I can do is view the final logic diagram (called RTL Schematic). It's a really nice way to see how the building blocks are implemented. It will also show the block layout for large projects and allow you to drill down through modules.

Altera Quartus has a very similar tool and it's quite useful. My observation is that code that produces nice tidy looking results on the RTL schematic usually works well too.

Look at the "technology" view, which shows how the tool mapped your nice RTL into the resources available in the chip!

Sal Ammoniac · « **Reply #27 on:** August 16, 2018, 07:36:38 pm »

Someone mentioned Free Range VHDL. I'll second that as I found it to be a very good tutorial when I was learning HDLs. Make sure you get the latest version, as there're many older versions floating around on the 'net and some of them are quite old.

http://freerangefactory.org/pdf/df344hdh4h8kjfh3500ft2/free_range_vhdl.pdf

I learned VHDL first and then Verilog, but overall I prefer Verilog. It's really a personal choice, however, as you can do the same thing in either language.

rstofer · « **Reply #28 on:** August 16, 2018, 08:16:46 pm »

Quote from: Bassman59 on August 16, 2018, 06:13:30 pm

Quote from: rstofer on August 16, 2018, 05:59:41 pm
Quote from: james_s on August 16, 2018, 04:56:53 pm
Quote from: rstofer on August 14, 2018, 09:48:36 pm
I only know about Xilinx tools but one of the things I can do is view the final logic diagram (called RTL Schematic). It's a really nice way to see how the building blocks are implemented. It will also show the block layout for large projects and allow you to drill down through modules.

Altera Quartus has a very similar tool and it's quite useful. My observation is that code that produces nice tidy looking results on the RTL schematic usually works well too.

I was staggered to see that a +1 binary counter I coded turned into an adder and a register. I don't know what I expected but that wasn't it!

I'm not sure what else you'd get! An incrementer is no more than an adder with one of the inputs set to a constant 1. The other input comes from the adder output.

I had thought it would be a ripple counter with a bunch of D-flops strung together with some logic, like the 74161 but with D-flops. On reflection, I realized that the adder is a much better solution since the carry logic is built in to the LUT and the LUT doesn't really have J-K flops.

If I were building such a counter out of TTL, I certainly wouldn't be using a 74181 and 74182 to implement a binary counter and that's where I was coming from.

As I said, on reflection, I see exactly why it is done the way it is.

Bassman59 · « **Reply #29 on:** August 16, 2018, 10:23:34 pm »

Quote from: rstofer on August 16, 2018, 08:16:46 pm

Quote from: Bassman59 on August 16, 2018, 06:13:30 pm
Quote from: rstofer on August 16, 2018, 05:59:41 pm
Quote from: james_s on August 16, 2018, 04:56:53 pm
Quote from: rstofer on August 14, 2018, 09:48:36 pm
I only know about Xilinx tools but one of the things I can do is view the final logic diagram (called RTL Schematic). It's a really nice way to see how the building blocks are implemented. It will also show the block layout for large projects and allow you to drill down through modules.

Altera Quartus has a very similar tool and it's quite useful. My observation is that code that produces nice tidy looking results on the RTL schematic usually works well too.

I was staggered to see that a +1 binary counter I coded turned into an adder and a register. I don't know what I expected but that wasn't it!

I'm not sure what else you'd get! An incrementer is no more than an adder with one of the inputs set to a constant 1. The other input comes from the adder output.

I had thought it would be a ripple counter with a bunch of D-flops strung together with some logic, like the 74161 but with D-flops. On reflection, I realized that the adder is a much better solution since the carry logic is built in to the LUT and the LUT doesn't really have J-K flops.

You can't use a ripple counter in an FPGA, as it's an asynchronous counter and the Q output of stage n clocks the flip-flop for stage n+1. Static timing analysis doesn't work on that kind of circuit, and in many FPGA fabrics (notably recent Xilinx) the only access to the flip-flop's clock input is through a global net, not from general logic.

Quote

As I said, on reflection, I see exactly why it is done the way it is.

Always worth thinking on these things.

chris_leyson · « **Reply #30 on:** August 16, 2018, 10:24:07 pm »

If you want to learn about how to use logic resources in an FPGA then try a simple 8-bit micro like the Picoblaze from Xilinx. It's small and it's fast, 96 slices on Spartan3 and 26 slices on Spartan 6. Xilinx UG129 gives a good overview. https://www.xilinx.com/support/documentation/ip_documentation/ug129.pdf

Picoblaze will teach you some basic VHDL syntax by building an 8-bit microprocessor from simple logic elements and it will also give you tools to test analize your hardware. Test and analysis are the two of the most powerful aspects of VHDL that are often overlooked, in fact it's why VHDL was conceived in the first place, it's a hardware description language not a hardware synnthesis language.

emece67 · « **Reply #31 on:** August 16, 2018, 10:48:52 pm »

rstofer · « **Reply #32 on:** August 17, 2018, 12:55:52 am »

That book sells for $105 for paperback on Alibris, the hardbound version is a touch over $200.

https://www.alibris.com/Circuit-Synthesis-with-VHDL-Roland-Airiau/book/1108885

NorthGuy · « **Reply #33 on:** August 17, 2018, 03:53:48 am »

Quote from: rstofer on August 16, 2018, 08:16:46 pm

As I said, on reflection, I see exactly why it is done the way it is.

The counter is actually much simple than adder, and can be made much faster in general logic than using carry elements. FPGA doesn't have gates, but rather uses LUTs. 6-input LUT can calculate any bit of a 6-bit counter at once. This is faster than carry chains. If you use faster clocks you often can see your adders replaced with general logic. Also, some of the Xilinx's LUT may be configured into 32-bit shift registers which can cascade and replace short counters.

Xilinx allows two different levels of schematics - one is a simple schematics done from VHDL - called "elaborated design". Here you will see adders, muxes, gates etc. This is an abstraction of VHDL and has nothing to do with real FPGA.

Another level is post-synthesis schematics where you will see the elements of real FPGA - LUTs, DSPs etc.

These two are completely different schematics. It's important to know which one you're looking at.

Also you can look at the real FPGA map where you get the post-implementation view - all the elements of the FPGA are there, you can see which are used by the design and how they're connected together.

legacy · « **Reply #34 on:** August 17, 2018, 11:49:38 am »

In my first university course at computer science's campus (when I was younger, oh ... Erasmus in the South East region of England, the best experience ever) we used this book as the reference and it really gave us good points to be discussed in the classroom, making a practice of them during laboratories.

It's about Verilog, and the technology it talks about didn't include modern b-ram and resources offered by modern XILINX FPGAs, therefore it's more conservative on the design approach, but it this is good as it makes good points about the methodology, with criticism on the differences between FPGA-design and ASIC-design.

Bassman59 · « **Reply #35 on:** August 17, 2018, 06:43:50 pm »

Quote from: chris_leyson on August 16, 2018, 10:24:07 pm

If you want to learn about how to use logic resources in an FPGA then try a simple 8-bit micro like the Picoblaze from Xilinx. It's small and it's fast, 96 slices on Spartan3 and 26 slices on Spartan 6. Xilinx UG129 gives a good overview. https://www.xilinx.com/support/documentation/ip_documentation/ug129.pdf

Picoblaze will teach you some basic VHDL syntax by building an 8-bit microprocessor from simple logic elements

Actually, PicoBlaze is an awful example of how to do a micro in an FPGA. Rather than using behavioral code to describe the logic, Ken Chapman's code instantiates FPGA primitives directly. It's as if he took an old schematic (remember when we did schematic entry for FPGAs?) of the design, which used those primitives, and had it export to a VHDL netlist. Good luck decoding the LUT4 instances with their INIT values to figure out what the hell it's doing.

The claim was that such direct instantiation of primitives was done for performance reasons, but everybody understands it was done to make porting to other FPGA vendors' devices impossible.

(That said, PicoBlaze works as advertised and I've used it many times.)

james_s · « **Reply #36 on:** August 17, 2018, 06:57:42 pm »

I hate FPGA code that directly instantiates primitives, it's an absolutely stupid practice that has no tangible benefits. Even amongst FPGAs from the same vendor, code with direct instantiation often will not synthesize for a different generation of FPGA from the same vendor without modifications. Write agnostic code and let the software figure out how to synthesize it.

hamster_nz · « **Reply #37 on:** August 18, 2018, 07:32:31 pm »

Quote from: james_s on August 17, 2018, 06:57:42 pm

I hate FPGA code that directly instantiates primitives, it's an absolutely stupid practice that has no tangible benefits. Even amongst FPGAs from the same vendor, code with direct instantiation often will not synthesize for a different generation of FPGA from the same vendor without modifications. Write agnostic code and let the software figure out how to synthesize it.

I don't know Ken, but I am sure he is pretty smart when it comes to FPGA stuff. I would not be surprised at all if his requirements for PicoBlaze made this style of coding the right way to do things.

The initial release of PicoBlace was about 14 years ago, when FGPAs were much smaller, and tools were much simpler. The audience of potential users is large, and minimizing support tickets due to implementation issues is an important consideration. It makes sense to pay careful attention and use very descriptive, structural HDL - even to the point of doing most the synthesis before the HDL is even written.

Bassman59 · « **Reply #38 on:** August 18, 2018, 09:35:34 pm »

Quote from: hamster_nz on August 18, 2018, 07:32:31 pm

Quote from: james_s on August 17, 2018, 06:57:42 pm
I hate FPGA code that directly instantiates primitives, it's an absolutely stupid practice that has no tangible benefits. Even amongst FPGAs from the same vendor, code with direct instantiation often will not synthesize for a different generation of FPGA from the same vendor without modifications. Write agnostic code and let the software figure out how to synthesize it.

I don't know Ken, but I am sure he is pretty smart when it comes to FPGA stuff. I would not be surprised at all if his requirements for PicoBlaze made this style of coding the right way to do things.

Again, the first requirement was: Xilinx only.

asmi · « **Reply #39 on:** August 18, 2018, 10:31:38 pm »

Quote from: Bassman59 on August 18, 2018, 09:35:34 pm

Again, the first requirement was: Xilinx only.

Unless you demonstrate the evidence of this, I call

on this one. If they really wanted to make it Xilinx-only, they would've delivered the core as encrypted IP.

NorthGuy · « **Reply #40 on:** August 18, 2018, 11:19:18 pm »

Quote from: Bassman59 on August 18, 2018, 09:35:34 pm

Quote from: hamster_nz on August 18, 2018, 07:32:31 pm
I don't know Ken, but I am sure he is pretty smart when it comes to FPGA stuff. I would not be surprised at all if his requirements for PicoBlaze made this style of coding the right way to do things.

Again, the first requirement was: Xilinx only.

I think he's just one of the old-school guys who likes doing things efficiently (and I bet Xilinx wanted PicoBlaze to be very efficient). When you assemble everything from LUTs you immediately see where the bottleneck is.

james_s · « **Reply #41 on:** August 19, 2018, 06:42:49 am »

Quote from: asmi on August 18, 2018, 10:31:38 pm

Unless you demonstrate the evidence of this, I call on this one. If they really wanted to make it Xilinx-only, they would've delivered the core as encrypted IP.

It's blatantly obvious to anyone who has worked with this stuff. You an instantiate primitives directly, or you can write agnostic code and let the fitter instantiate these same primitives appropriate for the specific part being used. The end result is exactly the same, the difference is that agnostic code can be built for any suitable device while directly instantiating primitives locks it to that specific family unless you modify it.

nctnico · « **Reply #42 on:** August 19, 2018, 08:16:07 am »

Quote from: asmi on August 18, 2018, 10:31:38 pm

Quote from: Bassman59 on August 18, 2018, 09:35:34 pm
Again, the first requirement was: Xilinx only.
Unless you demonstrate the evidence of this, I call on this one. If they really wanted to make it Xilinx-only, they would've delivered the core as encrypted IP.

You could have looked up the source yourself. Picoblaze instantiates Xilinx primitives directly (Spartan2 / Virtex2 IIRC). I doubt it can even synthesize on modern Xilinx devices due to architectural differences. I've used Picoblaze myself in a couple of designs. Mr Chapman created Picoblaze to show off how you can utilize a 'small' FPGA to the max but it turned out the end result was very useful as well.

chris_leyson · « **Reply #43 on:** August 19, 2018, 08:27:44 am »

Many years ago and just for fun I wrote a decimal version of the KCPSM3. I added a decimal flag to the status register and two additional instructions SED and CLD to set or clear the flag bit. In decimal mode you just do BCD add and subtract using a little bit of extra carry logic. I also added two shift instructions to do binary/BCD conversion and that neatly filled up a small gap in the instruction set. I emailed Ken Chapman and asked if was OK to publish the design, and he said it was OK as long as I state that it is for Xilinx devices only. I also asked why the KCPSM3 was written with primatives and he said it was so you got exactly the same logic no matter what syntheseis tool was used. Anyway, I never finished the design as there was a fair bit of additional logic to get the status register carry flags to work correctly and it just added too much delay. I've still got the VHDL somewhere so one day I might get around to finishing it.

legacy · « **Reply #44 on:** August 19, 2018, 11:23:29 am »

Quote from: nctnico on August 19, 2018, 08:16:07 am

I doubt it can even synthesize on modern Xilinx devices due to architectural differences.

yup, there are some examples of this even with projects at OpenCores.

legacy · « **Reply #45 on:** August 19, 2018, 11:38:53 am »

RAMB4_S8 from Xilinx Web pack ISE 4.2.03i is a clue of "Xilinx Specific"

NorthGuy · « **Reply #46 on:** August 19, 2018, 03:19:33 pm »

Quote from: james_s on August 19, 2018, 06:42:49 am

It's blatantly obvious to anyone who has worked with this stuff. You an instantiate primitives directly, or you can write agnostic code and let the fitter instantiate these same primitives appropriate for the specific part being used. The end result is exactly the same ...

Not exactly the same.

It's as if you would say that you can write code in assembler, or you can write agnostic code in C and let the compiler create the same assembler commands.

Even though the end result is the same when running at lower clock speeds, the well-written hand-crafted code with manual instantiations is likely to run at much faster clock speeds.

james_s · « **Reply #47 on:** August 19, 2018, 05:48:57 pm »

Well look at the RTL viewer, in most cases at least, the fitter infers exactly the same primitives, whether you specify them directly or create a generic element. How many ways can you create a RAM or ROM, or PLL? I suppose there are cases where you could get different results, but overall the software seems very good at coming up with an optimal solution provided the code is well written. At any rate I have yet to experience any issues whatsoever from changing directly instantiated primitives to agnostic HDL in code that I've ported to other platforms, if someone can find a specific example where directly instantiated primitives produces a demonstrably superior result then I'd like to see it. Otherwise I remain of the opinion that it is done primarily as a method of locking a design in to a specific vendor/product line, with likely some doing it out of habit formed in some earlier time when the software was less sophisticated and required more human optimization.

legacy · « **Reply #48 on:** August 19, 2018, 07:27:03 pm »

Quote from: NorthGuy on August 19, 2018, 03:19:33 pm

Even though the end result is the same when running at lower clock speeds, the well-written hand-crafted code with manual instantiations is likely to run at much faster clock speeds.

Reasons why the kernel of RiscOS v1-v2-3 was written in assembly? (ARM610, StrongArm)
Reasons why the kernel of the first gen of AmigaOS was written in assembly? (m68020, m68030)

ain't it?

james_s · « **Reply #49 on:** August 19, 2018, 07:56:20 pm »

Earlier times, less sophisticated compilers. It's kind of an apples to oranges comparison between HDL and high level programming languages. Hand optimized assembly language is something that is quite rare these days and getting rarer all the time. The hardware is more complex and powerful and the compilers are more sophisticated, causing a much lower return on investment for a human to get in there and tweak things by hand. There are still cases where it might make sense, but these are going to be special cases where it's worth putting in big effort to squeeze out that last 1% performance boost.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Up to date and not ancient VHDL tutorial! (Read 7920 times)

Share me