Author Topic: Learning Verilog, Have Weird Bug, Don't Know What to Try  (Read 8025 times)

0 Members and 1 Guest are viewing this topic.

Offline farrellTopic starter

  • Contributor
  • Posts: 19
Learning Verilog, Have Weird Bug, Don't Know What to Try
« on: November 22, 2014, 04:43:20 am »
I have a background in C and Java but recently starting playing with Verilog and a Nexys2 (Xilinx Spartan 3E) dev board. I wrote a couple modules: one module listens to a ps2 keyboard and outputs the scan code, the other module takes that scan code and shifts it out the DB9 port formatted as 9600 8N1. The keyboard module outputs to an array of 8 LEDs on the dev board. The rs232 module takes those 8 IOs as it's input.

With my code, shown below, it works fine. The 8 LEDs and the TX line of the DB9 correctly reflect the scan code from the keyboard. If you simply change the name of the ps2keyboard module instantiation from "keyboard" to "kbd" then LED #7 (the "MSB") is always lit. Even weirder is that the output from rs232 is always correct -- bit 7 always correctly reflects the scan code value. Only the LED is "stuck."

I can't figure it out. I can't even see why changing the name of an instantiated module (which isn't referenced anywhere else) would have any affect at all. Timing issues seem to be common FPGAs, so maybe that's it? I don't know how to proceed. With a microcontroller I would use my debugger and start stepping through code, but with an FPGA...?

Video clip showing the problem:



Code:

///////// top.v file /////////

`timescale 1ns / 1ps
module top(ps2clock, ps2data, leds, clock, tx);

   input clock;         // main FPGA clock, 50MHz
   input ps2clock;      // clock pin on the ps2 connector, ~15kHz
   input ps2data;         // data pin on the ps2 connector
   output [7:0] leds;   // 8 LEDs on the Nexys2 dev board
   output tx;            // TXd pin of the RS232 connector on the Nexys2 dev board

   ps2keyboard keyboard (clock, ps2clock, ps2data, leds);
   rs232_transmitter rs232 (clock, leds, tx);

endmodule




///////// ps2keyboard.v file /////////
`timescale 1ns / 1ps
// this module currently just outputs the scancode, ignoring key-up codes.
module ps2keyboard(clock, ps2clock, ps2data, scanCode);

   input clock;
   input ps2clock;
   input ps2data;
   output reg [7:0] scanCode;

   reg [7:0] buffer;
   reg [3:0] currentBit;
   reg previousClockState;
   reg currentClockState;

   always @(posedge clock)
   begin
      previousClockState <= currentClockState;
      currentClockState <= ps2clock;
      
      // wait for a falling edge on the ps2 clock line
      if(previousClockState == 1 && currentClockState == 0)
      begin
         // shift scan code (bits 1-8) into the buffer
         if(currentBit >= 1 && currentBit <= 8)
            buffer[currentBit - 1] = ps2data;
         
         // wrap around to bit 0 when reaching the end of the 11-bit word and output the scan code byte
         currentBit = currentBit + 1;
         if(currentBit == 11)
         begin
            currentBit = 0;
            if(buffer != 8'hF0 && buffer != 8'hE0) // don't display key-up F0 events, or E0 prefixes
               scanCode <= buffer;
         end
      end
   end

endmodule




///////// rs232_transmitter.v file /////////
`timescale 1ns / 1ps
// this module isn't a proper implementation of rs232, but simply shifts out bits in the rs232 9600 8N1 format.
// the input is continuously shifted out. there is no buffer, and no signal to start or stop transmission.
module rs232_transmitter(clock, data, tx);

input clock;
input [7:0] data;
output reg tx;

reg [12:0] counter;
reg [5:0] currentBit;
wire [39:0] rs232word;

assign rs232word = {30'b111111111111111111111111111111, 1'b1, data[7:0], 1'b0}; // idle state padding, 1 stop bit, data, 1 start bit

always @(posedge clock)
begin
   counter <= counter + 1;
   if(counter == 5208) // 50MHz/5208 = 9600.6144 baud rate
   begin
      counter <= 0;
      tx <= rs232word[currentBit];
      if(currentBit + 1 == 40)
         currentBit <= 0;
      else
         currentBit <= currentBit + 1;
   end
end

endmodule




///////// ucf.icf file /////////
NET "clock"   LOC = "B8";

NET "leds<0>" LOC = "J14";
NET "leds<1>" LOC = "J15";
NET "leds<2>" LOC = "K15";
NET "leds<3>" LOC = "K14";
NET "leds<4>" LOC = "E17";
NET "leds<5>" LOC = "P15";
NET "leds<6>" LOC = "F4";
NET "leds<7>" LOC = "R4";

NET "ps2clock" LOC = "R12";
NET "ps2data"  LOC = "P11";

NET "tx"       LOC = "P9";





For what it's worth, the problem even happens if nothing is plugged into the ps2 port. The bug is consistent. Power cycling the board and reloading the bitstream does not change the behavior. Any help is appreciated.

-Farrell
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #1 on: November 22, 2014, 04:55:27 am »
Didn't look too much into the code but in the ps2keyboard module, currentBit range is from 0-7 and you are comparing it to 1-8

Not sure why it works when the instance is named ps2keyboard.

« Last Edit: November 22, 2014, 04:57:38 am by miguelvp »
 

Offline farrellTopic starter

  • Contributor
  • Posts: 19
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #2 on: November 22, 2014, 05:45:00 am »
currentBit is a 4bit register, so it can be 0 - 15, right? I compare against 1 through 8 because bit0 from the keyboard is a start bit which I want to ignore, and bits 1 through 8 are the payload byte.

-Farrell
 

Offline marshallh

  • Supporter
  • ****
  • Posts: 1462
  • Country: us
    • retroactive
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #3 on: November 22, 2014, 06:26:15 am »
Well the first problem I see is that you are using unsynchronized inputs. Whenever you start seeing weird, unexplainable behavior, that's the place to start looking.

Here I have synchronized them and converted your blocking assingments to non-blocking. It introduces at least 2 FF stages to prevent metastability from infecting the design.

Code: [Select]
///////// ps2keyboard.v file /////////
`timescale 1ns / 1ps
// this module currently just outputs the scancode, ignoring key-up codes.
module ps2keyboard(clock, ps2clock, ps2data, scanCode);

input clock;
input ps2clock;
input ps2data;

reg ps2clock_1, ps2clock_2, ps2clock_3;
reg ps2data_1, ps2data_2, ps2data_3;

output reg [7:0] scanCode;

reg [7:0] buffer;
reg [3:0] currentBit;

always @(posedge clock)
begin
{ps2clock_3, ps2clock_2, ps2clock_1} <= {ps2clock_2, ps2clock_1, ps2clock};
{ps2data_3, ps2data_2, ps2data_1} <= {ps2data_2, ps2data_1, ps2data};

// wait for a falling edge on the ps2 clock line
if(ps2clock_3 & ~ps2clock_2)
begin
// shift scan code (bits 1-8) into the buffer
if(currentBit >= 1 && currentBit <= 8)
buffer[currentBit - 1] <= ps2data_3;

// wrap around to bit 0 when reaching the end of the 11-bit word and output the scan code byte
currentBit <= currentBit + 1;
if(currentBit == 11)
begin
currentBit <= 0;
if(buffer != 8'hF0 && buffer != 8'hE0) // don't display key-up F0 events, or E0 prefixes
scanCode <= buffer;
end
end
end

endmodule

As I explained to you before, ISE is using the checksum of your verilog source files to determine a fitter seed. So basically due to various randomness of the fitting process (determined exactly by the seed) is affecting the way the metastability is manifesting in your design.
I had a similar bug go unnoticed for 3 years when I clocked both ports of a BRAM with the same clock, but the signals on one port were in another clock domain. It worked most of the time except for some unrelated changes causing differences in fitting so that it broke due to marginal internal setup/hold times.
Verilog tips
BGA soldering intro

11:37 <@ktemkin> c4757p: marshall has transcended communications media
11:37 <@ktemkin> He speaks protocols directly.
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #4 on: November 22, 2014, 06:41:05 am »
I had a similar bug go unnoticed for 3 years when I clocked both ports of a BRAM with the same clock, but the signals on one port were in another clock domain. It worked most of the time except for some unrelated changes causing differences in fitting so that it broke due to marginal internal setup/hold times.
Nasty. How did you manage to debug that one?
 

Offline marshallh

  • Supporter
  • ****
  • Posts: 1462
  • Country: us
    • retroactive
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #5 on: November 22, 2014, 06:52:57 am »
Just by hours of inspecting the code. What's stupid is that this sort of clock crossing analysis could easily be done by Timequest or any STA, they have all the info necessary, launch/latch clocks for each net. Yet they can't warn you about this sort of thing.
Verilog tips
BGA soldering intro

11:37 <@ktemkin> c4757p: marshall has transcended communications media
11:37 <@ktemkin> He speaks protocols directly.
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #6 on: November 22, 2014, 07:17:56 am »
No warning despite you having put constraints on those paths? That would be really bad. Or you didn't put constraints on it?
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #7 on: November 22, 2014, 07:34:26 am »
currentBit is a 4bit register, so it can be 0 - 15, right? I compare against 1 through 8 because bit0 from the keyboard is a start bit which I want to ignore, and bits 1 through 8 are the payload byte.

-Farrell

You are right, long week and apparently my brain is exhausted, but listen to marshall :)
 

Offline photon

  • Regular Contributor
  • *
  • Posts: 234
  • Country: us
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #8 on: November 22, 2014, 07:42:23 am »
Just by hours of inspecting the code. What's stupid is that this sort of clock crossing analysis could easily be done by Timequest or any STA, they have all the info necessary, launch/latch clocks for each net. Yet they can't warn you about this sort of thing.

Don't think it's easy and the fact that it doesn't appear in any STA is because it's not easy.
 

Offline farrellTopic starter

  • Contributor
  • Posts: 19
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #9 on: November 22, 2014, 08:15:35 pm »
Marshallh: Thank you. I tried your version of the ps2keyboard module, but the problem persists. I had to change "if(currentBit == 11)" to "if(currentBit + 1 == 11)" since you used a non-blocking assignment for currentBit. Now the two most-significant LEDs are stuck on :) And like before, the RS232 output is always correct, regardless of the LEDs.

-Farrell
 

Offline marshallh

  • Supporter
  • ****
  • Posts: 1462
  • Country: us
    • retroactive
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #10 on: November 22, 2014, 09:42:37 pm »
Maybe something specific to the xilinx constraints? I will point a xilinx friend here and see what he says.
Verilog tips
BGA soldering intro

11:37 <@ktemkin> c4757p: marshall has transcended communications media
11:37 <@ktemkin> He speaks protocols directly.
 

Offline Artlav

  • Frequent Contributor
  • **
  • Posts: 750
  • Country: mon
    • Orbital Designs
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #11 on: November 22, 2014, 10:00:54 pm »
Hm.
What are your initial values?

Try adding
initial begin
 scanCode=0;
 currentBit=0;
 currentClockState=0;
 previousClockState=0;
end

to the keyboard module.

Even better, add a reset line.
I figure the problem is that you start with garbage in your registers since they are not explicitly set anywhere.

Another possible issue - how sure are you that the RS232 output does not have the last bit set?
As far as i can see on the video, it's indistinguishable from the continuing high line.
 

Offline farrellTopic starter

  • Contributor
  • Posts: 19
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #12 on: November 22, 2014, 10:49:07 pm »
I thought I read somewhere that Xilinx tools automatically init all values to zero? I added the initial block, but it did not help. In any case, I'm not sure that would have been the problem anyway: the first keypress would overwrite any corrupt data, and if currentBit started out non-zero, the bit order would have been offset, but it was never offset. (The lower LEDs (bits) were always correct when the upper LEDs were stuck high.)

As for the RS232 output, I'm counting the individual bits on the scope. The MSB is low (+7V due to inverted logic of RS232) as it should be. If the MSB was high (-7V) the waveform would look different.



-Farrell
 

Offline Bassman59

  • Super Contributor
  • ***
  • Posts: 2501
  • Country: us
  • Yes, I do this for a living
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #13 on: November 23, 2014, 03:10:53 am »
Did you simulate the design?
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #14 on: November 23, 2014, 05:15:27 am »
I thought I read somewhere that Xilinx tools automatically init all values to zero?
You are right, but it's a good idea to explicitly initialize those registers anyway. Otherwise you will get simulation mismatches (result on real hardware will be different than what you get during simulation).
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #15 on: November 23, 2014, 09:07:54 am »
Well the first problem I see is that you are using unsynchronized inputs. Whenever you start seeing weird, unexplainable behavior, that's the place to start looking.

Here I have synchronized them and converted your blocking assingments to non-blocking. It introduces at least 2 FF stages to prevent metastability from infecting the design.

I do agree that if this wasn't just a play design then only a two-stage synchronize is needed on PS2clock, but nothing is really needed on the PS2data signal,  and it is very, very that any errors will be seen during testing. The PS/2 protocol is that there is at least 5 microseconds of setup time for the data signal before the falling clock, so that signal won't be and issue, and ps2clock is registered into currentClockState before using it, so the only issue to worry about is that register going metastable due to a setup/hold violation.

Research (that Google gives as a horrid link so I won't paste it) indicates that the window for metastability on a Spartan 3 is about a 0.65ps window for inducing a delay of 0.150ns or longer when latching data into a register (either IOB or CLB), and once in a metastable state 95% of these will 'snap out' for every 0.1ns of slack in the shortest path to downstream registers. The upshot being if farrell's design has 0.25ns of slack, the chance an metastabilty induced error is one in 307,680 clock transitions - or about once every 30,000 keyboard scan codes.

In all likelihood he will have at least 1ns of slack on that path - and after all, it is a simple design ant it won't be running hot and at the lowest speced voltages. This conservative guess gives a modeled MTBF of once every 4 billion transitions. This is enough to be an issue if this was a comms link or transferring bulk data, very unlikely to be the source our a repeatable error seen when pressing keys on a keyboard.

Oh, and the design does need a "PERIOD=10ns" constraint on the 'clock' signal.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #16 on: November 23, 2014, 09:37:16 am »
With my code, shown below, it works fine. The 8 LEDs and the TX line of the DB9 correctly reflect the scan code from the keyboard. If you simply change the name of the ps2keyboard module instantiation from "keyboard" to "kbd" then LED #7 (the "MSB") is always lit. Even weirder is that the output from rs232 is always correct -- bit 7 always correctly reflects the scan code value. Only the LED is "stuck."
Couple of things... When you get strange behavior like that, it cannot hurt to do a clean project, and then rerun synthesis + P&R. With older versions of ISE I sometimes ran into behavior similar to what you mention.

Another thing is that for the -1200 version of the Nexys2 board the MSB 4 leds are in different locations. Probably not the problem since you say you get the expected result for some cases. But just to be sure:

UCF for -500:
Code: [Select]
NET "Led<4>"  LOC = "E17"; # Bank = 1, Pin name = IO, Type = I/O, Sch name = LD4? s3e500 only
NET "Led<5>"  LOC = "P15"; # Bank = 1, Pin name = IO, Type = I/O, Sch name = LD5? s3e500 only
NET "Led<6>"  LOC = "F4";  # Bank = 3, Pin name = IO, Type = I/O, Sch name = LD6? s3e500 only
NET "Led<7>"  LOC = "R4";  # Bank = 3, Pin name = IO/VREF_3, Type = VREF, Sch name = LD7? s3e500 only

UCF for -1200:
Code: [Select]
NET "Led<4>" LOC = "E16"; # Bank = 1, Pin name = N.C., Type = N.C., Sch name = LD4? other than s3e500
NET "Led<5>" LOC = "P16"; # Bank = 1, Pin name = N.C., Type = N.C., Sch name = LD5? other than s3e500
NET "Led<6>" LOC = "E4";  # Bank = 3, Pin name = N.C., Type = N.C., Sch name = LD6? other than s3e500
NET "Led<7>" LOC = "P4";  # Bank = 3, Pin name = N.C., Type = N.C., Sch name = LD7? other than s3e500

At any rate, I think a testbench is probably in order. Plus what also helps is to inspect the synthesis results, and check if you can spot any differences in how LED[7] is being driven for both your cases.
 

Offline farrellTopic starter

  • Contributor
  • Posts: 19
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #17 on: November 23, 2014, 10:51:54 am »
Bassman59: I had not written a testbench, but just wrote a trivial one. Considering that this stuck LED problem even occurs when nothing is plugged into the PS2 port, I wrote this sorry excuse for a testbench:

Code: [Select]
forever
begin
#1 clock = ~clock;
#1 clock = ~clock;
#1 clock = ~clock;
#1 clock = ~clock;
#1 clock = ~clock;
ps2clock = ~ps2clock;
end

And all 8 LEDs are logic 0 in the sim as expected.


hamster_nz: I added NET "clock" PERIOD=10ns; to my UCF, but the problem persists.


mrflibble: I have tried Project > Cleanup Project Files, and redone the synthesize/implement/generate file steps, but the problem persists. I have the -500 version of the Nexys2, and the LEDs work fine in other projects. How would I go about inspecting the synthesis results?

-Farrell
 

Offline mrflibble

  • Super Contributor
  • ***
  • Posts: 2051
  • Country: nl
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #18 on: November 23, 2014, 11:05:32 am »
mrflibble: I have tried Project > Cleanup Project Files, and redone the synthesize/implement/generate file steps, but the problem persists. I have the -500 version of the Nexys2, and the LEDs work fine in other projects. How would I go about inspecting the synthesis results?

In the synthesis menu, you can do "View RTL synthesis somethingsomething", I forgot the exact label.

If you attach a zip with the ISE project files I can take a look at it if you want.
 

Offline Artlav

  • Frequent Contributor
  • **
  • Posts: 750
  • Country: mon
    • Orbital Designs
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #19 on: November 23, 2014, 11:23:40 am »
When the answer is elusive, never rule out ninjas assumptions.

Try reassigning the LEDs backwards, both in verilog, in pin assignments and both (that is, LED0 is 7, and so on).
I.e. there can be some sort of an internal short to an unused pin with undefined value, or something unexpected like that.

Look carefully at the LED 7 - is it the same brightness than the others?
If not, it might be flickering very fast for some reason.

Try feeding the PS2 clock from a signal generator, with 0 or 1 at the data pin.

Try various changes in names, to see how often the issue arises - is it a case of lucky seed with kbd, or does it happen on every odd name (would help trying other things by knowing the chances of it happening)?

I tried simulating your code and it appears to work fine, so the chances are it's something one step down below actual verilog - Xilinx toolset peculiarities, FPGA peculiarities or hardware issues.
 

Offline farrellTopic starter

  • Contributor
  • Posts: 19
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #20 on: November 23, 2014, 09:50:53 pm »
I think I got it fixed! Finally!

First, I tried a bunch of things that did NOT fix it:

1. Changing the UCF assignments didn't help. The bits were reassigned to different LEDs, but the left-most LED that was stuck stayed stuck... it did not follow the new pin assignment. The LEDs that weren't stuck correctly reflected their new assignment.
2. Even commenting out the LEDs in the UCF and just having the ps2keyboard module output wired to the rs232_transmitter module didn't work. The stuck LED was STILL LIT. I'm raging pretty hard now...
3. I installed the Xilinx and Digilent software about two years ago (but just recently did anything past HelloWorld stuff) so decided to try the latest software. Out of frustration I decided to take the nuclear option: setup a whole new VM, installed a virgin copy of Windows 7, installed all Windows Updates as of last night (that took forever!) and the latest version of ISE WebPack and Digilent Adept. Tried it out, and the LED was still stuck. Words can not describe my anger.

Finally, success:

I had been loading my bitstream directly into the FPGA during all of this development and debugging. I decided to try loading the bitstream in the platform flash to see if anything would change. So I changed the FPGA Start Up Clock setting from "JTAG Clock" to "CCLK", regenerated the bitstream and flashed the board. FINALLY. It works properly now. I did not change anything else. I did not touch any PCB jumpers, etc.

Why might this have fixed the problem? Was the Adept software not properly "erasing" (not sure if that's the right word) when loading the bitstream into the FPGA instead of into the platform flash?

-Farrell
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2803
  • Country: nz
Re: Learning Verilog, Have Weird Bug, Don't Know What to Try
« Reply #21 on: November 23, 2014, 09:53:55 pm »
It is LED7 that is giving you issues (pin R4)?

That pin is also used as a voltage reference for some obscure I/O standards, so perhaps it is an I/O standards issue. Check your pinout report.

HSTL and SSTL inputs use the Reference Voltage (VREF) to bias the input-switching threshold. Once a configuration data file is loaded into the FPGA that calls for the I/Os of a given bank to use HSTL/SSTL, a few specifically reserved I/O pins on the same bank automatically convert to VREF inputs. For banks that do not contain HSTL or SSTL, VREF pins remain available for user I/Os or input pins.

I too have a Nexys2-500 at home, so if you want to email me your design files to hamster@snap.net.nz I can run it up for you.

EDIT: Oh - I see you've fixed it. Great!
« Last Edit: November 23, 2014, 09:56:20 pm by hamster_nz »
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf