Author Topic: Series defect on agilent 167xx boards?  (Read 40227 times)

0 Members and 1 Guest are viewing this topic.

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Series defect on agilent 167xx boards?
« on: November 18, 2017, 08:53:24 pm »
Hi there,

I recently acquired some used 16750a boards for my 16902a logic analyzer. Curiously some of them seem to have similar problems in the selftests:
Zoom Acquisition Chip Select Test Failed!
Chip 0: Master Clock from Chip 0 Test Failed!
Comparator Test Failed! (only one Pod connected to one Chip fails, not always the same one)

When inspecting the boards for damage I couldnt find anything odd except some small green nodules forming next to the plastic parts on the bottom. When I took them off I noticed that the adhesive material was hard as rock (should be soft I think) and some of the copper traces underneath are erroded. (really erroded not ripped off  ;)

I think the failing selftest are related to the damage to the traces and more specifically to the clock traces (differential pair on the bottom left).
Did anybody repair a problem like this?

When looking at working cards they seem to develop the same problem. Right around the edges there are small green nodules. But the traces underneath have not been damaged (yet).

After reading
https://community.keysight.com/message/57746
is it just me or is this a common problem?
 

Offline simmconn

  • Regular Contributor
  • *
  • Posts: 55
Re: Series defect on agilent 167xx boards?
« Reply #1 on: November 19, 2017, 02:08:41 am »
The way I see it, it is a common problem, and you've found the cause. Either some chemicals in the adhesive/gasket is corrosive, or it absorbs moisture and causes corrosion. I don't see a good way to fix it as in many cases the residue of the adhesive/gasket is very difficult to remove, and repairing the broken trace/vias in a multi-layer board is no easy task. I may be pessimistic, but I think all 16500 series and 16700 series plugins with the plastic stiffener will eventually die because of this. It's just a matter of time.

Working cards may have the same corrosion, only that it has not eaten away enough copper to cause problem. I usually clean the corrosion with IPA but they some back after a while.

The 16900 series plugins use different stiffener (shorter and thinner) and adhesive (transparent instead of foam/felt like). Maybe Agilent knew something and took on a new design.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #2 on: November 19, 2017, 08:37:21 pm »
simmconn is right.  This is a known problem with the adhesive on the plastic runners.  I have 20 or so of various 165xx and 167xx cards and they all have this corrosion to some degree.  It seems to be particularly pervasive on the 1675x cards.

The only fix is to remove the runners and, I also agree, clean up the board with IPA.  Do an end-to-end continuity check on all traces running under or near the runners.  Don't trust if they look "ok".  I've had some where the traces look fine and later found out they were corroded and severed *under* the solder mask.  I've seen some with corrosion down the vias, which could render the board unrepairable.

I've also seen boards where people have not been careful about inserting it into a chassis and there are cracked/broken components and PCB traces on the underside.  Do a thorough survey under a microscope.  One of my boards had an invisible fracture in a ferrite bead which cut the power to a chip, which in turn killed the acquisition clock.

I have a 16702B chassis which runs HPUX.  On it there's a command line utility, /usr/sprockets/bin/pv, which is the equivalent of the self-test GUI tool.  It can be used to turn on more verbose output from the self-test routines (set debug, mode, and/or result levels to something other than 0).  Some of these values can also be set from the GUI under the "Options" menu on the self-test.  I have no experience with the 16902A, but I would poke around for something similar.

Regrettably, there's no documentation for any of the verbose self-test levels or the meaning of the output.  You have to just play with it and guess as to the meaning.  However, I found that some of the output, at least under HPUX, will actually point you to specific chips or signal lines that aren't behaving as expected.  It's better than nothing.

That all being said, it's not guaranteed you can fix one of these boards.  I still have a bunch of dead ones even after fixing all the severed traces I could find.  If you bought it as "Used", I would consider returning it since it's supposed to be functional.  It can turn into a huge time sink.

EDIT: For the curious: Added photos of a 16755A with a severed trace (fixed now), and a picture of the corrosion.
« Last Edit: November 19, 2017, 08:45:53 pm by MarkL »
 

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Re: Series defect on agilent 167xx boards?
« Reply #3 on: November 20, 2017, 08:54:06 am »
Great!

Well, not great, but I thought I'd been cheated on those boards.

I wonder if Agilent did that on purpose to limit the lifetime of their boards?

@MarkL: how did you repair them? Did you just bridge the missing part of the trace or reroute the connection entirely with wire?

I've attached a selftest log for posterity.
And I agree without actually knowing what part is tested where its hard to understand.
That and the fact that most technical information is in the Logic Analyzer Software Help File and not in the cards datasheet ;)

All I see as pattern is clock. Except for the last 16752 which probably also has a defective memory chip I guess.
we'll see when I have time to try to repair them
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #4 on: November 20, 2017, 02:41:50 pm »
I wonder if Agilent did that on purpose to limit the lifetime of their boards?
I doubt it.  It took many years to manifest itself before they even knew they had a problem.

Quote
@MarkL: how did you repair them? Did you just bridge the missing part of the trace or reroute the connection entirely with wire?
The damage was too long to be bridged with solder.

I scraped the soldermask off both sides of the break until I found good solid copper instead of crumbly powder.  I then bridged the path using a single strand extracted from one conductor in a ribbon cable (around 0.12mm == 36 AWG).

Hint: Tin the tip of wire, and then use extra flux on the bare trace and the wire before tacking it in place.

Below is a photo where I repaired 15 paths in one area.  The white outline is where the runner sat, and it totally destroyed 11 or 12 of the traces directly under it.  I patched them all to be sure.  There were several other areas with breaks, but unfortunately repairing them all did not bring this particular card back to life.
 
The following users thanked this post: 42Khz

Offline Bashstreet

  • Frequent Contributor
  • **
  • Posts: 298
  • Country: gb
Re: Series defect on agilent 167xx boards?
« Reply #5 on: November 20, 2017, 04:25:49 pm »
beautiful work.. not your first repair  :-DD Nice job :-+ shame the card had other issues  :horse:
« Last Edit: November 20, 2017, 04:27:40 pm by Bashstreet »
 

Offline NickAmes

  • Contributor
  • Posts: 31
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #6 on: November 21, 2017, 06:32:44 am »
I'm having a similar issue with a 16760A card. It fails some self-tests and there were green nodules of corrosion around the bumpers. None of the traces underneath are broken, but some vias are corroded. There seem to be two types of vias: small tented ones and large tinned ones (probably used as test points). The tinned ones are the only type corroded. Backlighting the board shows these are plane connections or blind vias. Anyone have advice on fixing these?

Here's the output from pv:
Code: [Select]
Mod A  : (0x36) 16760A Logic Analyzer (Master)
Summary   Test Name                      #Tests #Fails
------------------------------------------------------
passed    cpldRegTest                         1      0
passed    testLoadFPGA                        1      0
passed    fpgaRegTest                         1      0
passed    dataBusTest                         1      0
passed    addrBusTest                         1      0
passed    hwMemoryCellTest                    1      0
passed    unloadTest                          1      0
passed    dmaTest                             1      0
passed    sleepTest                           1      0
untested  searchTest                          0      0
passed    chipRegTest                         1      0
passed    anlyBusTest                         1      0
passed    clksTest                            1      0
passed    measAnlyBusTiming                   1      0
passed    bpClkTest                           1      0
passed    icrTest                             1      0
passed    flagTest                            1      0
passed    armTest                             1      0
passed    eepromTest                          1      0
passed    adcTest                             1      0
passed    probeIdTest                         1      0
FAILED    gaProgTest                          1      1
passed    cmpProgTest                         1      0
FAILED    dataPassThruTest                    1      1
FAILED    dataDemuxTest                       1      1
FAILED    vOffsetTest                         1      1
passed    cmpDelayTest                        1      0
FAILED    comparatorCalTest                   1      1
passed    laxCalTest                          1      0

I haven't been able to finish the searchTest. Either it's causing pv to hang or it just takes a really long time. I'm currently letting it grind away to see it it completes.

Edit: The search test still hasn't completed after two hours. pv prints the following at the start then waits:
Code: [Select]
pv> d d=9 r=9 
debugLevel=9, mode=0, resultLevel=9
pv> x searchTest

  -- Testing chip 9 slot A ...
  -- Re-Loading Acquisition Memory for Walking 1/0 tests
  -- Starting Walking 1/0 Tests
    -- Walking Zeroes test
    startRow.port:0x0.0, endRow.port:0x7e6.8
    maskSize:12, relative:1, mode:0, type:0
    search master = Chip9, FPGA0
    chip9: level:0x1,25520140 care:0x1,a55aa55a
    chip8: level:0x0,00000000 care:0x0,00000000
« Last Edit: November 21, 2017, 08:12:34 am by NickAmes »
 

Offline Bashstreet

  • Frequent Contributor
  • **
  • Posts: 298
  • Country: gb
Re: Series defect on agilent 167xx boards?
« Reply #7 on: November 21, 2017, 10:37:17 am »
If not already i would first clean up the area at 520 (right side) much as possible with stiff tooth brush (anti static one if you have) and some isopropyl (generally well tolerated but you can test first in some safe spot.

At least this should prevent any further corrosion forming.

That said i do not think the problem is causes by this particular area (of course as part of organized diagnosis it must be ruled out)

I am somewhat suspicious of the solder mask in other areas as it seems there is corrosion forming in several areas under it.
I would do a very close inspection with magnifying class (or if you have suitable microscope) on the whole board especially at the formed nodules.
This is difficult as under the solder mask (what can hide many sins) the via might be corrupted or damaged and look fine outwardly...

Any case i recommend looking for service manual It can help you get the ballpark area of problem and you can then concentrate you efforts there.

Also please take high res images of the the whole board for us to look at.  :-+ also do topside pictures and inspection if not done.
 

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Re: Series defect on agilent 167xx boards?
« Reply #8 on: November 21, 2017, 12:47:04 pm »
Yeah, it seems the selftest is nice but doesn't really pinpoint the area or traces one needs to be concerned about.

It does seem the 16750s have a JTAG connector right next to one of the FPGAs close to the backplane connector.
I measured the connections and it seems the connections are indeed JTAG (haven't verified yet need datasheets), but I couldnt find any further connections for a JTAG chain. Maybe thats done differently.

The mainframe must be able to reprogram the cards and the source files are in Logic Analyzers Software directory (ttf / rbt / bit / xsvf files for various cards Fatcat (16910) / Fastcat (?) / Wildcat (16950 also likely 16753/54/55) and 1696x as far as I could identify).
There are others: Yari and Daytona that also come with a suffix Pv (= Performance Verification)

Daytona maybe 16760A but there are two revisions
Yari = 16750A/51A/52A
Yari Rev 2 probably = 16740A/41A/42A/50/B/51B/52B (could be verified by modifying the resistor field accordingly 4xA <-> 5xB ) See https://www.eevblog.com/forum/testgear/hpagilent-1675x-logic-analyzer-card-memory-up-hack/
agYariMemFpga.rbt is most likely for the 4 memory controllers in the middle of the card. The files are even humanreadable (well semi: 0s and 1s ;)
agYariIfaceFpga.ttf is most likely for the Flex 10K10 (15 KB * 4 for plaintext encoding,only sram (=no persistence), there are no configuration devices on the board)

What about the rest?
I couldn't find a file for the EPM7256A so that is probably programmed via the JTAG connector in the factory. This most likely determines the basic identity of the card (Family / ID Code), the resistors determine the exact model/options.
It is probable that the EPM7256A then routes information from the backplane to program the Flex 10K10 (most likely interface logic to read out the memory) and the FPGAs on board (memory controllers, the 4 silver chips in the middle for 16750A), the two Chips with heatsinks are probably also ASICs as there are no files to program them. Then ASICs for FISO maybe A/D maybe Zoom (5 on the 16750) and last not least programmable comparators (4 top / 4 bottom).
There is an additional TTL to BTL transceiver TI FB2040 close to the backplane connector (for what?)
There is an PCF8584 close to the JTAG connector, but not connected to it -> Probe identification?

Has anybody ever tried JTAG and got it working?
Then a boundary scan would be possible making life easier.

P.S. Sorry if this post looks weird, work in progress. Help is welcome, maybe you can answer one of the questions?
« Last Edit: November 22, 2017, 07:43:56 pm by DocBen »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #9 on: November 21, 2017, 05:06:06 pm »
...
Edit: The search test still hasn't completed after two hours. pv prints the following at the start then waits:
...
Although some corrosion, your board doesn't look too terrible.

It would be nice to know what a working pv self-test looks like for a 16760A.  I don't have one to be able to help you with that, but maybe someone with the card and a 1670xA/B can run the test.

With the 1756x 1675x cards, some of the debug printout references to "Chip x" means "Ux".  If true here, there could be an issue related to U8.  All 0's looks strange.  Perhaps focus on that area and all the connecting traces to and from that area.

And yes, a high-res photo of both sides could help.  Occasionally someone can spot something that was overlooked.

If you haven't read it yet, go through the Service Guide Self Test and Theory sections.  Although there's no schematics (much grumbling here), there's sometimes useful clues:

  http://literature.cdn.keysight.com/litweb/pdf/16760-97013.pdf

I think you're right that some of the "vias" are actually test points.  I haven't attempted a repair of a via, mainly for the reason there's no schematics to verify where the signal is supposed to go.  If you have a working board to compare against, it's possible to figure it out, but be prepared to invest a large amount of time.

Why was "searchTest" showing up as untested?  Did you run the tests manually one by one?  Perhaps play with different values for debug, result, or mode.  Maybe it will print out more detail of what it's doing for searchTest.  Also, if you set debug mode 1 and then type help, does anything extra show up for the 16760A card?


EDIT: Typo on card model.
« Last Edit: November 21, 2017, 05:12:40 pm by MarkL »
 

Offline NickAmes

  • Contributor
  • Posts: 31
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #10 on: November 22, 2017, 04:02:06 am »
The photos won't fit within the forum size limits, so I made an imgur album: https://imgur.com/a/4dNcX (If you want, you can download the whole album using the ... menu at the bottom of the page.)

After scraping away the hardened foam, I cleaned the bumper areas with acetone using a brush and q-tips. Isopropyl alcohol didn't seem to touch it. The soldermask scratches across the traces are from chipping away the foam (there are also small particles of foam stuck to the board).

I've tested trace continuity in the areas marked OK and the center one. There's a few more corroded vias in the upper-left area. (When looking at the whole-board bottom view.) There's also some corrosion on the lids of the memory controller FPGAs, which is a bit alarming. However, the solder around them doesn't seem to be corroded.

With the 1756x 1675x cards, some of the debug printout references to "Chip x" means "Ux".  If true here, there could be an issue related to U8.  All 0's looks strange.  Perhaps focus on that area and all the connecting traces to and from that area.

If you haven't read it yet, go through the Service Guide Self Test and Theory sections.  Although there's no schematics (much grumbling here), there's sometimes useful clues:

Why was "searchTest" showing up as untested?  Did you run the tests manually one by one?  Perhaps play with different values for debug, result, or mode.  Maybe it will print out more detail of what it's doing for searchTest.  Also, if you set debug mode 1 and then type help, does anything extra show up for the 16760A card?


"Chip 8" meaning U38 make a lot of sense. I wish the problems were just confined to chip 8. If so, it might be possible to use half the logic analyzer channels. (The service manual implies that they're mostly independent.) Unfortunately, the failing self tests talk about chip 8 and 9:
Code: [Select]
pv> x dataPassThruTest
Slot A: Walking Zeros Test ...
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ .......B  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ......B.  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ......BB  Data Levels
    Slot A, Chip 8: . ........ ........  B BB...... ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . .B...... ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . B....... ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  B ........ ........  Data Levels
Slot A: Walking Ones  Test ...
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ .......B  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ......B.  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ......BB  Data Levels
    Slot A, Chip 8: . ........ ........  B BB...... ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . .B...... ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . B....... ........  Data Levels
    Slot A, Chip 9: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 8: . ........ ........  . ........ ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  B ........ ........  Data Levels
> Slot A: Data Path Pass-Thru Test Failed!
Mod   A: TEST FAILED       # "dataPassThruTest" (2, 2, -1)
pv> x dataDemuxTest
Slot A: Data Set Low ...
    Slot A, Chip 9: . ........ ......BB  . ........ ......BB  No Activity
    Slot A, Chip 8: B BB...... ........  B BB...... ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 8: . ........ ........  . ........ ........  Data Levels
    Slot A, Chip 9: . ........ ......BB  . ........ ......BB  No Activity
    Slot A, Chip 8: B BB...... ........  B BB...... ........  No Activity
    Slot A, Chip 9: . ........ ......BB  . ........ ........  Data Levels
    Slot A, Chip 8: B ........ ........  B ........ ........  Data Levels
Slot A: Data Set High ...
    Slot A, Chip 9: . ........ ......BB  . ........ ......BB  No Activity
    Slot A, Chip 8: B BB...... ........  B BB...... ........  No Activity
    Slot A, Chip 9: . ........ ......BB  . ........ ......BB  Data Levels
    Slot A, Chip 8: B BB...... ........  B BB...... ........  Data Levels
    Slot A, Chip 9: . ........ ......BB  . ........ ......BB  No Activity
    Slot A, Chip 8: B BB...... ........  B BB...... ........  No Activity
    Slot A, Chip 9: . ........ ........  . ........ ......BB  Data Levels
    Slot A, Chip 8: B ........ ........  B BB...... ........  Data Levels
> Slot A: Data Path Demux Test Failed!
Mod   A: TEST FAILED       # "dataDemuxTest" (1, 1, -1)
pv> x comparatorCalTest
  Mod A: Pod 1, vDac1:4.700V, vAdc1:4.699V
                vDac2:0.300V, vAdc2:0.300V
                vDiff:0.001V, rProbe:0.000 KOhms, tolerance:5.0%
  Mod A: Pod 2, vDac1:4.700V, vAdc1:4.697V
                vDac2:0.300V, vAdc2:0.300V
                vDiff:0.003V, rProbe:0.000 KOhms, tolerance:5.0%
Slot A: OS Null cal failure, no 0, comp=3, chan=0
Slot A: OS Null cal failure, no 0, comp=3, chan=5
Slot A: OS Null cal failure, no 0, comp=3, chan=6
> Slot A: Pod 1: Offset Null Cal Failed
Slot A: RC comp cal failure, threshold error, comp=3, chan=0
> Slot A: Pod 1: RC Compensation Cal Failed
Slot A: Deskew cal failure, no 1, comp=3, chan=5
Slot A: Deskew cal failure, no 1, comp=3, chan=6
> Slot A: Pod 1: Data Deskew Cal Failed
CALIBRATION 1
chip 9: numPeriodPos = 0, numPeriodNeg = 1, psPerTapFine = 88
chip 9: lastPosTapFine = 16, firstPosTapFine = 6, fineTapTime = 880
chip 9: lastNegTapFine = 11, firstNegTapFine = 5, fineTapTime = 528
chip 9: numPeriodPos = 8, numPeriodNeg = 7, psPerTapCoarse = 1388
chip 9: fineToCoarseRatio = 16
chip 8: numPeriodPos = 0, numPeriodNeg = 1, psPerTapFine = 97
chip 8: lastPosTapFine = 12, firstPosTapFine = 7, fineTapTime = 485
chip 8: lastNegTapFine = 4, firstNegTapFine = 8, fineTapTime = -388
chip 8: numPeriodPos = 8, numPeriodNeg = 9, psPerTapCoarse = 1558
chip 8: fineToCoarseRatio = 16
Slot A: Tap Delay cal failure, no signal, comp=3
Slot A: Tap Delay cal failure, no signal, comp=3
> Slot A: Pod 1: Measure Tap Delay Cal Failed
Slot A: GA Delay reset at doneCount=3, compDelay=412, comp=2, ch=2
Slot A: GA Delay cal failure, initial out not 0, comp=3, ch=5
Slot A: GA Delay cal failure, initial out not 0, comp=3, ch=6
Slot A: comp:3 chan:5 cmpDelay=411 gaDelay[0]=12
Slot A: comp:3 chan:5 cmpDelay=411 gaDelay[1]=18
Slot A: comp:3 chan:5 cmpDelay=411 gaDelay[2]=26
Slot A: comp:3 chan:5 cmpDelay=411 gaDelay[3]=30
Slot A: comp:3 chan:6 cmpDelay=395 gaDelay[0]=12
Slot A: comp:3 chan:6 cmpDelay=395 gaDelay[1]=18
Slot A: comp:3 chan:6 cmpDelay=395 gaDelay[2]=26
Slot A: comp:3 chan:6 cmpDelay=395 gaDelay[3]=30
Slot A: comp:3 chan:7 cmpDelay=394 gaDelay[0]=11
Slot A: comp:3 chan:7 cmpDelay=394 gaDelay[1]=17
Slot A: comp:3 chan:7 cmpDelay=394 gaDelay[2]=24
Slot A: comp:3 chan:7 cmpDelay=394 gaDelay[3]=28
> Slot A: Pod 1: Gate Array Delay Cal Failed
> Slot A: Pod 1: Comparator Calibration Failed
Slot A: OS Null cal failure, no 0, comp=3, chan=0
Slot A: OS Null cal failure, no 0, comp=3, chan=1
Slot A: OS Null cal failure, no 0, comp=3, chan=2
Slot A: OS Null cal failure, no 0, comp=3, chan=3
> Slot A: Pod 2: Offset Null Cal Failed
Slot A: RC comp cal failure, threshold error, comp=3, chan=0
> Slot A: Pod 2: RC Compensation Cal Failed
Slot A: Deskew cal failure, no 1, comp=3, chan=1
Slot A: Deskew cal failure, no 1, comp=3, chan=2
Slot A: Deskew cal failure, no 1, comp=3, chan=3
> Slot A: Pod 2: Data Deskew Cal Failed
CALIBRATION 2
chip 9: numPeriodPos = 0, numPeriodNeg = 1, psPerTapFine = 90
chip 9: lastPosTapFine = 15, firstPosTapFine = 7, fineTapTime = 720
chip 9: lastNegTapFine = 11, firstNegTapFine = 6, fineTapTime = 450
chip 9: numPeriodPos = 8, numPeriodNeg = 7, psPerTapCoarse = 1390
chip 9: fineToCoarseRatio = 15
chip 8: numPeriodPos = 0, numPeriodNeg = 1, psPerTapFine = 97
chip 8: lastPosTapFine = 11, firstPosTapFine = 8, fineTapTime = 291
chip 8: lastNegTapFine = 4, firstNegTapFine = 9, fineTapTime = -485
chip 8: numPeriodPos = 8, numPeriodNeg = 9, psPerTapCoarse = 1561
chip 8: fineToCoarseRatio = 16
Slot A: Tap Delay cal failure, no signal, comp=3
Slot A: Tap Delay cal failure, no signal, comp=3
> Slot A: Pod 2: Measure Tap Delay Cal Failed
Slot A: GA Delay cal failure, initial out not 0, comp=3, ch=1
Slot A: GA Delay cal failure, initial out not 0, comp=3, ch=2
Slot A: GA Delay cal failure, initial out not 0, comp=3, ch=3
Slot A: comp:3 chan:1 cmpDelay=59936 gaDelay[0]=12
Slot A: comp:3 chan:1 cmpDelay=59936 gaDelay[1]=18
Slot A: comp:3 chan:1 cmpDelay=59936 gaDelay[2]=26
Slot A: comp:3 chan:1 cmpDelay=59936 gaDelay[3]=30
Slot A: comp:3 chan:2 cmpDelay=17200 gaDelay[0]=12
Slot A: comp:3 chan:2 cmpDelay=17200 gaDelay[1]=18
Slot A: comp:3 chan:2 cmpDelay=17200 gaDelay[2]=26
Slot A: comp:3 chan:2 cmpDelay=17200 gaDelay[3]=30
Slot A: comp:3 chan:3 cmpDelay=0 gaDelay[0]=12
Slot A: comp:3 chan:3 cmpDelay=0 gaDelay[1]=18
Slot A: comp:3 chan:3 cmpDelay=0 gaDelay[2]=26
Slot A: comp:3 chan:3 cmpDelay=0 gaDelay[3]=30
Slot A: GA Delay reset at doneCount=1, compDelay=399, comp=4, ch=7
> Slot A: Pod 2: Gate Array Delay Cal Failed
> Slot A: Pod 2: Comparator Calibration Failed
Mod   A: TEST FAILED       # "comparatorCalTest" (1, 1, -1)
pv> x gaProgTest
  Slot A: GateArray[3] failed on 6/13 bits
             exp: 0x1555  act: 0x1fff
> Slot A: Gate Array Load Test Failed!
  Slot A: GateArray[3] failed on 10/13 bits
             exp: 0x1210  act: 0x1fff
> Slot A: Gate Array Load Test Failed!
  Slot A: GateArray[3] failed on 13/13 bits
             exp: 0x0000  act: 0x1fff
> Slot A: Gate Array Load Test Failed!
  Slot A: GateArray[3] failed on 13/13 bits
             exp: 0x0000  act: 0x1fff
> Slot A: Gate Array Load Test Failed!
  Slot A: GateArray[3] failed on 6/13 bits
             exp: 0x1555  act: 0x1fff
> Slot A: Gate Array Load Test Failed!
  Slot A: GateArray[3] failed on 13/13 bits
             exp: 0x0000  act: 0x1fff
> Slot A: Gate Array Load Test Failed!
  Slot A: GateArray[3] failed on 13/13 bits
             exp: 0x0000  act: 0x1fff
> Slot A: Gate Array Load Test Failed!
Mod   A: TEST FAILED       # "gaProgTest" (1, 1, -1)

"searchTest" is untested because I haven't been able to get it to complete. Running it in mode 1 doesn't change the output:
Code: [Select]
pv> d m=1 r=9 d=9
debugLevel=9, mode=1, resultLevel=9
pv> x searchTest

  -- Testing chip 9 slot A ...
  -- Re-Loading Acquisition Memory for Walking 1/0 tests
  -- Starting Walking 1/0 Tests
    -- Walking Zeroes test
    startRow.port:0x0.0, endRow.port:0x7e6.8
    maskSize:12, relative:1, mode:0, type:0
    search master = Chip9, FPGA0
    chip9: level:0x1,25520140 care:0x1,a55aa55a
    chip8: level:0x0,00000000 care:0x0,00000000
 

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Re: Series defect on agilent 167xx boards?
« Reply #11 on: November 22, 2017, 09:08:15 am »
The power regulator(?) next to them looks fishy. That side also looks like there significantly more dirt/corrosion (underside of the board) maybe you could clean that area some more to get a better picture.

I've used the plexiglas part of the spacers to remove the adhesive, they dont leave scratchmarks on the board.
« Last Edit: November 22, 2017, 07:25:42 pm by DocBen »
 

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Re: Series defect on agilent 167xx boards?
« Reply #12 on: November 22, 2017, 03:54:50 pm »
Whats throwing me off here is that the naming conventions of the chips is so different.
I wonder if thats because of the card or the analyzer.

At first I thought Chip 8/9 might refer to the 10 Chips closer to the input, but reading it more thoroughly they seem to correspond to Chip 0/1 on the 16750 with the 16902a (which dont have any correlation with the markings on the board).

If so there might be something wrong with the chip itself and not the traces, because it seems there is also trouble programming it.
Also there is something wrong with Comparator 3 maybe severed lines, but then again: which one them is Comparator 3? ;)
Other than that all signals that dont fail the selftest could still work.
Maybe test that with a function generator?
« Last Edit: November 22, 2017, 04:08:00 pm by DocBen »
 

Offline NickAmes

  • Contributor
  • Posts: 31
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #13 on: November 23, 2017, 11:28:55 pm »
Other than that all signals that dont fail the selftest could still work.
Maybe test that with a function generator?

Unfortunately the software hangs when trying to capture from the device. I think it's related to the searchTest failure (or maybe the gate array programming failure).
 

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Re: Series defect on agilent 167xx boards?
« Reply #14 on: November 25, 2017, 02:35:34 pm »
@NickAmes: unfortunately that might really mean you cant use the card at all, because Chip 8/9 are ASICs so if they dont work thats propably it because they share the workload in all configurations AFAICT.
Unless you can find another card and replace the defective chip.

But that is going to be hard: I have a 16960a that I'm hoping to restore and looked at the ASIC. Not only is it BGA but they also put some sort of Epoxy around the base of it so that it is essentially glued to the board with little chance of removing it.
(I know that it isn't defective though, the Analyzer just bricked it trying to do an firmware update of the onboard interface FPGA and I think I can reprogram that or the configuration device to be more precise)
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #15 on: November 26, 2017, 08:48:08 pm »
Here's something that may or may not help...

On the 16700 HPUX-running chassis, you can turn on debugging when invoking the top level GUI, /usr/sprockets/bin/vp:

  usage: vp [-debugqfxa]
       [-debug level]
       [-all] (load all shared library groups)
       [-q] (don't show intro screen, don't prebuild instrument menus)
       [-f cfgfile] (load this config file at powerup)
       [-x] (exit after powerup, typically used w/ -f option)
       [-a address]
           address < 0:     don't load instruments
           address > 5:     search all scsi addresses (the default)
           address = 0-5:   search only this scsi address

Setting -debug to something greater than 1 (well, I tried 1, 2 and 255; try others if you like), makes a "Debug" option appear under the main "Select" menu for some models of analyzer cards.  I tried this on 16752A and 16756A cards, and I think it should would work on a 16760A since it appears to share some of the same code.  See screen captures below.

Perhaps you can use this to characterize the problem further by reading registers, or maybe by verifying chip select lines are working.  The "Chip Dump" button causes a file to be created in whatever directory you started "vp".  The file created contains a series of register dumps.

Setting "vp -debug 255" causes more stuff to be printed, but seems mostly related to the various widget settings for the GUI.  There's also some interesting commands if you run /usr/sprockets/bin/starhw.  I think it's the CLI equivalent to what's visible with vp -debug.

The above usage output from vp also implies the cards are accessed through some kind of emulated SCSI interface?  I really haven't looked at it in detail.

As is par for the course in more recent Agilent/Keysight equipment, there's no documentation for any of this.  Good luck, and if you discover anything useful, like for other values of -debug, please share.

There's lots of opportunity for "infinite monkey" hacking...
 

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Re: Series defect on agilent 167xx boards?
« Reply #16 on: November 26, 2017, 10:41:26 pm »
Wow that SCSI bit is realy interesting. In the documentation they write something like propriatary multiplexed 16-bit bus.
The LA connector has 72 pins, SCSI 50 or 68 if I remember correctly, but definitly close enough.

Would really be quite ingenious to do it that way because all the ip for the fpgas would already been there and tested.

Also quite impressed with the debug options, havent had time to see if the 16902a has something like that as well, would be really helpful.
 

Offline NickAmes

  • Contributor
  • Posts: 31
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #17 on: November 27, 2017, 01:07:16 am »
On the 16700 HPUX-running chassis, you can turn on debugging when invoking the top level GUI, /usr/sprockets/bin/vp:

  usage: vp [-debugqfxa]
       [-debug level]
       [-all] (load all shared library groups)
       [-q] (don't show intro screen, don't prebuild instrument menus)
       [-f cfgfile] (load this config file at powerup)
       [-x] (exit after powerup, typically used w/ -f option)
       [-a address]
           address < 0:     don't load instruments
           address > 5:     search all scsi addresses (the default)
           address = 0-5:   search only this scsi address


That's really interesting. Using
Code: [Select]
vp -debug 255 yields a debug menu on the 16760A. (Screenshots attached. The comparator window has a few more options than the 16756A.) I was incorrect earlier when I wrote about the app locking up when capturing from the card. It runs just fine, but locks up when trying to view the data. When trying to capture and view pod 1, the following message is printed to the console:
Code: [Select]
KatanaHardware::ramSearchStart()
    startSample:33554026, stopSample:33554824
    startRow.port:0x142550.a, endRow.port:0x1425b4.8
    maskSize:6, relative:1, mode:1, type:0
    search master = Chip9, FPGA0
    chip9: level:0x0,00009838 care:0x0,ffff0045

For pod 2:
Code: [Select]
chip 8: fineToCoarseRatio = 16
  KatanaHardware::ramSearchStart()
    startSample:33554018, stopSample:33554816
    startRow.port:0x3fffce.2, endRow.port:0x400032.0
    maskSize:6, relative:1, mode:1, type:0
    search master = Chip9, FPGA0
    chip8: level:0x0,00009838 care:0x0,ffff5a44

Also, some of the bits are oscillating with no cables attached. (Third screenshot.) They looking like they might be physically adjacent on the card.

I can't mess with the threshold settings since I don't have a probe to go on the end of the cable. According to the service manual, the LA communicates with the probe over I2C to identify which type it is. Does anyone have information on what the LA expects to find there? (Perhaps a small EEPROM?) This would also be helpful for people who can't afford the 90-pin probe prices.
« Last Edit: November 27, 2017, 01:44:20 am by NickAmes »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #18 on: November 27, 2017, 02:10:14 am »
Wow that SCSI bit is realy interesting. In the documentation they write something like propriatary multiplexed 16-bit bus.
The LA connector has 72 pins, SCSI 50 or 68 if I remember correctly, but definitly close enough.

Would really be quite ingenious to do it that way because all the ip for the fpgas would already been there and tested.

Also quite impressed with the debug options, havent had time to see if the 16902a has something like that as well, would be really helpful.
I think now my guess about using SCSI is probably wrong.  I wasn't able to make the "vp -a" option do anything different, plus I also found some utilities that look like they're using PCI to talk to the cards.  "pciinfo", for one:

  # pciinfo
 
  Starship PCI Device Configuration Header
     0x00: 1650103c   0x10: 0000ff01   0x20: f0f00000   0x30: 00000000
     0x04: 00000007   0x14: f0dfe000   0x24: 00000000   0x34: 00000000
     0x08: 06800000   0x18: f0dff000   0x28: 00000000   0x38: 00000000
     0x0c: 0000f800   0x1c: f0e00000   0x2c: 00000000   0x3c: 00ff0104
 
  Starship PCI Device Non-Volatile Serial ROM Contents
     0x00-0x0f: 3c 10 50 16 00 e0 00 00 00 00 80 06 00 f8 00 00
     0x10-0x1f: c1 ff e8 10 00 f0 ff 7f 00 f0 ff 7f 00 00 f0 7f
     0x20-0x2f: 00 00 f0 bf 00 00 00 00 00 00 00 00 00 00 00 00
     0x30-0x3f: 00 00 00 00 00 00 00 00 00 00 00 00 ff 01 ff 00
 
  Starship PCI Base Addresses
  Base  Start     Size    Mapping     Width    Description
  ----  -----     ----    -------     -----    -----------
     0  0000ff01  40      Kernel Map  32 bits  AMCC Internal Registers
     1  f0dfe000  1000    User Map     8 bits  PassThru0-Bus FPGA
     2  f0dff000  1000    User Map     8 bits  PassThru1-Bus Registers
     3  f0e00000  100000  User Map     8 bits  PassThru2-8 Bit Memory-Mapped Bus
     4  f0f00000  100000  User Map    16 bits  PassThru3-16 Bit Memory-Mapped Bus
 
  Starship PCI AMCC Register Contents
     00=00000000 Outgoing Mailbox 1
     04=00000000 Outgoing Mailbox 2
     08=00000000 Outgoing Mailbox 3
     0c=00000000 Outgoing Mailbox 4
     10=00000000 Incoming Mailbox 1
     14=00000000 Incoming Mailbox 2
     18=00000000 Incoming Mailbox 3
     1c=00000000 Incoming Mailbox 4
     20=00000000 FIFO port
     24=00000000 Master Write Address
     28=00000000 Master Write Transfer Count
     2c=00000000 Master Read Address
     30=00000000 Master Read Transfer Count
     34=00000000 Mailbox Empty/Full Status
     38=00000000 Interrupt Control/Status Register
     3c=000000e6 Bus Master Control/Status Register
 
  Backplane State
    connectCount=1, fpgaLoaded=TRUE, fpgaVersion=1
    PCIInterruptMask=0x0000 (0000000000000000)
 
  Interrupt Setup
    Interrupts are not being used


And "pcipeek":

  # pcipeek
  Startouch Backplane Found! (frameID=1)
    PCI addresses:  PAL=f0dfe000
                    Bus control registers=f0dff000
                    Backplane 8 bit bus=f0e00000
                    Backplane 16 bit bus=f0f00000
  rockyII> ?
  RockyII peekpoke Commands: all numbers are hex
  t - toggle between 8 and 16 bit backplane bus r / w access
  r XX - read 8/16 bit backplane address XX
  w XX YY - write 8/16 bit backplane address XX with value YY
  R XXXX - read FPGA/PAL address XXXX
  W XXXX YY - write FPGA/PAL address XXXX with value YY
  T - toggle between FPGA and PAL for R / W access
  a X - set access size to 1, 2, or 4 bytes
  c NNNN - set repeat count for read/write operations
  d filename - download raw bits (.rbt or .tek) file to PCI FPGA
  m filename - download Altera (.ttf) file to Marinade FPGA
  s n - select slot n (a..j, 1..4) in cardcage.  0 ==> no slot
  D msec - delay time between I/O accesses
  v n - set verbosity level to n
  i - show PCI configuration info
  l - load backplane FPGA and list cage contents
  h, ? - this help screen
  q - quit
  rockyII> l
     A: id=0x38  Unknown hardware ID
     B: id=0x38  Unknown hardware ID
     C: id=0x35  Unknown hardware ID
     D: id=0x34  Unknown hardware ID
     E: id=0x1b  Unknown hardware ID
     F: id=0xff  Empty Slot
     G: id=0xff  Empty Slot
     H: id=0xff  Empty Slot
     I: id=0xff  Empty Slot
     J: id=0xff  Empty Slot
     1: id=0xff  Empty Slot
     2: id=0xff  Empty Slot
     3: id=0xff  Empty Slot
     4: id=0xff  Empty Slot
  rockyII>


Those id's above map back to the card product IDs in pv and other utilities.  Contents of /usr/sprockets/tools/instrument/16700/productMap (IDs are in decimal):
Code: [Select]
# This is a table of product ID codes and their corresponding HP
# product numbers. When we invent a new product (LAX for example), we'll
# need to add it to this lookup table. Until we have time to spend
# truly eliminating the code dependence cycle, this will have to do.
#
# NOTE:  the directory name and lib name must be the same, since they're
#        pulled from this file as a single name.
#

 4 16517    # Roadrunner
 5 -1       # Roadrunner expander
13 16532    # Epitaph
14 16534    # Franz
15 16535    # Talons
21 -1       # Old Pattern Generator
22 -1       # Old Pattern Generator Expander
24 -1       # Phaser expander
25 16522    # Phaser
26 -1       # Deep Phaser expander
27 16720    # Deep Phaser
31 -1       # Elan
32 16550    # Socrates
33 -1       # Socrates Expander
34 16555    # Marianas
35 -1       # Marianas Expander
36 marinade # Marinade
37 16710    # Kazon
38 -1       # Kazon Expander
40 -1       # Omega
41 -1       # Omega Expander
42 -1       # Chronicle
44 16715    # LAX
45 -1       # LAX Expander
46 ont      # ONT
47 -1       # ONT Expander
48 sfo      # SFO
49 -1       # SFO Expander
50 yakitori # YAKITORI
51 kabob    # KABOB
52 16718    # Katana
53 -1       # Katana Expander
226 multi   # Multiframe
54 16760    # Daytona
55 -1       # Daytona Expander
56 16754    # Wildcat
57 -1       # Wildcat Expander
60 16718    # Yari/Supercharger Rev. 2
61 -1       # Yari/Supercharger Rev. 2 Expander
logic#


More stuff to play with!

Sorry I can't be more helpful with your 16902A, but I don't think Agilent would delete all this debugging when porting the code.  I found clues for most of it by running "strings" on the executables and shared libraries, and then started trying things.  Perhaps a similar scan through the windows bits would be enlightening.

EDIT: Fixed couple of minor typos.
« Last Edit: November 27, 2017, 02:37:53 am by MarkL »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #19 on: November 27, 2017, 02:23:26 am »
...
Also, some of the bits are oscillating with no cables attached. (Third screenshot.) They looking like they might be physically adjacent on the card.

I can't mess with the threshold settings since I don't have a probe to go on the end of the cable. According to the service manual, the LA communicates with the probe over I2C to identify which type it is. Does anyone have information on what the LA expects to find there? (Perhaps a small EEPROM?) This would also be helpful for people who can't afford the 90-pin probe prices.
One thing you can do is buy E5378A adapters (about $20 on ebay) and some Samtec ASP-65067-01 (also on ebay or Digikey) and make your own single-ended probes.  Or with a fair amount of trouble, differential probes.  The E5378A has a resistor for the probe ID.

Unless you need the speed, extra memory, or differential probing, it might be easier in the end to go with a 16750A/16751A/16752A card with the much cheaper and ubiquitous single ended probes.  As per my other post, you can turn any of those into the maxed out 16752A.
 

Offline DocBenTopic starter

  • Regular Contributor
  • *
  • Posts: 111
  • Country: de
Re: Series defect on agilent 167xx boards?
« Reply #20 on: November 27, 2017, 04:52:37 pm »
@MarkL: Great stuff, definitly keep it coming!  :popcorn:

I'm a little short on time right now, but I did have a look at the Logic Analyzer Software (which you can download and play with even without an Analyzer) and there doesn't seem to be anything quite like the Tools you have. Maybe thats only activated in the main application on the actual Analyzer when the module is in place. Have to test that sometime. I do wish they had linux and cli on this thing though.

Also I think the 16902a has two busses: SVY and RIO (and I think they are both named after bus stops in California, Scotts Valley and Rio Dell)
SVY seems to be the "old" 16700 type bus and RIO is an high speed addon I guess (something like LVDS with an FB2040 BTL driver) so memory transfer from the cards is likely significantly faster. But it isnt mentioned anywhere. I can only infer from an Bus tuner application for Rio (sets the voltage swing) and pci to svy / rio drivers the application installs.

Right now the first thing I want to do is to replace the electrolytic capacitors on the Mainboard and then try to get the 16740a to be a 16752b.

 

Offline DoricLoon

  • Contributor
  • Posts: 38
  • Country: gb
Re: Series defect on agilent 167xx boards?
« Reply #21 on: October 01, 2019, 08:44:43 pm »
I wonder if anyone could shine some light on an issue I have with a 16750A card?
Purchased 2X16741A and 2x16750A cards from Ebay advertised as not working. Price was pretty low and the cards came with cables.
I was hoping I'd be able to repair at least one of them.

The two 16741A's are pretty bad. Large number of self test fails and one of them looks like the whole bank of ram has seen high voltage? (They also show many vias with corrision)
The 16750A's are a bit more promising, one in particular. When I ran the self test it failed with a comparator error and a calibration error. Couldn't see much sign of the corrosion but took the stiffeners off anyway and had a good look (looked healthy)

I then connected a power supply linked to each of the bits of a pod and checked the thresholds based on the service manual. Pod 4 low order bits were considerably out.
Since I had plenty of spares I thought I'd steal a comparator out one of the 16741A's. At least I think I have changed the comparator? The signals enter the card and pass a couple of resistors before hitting a 44pin QFP which I assume is the comparator? (Agilent numbering so I can't search for a data sheet to verify) I checked continuity through from pod4 bit 0 to find the correct chip. After changing the said device I still have the same outcome.
Buzzed out the traces crossing under the runners without any breaks. I did however find a small section of corrosion on one of the traces about 5mm away from one of the runners. The trace was open under the corrosion. Soldered a wire bridge across it and touched up the soldering on a resistor network which looked a bit ropey.
This time when I carry out the self test the card passes with flying colours. Unfortunatley the thresholds on pod4 low order are still out of tolerance.
The error with the threshold set to -5 and 5V is around 700mV (-4.3 and 4.3V) the error reduces linearly heading toward zero. With the threshold set to zero the error is only 35mV (within the specified 65mV).
I have tested all the resitors and capacitors in the vaccinity of the comparitor. Have cross referenced with Pod 2 as well as another card and values all seem very close.

Does anyone know which devices are responsible for setting the threshold voltages? Service manual states:

 "The threshold circuit includes a precision octal DAC and precision op amp drivers. Each of the eight channels of the DAC is individually programmable which allows the user to set the thresholds of the
individual pods. The 16 data channels and the clock/data channel of each pod are all set to the same threshold voltage."

If I could identify the dac and opamp it would be a start?

The card is probably useable enough as it is. Still 56 channels working perfectly and thanks to Mark L, and the removal of one resistor and I have 32M memory depth. :-+
« Last Edit: October 02, 2019, 12:02:37 am by DoricLoon »
 
The following users thanked this post: tonykara

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #22 on: October 02, 2019, 01:48:48 pm »
...
Does anyone know which devices are responsible for setting the threshold voltages?
...

I haven't done any debugging on the comparator or threshold circuits, but I would start with identifying the DAC (U39) outputs for the 8 pods.  Then compare the out-of-spec output with some of the working ones all the way to the comparators (U29 U38 U46 U53 U72 U79 U85 U88).  The DAC (U39) is an AD7841AS in a MQFP-44 and is near the backplane connector.

When you change the threshold voltage in the "Format" tab, the voltages on the DAC should change immediately, so it should be fairly easy to compare what's happening.

With a quick continuity check, it looks like the DAC output enters the comparator on pin 39, after going through a resistor divider of 7500R on top and 536R on the bottom.  Make sure the arriving DAC output and the output of the voltage divider into the comparator make sense.  Compare with the circuitry on a good pod.

I don't know what's going on inside the comparator, but keep in mind the the incoming voltages from the tip of the probe pods is arranged as a 10x probe (90k + 10k), so I would expect the threshold input of the comparator to also be in this range (just a guess - I didn't verify it).

It's difficult to probe these boards in the chassis, and you can do some checking with continuity.  But for real verification, you'll need to solder in some jumpers and run them out of the chassis to probe the voltages I'm suggesting while it's running.

If the bad pod comparator is on the bottom, you can also move the card to the bottom chassis slot, take the bottom off the chassis, and relocate the mouse/keyboard interface board (just let it hang in the air but insulate it so it doesn't short against anything).  That clears the way to probe the bottom of the board.

Please let us know what you discover!
 
The following users thanked this post: DoricLoon

Offline DoricLoon

  • Contributor
  • Posts: 38
  • Country: gb
Re: Series defect on agilent 167xx boards?
« Reply #23 on: October 02, 2019, 02:30:49 pm »

I haven't done any debugging on the comparator or threshold circuits, but I would start with identifying the DAC (U39) outputs for the 8 pods.  Then compare the out-of-spec output with some of the working ones all the way to the comparators (U29 U38 U46 U53 U72 U79 U85 U88).  The DAC (U39) is an AD7841AS in a MQFP-44 and is near the backplane connector.


I did notice an Analog devices chip near the backplane last night but dismissed it thinking it would be nearer the comparators. Thanks for the info. I have downloaded the datasheet for the DAC. If the reference was off then more than one 8 bit section would be affected. There are three reference connections available on the DAC but minimum of two outputs using the same ref. I guess the reference will common to all of them?

So carry out continuity checks between the DAC outputs and the resistor divider you mention. Check resistor values and connections to comparators.

Thank you Mark, will get a look at it in the evening and will update.

« Last Edit: October 02, 2019, 02:34:22 pm by DoricLoon »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #24 on: October 02, 2019, 06:57:16 pm »
...
If the reference was off then more than one 8 bit section would be affected. There are three reference connections available on the DAC but minimum of two outputs using the same ref. I guess the reference will common to all of them?
Yes, the DAC reference inputs are all tied together, so your assumption is correct.

Quote
So carry out continuity checks between the DAC outputs and the resistor divider you mention. Check resistor values and connections to comparators.
Right.  It could be a partially corroded trace coming from the DAC that is causing a high impedance connection to the comparator.  So it works, kind-of.

Of course it could also be the DAC itself.  Or it could also be the comparator, but you've changed that.  It's time to measure the actual voltages and compare with a known good half-pod.

Here's the mapping for the DAC outputs, which you might find useful:

  DAC (pin)    Comparator   Pod/Bits (Clock)
  ----------   ----------   ----------------
  VoutA (2)        U85       1/7:0
  VoutB (44)       U88       1/15:8 (J)
  VoutC (43)       U46       2/7:0
  VoutD (41)       U53       2/15:8 (K)
  VoutE (37)       U72       3/7:0
  VoutF (35)       U79       3/15:8 (L)
  VoutG (34)       U29       4/7:0
  VoutH (32)       U38       4/15:8 (M)

Since you're seeing problems with pod 4/7:0, I would focus your search on the path from DAC output VoutG, through the resistor divider, to U29.

Photo below of the comparator area.  They're all the same as far as the threshold input is concerned.

I'm not sure why they assigned two DAC channels per pod since the software will only set the entire pod to the same voltage.  It might be they wanted to have the option to split the pod into two different voltages in the future, or maybe there's some small differences in the comparators which are calibrated out during boot.  Dunno.
 
The following users thanked this post: DoricLoon

Offline DoricLoon

  • Contributor
  • Posts: 38
  • Country: gb
Re: Series defect on agilent 167xx boards?
« Reply #25 on: October 02, 2019, 08:10:22 pm »
Thank you very much for the help Mark.

With the information you passed on earlier I discovered that the 7k5 resistor in the divider for U29 was in fact measuring 8k. Swapped the resistor and I have fully functional 16752A

Beauty.

Came on to post my results and I see your further post which has the information I had just noted down about the DAC to comparator pin out.

So for my low order bits on pod 4. Comparator U29 (pin39) to junction of (560R - 7k5) - DAC U39 Voutg (pin34), trace goes pretty much full length of the card.

Now to clean up the area's where the runners were and stick them on with new adhesive. I purchased a roll of 3M VHB clear foam tape 6mm wide 1mm deep. Hopefully it will play nicely with the PCB?

Thanks again  :-DMM

 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #26 on: October 02, 2019, 08:35:34 pm »
Great - glad you found the problem!

That's what I was going to try too - the thin 3M VHB tape, but I have some of the gray variety sitting around.

For now, after cleaning the boards, I left the runners off and I'm being careful with them.  I'm probably going to regret it after I rake several components off the bottom, and THEN I'll put the runners back on.
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #27 on: November 15, 2019, 06:12:39 pm »
Thanks so much for this dialog --- it's nice to see such detailed posts and information!

I'm looking for an appropriate replacement adhesive for the runners.

How's that 3M stuff working out? Have you found any adhesive that claims it is specifically FR-4/PCB safe?

Keith

EDIT: Sorry I know the thread is about a month old.....not too too stale, I hope. :)
 

Offline DoricLoon

  • Contributor
  • Posts: 38
  • Country: gb
Re: Series defect on agilent 167xx boards?
« Reply #28 on: November 15, 2019, 07:16:11 pm »
I've still not got around to sticking the runners back on. Roll of tape at the ready but that's as far as I've got. When I read the data sheet it sounded like the 3m tape should be safe? I guess that any issues with it would take a long time to transpire anyway. I did wonder if it would be a good idea to put on conformal coating first, just incase the adhesive doesn't play nicely?
« Last Edit: November 15, 2019, 07:18:33 pm by DoricLoon »
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #29 on: November 15, 2019, 08:33:40 pm »
Yeah conformal coating down first sounds like a good idea.

My question was whether the pro-active approach to runner removal is a good idea or not. In boards where there are no visible signs of corrosion that are otherwise self-testing fine?

It occurred to me while I was removing them that you could possibly damage a board that worked fine.

Do you guys have an opinion on that?

I've been using a heatgun setting on low and trying to evenly heat just the runner. While the heat helps the adhesive release, I worry about these super tiny 0201's (or whatever) accidentally re-flowing. I'm guessing the surface tension would keep it in place.....but who knows?

My technique has been to use a plastic pry tool to get under an edge, and then wand the heatgun back and forth over the runner, all the while applying a lifting force. The runner does sometimes "pop" violently off the board.

I have noticed that the adhesive tape is both still sticky and flexible.....not hard.... and that I've been able to, in 95% of the cases lift the adhesive tape AND the plastic runner off fully intact.

Am I doing this right?

Thanks
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #30 on: November 15, 2019, 09:32:18 pm »
...
My question was whether the pro-active approach to runner removal is a good idea or not. In boards where there are no visible signs of corrosion that are otherwise self-testing fine?

It occurred to me while I was removing them that you could possibly damage a board that worked fine.

Do you guys have an opinion on that?
In general, I'm strong a believer in "if ain't broke, don't fix it,", which includes things like re-capping for no demonstrated reason.  I've seen too many people damage perfectly working equipment.

However, in this case my opinion is to remove the runners proactively.  I have somewhere around 25 cards, and ALL of them had corrosion on them, some severe and some only in a few spots.  For several years I was in the the "not proactive" camp until two cards I had received working and had been using were suddenly dead one day.  I took the runners off and several traces had been eaten clean through.  Fortunately, repairing the traces brought them both back.  Not wanting to repeat this fate, I removed all the runners on all my cards and scrubbed the areas with isopropyl.

No additional failures yet, but this was fairly recent within the last year or so.

I was careful but not overly paranoid about scraping off SMD components.  I have at least two of each card type so I could measure the value on the other card for replacement.  (And, yes, there were a couple of accidents involving MLCCs.)

Quote
I've been using a heatgun setting on low and trying to evenly heat just the runner. While the heat helps the adhesive release, I worry about these super tiny 0201's (or whatever) accidentally re-flowing. I'm guessing the surface tension would keep it in place.....but who knows?

My technique has been to use a plastic pry tool to get under an edge, and then wand the heatgun back and forth over the runner, all the while applying a lifting force. The runner does sometimes "pop" violently off the board.

I have noticed that the adhesive tape is both still sticky and flexible.....not hard.... and that I've been able to, in 95% of the cases lift the adhesive tape AND the plastic runner off fully intact.

Am I doing this right?

Sounds like your cards are newer if the foam tape was still pliable.  Most of mine were hard as a rock and crumbled when I tried to remove them.

I think if you're heating them enough to reflow the solder, you're applying too much heat!  The plastic runners would probably melt first.

A soft nylon tool works well if you have to do scraping.  I read that some people cut up an old credit card.

 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #31 on: November 15, 2019, 10:13:53 pm »
Mark,

Of your 25 cards, what cards and years of manufacture are you dealing with?

I have about 10-12 cards, and I've inspected about half of them. The serial numbers are US39xxxx, US40xxxx, and US42xxxx indicating 1999-2002. 16715As, 16717As, 16752A.

I also have MY45's -- Malaysia 2005, 16951s and 16910As.

Is the corrosion you're talking about very visible? I've seen your images posted, and the god-awful ones from Alexandre, but the majority of what I've seen with real problems look like really dirty, dingy, and I'm not surprised by the fact that there's corrosion on these boards.

But mine look nothing like that. They are clean. Now I use my 3M/SDS 497AJN ESD-safe vacuum on things, and IPA to keep them clean. I use a grounded wrist-strap at all times.

I'm not judging or humblebragging about the state of the boards here, I'm truly worried that I'm missing some obvious signs. Now I did see a tiny bit of corrosion on my stacking connector pins which I used deoxit and IPA to clean up.

I'm using 10x magnification to look at the traces underneath the runners and I'm looking very carefully. I just don't see anything.

So I'm trying to figure out why you're seeing it about 100% of the time, and I'm seeing it, well, very little. The most I've seen (and I'm not discounting this) is perhaps a light green surface-level tinge on the vias. And maybe that's worse than the traces!

Thanks,
Keith
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #32 on: November 18, 2019, 02:48:17 pm »
Hi Keith,

The vast majority of my cards are US made.

16740A, 16741A, 16750A, 16751A cards are from 2000 to 2002, all US.

16534A cards are from 1995 to 1999, all US.

16753A, 16754A, 16755A are all from 2002, all Malaysia.

I recently acquired a 16712A card from 1999 (US made) and it is extremely clean, apart from a few dots of green corrosion on some vias.  It tested good and, to be honest, I'm not rushing to remove the runners on this one.

Generally speaking, my cards are all around 20 years old and 100% of them have (or had) corrosion to some level.  The rate of whatever is leaching out of the adhesive is probably related to their environment.  The boards I had go bad on me were in a clean, cool, dry area and were not powered on continuously.  The dust had been cleaned off, but runners not removed at that point.

The only card I have without corrosion is a 16720A pattern generator.  It doesn't have runners on the bottom because it doesn't have any components on the bottom.

I don't have any 169xx cards to compare.

I know that the corrosion has wicked into some of the vias on some boards because it has shown up on the other side in a couple of cases.  And those boards don't work, no surprise.  With no schematics, they are likely scrap at this point.  It's a shame that a 10 cent strip of foam tape is systematically destroying this impressive technology.
« Last Edit: November 18, 2019, 06:13:54 pm by MarkL »
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #33 on: November 19, 2019, 03:11:45 am »
Thank you for the detailed reply.

Quote
It's a shame that a 10 cent strip of foam tape is systematically destroying this impressive technology.

I couldn't agree more. It is a shame. Of course it's the weakest link in the fence idea. Even if they did a superb job in PCB layout, most of the component choices, good SQA on the software, etc etc --- After 20 years+, which is well passed the original window of support, anything is likely to break down.

I keep checking out my various modules. The 16715A which was misbehaving as a master in some cases, I removed the runners. All the areas underneath the runners were clear.

HOWEVER, I think I finally spotted my first piece of corrosion.

Is this it? Note this wasn't under a runner but bottom right(looking at module right-side up) edge.

I've made a couple passes with deoxit and IPA. I admit having a hard time capturing this image, depth of field, light, etc.

Thanks,
Keith
 

Offline NickAmes

  • Contributor
  • Posts: 31
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #34 on: November 19, 2019, 03:27:59 am »
That looks more like delamination on an internal layer rather than corrosion.
 
The following users thanked this post: keitheevblog

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #35 on: November 19, 2019, 03:55:30 pm »
Hi Keith,

I think Nick is right that it's not corrosion.  The corrosion is crusty green on the surface, and when it gets underneath the soldermask it turns the copper grayish brown.

Below are a few examples.

The first two are the practically untouched 16712A that I mentioned.  I had to look hard to find anything, but it's there.

The last two are a 16534A card that I recently received.  This one is in terrible condition and fails self-test, and I was going to use it for a thread on how to repair these cards.  There is severe corrosion to traces, and on this one the metal back plate for the hybrid ADC on top has also been thoroughly attacked (it's not just copper that it likes).

EDIT: Hmmm.... seems the photos came out in the opposite order I posted them.  I'm sure you get the idea.
 
The following users thanked this post: keitheevblog

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #36 on: November 20, 2019, 01:42:43 am »
Thanks much for the sample images.

Yeah, I've been pretty lucky --- nothing even close. It's nice to see images of the exact type here.

The key feature looks like almost neon green crusty deposits.....

I'm going to pull my 16900 modules and check those too. My 167xx modules are all clean the best I can tell.

Thanks
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #37 on: November 20, 2019, 08:20:09 pm »
Ok no photos yet because I was too tired to remember to take them.

However.

The 16951A I pulled last night has a considerably different designed runner. First, the runners are easily 1/3 to 1/4 the overall width --- very narrow in comparison. So instead of 6.5mm wide, they are something like 2mm wide.

Next, instead of being about 5" long, they are closer to 3-4" long.

First impression is that they sit a touch more proud....that is, a little taller. And more skinny. And they are less long. They take up MUCH less surface area as a whole. It also seemed to be that their placement was better --- given that they are smaller, are located in more "don't care" areas crossing fewer traces.

It does appear that they might have known of an issue and/or might have just been trying to conserve board space....

This module is from 2006 and looked pretty clean to me. Some surface dust. No visible corrosion that I can see.

Just another data point to put in our caps.

Thanks
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #38 on: November 21, 2019, 05:54:52 pm »
The 16753A/54A/55A cards that I have all have the skinny runners too.  They're 3mm wide, and as you point out on your 16951A, are less in number.

They have less corrosion than my other cards, but it's still there.

While less in number, the runners cross several areas of dense escape vias underneath a couple of the BGAs.  Maybe it's same on the 16951A.  When I took the runners off, many of the annular rings on the vias were partially or completely eaten away and corrosion was down some of the holes.  I couldn't believe the card still worked.  Year 2002 serial number.

Out of the 5 16753A/54A/55A cards I have, only 2 of them work.  The picture I posted earlier in this thread:

  https://www.eevblog.com/forum/repair/series-defect-on-agilent-167xx-boards/msg1353955/#msg1353955

was a repair attempt on a 16755A card.  Also from 2002.

With any luck maybe they figured it out by 2006.
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #39 on: November 27, 2019, 02:46:50 am »
ok fellas, I've got some new information. Been on a warpath for days trying to find the right replacement adhesive tape. I think I've got a good start:

Don't try to use VHB acrylic adhesive tapes on silicone conformal coating. 3M application engineers told me that specifically it won't bond. 3M 830 rubber adhesive 442KW should work here, though. It has a nice feature too "Provides clean removability from many surfaces"

While 3M says you should always test the tape for suitability for your specifically application, they think there's a good chance that the 5952 series, specifically the 5915, should work. You cannot, however, easily remove this tape. It's designed to be permanent, although they publish a PDF with removal instructions.

While anything is possible, the application engineer said that it likely was not the adhesive that leached or had some bad interaction with the substrate/adherend. They said they've got "decades of the various formulations in hundreds of thousands of applications" and they thought it would be common knowledge (amongst his peers) and/or covered specifically as a warning if it was a problem. Take that for what it's worth.

He also said extensive tests with this tape in electronic applications have NOT been done.

I'm inclined to think that there was something corrosive on the board that then got trapped underneath the adhesive at the time of assembly. It's also possible that poor environmental storage conditions exacerbated the problem -- perhaps the adhesive/runner attracting or trapping some bit of water on a stored on the side board.

I think this answer is good enough for me. I will say that shearing off a component is definitely going to happen, so I'm going to replace the runners soon. I'll use just enough tape strategically placed to avoid high density trace/via areas.
 
The following users thanked this post: MarkL

Offline benoar

  • Newbie
  • Posts: 1
  • Country: fr
Re: Series defect on agilent 167xx boards?
« Reply #40 on: March 03, 2020, 12:43:15 pm »
Hello everyone, new 167xx owner here and new participant to the forum.

Just got a broken 16750A that I am trying to fix: spotted a broken delay line before buying, did not care much about losing 1/4 of the inputs, but when it arrived it was also missing a ferrite, so no testing possible right now. I also looked at the corrosion and… it is quite bad. For example, attached is a photo of a broken trace by corrosion (tested, no continuity), and another one of the PCB after removing a runner, which peeled the solder mask on some areas. You can also see R519 badly corroded: it came off while removing the runner, too.

I cleaned it with IPA and then tried to tin the traces to protect them. but got more mishap while at it: ripped off a pad (test point) presumably because I was too quick and maybe forgot to dry the IPA, or my iron was too hot (370°C; went to 280°C and still working fine with SnPb); it “popped”, but I continued a bit and tore it off (I was thinking it was maybe because of some discharge because my shitty wallwart-supplied hobbyist Weller leaks current like hell; my ESD wristband tingles when it heats up sometimes…). Also accidentally cut another trace somewhere else with my knife when removing adhesive. Then injured myself with the knife and stopped for the moment. Another photo from the result of peeling the runner with a knife: solder mask is cut off in a lot of small places, not very nice. Maybe I should go the heat-gun way.

I will report if I get anywhere with that.
 

Offline TK

  • Super Contributor
  • ***
  • Posts: 1722
  • Country: us
  • I am a Systems Analyst who plays with Electronics
Re: Series defect on agilent 167xx boards?
« Reply #41 on: March 03, 2020, 01:55:10 pm »
There lots of vias that are corroded and they internal traces could be damaged, making it extremely difficult to repair this boards
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #42 on: June 03, 2020, 01:02:05 pm »
While I know this doesn't help any of you that already have corrosion on your boards, I've recently done some runner replacements and wanted to share an update.

  • I'm silicone-conformally coating any potentially problem area underneath the removed runners, after cleaning, for future protection.
  • I decided on 3M 442KW adhesive tape. I really like this tape, working beautifully. Don't order from digi-key, their stock is old. If you're in US, order from ULINE.
  • I do think that being proactive with these is the prudent approach, but there is a risk during runner removal. Take your time.

My updated blog post is https://www.techtravels.org/2019/12/corrosion-near-underneath-hp-logic-analyzer-module-plastic-runners/.

Hope this helps!

 
The following users thanked this post: MarkL

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #43 on: December 10, 2020, 03:13:56 pm »
Hi!

I seem to have the same problem. I've recently got a faulty 16750A, that when self-tested returned the following:



Unfortunately I don't know how to make any more sophisticated tests. If anyone could point me to a simplish tutorial about that or give some other info I would be very glad :).

And I have noticed the green globs of corrosion sticking from underneath the plastic sliders. I have removed some of them, and the sticky tape is totally dry.

So, my questions is - is it possible, that the self-test failures are due to the corroded tracks? Or is it obvious, that the reason is totally different? I'm asking this, before I spend few hours cleaning the board with IPA and/or acetone from this crap  ::).
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #44 on: December 10, 2020, 03:33:13 pm »
Hey there,

Most of the tests that you're running via that GUI interface are the same tests. What you need to do is telnet into the box(Username:root, pw "uh,uhuh", access the shell, and run the pv command. You can hit ? for help, and most of it is self explanatory. You either select a specific module (or have it execute against all of them), and then pick which tests (or all of them) to run.

There are a couple ways to increase the verbosity of the debugging output.

Are you far enough along to know how to do those steps above?

EDIT: And yes, there's a VERY good chance that the test failures are directly related to the corrosion.

Keith
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #45 on: December 11, 2020, 10:29:40 pm »
Hey there,

Most of the tests that you're running via that GUI interface are the same tests. What you need to do is telnet into the box(Username:root, pw "uh,uhuh", access the shell, and run the pv command. You can hit ? for help, and most of it is self explanatory. You either select a specific module (or have it execute against all of them), and then pick which tests (or all of them) to run.

There are a couple ways to increase the verbosity of the debugging output.

Are you far enough along to know how to do those steps above?

EDIT: And yes, there's a VERY good chance that the test failures are directly related to the corrosion.

Keith

Hi Keith!

Thank you for your quick reply. I've connected the device to my networked and connected to it via telnet. I've run the pv without a problem. I will post the test results here as soon, as I finish the tests  :-BROKE.

What I'm more curious is if it is possible to pinpoint the broken trace or at least find a region, where it might be, based on the failed test or info from the pv. For example - my card failed the comparators test (as shown in the photo) and while cleaning the board from the black goo I've found, that two traces from the DAC to the comparators are broken (and I hope that this is the case with them).
 

Offline keitheevblog

  • Regular Contributor
  • *
  • Posts: 59
Re: Series defect on agilent 167xx boards?
« Reply #46 on: December 12, 2020, 03:37:21 pm »
Hey,

Yeah it's certainly possible, but I don't think a guide exists to help you identify the traces or regions easily.

I'd suggest looking at the service guides and manuals that describe the tests. There's about a paragraph of text per test that describes what the test does. There's also a general theory of operation section that describes roughly how the LAs work. You can use those, in combination with looking at the board, to help you figure it out.

Some tests are more useful than others. For some of the memory tests, for instance, they will identify which chip number is failing a test. That chip number isn't specific as a PCB silkscreen Unit Designator (or whatever the Uxx's are called) --- so you have to count chips to figure it out.

I can't easily look this up at the moment, but turning on enhanced debugging can help. There's switches like "d=9,p=9,r=9" that adjust the verbosity of the debug output. I don't think we've found a guide that describes what they do, but I think we assume 0-9, and I think I've done 255 before because why not? Look back through the previous threads here where they are discussed. You can see them at the command line help.

Sometimes, you can see a particular problem that matches to a trace. For instance if bit6 is always 0, when it should be a 1, then that trace could be cut. If things are intermittent, it's less clear. Now which trace is bit6? It's mostly trial and error. With these custom chips, finding datasheets isn't always possible. But things are logically laid out on the PCB, usually. So if it's an 8-bit bus, and you know it's bit6, then count 6 from the left, and 6 from the right.....it might be one of those!

You need to be able to probe the boards live, while they are plugged in. In order to confirm the right voltages/signals when the board is being tested/used. MarkL on these forums is a super smart guy --- he has techniques like installing the board in the bottom slot, flipping the chassis over, taking off the bottom cover, then probing the board.

For full disclosure, while I know the approximate technique on how to fix this stuff, I've never actually done it. There are physical challenges (access to the board while it's powered up), which also can require small pigtails to be soldered to the chips.... This stuff being SMD can be challenging to avoid shorts. Or you need to use microprobes (really small fine clips for legs of chips) which can be expensive.

I attempted repairing a couple bad boards, but it was pretty obvious it's beyond me -- partially in terms of lack of experience and skill, but mostly in terms of patience and willingness. You might be in a different place with all of that, obviously.

I hope this helps
Keith
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #47 on: December 12, 2020, 08:27:23 pm »
Ok, so I've done some more detailed tests over the telnet, and my board failed the following tests:

  • anlyBusTest
  • clksTest
  • measAnlyBusTiming
  • cmpTest
  • ZoomChipSelTest

Where I can find any info about this tests? I have a service manual from Keysight page, but there is nothing valuable about the these - they just say, thet there is a particular test. What I would want to know, if, based on the above failed tests, do you think the board is worth trying to repair it? I mean - it's fun and all ;) but if this means e.g. an 100% dead important IC then it is not fun any more to waste a lot of time.

So - any ideas about this failed tests?  :-//
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #48 on: December 12, 2020, 08:40:08 pm »
Keith definitely has all the magic incantations for the extra debugging tricks.

Despite my obsession with trying to fix these cards, my success rate is only about 1 out of 3.  It's so unfortunate that "pv" and its capabilities is completely undocumented even now when these products haven't been sold for 20 years.

If you haven't found it yet, there's another thread that may be of some help where the DAC and comparators are discussed:

  https://www.eevblog.com/forum/testgear/agilent-16717a-comparator-and-zoomchipseltest-failures/msg3091674/#msg3091674

Even though it's not specifically your card model, it still uses the same DAC and comparators.  Agilent tried to re-use as much of the front-end design (including timing zoom) between the various modules as possible.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #49 on: December 12, 2020, 08:45:14 pm »
On your latest post, try turning on some of the more detailed debugging output, as Keith suggested, and just run that one test from pv.

Some of the tests will print out bit patterns and other things that it's unhappy about which may be able to guide to you tracks that are damaged.  Most of the test routines name the specific chips where the test is failing, but it doesn't mean it's the chip.

Not that I'm saying it's not a bad chip, but of the boards I've repaired, it never required a chip swap (unless you count the ones I blew up by accident myself, but those doesn't count).
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #50 on: December 12, 2020, 08:48:14 pm »
Also, it's worth noting that if one test fails, I've found that you should concentrate on fixing that one first.  Subsequent failures are usually a consequence of the first failure reported.
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #51 on: December 16, 2020, 02:51:44 pm »
OK, so I wanted to update the topic. I've cleaned the whole board, and tested it again, this time using pv. It showed the same failed tests. I've found two obviously broken traces, that from the location look like there are connecting the DAC with the comparators (judging from the location etc).

After fixing the traces I've got the following result from pv:

Code: [Select]
Mod A  : (0x34) 16750A Logic Analyzer (Master)
Summary   Test Name                      #Tests #Fails
------------------------------------------------------
passed    cpldRegTest                         2      0
passed    testLoadFPGA                        2      0
passed    fpgaRegTest                         2      0
passed    dataBusTest                         2      0
passed    addrBusTest                         2      0
passed    hwMemoryCellTest                    2      0
passed    unloadTest                          2      0
passed    dmaTest                             2      0
passed    sleepTest                           2      0
passed    searchTest                          2      0
passed    chipRegTest                         1      0
FAILED    anlyBusTest                         1      1
FAILED    clksTest                            1      1
FAILED    measAnlyBusTiming                   1      1
passed    bpClkTest                           1      0
passed    cmpTest                             3      0
passed    icrTest                             1      0
passed    flagTest                            1      0
passed    armTest                             1      0
passed    calTest                             1      0
passed    zoomDataTest                        1      0
passed    zoomMasterTest                      1      0
passed    fisoRedundancyTest                  1      0
FAILED    zoomAcqTest                         17     3
passed    zoomChipSelTest                     5      0

So it seems to have fixed the cmpTest and zoomChipSelTest, but now the zoomAcqTest fails from time to time. After reseting the pv I've rerun the test 20 times and it was fine  :-// it also passed with flying colors when umaking the self-test on the device itself.

So now I have to worry only about the anlyBusTest, clksTest and measAnlyBusTiming. Does anyone has any closer information regarding those?
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #52 on: December 16, 2020, 07:09:06 pm »
Please read Keith's and my post above about turning on debugging output for pv.  Here's another thread that talks about pv:

  https://www.eevblog.com/forum/repair/logic-analyzer-boards-repair/msg2235390/#msg2235390

There is a service guide that describes each of the tests performed.  These descriptions can provide additional clues on where to look for problems when combined with the debugging output:

  http://literature.cdn.keysight.com/litweb/pdf/16750-97003.pdf

You should also read the whole Theory of Operation chapter.  There's very little service information out there about these cards, so you need to fully absorb everything that you can.
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #53 on: January 19, 2021, 08:33:08 pm »
OK, I've did some debugging and serious thinking  :-/O it seems, that one of the memory chips is broken(ish). My module is failing the following tests:

  • anlyBusTest - Analyzer Chip Memory Bus Test
  • clksTest - System Clocks (Master/Slave/Psync) Test
  • measAnlyBusTiming - Analyzer Memory Bus SU/H Measure[/il]
All of them use chip memory! anlyBusTest checks it directly using a walking 1 pattern and reports:

Code: [Select]
Slot A, Analysis Chip Data Bit Failed, chip 9, port 2
for all the bits (0xFFFFFFFF) of the chip 9, port 2 of the memory. The module has 34 memory chips (48LC4M16A2 - 1Mx16x4 SDRAMs) organized in 32-bit memory. Do I assume correctly, that this means that one of the memory chips is damaged or lost its connection to the Virtex FPGAs? Do you have any idea how to check which one?  :-DMM

Regards :) and thank you for your help.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #54 on: January 20, 2021, 10:22:12 pm »
In the pv output "Chip 8" and "Chip 9" refer to the acquisition ASICs.

Please post (attach) your output from pv with "d r=9" turned on so we can have some context.  At the moment, the output from anlyBusTest and clksTest will do.
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #55 on: January 21, 2021, 09:55:37 pm »
In the pv output "Chip 8" and "Chip 9" refer to the acquisition ASICs.

Please post (attach) your output from pv with "d r=9" turned on so we can have some context.  At the moment, the output from anlyBusTest and clksTest will do.

I will provide it as soon as possible, for now I only have a screenshot with d=256, r=100 of the anlyBusTest:



and part of the clksTest:



for the TEST 1: Master Clock from CHIP 9 all the clocks it's as shown:
Code: [Select]
Stage 1 is 0x0,000,0000 0x0,000,0000

for the TEST 1 I've also found this:

 

The same thing goes for TEST 2, 3, 4 and 5... the rest seems fine.

And U50 and U59 are a pair of SDRAMs on the board, that constitute a 32-bit word. They are both correctly connected to the VDD, VDDQ and VSS and have their control signals and addresses tied together (I've measured that). But I have no idea, where they should go after that. I'm guessing, that one of the control lines of the SDRAMs might be severed, and because of that, the whole bank is dead. Correct me if I'm wrong  :palm:.

The acquisition ICs are U22 and U45 (the big ones under the heatsinks)? Virtex FPGAs next to them are the glue logic to the memory?
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #56 on: January 22, 2021, 03:25:37 am »
U45 is one of the acquisition ASIC and is known in pv as "Chip 9".

I think you're correct that the Virtex FPGAs are glue logic for the memory.

I think you're also probably right that there's a common signal for U50 and U59 that is dead and is causing them both to read as all 0's.  If so, it's probably going to be between U50/U59 and U52, or between U52 and U45.

I would examine ALL the traces on the bottom for breaks between U45 and U52.  If nothing is obvious, I would double check each trace with a continuity tester from VIA TO VIA.  You need two very fine and sharp probes to poke sideways at about a 45 degree angle into each via.  Start where you removed a runner (there's a label "AF2" there) and work your way towards the edge of the card.  You'll probably need a microscope (I do).  It's easy to lose your place, but you can mark your progress with a fine Sharpie or small removable labels.

The corrosion especially likes to attack a tiny ring of exposed copper around the small solder pads.  It's impossible to see which is why it's important to test end to end by probing the vias.

Another approach would be to turn the chassis upside down and access the card from the bottom while it's running.  You can remove the mouse/keyboard card and put it aside.  All signals, even those on top, can be probed from the bottom on the small solder pads (at least I haven't found any exceptions to this yet).  You would need to figure out which ones are CS#, WE#, CLK, RAS#, CAS#, etc. for U50/U59 with a continuity tester.

Then use a script to repeatedly run anlyBusTest.  Compare the signals on U50 and U59 with another pair of memory chips that aren't having an error.  You might be able to identify which signal(s) are missing and trace it backwards that way.

Example script that runs anlyBusTest on slot E once a second:

Code: [Select]
#!/usr/local/bin/bash

# export PVRESULTLEVEL=9
# export PVDEBUGLEVEL=9

(
  sleep 5

  while true; do
    echo "s e"
    echo "x anlyBusTest"
   
    sleep 1
  done

) | pv

I have bash loaded on my system.  I'd highly recommend it, but if you don't use it you can adapt the script for the default system shell.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #57 on: January 23, 2021, 04:00:33 pm »
Another thing you might try...

You'll notice that there are numerous 33R resistor networks between U52 and the memory.  They are for series termination for the control and data signals leading to the memory.  I've had several cases of open single resistors in other areas, so it might be worth checking these resistor networks.  The control signals appear to be mostly on the resistor networks directly under U52 (mounted horizontally).  The data signals appear to be mostly on the ones mounted vertically on the top and bottom.  But I would probably check them all anyway near U52 since it's very easy to do.
« Last Edit: January 23, 2021, 04:03:56 pm by MarkL »
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #58 on: January 24, 2021, 01:03:42 am »
Another thing you might try...

You'll notice that there are numerous 33R resistor networks between U52 and the memory.  They are for series termination for the control and data signals leading to the memory.  I've had several cases of open single resistors in other areas, so it might be worth checking these resistor networks.  The control signals appear to be mostly on the resistor networks directly under U52 (mounted horizontally).  The data signals appear to be mostly on the ones mounted vertically on the top and bottom.  But I would probably check them all anyway near U52 since it's very easy to do.

I've found out, that the resistors are terminating all(?) the signals to the memory ICs, but didn't knew, that thy migh be faulty! I will check that first, as this is ultra-simple  :-DMM
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #59 on: January 24, 2021, 07:16:37 pm »
Terminating resistors seem to be fine. I've also checked RAS, CAS, CS and WE lines, and they are fine. Still hunting for the CLK, CKE, DQMH and DQML lines - they have to be somwhere on the resistor packs :-DMM.

I'm thinking about lifting the RAMs and/or exchanging them with some else RAMs, there is pleny of them.

 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #60 on: January 25, 2021, 12:37:23 am »
CLK, CKE, and CS# are all connected to one side of various 10k resistors on the top, and also to one side of the 33R horizontal resistor packs on the bottom.

It seems DQMH and DQML are connected together on all the chips in the group (U60, U59, U47, U51, U60, U90, U89, U87, U86), and go to one side of a blue 38R resistor on top.

It's your board, but I would not start swapping chips until I have a reason to believe it's the chip.  Plus, I'm more likely to believe it's one chip that has gone bad and not both.

I would try the live probing first to see if the U50/U59 are getting the same anlyBusTest test signals that another working pair are seeing, like U23/U30.  Note that the two halves of the board are practically identical copies of each other, at least for the data acquisition piece, and you can use this to your advantage for comparison purposes when you have something failing only on one half.

EDIT:  A further thought on this...  You should note that the memory chips all passed previous read/write tests without error.  The problem appears to be when the acquisition chips are in charge of writing to the memory.  The previous tests, I believe, are done through the bus interface (Altera) FPGAs.
« Last Edit: January 25, 2021, 04:13:26 am by MarkL »
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #61 on: January 25, 2021, 04:29:22 pm »
A further thought on this...  You should note that the memory chips all passed previous read/write tests without error.  The problem appears to be when the acquisition chips are in charge of writing to the memory.  The previous tests, I believe, are done through the bus interface (Altera) FPGAs.

From what I understand this is the first test involving memory. All further also fail.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #62 on: January 25, 2021, 10:02:57 pm »
A further thought on this...  You should note that the memory chips all passed previous read/write tests without error.  The problem appears to be when the acquisition chips are in charge of writing to the memory.  The previous tests, I believe, are done through the bus interface (Altera) FPGAs.

From what I understand this is the first test involving memory. All further also fail.
No.  The following tests also involve the acquisition memory and are performed before the anlyBusTest:

  dataBusTest
  addrBusTest
  hwMemoryCellTest
  unloadTest
  dmaTest
  sleepTest

You previously reported that all of these passed.
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #63 on: January 26, 2021, 05:59:54 pm »
Indeed you are right. Is there any more documentation showing what exactly is checked by which test? On a tad lower level, so I can compare it with the hardware.

The memory ICs seem to be connected only to the Virtex FPGAs, and the lines seem to be fine. So now it's tim to check what is going on between the ASICs and the big FPGA, that controlls the offending memory chip...  :-DMM
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #64 on: January 26, 2021, 08:06:27 pm »
Indeed you are right. Is there any more documentation showing what exactly is checked by which test? On a tad lower level, so I can compare it with the hardware.
Other than what's stated in the service guide, I'm not aware of anything.  It's clearly stated that the acquisition memory is thoroughly tested via those tests.

If you're ever in doubt, you could always lift a leg on a working memory chip and observe which tests detect it.

Quote
The memory ICs seem to be connected only to the Virtex FPGAs, and the lines seem to be fine. So now it's tim to check what is going on between the ASICs and the big FPGA, that controlls the offending memory chip...  :-DMM
I have a 16750A with one FPGA and one ASIC removed, and that is correct the memory lines only go to the FPGA.

The 16715A/16716A/16717A cards (and maybe others) have a very similar, if not identical, acquisition ASIC footprint and PCB layout as the 16750A.  And in those cards, the ASIC drives the memory directly without intervening FPGAs.  Another thing is that even though the 16750A cards have more total memory bits, the number of physical memory chips is the same (34).

My suspicion is that Agilent did not want to re-spin (or maybe make very minimal changes to) the acquisition ASIC for cost reasons and instead opted to create an adaption layer to take the memory control signals from the existing ASIC and control a larger pool of memory.

So, it may be that there is a direct correlation for the control lines to U50/U59 to one or more signals coming out of the acquisition ASIC, U45.

The acquisition ASICs in the 16715A/16A/17A do have a different part number than the 16750A, but it's also a different manufacturer ("LSI L2B1075 HP 1821-3936" vs. "Agilent L2A1509").  They may have some differences internally, but I'm looking at the footprint and how the PCB traces are arranged.


EDIT:  After looking up "L2A1509", it appears that's also made by LSI.  So the manufacturer is the same.  Minor point.
« Last Edit: January 26, 2021, 08:49:19 pm by MarkL »
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #65 on: January 27, 2021, 10:19:15 pm »
I have read the guide etc but I have no idea how you know what is chip 9 etc.

If you're ever in doubt, you could always lift a leg on a working memory chip and observe which tests detect it.

This is one of the ideas, that I had... But not yet  >:D. Tommorow I plan to poke some more between the ASIC and the FPGA responsible for the U50/U59.

Regards!
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #66 on: January 28, 2021, 12:32:00 am »
I have read the guide etc but I have no idea how you know what is chip 9 etc.
I fixed a lot of these boards and have decoded some of the meaning printed by pv in the process, that's all.  My failed-to-fix pile is still bigger than the fixed pile, so there's always more to learn.  Having a pool of dead cards to experiment on and (sometimes destructively) examine is also helpful.

You might want to take a look at the 16753A/54A/55A/56A service guide.  Although the board is quite different, it does use a memory controller FPGA architecture and has a slightly better explanation for some of the tests in common with the 16750A.

Quote
If you're ever in doubt, you could always lift a leg on a working memory chip and observe which tests detect it.

This is one of the ideas, that I had... But not yet  >:D. Tommorow I plan to poke some more between the ASIC and the FPGA responsible for the U50/U59.

Regards!
I already tried lifting a leg on U59 a while ago.  Attached are my notes.  Note that "Chip 9" pops up, as well as some unexpected failures since those use the acquisition memory to store test results.  The first reported problem is usually the one to concentrate on, but examining subsequent failures can sometimes provide additional information or reinforce what was already reported.

Code: [Select]
Lifted DQ1 (pin 4) of U59 (memory MT48LC4M16A2) as a test on a working
board.  Memory glue FPGA (Virtex) for that column is U52, acquisition
ASIC for that half of the board is U45.

pv> x modtests
Mod   C: TEST passed       # "cpldRegTest" (1, 0, 1)
Mod   C: TEST passed       # "testLoadFPGA" (1, 0, 1)
Mod   C: TEST passed       # "fpgaRegTest" (1, 0, 1)
Mod   C: TEST FAILED       # "dataBusTest" (1, 1, -1)
Mod   C: TEST FAILED       # "addrBusTest" (1, 1, -1)
Mod   C: TEST FAILED       # "hwMemoryCellTest" (1, 1, -1)
Mod   C: TEST FAILED       # "unloadTest" (1, 1, -1)
Mod   C: TEST FAILED       # "dmaTest" (1, 1, -1)
Mod   C: TEST FAILED       # "sleepTest" (1, 1, -1)
Mod   C: TEST passed       # "searchTest" (1, 0, 1)
Mod   C: TEST passed       # "chipRegTest" (1, 0, 1)
Mod   C: TEST FAILED       # "anlyBusTest" (1, 1, -1)
Mod   C: TEST FAILED       # "clksTest" (1, 1, -1)
Mod   C: TEST FAILED       # "measAnlyBusTiming" (1, 1, -1)
Mod   C: TEST passed       # "bpClkTest" (1, 0, 1)
Mod   C: TEST passed       # "cmpTest" (1, 0, 1)
Mod   C: TEST passed       # "icrTest" (1, 0, 1)
Mod   C: TEST passed       # "flagTest" (1, 0, 1)
Mod   C: TEST passed       # "armTest" (1, 0, 1)
Mod   C: TEST passed       # "calTest" (1, 0, 1)
Mod   C: TEST passed       # "zoomDataTest" (1, 0, 1)
Mod   C: TEST passed       # "zoomMasterTest" (1, 0, 1)
Mod   C: TEST passed       # "fisoRedundancyTest" (1, 0, 1)
Mod   C: TEST passed       # "zoomAcqTest" (1, 0, 1)
Mod   C: TEST passed       # "zoomChipSelTest" (1, 0, 1)

pv> d r=9
debugLevel=0, mode=0, resultLevel=9

pv> x dataBusTest
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=40000000, act=00000000, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFFFE, act=BFFFFFFE, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFFFD, act=BFFFFFFD, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFFFB, act=BFFFFFFB, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFFF7, act=BFFFFFF7, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFFEF, act=BFFFFFEF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFFDF, act=BFFFFFDF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFFBF, act=BFFFFFBF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFF7F, act=BFFFFF7F, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFEFF, act=BFFFFEFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFDFF, act=BFFFFDFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFFBFF, act=BFFFFBFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFF7FF, act=BFFFF7FF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFEFFF, act=BFFFEFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFDFFF, act=BFFFDFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFFBFFF, act=BFFFBFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFF7FFF, act=BFFF7FFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFEFFFF, act=BFFEFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFDFFFF, act=BFFDFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFFBFFFF, act=BFFBFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFF7FFFF, act=BFF7FFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFEFFFFF, act=BFEFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFDFFFFF, act=BFDFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FFBFFFFF, act=BFBFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FF7FFFFF, act=BF7FFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FEFFFFFF, act=BEFFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FDFFFFFF, act=BDFFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=FBFFFFFF, act=BBFFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=F7FFFFFF, act=B7FFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=EFFFFFFF, act=AFFFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=DFFFFFFF, act=9FFFFFFF, Mems  U59
  Slot C, Data Failures, chip 9, bank 0, port 2, exp=7FFFFFFF, act=3FFFFFFF, Mems  U59
  Slot C, Data Error, chip 9, bank 0, port 2, bits=40000000, Mems  U59
Mod   C: TEST FAILED       # "dataBusTest" (2, 2, -1)

pv> x addrBusTest
  Slot C, chip 9, port 2, bank 0, addr=00000001, exp=FFFEFFFE, act=BFFEFFFE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000002, exp=FFFCFFFC, act=BFFCFFFC, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000004, exp=FFFAFFFA, act=BFFAFFFA, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000008, exp=FFF6FFF6, act=BFF6FFF6, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000010, exp=FFEEFFEE, act=BFEEFFEE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000020, exp=FFDEFFDE, act=BFDEFFDE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000040, exp=FFBEFFBE, act=BFBEFFBE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000080, exp=FF7EFF7E, act=BF7EFF7E, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000100, exp=FEFEFEFE, act=BEFEFEFE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000200, exp=FDFEFDFE, act=BDFEFDFE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000400, exp=FBFEFBFE, act=BBFEFBFE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00000800, exp=F7FEF7FE, act=B7FEF7FE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00001000, exp=EFFEEFFE, act=AFFEEFFE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00002000, exp=DFFEDFFE, act=9FFEDFFE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00008000, exp=7FFE7FFE, act=3FFE7FFE, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00010000, exp=FFFCFFFC, act=BFFCFFFC, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00020000, exp=FFFDFFFD, act=BFFDFFFD, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00040000, exp=FFF9FFF9, act=BFF9FFF9, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00080000, exp=FFF5FFF5, act=BFF5FFF5, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00100000, exp=FFEDFFED, act=BFEDFFED, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=00200000, exp=FFDDFFDD, act=BFDDFFDD, Mems U59
  Slot C, chip 9, port 2, bank 0, addr=003FBFFF, exp=40014001, act=00014001, Mems U59
 Address bits stuck:
  Slot C, Failed address test, Memory U59
  Slot C, chip 9, port=2, bank 0, AddrBitsHigh=0004000, Mems U59 U50
Mod   C: TEST FAILED       # "addrBusTest" (2, 2, -1)

pv> x hwMemoryCellTest
  Slot C, SDRAM Fpga 0 done. Num trys = 50.
  Slot C, SDRAM Fpga 2 done. Num trys = 50.
  Slot C, SDRAM Fpga 1 done. Num trys = 74.
  Slot C, fpga U52, port 0, Failed:    Top Bank, Hi Word,  Mem U59.
  Slot C, fpga U52, port 0, Failed: Bottom Bank, Hi Word,  Mem U89.
  Slot C, SDRAM Fpga 3 done. Num trys = 74.
Mod   C: TEST FAILED       # "hwMemoryCellTest" (2, 2, -1)

pv> x unloadTest
  Slot C, Chip 9: SDRAM Data Only Unload Test ...
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0xfffb actual:0xbffb
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xffbf actual:0xbfbf
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0xfbff actual:0xbbff
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0xfffe actual:0xbffe
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xffef actual:0xbfef
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0xfeff actual:0xbeff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xefff actual:0xafff
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9: SDRAM Data Only Unload Test Failed!
  Slot C, Chip 9: SDRAM Count Only Unload Test ...
  Slot C, Chip 9: Interleaved Data & Count Unload Test ...
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0x552a actual:0x152a
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x00001c, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x00001d, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x00001e, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x00001f, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x000020, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  Slot C, Chip 9: Interleaved Data & Count Unload Test Failed!
  Slot C, Chip 9: SDRAM Data Unload Modes Test Failed!
  Slot C, Chip 8: SDRAM Data Only Unload Test ...
  Slot C, Chip 8: SDRAM Count Only Unload Test ...
  Slot C, Chip 8: Interleaved Data & Count Unload Test ...
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
Mod   C: TEST FAILED       # "unloadTest" (2, 2, -1)

pv> x dmaTest
  Slot C, Chip 9: DMA Data Only Unload Test ...
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe807, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe818, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe829, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe83a, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe84b, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe85c, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe86d, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe87e, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe88f, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8a0, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8b1, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8c2, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8d3, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8e4, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8f5, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe906, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe917, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9: DMA Data Only Test Failed!
  Slot C, Chip 9: DMA Count Only Unload Test ...
  Slot C, Chip 9: DMA Interleaved Data & Count Test ...
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe800, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe801, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe802, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80b, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80c, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80d, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80e, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80f, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe810, bank 0, port 2 MSW, exp:0x552a actual:0x152a
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe811, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe812, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe813, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81c, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81d, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81e, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81f, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe820, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  Slot C, Chip 9: DMA Interleaved Data & Count Test Failed!
> Slot C, Chip 9: DMA Unload Modes Test Failed!
  Slot C, Chip 8: DMA Data Only Unload Test ...
  Slot C, Chip 8: DMA Count Only Unload Test ...
  Slot C, Chip 8: DMA Interleaved Data & Count Test ...
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
Mod   C: TEST FAILED       # "dmaTest" (2, 2, -1)

pv> x sleepTest
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0xfffb actual:0xbffb
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xffbf actual:0xbfbf
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0xfbff actual:0xbbff
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0xfffe actual:0xbffe
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xffef actual:0xbfef
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0xfeff actual:0xbeff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xefff actual:0xafff
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9: Top bank check failed before Sleep mode.
Mod   C: TEST FAILED       # "sleepTest" (2, 2, -1)

pv> x anlyBusTest
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=40000000, act=00000000
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFFE, act=BFFFFFFE
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFFD, act=BFFFFFFD
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFFB, act=BFFFFFFB
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFF7, act=BFFFFFF7
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFEF, act=BFFFFFEF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFDF, act=BFFFFFDF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFBF, act=BFFFFFBF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFF7F, act=BFFFFF7F
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFEFF, act=BFFFFEFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFDFF, act=BFFFFDFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFBFF, act=BFFFFBFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFF7FF, act=BFFFF7FF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFEFFF, act=BFFFEFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFDFFF, act=BFFFDFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFBFFF, act=BFFFBFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFF7FFF, act=BFFF7FFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFEFFFF, act=BFFEFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFDFFFF, act=BFFDFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFBFFFF, act=BFFBFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFF7FFFF, act=BFF7FFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFEFFFFF, act=BFEFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFDFFFFF, act=BFDFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFBFFFFF, act=BFBFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FF7FFFFF, act=BF7FFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FEFFFFFF, act=BEFFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FDFFFFFF, act=BDFFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FBFFFFFF, act=BBFFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=F7FFFFFF, act=B7FFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=EFFFFFFF, act=AFFFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=DFFFFFFF, act=9FFFFFFF
  Slot C, Analysis Chip Data Bit Failed, chip 9, port 2, exp=7FFFFFFF, act=3FFFFFFF
  Slot C, Analysis Chip Data Bit Error, chip 9, port 2, bits=40000000
Mod   C: TEST FAILED       # "anlyBusTest" (2, 2, -1)

pv> x clksTest
  TEST  1: Master Clock from CHIP 9 ...
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0x4926 actual:0x0926
  Slot C, Chip 9, SDRAM U59: MAC:0x000007, bank 0, port 2 MSW, exp:0x6492 actual:0x2492
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x4992 actual:0x0992
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x000015, bank 0, port 2 MSW, exp:0x4926 actual:0x0926
  Slot C, Chip 9, SDRAM U59: MAC:0x000018, bank 0, port 2 MSW, exp:0x6492 actual:0x2492
  Slot C, Chip 9, SDRAM U59: MAC:0x000019, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00001c, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00001f, bank 0, port 2 MSW, exp:0x4992 actual:0x0992
  Slot C, Chip 9, SDRAM U59: MAC:0x000023, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x000026, bank 0, port 2 MSW, exp:0x4926 actual:0x0926
  Slot C, Chip 9, SDRAM U59: MAC:0x000029, bank 0, port 2 MSW, exp:0x6492 actual:0x2492
  Slot C, Chip 9, SDRAM U59: MAC:0x00002a, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00002d, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
> Slot C, Chip 9: Master Clock from Chip 9 Test Failed!
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  2: Slave  Clock from CHIP 9 ...
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x739c actual:0x339c
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xe733 actual:0xa733
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0x7339 actual:0x3339
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0x739c actual:0x339c
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xcce7 actual:0x8ce7
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x739c actual:0x339c
  Slot C, Chip 9, SDRAM U59: MAC:0x000015, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x000016, bank 0, port 2 MSW, exp:0xe733 actual:0xa733
  Slot C, Chip 9, SDRAM U59: MAC:0x000017, bank 0, port 2 MSW, exp:0x7339 actual:0x3339
  Slot C, Chip 9, SDRAM U59: MAC:0x00001b, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x00001c, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
> Slot C, Chip 9: Slave  Clock from Chip 9 Test Failed!
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  3: Master Clock from CHIP 8 thru Chip 9 ...
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0xb6db actual:0xf6db
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0x6d9b actual:0x2d9b
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0xd9b6 actual:0x99b6
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xdb66 actual:0x9b66
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0x66db actual:0x26db
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000014, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000016, bank 0, port 2 MSW, exp:0x6d9b actual:0x2d9b
  Slot C, Chip 9, SDRAM U59: MAC:0x000017, bank 0, port 2 MSW, exp:0xd9b6 actual:0x99b6
  Slot C, Chip 9, SDRAM U59: MAC:0x00001a, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x00001b, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
> Slot C, Chip 9: Master Clock from Chip 8 thru Chip 9 Test Failed!
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  4: Slave  Clock from CHIP 8 thru Chip 9 ...
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0x18c6 actual:0x58c6
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x8c63 actual:0xcc63
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x000007, bank 0, port 2 MSW, exp:0xcc63 actual:0x8c63
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xc633 actual:0x8633
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x6331 actual:0x2331
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x000014, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x000018, bank 0, port 2 MSW, exp:0xcc63 actual:0x8c63
  Slot C, Chip 9, SDRAM U59: MAC:0x000019, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x00001a, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x00001e, bank 0, port 2 MSW, exp:0xc633 actual:0x8633
  Slot C, Chip 9, SDRAM U59: MAC:0x00001f, bank 0, port 2 MSW, exp:0x6331 actual:0x2331
  Slot C, Chip 9, SDRAM U59: MAC:0x000024, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
> Slot C, Chip 9: Slave  Clock from Chip 8 thru Chip 9 Test Failed!
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  5: Psync  Clock from CHIP 8 thru CHIP 9 ...
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xb6db actual:0xf6db
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0x6db7 actual:0x2db7
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xdb76 actual:0x9b76
  Slot C, Chip 9, SDRAM U59: MAC:0x000007, bank 0, port 2 MSW, exp:0x76db actual:0x36db
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x6ddb actual:0x2ddb
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xddb6 actual:0x9db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000015, bank 0, port 2 MSW, exp:0x6db7 actual:0x2db7
  Slot C, Chip 9, SDRAM U59: MAC:0x000016, bank 0, port 2 MSW, exp:0xdb76 actual:0x9b76
  Slot C, Chip 9, SDRAM U59: MAC:0x000018, bank 0, port 2 MSW, exp:0x76db actual:0x36db
> Slot C, Chip 9: Slave  Clock from Chip 9 thru Chip 8 Test Failed!
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  6: Master Clock from CHIP 8 ...
  TEST  7: Slave  Clock from CHIP 8 ...
  TEST  8: Master Clock from CHIP 9 thru CHIP 8 ...
  TEST  9: Slave  Clock from CHIP 9 thru CHIP 8 ...
  TEST 10: Psync  Clock from CHIP 9 thru CHIP 8 ...
  TEST 11: Poststore Counter Test ...
Mod   C: TEST FAILED       # "clksTest" (2, 2, -1)

pv>x measAnlyBusTiming
  Chip 9: Maximum Time Range = 6300
  Slot Chip FineT Port SuTime  WinSize SuErrBits  HldErrBits STap HTap
  ==== ==== ===== ==== ======= ======= ========== ========== ==== ====
    C    9   100   0    5600   > 700    04080002   FFFFFFFF    7    0
    C    9   100   1    5300   >1000    00000240   FFFF6DBB   10    0
    C    9   100   2       0   >6300    40000000   443A6FFF   63    0
FAILED: Slot:C Chip:9 Port:2  Setup:   0 (Limit > 3000), Setup Tap = 63 (Limit < 20), Bits:40000000
    C    9   100   3    5600   > 700    00412000   FFFFFFFF    7    0
    C    9   100   B      -1   >   0        0000       FFFF  999  999
  Chip 8: Maximum Time Range = 5985
  Slot Chip FineT Port SuTime  WinSize SuErrBits  HldErrBits STap HTap
  ==== ==== ===== ==== ======= ======= ========== ========== ==== ====
    C    8    95   0    5225   > 760    00080000   FFFFFFFF    8    0
    C    8    95   1    5035   > 950    00000240   FFFFEDBB   10    0
    C    8    95   2    4940   >1045    88000000   043A6FFF   11    0
    C    8    95   3    5320   > 665    00412000   FFFFFFFF    7    0
    C    8    95   B      -1   >   0        0000       FFFF  999  999
Mod   C: TEST FAILED       # "measAnlyBusTiming" (3, 3, -1)

------------------------------------------------------------

Some of these tests produce more info with resultLevel 10...

pv> d r=10
debugLevel=0, mode=1, resultLevel=10

pv> x unloadTest
  Slot C, Chip 9: SDRAM Data Only Unload Test ...
  -- Writing Full Channel Normal Data --
  -- Writing Normal Bonus Data --
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0xfffb actual:0xbffb
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xffbf actual:0xbfbf
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0xfbff actual:0xbbff
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0xfffe actual:0xbffe
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xffef actual:0xbfef
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0xfeff actual:0xbeff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xefff actual:0xafff
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  -- Reading Bonus Data --
  Slot C, Chip 9: SDRAM Data Only Unload Test Failed!
  Slot C, Chip 9: SDRAM Count Only Unload Test ...
  -- Writing Count Data --
  -- Reading Count Data --
  Slot C, Chip 9: Interleaved Data & Count Unload Test ...
  -- Writing Interleaved Data & Tags --
  -- Writing Interleaved Bonus Data --
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0x552a actual:0x152a
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x00001c, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x00001d, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x00001e, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x00001f, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x000020, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  -- Reading Bonus Data --
  -- Reading Count Data --
  Slot C, Chip 9: Interleaved Data & Count Unload Test Failed!
  Slot C, Chip 9: SDRAM Data Unload Modes Test Failed!
  Slot C, Chip 8: SDRAM Data Only Unload Test ...
  -- Writing Full Channel Normal Data --
  -- Writing Normal Bonus Data --
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  Slot C, Chip 8: SDRAM Count Only Unload Test ...
  -- Writing Count Data --
  -- Reading Count Data --
  Slot C, Chip 8: Interleaved Data & Count Unload Test ...
  -- Writing Interleaved Data & Tags --
  -- Writing Interleaved Bonus Data --
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  -- Reading Count Data --
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
Mod   C: TEST FAILED       # "unloadTest" (6, 6, -1)

pv> x dmaTest
  Slot C, Chip 9: DMA Data Only Unload Test ...
  -- Writing Full Channel Normal Data --
  -- Writing Normal Bonus Data --
  -- Reading Normal Data Using DMA --
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe807, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe818, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe829, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe83a, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe84b, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe85c, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe86d, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe87e, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe88f, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8a0, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8b1, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8c2, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8d3, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8e4, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe8f5, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe906, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe917, bank 0, port 2 MSW, exp:0x4000 actual:0x0000
  -- Reading Bonus Data Using DMA --
  Slot C, Chip 9: DMA Data Only Test Failed!
  Slot C, Chip 9: DMA Count Only Unload Test ...
  -- Writing Count Data --
  -- Reading Count Data Using DMA --
  Slot C, Chip 9: DMA Interleaved Data & Count Test ...
  -- Writing Interleaved Data & Tags --
  -- Writing Interleaved Bonus Data --
  -- Reading Normal Data Using DMA --
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe800, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe801, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe802, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80b, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80c, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80d, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80e, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe80f, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe810, bank 0, port 2 MSW, exp:0x552a actual:0x152a
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe811, bank 0, port 2 MSW, exp:0x54aa actual:0x14aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe812, bank 0, port 2 MSW, exp:0x52aa actual:0x12aa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe813, bank 0, port 2 MSW, exp:0x4aaa actual:0x0aaa
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81c, bank 0, port 2 MSW, exp:0xd555 actual:0x9555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81d, bank 0, port 2 MSW, exp:0x5555 actual:0x1555
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81e, bank 0, port 2 MSW, exp:0x5554 actual:0x1554
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe81f, bank 0, port 2 MSW, exp:0x5552 actual:0x1552
  Slot C, Chip 9, SDRAM U59: MAC:0x3fe820, bank 0, port 2 MSW, exp:0x554a actual:0x154a
  -- Reading Bonus Data Using DMA --
  -- Reading Count Data Using DMA --
  Slot C, Chip 9: DMA Interleaved Data & Count Test Failed!
> Slot C, Chip 9: DMA Unload Modes Test Failed!
  Slot C, Chip 8: DMA Data Only Unload Test ...
  -- Writing Full Channel Normal Data --
  -- Writing Normal Bonus Data --
  -- Reading Normal Data Using DMA --
  -- Reading Bonus Data Using DMA --
  Slot C, Chip 8: DMA Count Only Unload Test ...
  -- Writing Count Data --
  -- Reading Count Data Using DMA --
  Slot C, Chip 8: DMA Interleaved Data & Count Test ...
  -- Writing Interleaved Data & Tags --
  -- Writing Interleaved Bonus Data --
  -- Reading Normal Data Using DMA --
  -- Reading Bonus Data Using DMA --
  -- Reading Count Data Using DMA --
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
Mod   C: TEST FAILED       # "dmaTest" (2, 2, -1)

pv> x sleepTest
  -- Writing Full Channel Normal Data --
  -- Writing Normal Bonus Data --
  -- Writing Full Channel Normal Data --
  -- Writing Normal Bonus Data --
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0xfffb actual:0xbffb
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xffbf actual:0xbfbf
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0xfbff actual:0xbbff
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0xfffe actual:0xbffe
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xffef actual:0xbfef
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0xfeff actual:0xbeff
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xefff actual:0xafff
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xffff actual:0xbfff
  -- Reading Bonus Data --
  Slot C, Chip 9: Top bank check failed before Sleep mode.
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
Mod   C: TEST FAILED       # "sleepTest" (1, 1, -1)

pv> x clksTest
  -- Writing Full Channel Normal Data --
  -- Writing Normal Bonus Data --
  TEST  1: Master Clock from CHIP 9 ...
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0x4926 actual:0x0926
  Slot C, Chip 9, SDRAM U59: MAC:0x000007, bank 0, port 2 MSW, exp:0x6492 actual:0x2492
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x4992 actual:0x0992
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x000015, bank 0, port 2 MSW, exp:0x4926 actual:0x0926
  Slot C, Chip 9, SDRAM U59: MAC:0x000018, bank 0, port 2 MSW, exp:0x6492 actual:0x2492
  Slot C, Chip 9, SDRAM U59: MAC:0x000019, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00001c, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00001f, bank 0, port 2 MSW, exp:0x4992 actual:0x0992
  Slot C, Chip 9, SDRAM U59: MAC:0x000023, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x000026, bank 0, port 2 MSW, exp:0x4926 actual:0x0926
  Slot C, Chip 9, SDRAM U59: MAC:0x000029, bank 0, port 2 MSW, exp:0x6492 actual:0x2492
  Slot C, Chip 9, SDRAM U59: MAC:0x00002a, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  Slot C, Chip 9, SDRAM U59: MAC:0x00002d, bank 0, port 2 MSW, exp:0x4924 actual:0x0924
  -- Reading Bonus Data --
> Slot C, Chip 9: Master Clock from Chip 9 Test Failed!
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  2: Slave  Clock from CHIP 9 ...
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x739c actual:0x339c
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xe733 actual:0xa733
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0x7339 actual:0x3339
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0x739c actual:0x339c
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xcce7 actual:0x8ce7
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x739c actual:0x339c
  Slot C, Chip 9, SDRAM U59: MAC:0x000015, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x000016, bank 0, port 2 MSW, exp:0xe733 actual:0xa733
  Slot C, Chip 9, SDRAM U59: MAC:0x000017, bank 0, port 2 MSW, exp:0x7339 actual:0x3339
  Slot C, Chip 9, SDRAM U59: MAC:0x00001b, bank 0, port 2 MSW, exp:0xce73 actual:0x8e73
  Slot C, Chip 9, SDRAM U59: MAC:0x00001c, bank 0, port 2 MSW, exp:0xe739 actual:0xa739
  -- Reading Bonus Data --
> Slot C, Chip 9: Slave  Clock from Chip 9 Test Failed!
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  3: Master Clock from CHIP 8 thru Chip 9 ...
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000000, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0x6d9b actual:0x2d9b
  Slot C, Chip 9, SDRAM U59: MAC:0x000006, bank 0, port 2 MSW, exp:0xd9b6 actual:0x99b6
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x00000a, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xdb66 actual:0x9b66
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0x66db actual:0x26db
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000011, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000014, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000016, bank 0, port 2 MSW, exp:0x6d9b actual:0x2d9b
  Slot C, Chip 9, SDRAM U59: MAC:0x000017, bank 0, port 2 MSW, exp:0xd9b6 actual:0x99b6
  Slot C, Chip 9, SDRAM U59: MAC:0x00001a, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  -- Reading Bonus Data --
> Slot C, Chip 9: Master Clock from Chip 8 thru Chip 9 Test Failed!
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  4: Slave  Clock from CHIP 8 thru Chip 9 ...
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x000003, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x000007, bank 0, port 2 MSW, exp:0xcc63 actual:0x8c63
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x00000d, bank 0, port 2 MSW, exp:0xc633 actual:0x8633
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x6331 actual:0x2331
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x000014, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x000018, bank 0, port 2 MSW, exp:0xcc63 actual:0x8c63
  Slot C, Chip 9, SDRAM U59: MAC:0x000019, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x00001a, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x00001e, bank 0, port 2 MSW, exp:0xc633 actual:0x8633
  Slot C, Chip 9, SDRAM U59: MAC:0x00001f, bank 0, port 2 MSW, exp:0x6331 actual:0x2331
  Slot C, Chip 9, SDRAM U59: MAC:0x000024, bank 0, port 2 MSW, exp:0xc631 actual:0x8631
  Slot C, Chip 9, SDRAM U59: MAC:0x000025, bank 0, port 2 MSW, exp:0x6318 actual:0x2318
  Slot C, Chip 9, SDRAM U59: MAC:0x000029, bank 0, port 2 MSW, exp:0xcc63 actual:0x8c63
  -- Reading Bonus Data --
> Slot C, Chip 9: Slave  Clock from Chip 8 thru Chip 9 Test Failed!
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  5: Psync  Clock from CHIP 8 thru CHIP 9 ...
  -- Reading Full Channel Normal Data --
  Slot C, Chip 9, SDRAM U59: MAC:0x000001, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000002, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000004, bank 0, port 2 MSW, exp:0x6db7 actual:0x2db7
  Slot C, Chip 9, SDRAM U59: MAC:0x000005, bank 0, port 2 MSW, exp:0xdb76 actual:0x9b76
  Slot C, Chip 9, SDRAM U59: MAC:0x000007, bank 0, port 2 MSW, exp:0x76db actual:0x36db
  Slot C, Chip 9, SDRAM U59: MAC:0x000008, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000009, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x00000b, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x00000c, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x00000e, bank 0, port 2 MSW, exp:0x6ddb actual:0x2ddb
  Slot C, Chip 9, SDRAM U59: MAC:0x00000f, bank 0, port 2 MSW, exp:0xddb6 actual:0x9db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000010, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000012, bank 0, port 2 MSW, exp:0x6db6 actual:0x2db6
  Slot C, Chip 9, SDRAM U59: MAC:0x000013, bank 0, port 2 MSW, exp:0xdb6d actual:0x9b6d
  Slot C, Chip 9, SDRAM U59: MAC:0x000015, bank 0, port 2 MSW, exp:0x6db7 actual:0x2db7
  Slot C, Chip 9, SDRAM U59: MAC:0x000016, bank 0, port 2 MSW, exp:0xdb76 actual:0x9b76
  Slot C, Chip 9, SDRAM U59: MAC:0x000018, bank 0, port 2 MSW, exp:0x76db actual:0x36db
  -- Reading Bonus Data --
> Slot C, Chip 9: Slave  Clock from Chip 9 thru Chip 8 Test Failed!
  Slot C, Chip 9: Bad SDRAMs: U59 Bad Data: 0x4000
  TEST  6: Master Clock from CHIP 8 ...
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  TEST  7: Slave  Clock from CHIP 8 ...
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  TEST  8: Master Clock from CHIP 9 thru CHIP 8 ...
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  TEST  9: Slave  Clock from CHIP 9 thru CHIP 8 ...
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  TEST 10: Psync  Clock from CHIP 9 thru CHIP 8 ...
  -- Reading Full Channel Normal Data --
  -- Reading Bonus Data --
  TEST 11: Poststore Counter Test ...
Mod   C: TEST FAILED       # "clksTest" (1, 1, -1)
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #67 on: January 29, 2021, 09:51:03 pm »
Which leg did you lift? One of the data lines?

I'll try to poke around the board tommorow, to check for broken traces (again), wish me luck
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #68 on: January 29, 2021, 10:13:43 pm »
Which leg did you lift? One of the data lines?
Says it at the top of the notes.
 

Offline nikodem

  • Regular Contributor
  • *
  • Posts: 60
  • Country: pl
Re: Series defect on agilent 167xx boards?
« Reply #69 on: February 10, 2021, 06:05:53 pm »
I have locate a single trace, that was broken between the acqusition ASIC and the FPGA glue logic  :clap: that damn f...failed trace was hidden UNDER OVERLAY, uh. It was looking kinda clocky, as it was length matched. Now I see the following result of anlyBusTest:

Code: [Select]
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=00000002, act=00000000
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFFE, act=FFFFFFFC
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFFB, act=FFFFFFF9
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFF7, act=FFFFFFF5
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFEF, act=FFFFFFED
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFDF, act=FFFFFFDD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFFBF, act=FFFFFFBD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFF7F, act=FFFFFF7D
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFEFF, act=FFFFFEFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFDFF, act=FFFFFDFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFFBFF, act=FFFFFBFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFF7FF, act=FFFFF7FD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFEFFF, act=FFFFEFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFDFFF, act=FFFFDFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFFBFFF, act=FFFFBFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFF7FFF, act=FFFF7FFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFEFFFF, act=FFFEFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFDFFFF, act=FFFDFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFFBFFFF, act=FFFBFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFF7FFFF, act=FFF7FFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFEFFFFF, act=FFEFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFDFFFFF, act=FFDFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FFBFFFFF, act=FFBFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FF7FFFFF, act=FF7FFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FEFFFFFF, act=FEFFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FDFFFFFF, act=FDFFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=FBFFFFFF, act=FBFFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=F7FFFFFF, act=F7FFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=EFFFFFFF, act=EFFFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=DFFFFFFF, act=DFFFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=BFFFFFFF, act=BFFFFFFD
  Slot A, Analysis Chip Data Bit Failed, chip 9, port 2, exp=7FFFFFFF, act=7FFFFFFD
  Slot A, Analysis Chip Data Bit Error, chip 9, port 2, bits=00000002
Mod   A: TEST FAILED       # "anlyBusTest" (3, 3, -1)

Which is much much better, because the RAM now is working and only a single bit (DQ1 of the lower chip I presume) is faulty. The clksTest seems to confirm that:



I will try to investigate the U50, especially stuff that is connected to the DQ1 (4th pin). Or maybe the failure is somwhere else? because if the DQ1 would be e.g. not connected, then as you shown I should get much more errors. Right?

Thanks all for the support! and keep fingers crossed.

// EDIT:

Found another trace between ASIC and FPGA, that has been corroded in an invisible way... It wasn't even under those sliders with corrosive glue (only next to it). Some serious scraping was needed, to find, where it was broken (under the solder mask). After soldering it together, the board passed all the tests!  :-+

I need to run some more tests now (the tests were just triffered from the GUI, didn't had time for pv) but I hope that it is repaired!

Thank you all for your invaluable help and assistance.
« Last Edit: February 11, 2021, 08:50:56 am by nikodem »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #70 on: February 11, 2021, 10:19:38 pm »
I have locate a single trace, that was broken between the acqusition ASIC and the FPGA glue logic  :clap: that damn f...failed trace was hidden UNDER OVERLAY, uh. It was looking kinda clocky, as it was length matched.
This is why I cautioned you to methodically test end-to-end by probing the vias.  Corrosion damage is sometimes difficult to see.

Quote
Now I see the following result of anlyBusTest:
...
See attached photo for the trace that carries anlyBus bit 00000002.  I have indicated the two endpoint vias and the probe pad in the middle.  Again, a common spot for corrosion breaks is around the probe pads, but can occur anywhere.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #71 on: February 23, 2021, 01:34:48 am »
Glad you got it working, and you're welcome!

I didn't see your EDIT to your post that you got it working until after I posted.  (Actually, not until today.)

Which trace did you fix?  Was it the one I showed in the photo or something else?  Did your additional tests succeed also?
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #72 on: March 29, 2021, 03:51:22 am »
seems i got the runner failure delima..

i have 3 dead 16751A
i have 2 dead 16750A
i have 1 dead 16534A

working: 2 16534A
working: 2 16715A
working: 1 16716A

i have a 1680A now, so probably not going to spend to much on these.. probably going to scrap.

i will keep the 715/716A ( after removing the runners and cleaning ) , its funny how they were completly caked in dust.. seems the dust saved them. all cards with issues look like large xilinix chips with silver tops have started to "corrode/rust" seems these boards all have a limited lifespan.

Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #73 on: March 29, 2021, 04:05:19 pm »
I've had corrosion/rust on the Xilinx tops too.  No metal on these boards seems to be safe.

The 16534A is the most repairable of the bunch.  I wouldn't scrap that.  Out of 6 or 7 broken 16534A cards that I had, I was able to fix all of them.  Most had corroded traces and visibly damaged components which were easily repaired.
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #74 on: March 30, 2021, 04:13:52 am »
Yep, Fixed the 16534.

I was able to fix a 16751A that had just about every test failed, now down to 3:

Analyzer Chip Memory Bus Test
System Clocks ( Master/Slave/Psync) Test 
Analyzer Memory Bus SU/H Measure

Any suggestions on best method to "Safely" remove the plastic runners without damaging traces?  i pulled up two pads removing one.. but they were just pogo pin test pads, so was able to repair without issue.



Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #75 on: March 30, 2021, 03:25:38 pm »
Yep, Fixed the 16534.
Ah, great!
Quote
...
Any suggestions on best method to "Safely" remove the plastic runners without damaging traces?  i pulled up two pads removing one.. but they were just pogo pin test pads, so was able to repair without issue.
Heat.

Heating the runners and the board around them before trying to pull them off works best for me.  However, it usually leaves behind half of the adhesive strip.  So once I get all the runners off, I go back and heat the leftover adhesive and carefully scrape it while it is hot with a piece of plastic with a sharp edge.

For scraping, I've read some people use an old credit card cut down to size.  I've found I need to be a little more aggressive and use the edge on a length of 3/8" square PVC.  When all 4 edges get dull and don't dig into the adhesive, I cut the end off for a fresh four edges.  Keeping a sharp edge is key, because when the edge gets dull it will slip and head directly for some innocent passives.

I then clean up any remaining residue by repeatedly sticking Gorilla tape to the residue, and then finally wipe the runner areas with IPA.  If it becomes obvious that IPA will take forever with really stubborn residue, and I will resort to pro-strength Goof Off.  Goof Off never fails, but it's nasty stuff.

If the corrosion is wide spread, I will also scrub the whole bottom of the board with IPA and a soft brush.
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #76 on: March 30, 2021, 04:07:31 pm »
Cool, i been using a blue "cell phone case separator" but will add some heat..

I have one board that i removed the two runners in the area i had damage on the one for the memory bus errors, but see no damage, so i may try out using some wd40 as its known to remove adhesive as well.

i had a 16534 ( working ) that didn't use the black adhesive, and shows 0 signs of corrosion, so its really just the ugly black adhesive
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #77 on: May 24, 2021, 04:09:41 pm »
Anybody have an ideas on this? 16911A PSync Fail , Card still works, but 1 channel seems to flop in the wind ( same have of card with U42 ) -- U42 has been replaced and still same result.

Here is the error, I am pretty sure its a bad via, Anybody know what the "PSync" Signal is? were its located? -- Would be helpful if we had some form of schematics or signal identification on these cards, If there are no longer supported and nobody is repairing them..

16911A Logic Analyzer(A) running...
  System Clocks (J/K/L/M/Psync) Test running...
      TEST  1: JCLK routed to all chips...
      TEST  2: KCLK routed to all chips...
      TEST  3: LCLK routed to all chips...
      TEST  4: MCLK routed to all chips...
      TEST  5: PSYNC A routed to all chips...
      Slot A, Chip 0, RAM U42: MAC:0x000000, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000000 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000001, nPort 0 MSW, exp:0x2499 nActual:0x3499
    FAIL: mode=PIO addr=0x6208 sample=0x00000004 word=msw exp=0x2499 act=0x3499 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000003, nPort 0 MSW, exp:0x9924 nActual:0x8924
    FAIL: mode=PIO addr=0x6208 sample=0x0000000c word=msw exp=0x9924 act=0x8924 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000005, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000014 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000007, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x0000001c word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000008, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000020 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000a, nPort 0 MSW, exp:0x9264 nActual:0x8264
    FAIL: mode=PIO addr=0x6208 sample=0x00000028 word=msw exp=0x9264 act=0x8264 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000b, nPort 0 MSW, exp:0x2649 nActual:0x3649
    FAIL: mode=PIO addr=0x6208 sample=0x0000002c word=msw exp=0x2649 act=0x3649 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000e, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000038 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000f, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x0000003c word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000011, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000044 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000012, nPort 0 MSW, exp:0x2499 nActual:0x3499
    FAIL: mode=PIO addr=0x6208 sample=0x00000048 word=msw exp=0x2499 act=0x3499 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000014, nPort 0 MSW, exp:0x9924 nActual:0x8924
    FAIL: mode=PIO addr=0x6208 sample=0x00000050 word=msw exp=0x9924 act=0x8924 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000016, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000058 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000018, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000060 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000019, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000064 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00001b, nPort 0 MSW, exp:0x9264 nActual:0x8264
    FAIL: mode=PIO addr=0x6208 sample=0x0000006c word=msw exp=0x9264 act=0x8264 buf=0x00000000
    > Slot A, Chip 0: PSYNC with Chip 9 master  Test Failed!
      Slot A, Chip 0: Bad RAMs:  U42Bad Data: 0x1000
      TEST  6: PSYNC B routed to all chips...
      Slot A, Chip 0, RAM U42: MAC:0x000000, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000000 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000001, nPort 0 MSW, exp:0x2499 nActual:0x3499
    FAIL: mode=PIO addr=0x6208 sample=0x00000004 word=msw exp=0x2499 act=0x3499 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000003, nPort 0 MSW, exp:0x9924 nActual:0x8924
    FAIL: mode=PIO addr=0x6208 sample=0x0000000c word=msw exp=0x9924 act=0x8924 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000005, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000014 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000007, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x0000001c word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000008, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000020 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000a, nPort 0 MSW, exp:0x9264 nActual:0x8264
    FAIL: mode=PIO addr=0x6208 sample=0x00000028 word=msw exp=0x9264 act=0x8264 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000b, nPort 0 MSW, exp:0x2649 nActual:0x3649
    FAIL: mode=PIO addr=0x6208 sample=0x0000002c word=msw exp=0x2649 act=0x3649 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000e, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000038 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000f, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x0000003c word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000011, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000044 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000012, nPort 0 MSW, exp:0x2499 nActual:0x3499
    FAIL: mode=PIO addr=0x6208 sample=0x00000048 word=msw exp=0x2499 act=0x3499 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000014, nPort 0 MSW, exp:0x9924 nActual:0x8924
    FAIL: mode=PIO addr=0x6208 sample=0x00000050 word=msw exp=0x9924 act=0x8924 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000016, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000058 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000018, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000060 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000019, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000064 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00001b, nPort 0 MSW, exp:0x9264 nActual:0x8264
    FAIL: mode=PIO addr=0x6208 sample=0x0000006c word=msw exp=0x9264 act=0x8264 buf=0x00000000
    > Slot A, Chip 0: PSYNC with Chip 8 master Test Failed!
      Slot A, Chip 0: Bad RAMs:  U42Bad Data: 0x1000
      TEST  7: Each PSYNC routed to half the chips...
      Slot A, Chip 0, RAM U42: MAC:0x000000, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000000 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000001, nPort 0 MSW, exp:0x2499 nActual:0x3499
    FAIL: mode=PIO addr=0x6208 sample=0x00000004 word=msw exp=0x2499 act=0x3499 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000003, nPort 0 MSW, exp:0x9924 nActual:0x8924
    FAIL: mode=PIO addr=0x6208 sample=0x0000000c word=msw exp=0x9924 act=0x8924 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000005, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000014 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000007, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x0000001c word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000008, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000020 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000a, nPort 0 MSW, exp:0x9264 nActual:0x8264
    FAIL: mode=PIO addr=0x6208 sample=0x00000028 word=msw exp=0x9264 act=0x8264 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000b, nPort 0 MSW, exp:0x2649 nActual:0x3649
    FAIL: mode=PIO addr=0x6208 sample=0x0000002c word=msw exp=0x2649 act=0x3649 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000e, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000038 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00000f, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x0000003c word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000011, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000044 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000012, nPort 0 MSW, exp:0x2499 nActual:0x3499
    FAIL: mode=PIO addr=0x6208 sample=0x00000048 word=msw exp=0x2499 act=0x3499 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000014, nPort 0 MSW, exp:0x9924 nActual:0x8924
    FAIL: mode=PIO addr=0x6208 sample=0x00000050 word=msw exp=0x9924 act=0x8924 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000016, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000058 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000018, nPort 0 MSW, exp:0x9249 nActual:0x8249
    FAIL: mode=PIO addr=0x6208 sample=0x00000060 word=msw exp=0x9249 act=0x8249 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x000019, nPort 0 MSW, exp:0x2492 nActual:0x3492
    FAIL: mode=PIO addr=0x6208 sample=0x00000064 word=msw exp=0x2492 act=0x3492 buf=0x00000000
      Slot A, Chip 0, RAM U42: MAC:0x00001b, nPort 0 MSW, exp:0x9264 nActual:0x8264
    FAIL: mode=PIO addr=0x6208 sample=0x0000006c word=msw exp=0x9264 act=0x8264 buf=0x00000000
    > Slot A, Chip 0: PSYNC with both masters Test Failed!
      Slot A, Chip 0: Bad RAMs:  U42Bad Data: 0x1000
  ...System Clocks (J/K/L/M/Psync) Test ended. Result: Failed***
...16911A Logic Analyzer(A) ended. Result: Failed***


____ ALL OTHER TESTS PASS ____
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #78 on: February 06, 2022, 12:47:35 pm »
This seems to be the place to discuss corrosion troubleshooting/repair for these boards, so here goes mine -

I've got a 16750B that had some corrosion (I removed the runners and cleaned everything up), and was failing quite a few self tests (Analyzer Chip Memory Bus Test, System Clocks Test, Analyzer Memory Bus SU/H Measure, Comparators Test, Zoom Acquisition Test, and Zoom Chip Select Test).  I repaired about a dozen corroded traces, and have everything passing except the Zoom Acquisition Test, and Zoom Chip Select Test.

Does anyone have thoughts on Zoom failures?  Comparing the non-Zoom 16715 to the boards with Zoom, I think the Zoom circuitry is the 5x 1NB4-5040 chips, and a couple other chips nearby.  I've inspected the areas around there very closely for corrosion under the microscope, as well as checked continuity anywhere I've seen anything suspicious looking... but haven't found anything.

Below is the output from pv (d=9, r=9).  It looks similar to the failure from this thread: https://www.eevblog.com/forum/testgear/agilent-16717a-comparator-and-zoomchipseltest-failures/ , though I'm not sure that was ever resolved.  From reading the log, it looks like maybe FISO #0 is working, but #1-#4 aren't (there's at least one FISO failed message for each, except #0).  The 0x8088, 0x888, and 0x808 regardless of expected data sorta gives me the feeling of a floating bus, like the chip isn't getting selected (especially since it's failing the chip select test).

I'm guessing nobody knows which pin is the chip select, or where it comes from, right?

I guess if everything except Zoom works on this board, that'd still be usable... but of course I'd rather fix it if I can. :)

Code: [Select]
pv> x zoomAcqTest
  Slot A: FISO 4 - Data Bits Stuck HIGH: 0xf7
> Slot A: Zoom Acquisition Data Lines Test Failed!
  Slot A: FISO 3 - Data Bits Stuck LOW:  0xf3
  Slot A: FISO 3 - Data Bits Stuck HIGH: 0xf7
  Slot A: FISO 2 - Data Bits Stuck HIGH: 0xf7
  Slot A: FISO 1 - Data Bits Stuck LOW:  0xf7
  Slot A: Chip9: edgeCount=1623, exp=811
> Slot A: Chip9: Zoom Acquisition Data Frequency Test Failed!
  Slot A: Chip8: edgeCount=0, exp=811
> Slot A: Chip8: Zoom Acquisition Data Frequency Test Failed!
Mod   A: TEST FAILED       # "zoomAcqTest" (4, 4, -1)


Code: [Select]
pv> x zoomChipSelTest
  Slot A: Filling FISOs for Pods #1 with zeroes...
    Slot A: Checking FISO #4...
    Slot A: Checking FISO #3...
    Slot A: Checking FISO #2...
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
      Actual = 0x8808, Expected = 0xadff
     Slot A: FISO #2 failed.
    Slot A: Checking FISO #1...
    Slot A: Checking FISO #0...
  Slot A: Filling FISOs for Pods #2 with zeroes...
    Slot A: Checking FISO #4...
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
      Actual = 0x888, Expected = 0xffff
     Slot A: FISO #4 failed.
    Slot A: Checking FISO #3...
    Slot A: Checking FISO #2...
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
      Actual = 0x8808, Expected = 0xdaff
     Slot A: FISO #2 failed.
    Slot A: Checking FISO #1...
    Slot A: Checking FISO #0...
  Slot A: Filling FISOs for Pods #3 with zeroes...
    Slot A: Checking FISO #4...
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
      Actual = 0x888, Expected = 0x5aad
     Slot A: FISO #4 failed.
    Slot A: Checking FISO #3...
    Slot A: Checking FISO #2...
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
      Actual = 0x8808, Expected = 0xffad
     Slot A: FISO #2 failed.
    Slot A: Checking FISO #1...
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
     Slot A: FISO #1 failed.
    Slot A: Checking FISO #0...
  Slot A: Filling FISOs for Pods #4 with zeroes...
    Slot A: Checking FISO #4...
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
      Actual = 0x888, Expected = 0xadda
     Slot A: FISO #4 failed.
    Slot A: Checking FISO #3...
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
      Actual = 0x808, Expected = 0xad5a
     Slot A: FISO #3 failed.
    Slot A: Checking FISO #2...
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
      Actual = 0x8808, Expected = 0xff5a
     Slot A: FISO #2 failed.
    Slot A: Checking FISO #1...
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
      Actual = 0x808, Expected = 0xffff
     Slot A: FISO #1 failed.
    Slot A: Checking FISO #0...
> Slot A: Zoom Acquisition Chip Select Test Failed!
Mod   A: TEST FAILED       # "zoomChipSelTest" (1, 1, -1)



I also have a 16533A that sometimes passes some or all tests, and sometimes fails some or all tests.  I pulled the runners, though didn't see any corrosion that looked like it needed repair.  I haven't dug into this one at all yet (plan to double-check for corrosion), but if anyone's got tips to check for this one, I'd appreciate it as well.

Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #79 on: February 06, 2022, 10:55:18 pm »
...
I'm guessing nobody knows which pin is the chip select, or where it comes from, right?
...
Well, I have a good guess.

From some troubleshooting I did on these boards a couple of years ago, it appears that pin 43 is the chip select on the 1NB4-5040 zoom chips (U27 U34 U42 U49 U55).  They go to the Altera FLEX FPGA (U26) near the backplane connector:

  Pin 43 (CS)     U26 pin
  -----------     -------
     U27            138
     U34            136
     U42            137
     U49             10
     U55             95

Since FISO 0 is the only one passing the test, perhaps its CS is floating and it's not getting off the bus while the other chips are being accessed.  I don't know which of the zoom chips is FISO 0, though.

If it's of any use, the data bus pins on the 1NB4-5040 appear to be 46, 48, 51, 53, 55, 57, 60, and 62.  But there's no other clues on that bus to be able to guess at the D7:D0 labels.

Another thing you could do is set up a script to loop the zoomChipSelTest in pv and watch the various pins on the zoom chips with a scope and compare between chips, or if you have it, with another card that's working (or at least passes that test).

While looping tests, another trick is to perturb various signals with a 50R or 22R resistor to ground (use whatever value clearly causes a logic low).  This technique can help sort out what's working and can be used to identify data bit positions by watching the effect on the detailed debug output.

I would also check that the two regulators near the pod connectors, U20 and U21, are putting out the right voltages.  They are not present on 16715A cards which do not have zoom feature, so they have something to do with the zoom chips that get populated.

To save you from unsoldering the heatsinks, U20 is an LM2991S and U21 is an LT1086CM.  U20 should have -1.8V on its output, and U21 should have +3.3V on its output.

While I'm at it, the other mystery chip associated with the zoom feature, 1821-4731 (near U27, I don't see a U designation), appears to be a differential clock distribution driver.  When you gang these boards together, this chip on the master board becomes the master clock for all the zoom chips on the other boards.

A hint on probing: I believe (but can't prove) all signals are accessible somewhere on the bottom of the board on round pads domed with solder.  You'll see a lot of vias from the top that go underneath just to connect to a probe pad.  Nice touch by HP/Agilent.  It makes it easier to turn the whole chassis over, take the bottom off, and probe these cards from the bottom.  Maybe they were for automated testing.

If you probe from the top, all the TPxx hooks soldered into the board are GND for convenience.

Quote
I also have a 16533A that sometimes passes some or all tests, and sometimes fails some or all tests.  I pulled the runners, though didn't see any corrosion that looked like it needed repair.  I haven't dug into this one at all yet (plan to double-check for corrosion), but if anyone's got tips to check for this one, I'd appreciate it as well.
Like the above, I would check ALL the regulator outputs.  The output voltages are labeled near the regulators.  I've had more than one 16533A/16534A with way out of spec voltages because of resistors in the regulator's feedback network that had gone bad.  In a couple of cases it caused erratic behavior (like would pass calibration only some of the time).
 
The following users thanked this post: oPossum

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #80 on: February 07, 2022, 12:49:18 am »
Thanks a TON for that info.  I'll dig into this more tonight... hopefully I can track this down.

I've had more than one 16533A/16534A with way out of spec voltages because of resistors in the regulator's feedback network that had gone bad.  In a couple of cases it caused erratic behavior (like would pass calibration only some of the time).
I decided to take another look at the scope card... today, it seems to be consistently passing the FISO test, but failing everything else.  What kind of tolerance on these voltages is OK?  I pulled the bottom off the chassis and checked the voltages... at the regulator outputs I'm getting:
+6V = +6.11V
-6V = -6.19V
+7.3V = +7.31V
-2V = -2.12V
+2V = +1.84V

So... not TOO far off, but not dead-on, if that matters.

I did find several vias that looked pretty bad.  The two in the first picture look like they go to GND, and I still get continuity from the top to GND, so I'm not too worried about those.  I'm not sure about the other two... the one goes from the outer layer to an inner layer, and the other goes from an inner layer to another inner layer.  So I don't know where they're supposed to go, nor whether they're actually getting there.

Is there a way to get better debugging info from the oscilloscope self-test?  Even with d=9 r=9, I only get verbose output from testTrigger.  I assume the third number is the return code... are those meanings defined anywhere?:
Code: [Select]
pv> x modtests
Mod   C: TEST passed       # "testFISO" (1, 0, 1)
Mod   C: TEST NOT EXECUTED # "testADC" (0, 0, 0)
Mod   C: TEST FAILED       # "testDAC" (1, 1, -3)
Fail Qual
Fail Qual
Fail Qual
Fail Qual
Fail Qual
Fail Qual
Fail Qual
Fail Holdoff
Fail Holdoff
Mod   C: TEST FAILED       # "testTrigger" (1, 1, -2047)
Mod   C: TEST FAILED       # "testTimebase" (1, 1, -3)
Mod   C: TEST FAILED       # "testIMB" (1, 1, -1)
pv> x testADC

1.  Disconnect all stimulus from Channels 1 and 2.
2.  Press OK to continue. [OK]

Mod   C: TEST FAILED       # "testADC" (1, 1, -1)

I saw your post about the DebugScope file, but that didn't seem to give any additional troubleshooting info.


BTW, did you ever make your right angle adapter board concept from this thread: https://www.eevblog.com/forum/testgear/agilent-16717a-comparator-and-zoomchipseltest-failures/msg3094078/#msg3094078 ?

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #81 on: February 07, 2022, 11:01:13 am »
I looked into the 16750B card some more... I confirmed continuity of the chip select pins you mentioned.  I inspected the traces more, checking continuity across all traces going across/through the runner sections, and looked for any signs of corroded vias (didn't see any).  I also checked the voltages at the two regulators... they're 3.31V and -1.74V (seems like they should be close enough).

I also noticed that I get this error in pv, but it only shows up the first time I run the tests (right before zoomAcqTest fails).
Code: [Select]
Mod E: Unable to Detect WRAP Flag.  Possible Board Fault.

Do you know anything about the WRAP flag?

Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #82 on: February 07, 2022, 08:57:40 pm »
Ok, we've got two things going here...  First the 16533A.

But pre-first, a few general comments:

- If you haven't done so already, I'd strongly recommend reading the Theory section in the Service Guide for the cards you're trying to fix.  I've found they're very accurate and must have been written by the hardware and software developers.  It's the only service info us mere mortals have, so every little detail can be (and usually is) important.

- And along with that, read the Self-Tests Description.  Like the Theory section, every word can be valuable when trying to decipher the debug output.

- When running pv, I've found that it's almost always the case that the first thing that fails is the thing that has to be fixed first.  pv doesn't understand failure dependencies, and it will try to use a failed sub-system to do further validation, which in all likelihood will cause those subsequent tests to also fail.  I'm not saying to ignore the subsequent failure output, but take it with a grain of salt and scan the output for further clues regarding the first failure.

Ok, now back to the 16533A.  I would focus first on the DAC.  Those regulator voltages look ok to me.

I agree that for the scope cards the pv output and the DebugScope output are fairly worthless.  There's some tantalizing strings to be found in the 16533A/16534A pv driver (different that the operational driver) that implies there's more debugging to be had.  I can't figure out how to access it.  (Anyone up for some assembly-level reverse engineering?  DDE is provided on the system.)

If the scope card will run enough to give you a trace or at least let you onto the configuration screen, you can start to experiment with the DAC.  Move the trigger up and down and the trace offset up and down (even if the trace is off the screen).  Watch the trigger and offset DAC outputs to see if it's responding.

The DAC is U200, HP part 1SJ2-0102, a 24-pin narrow DIP.  I have figured out this much:

     Pin        16533A/16534A Function
  ---------     ----------------------
   1 CH13       Ch2 Trigger Level
   2 CH14       Ch1 Trigger Level
   3 CH15
   4 DIN
   5 DCLK
   6 DGND
   7 DVDD       5V
   8 DAC_CLK    19.66080MHz
   9 DL_EN
  10 CH0        Ch1 Offset
  11 CH1        Ch2 Offset
  12 CH2

  13 CH3
  14 CH4
  15 CH5        "Startable Oscillator"?
  16 CH6
  17 CH7
  18 AVDD       +5.000V
  19 AGND
  20 CH8        DC Cal
  21 CH9        ? (is either 0V or 5V)
  22 CH10
  23 CH11
  24 CH12

It's a PWM DAC, so you won't see a nice analog voltage on the output pins unless you have a DMM with a good filter.  The 3478A is in that category.  If a DMM doesn't work for you, you can look with a scope, or you can reverse-engineer the PWM filter on the output and look at the voltage after the filter.

pv moves around the DAC output voltages to force triggers, so if the DAC is failing, this is a good example of why subsequent tests will also fail.

There's more than anyone will want to know about this DAC in the Feb 1992 HP Journal, page 48:

  http://hparchive.com/hp_journals

There's also a CLIP for the 546xxA digital scopes which use the DAC.  The schematic could be helpful.  This particular site looks a dubious, but you can search around for the file name for other locations:

  http://bee.mif.pg.gda.pl/ciasteczkowypotwor/HP/546XXA_CLIP_Package.zip

I've encountered bad resistors in the DAC area.  Alexandre Souza did a nice blog on one particular resistor that has been reported as bad 4 times:

  https://tabajara-labs.blogspot.com/2019/05/repair-of-hp16533a-and-for-that-matter.html

Did you try a calibration?  It will fail, but it goes through different steps and puts a lot more info in the DebugScope directory if you have it enabled.  Maybe there's some clues lurking there with what fails or what it puts in the debug files.

I never got around to making the angle adapter.  I kept finding ways to troubleshoot cards from the underside.  I thought I could find a 0.1" male angle header with long enough pins but I couldn't.  It's going to need a short PCB and I got lazy.

It looks like you still have copper on those vias.  They're probably ok but we can get back to them.  The first is in the timebase area and the other next to an ADC.  They shouldn't affect the operation of the DAC.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #83 on: February 07, 2022, 09:15:15 pm »
I looked into the 16750B card some more... I confirmed continuity of the chip select pins you mentioned.  I inspected the traces more, checking continuity across all traces going across/through the runner sections, and looked for any signs of corroded vias (didn't see any).  I also checked the voltages at the two regulators... they're 3.31V and -1.74V (seems like they should be close enough).

I also noticed that I get this error in pv, but it only shows up the first time I run the tests (right before zoomAcqTest fails).
Code: [Select]
Mod E: Unable to Detect WRAP Flag.  Possible Board Fault.

Do you know anything about the WRAP flag?

Thanks,
DogP
I agree those voltages are fine.

I haven't heard of the WRAP flag before.

I suggest the next step on the 16750B is to start looking at the data pins and the chip select on the zoom chips with a loop test running in pv.  Make sure they look like good TTL levels.  I'm sure there's also a R/W pin there too, but I haven't looked for it.  It can probably be identified by looking on that side of the chip as a signal that's in sync with CS or the data lines and gets set up before CS goes true.

Other possibilities is that one of the zoom chips is misbehaving or some of the pins on the FPGA are not working.  I have seen both, and both times manifested themselves as bogus logic levels on the bus.  But I think we're far from concluding it's a dead chip at the moment.
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #84 on: February 08, 2022, 12:17:28 pm »
Once again, thanks for all that info!  I didn't have time to look at the DAC today, though I did try the calibration, which passed ADC, Gain, and Ext Trigger Skew, but failed everything else.

Not sure if it helps any... but it does seem that the scope is at least somewhat working.  When I press run, it sits at "Waiting for Prestore", but if I press stop, it does show the waveform.  And the waveform of both channels seems to be correct (I tested with a roughly 10kHz 2Vp-p sine wave, and it looks correct).

Attached is a zip of the debug output directory.  The one that jumps out at me is logictrig.file:
Code: [Select]
freq = 0, startable osc DAC = f800
Max Delay: No Trigger found
Min Delay, Rising Edge: No Trigger found
Trouble! Speeding up the startable oscillator to 1.08108e+08 Hz.
freq = 0, startable osc DAC = 8000
Error! Could not speed up the startable oscillator. Setting it to its maximum.
Min Delay, Higher Frequency : No Trigger found

Based on the frequency mentioned, I assume this is the 100 MHz oscillator referred to in the theory of operation?  I see your DAC notes say CH5 goes to the "Startable Oscillator", so I think you're right that I need to start by looking at the DAC.  It sounds like the sample clock comes from the 100 MHz oscillator, so I'd expect that the oscillator itself is working.

relay.file also had a couple error lines:
Code: [Select]
  Relay Errors Channel[0] =    112

  Relay Errors Channel[1] =    113

I had seen that post about the bad 100K resistor, and was one of the first things I checked.  I desoldered it, it measured 100K out of circuit, so I put it back.

So, I plan to look at the DAC next, but please let me know if you have any other thoughts.

Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #85 on: February 08, 2022, 07:06:01 pm »
Ah, it's good that you at least have traces on the screen.  The waveform capture and the signal path to the ADCs is working.  That side of things is a real bear to troubleshoot.

Does the offset control move the traces around?  The fact that the traces are somewhat near the middle of the screen would imply the offset might be working.

My initial guess is that there's a problem with the trigger level.

Will it trigger on the other channel?  Probably not, based on the calibration output.  If not, we're probably looking at something in the DAC area that's common to both channels.

Can you get it to trigger by giving it a very large signal that's way off the screen?  Can you get it to trigger by moving the trigger control to the + and - extremes?

One thing you can do is look at the trigger level input to the analog trigger chip/section.  There are at least 3 versions of these boards.  Some have the analog trigger implemented with discrete ECL.  Can you post a picture of the top of your board?

Although better to probe the trigger level at the analog trigger section, you can also probe the trigger level in the DAC area.  Attached is a photo of the probe locations.  Do the trigger voltage levels change when you move the trigger control knob/menu?

Here are my notes from one particular card.  Your values are going to be a little different and depend on the stored calibration values.  It's probably obvious, but the scope must be running (acquiring) for it to update the DAC with any new trigger and offset values as you are changing them.

  Trigger position outputs from DAC area
    Condition Ch1/Ch2: Offset = 0.000V, 1.0V/div
 
      Trigger Level               Ch1 (V)  Ch2 (V)
      -----------------------     -------  -------
      Top graticule (+4.0v)       +0.297   +0.265
      Center graticule (0.0V)     +0.055   +0.027
      Bottom graticule (-4.0v)    -0.187   -0.211
 
  Offset position outputs from DAC area
    Condition Ch1/Ch2: Trigger = 0.000V, 1.0V/div
 
      Offset Level                Ch1 (V)  Ch2 (V)
      -----------------------     -------  -------
      Top graticule (-4.0v)       -0.177   -0.173
      Center graticule (0.0V)     -0.032   -0.026
      Bottom graticule (+4.0v)    +0.114   +0.121


Here's a good calibration from logictrig.file, if it helps:

  freq = 9.99985e+07, startable osc DAC = 7000
  Max Delay: No Trigger found
  Min Delay, Rising Edge: Yes Trigger found
    DurationCount = 1  DurationClock = 1
    DurationCount = 1  DurationClock = 0
    DurationCount = 0  DurationClock = 1
    DurationCount = 0  DurationClock = 0
    DurationCount = -1  DurationClock = 1
    Delay Adjust = 31
    Delay Adjust = 15
    Delay Adjust = 23
    Delay Adjust = 19
    Delay Adjust = 21
    Delay Adjust = 22
  DelayAdj = 22  DurationCount = -1  DurationClock = 1

I've attached all the files from a good calibration from two cards.  I'll take a look through your files and post again if I see anything that might help.

There's a lot of opamps in the DAC section.  One quick troubleshooting method is to get the pinout for each opamp and compare the inverting and non-inverting inputs.  The difference should be 0V.  This is a good first pass that can catch failing feedback networks, assuming none of them are being used as comparators which I haven't found to be the case (at least yet).

Another thing to check is the DC Cal output.  You can set the BNC to any voltage from 0 to +5.000V.  Try some different voltages and makes sure it works.

...
Based on the frequency mentioned, I assume this is the 100 MHz oscillator referred to in the theory of operation?  I see your DAC notes say CH5 goes to the "Startable Oscillator", so I think you're right that I need to start by looking at the DAC.  It sounds like the sample clock comes from the 100 MHz oscillator, so I'd expect that the oscillator itself is working.
To this day I'm still not sure what's meant by "Startable Oscillator", but you can certainly check the sample clock by setting up scope debug mode and then in the Calibration window new choices appear for the BNC output.  One of them is the 100MHz sample clock.  But given that you see waveforms, I think it's ok.

I think the first priority is to verify the trigger levels from the DAC.
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #86 on: February 09, 2022, 03:17:37 am »
Thanks!  I just did some quick testing, and here are some quick answers... I'll dig deeper later tonight.

>Does the offset control move the traces around?
Yes (though I assumed that was just where the software draws the trace on the screen).

>Will it trigger on the other channel?
No, neither channel will trigger.

>Can you get it to trigger by giving it a very large signal that's way off the screen?
It doesn't seem like it... I put a large sine wave going in, which looked like a square wave on the screen, but no trigger.

>Can you get it to trigger by moving the trigger control to the + and - extremes?
Again, doesn't seem to.  I tried min, max, and 0 at min and max scale on both channels, but doesn't seem to trigger.  I also tried trigger immediate, which I would have expected to just work (doesn't depend on a trigger level or anything), but it didn't.  Also tried both "All" and "Partial" Acquisition Memory to Display.

>Can you post a picture of the top of your board?
Attached.

>You can set the BNC to any voltage from 0 to +5.000V.
Yes, this does work.  The manual says this is one output from the 16-channel DAC... does that mean the DAC is probably OK?

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #87 on: February 09, 2022, 04:28:59 am »
It looks like you still have copper on those vias.  They're probably ok but we can get back to them.  The first is in the timebase area and the other next to an ADC.  They shouldn't affect the operation of the DAC.
I think one or both of these vias might be my problem.  The picture kinda sucks (slightly better ones attached), but there's definitely no copper pad left on the bottom of the one, and the other is missing the copper pad on both the top and bottom.  With no copper pad on both the top and bottom, I assumed there's little or no via plating left either.

On the one with a pad still on the top, I put a small dot of solder, and let it flow down into the via a bit.  On the one with no pads, obviously solder wouldn't stick to anything... so I took a piece of 30 AWG wire and poked it into the hole, and put some solder on it.  It wouldn't go all the way through, so I pushed a wire in both sides, hoping maybe the solder would flow to any exposed copper left in the hole.

Anyway, I had little hope, but I popped it in, and IT WORKS.

So, any chance you can trace where those two vias connect to, so I can do a better permanent repair?  Theoretically, I can trace them right now... but I have little confidence in this hack reliably making contact while I'm pressing my DMM probe against it.

To this day I'm still not sure what's meant by "Startable Oscillator"
I guess my assumption was that it's an oscillator with an enable line, but leave it to HP to complicate such a simple concept. ;)
https://patents.google.com/patent/US3921095A/en
Also discussed in the August 1978 HP Journal (and a few other HP Journals as well - I don't see any references outside of HP though)

Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #88 on: February 09, 2022, 11:46:50 pm »
That's great you got it working!

Perhaps step #0 should be: "Always fix everything that appears it might be bad, even if it looks like it has nothing to do with the issue you're chasing."  That's one from the software side, actually.

I buzzed out those two vias.  I called them A and B.  Photos attached.

I don't know their function, but maybe I'll set up the chassis again and see what signal is on them.  They certainly both go to the timebase area.

If you manage to break it again in the process of fixing the vias, would you mind seeing if the external trigger input and the external trigger output are working?  I think we would have found the trigger levels from the DAC were (and are) working properly, and this would have been the next step.  It could provide a little insight for next time someone has a trigger issue.  Thanks!
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #89 on: February 10, 2022, 02:34:56 am »
>I buzzed out those two vias.  I called them A and B.  Photos attached.
Thanks!  Though the damaged vias don't visibly connect to anything, so there should be at least two endpoints for A and B, right?  Any idea where the other end(s) connect to?  Hmm... maybe the hybrid(s) since they're suspiciously close by?


>If you manage to break it again in the process of fixing the vias, would you mind seeing if the external trigger input and the external trigger output are working?
The odds seem pretty good that I'll break it again in the process, so yep, I can try that (especially likely if the one with w/ my wire through it is the cause).  Though I've never used external triggering on these scope cards... what's the best way to test?

It triggers on ECL levels, right?  I don't have much ECL stuff... can I drive it w/ a function generator, and what are safe input levels?  The manual seems to just discuss using those for daisy-chaining cards.  Is that an SMB connector?


Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #90 on: February 10, 2022, 03:47:17 pm »
>I buzzed out those two vias.  I called them A and B.  Photos attached.
Thanks!  Though the damaged vias don't visibly connect to anything, so there should be at least two endpoints for A and B, right?  Any idea where the other end(s) connect to?  Hmm... maybe the hybrid(s) since they're suspiciously close by?
...
Yes, sorry for not engaging brain before posting.

Below is the other end of A, and one other place where B shows up.

I believe B is actually the -2V (-2.1V) supply.  It has a very low resistance to the output leg on the regulator.  If you look at the two locations where I found B in the bad area, it's connected to a decoupling cap to ground and a 68.1R resistor to a signal trace.  So B appears to be providing power to some ECL terminations.  Since the -2V supply is everywhere, it's really hard to say where that specific via goes on a working board.  The closest connection is what I've shown, which is my best guess.  It's not on any pins under the hybrid.

If your continuity on the via gets broken while you're fixing it, you'd be in a better position to verify the actual connectivity.

I would consider it a last resort, but it is possible to excavate the via if you can't get a reliable connection from the top or bottom.

EDIT: Oh, and one further note.  I did take a look at A and B with a scope.  Unsurprisingly, B doesn't move.  And A appears to be TTL and changes state while calibrating the logic trigger.  Not very exciting.
« Last Edit: February 10, 2022, 03:51:07 pm by MarkL »
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #91 on: February 11, 2022, 04:49:16 am »
Awesome... thanks for checking these!  Hopefully I can make a permanent repair to this tonight.

I haven't made any useful progress on the 16750B... I don't have any cables for these boards, so I can't test with any real signals at the moment.  I ordered a 16716A (parts board) w/ cable set, so I plan to put this repair on hold until the cables arrive.  I'm hoping if I can drive some signals into it, I'll be able to see what's actually working/not working better.

Of course, then I'll have a (likely dead) 16716A, which I'll want to try repairing too. :-P

I do plan to dig through some binaries and see if I can figure anything out on the WRAP flag.  I'm wondering if it's specific to the zoom chips, is it connected to the FLEX FPGA or another chip, etc.  Then maybe I can see how many of the zoom chip pins are shared, and which go back to the FLEX, and compare between the A and B boards that I have.

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #92 on: February 11, 2022, 01:11:29 pm »
Any chance you could double-check the measurements?  In my "currently working" state, I measured to confirm mine was actually making a good connection, even in the poor hacked state.

I get 0 ohms to your first point 'A', but would briefly get a beep on the 2nd point 'A', but showed about 230 ohms steady state.  If you can confirm this is 0 ohms on yours, I'll just add a jumper wire.

On 'B', I was getting about 9 ohms.  I probed around a bit, and found this ferrite near U502 that actually looked like 0 ohms.  Though I didn't find any other places this went, so I wonder if my repair is only half connected.  Can you see whether you get similar measurements with that ferrite, and if there's anywhere else you find it connected?

You might be right about excavating the via.  Since I've already soldered to mine, it's a bit messy and hard to see... if you shine a bright light behind your vias, can you see a trace leaving the sites, and if so, in which direction?  With the ground and power planes nearby, it's probably tough to see.

Thanks again!
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #93 on: February 11, 2022, 03:13:37 pm »
On the two "A" points I'm getting 1.2R, which is not too unreasonable but perhaps a bit high.

On B I'm getting 0.133R.  There are a lot beads in this section which are going to be low resistance and it's hard to tell which side we're on.  I used a high current across the two B points (1.5A limited to 0.4V) when I was searching and I couldn't find any components that had voltage drop across them, and an IR camera did not light up on anything.  It seems likely that the B via near the edge would have another leg or two coming off it given where they put it.

Do you know a friendly dentist or vet who would xray these areas for you?

If you want to hold on making any permanent repairs, I can do this later today:

- I will try to find the exact card type you have and double check the continuity.  I have a pile of these cards, and all of them are (or at least were) working.   I've been testing with a card that is the same as yours in these sections, but who knows.

- I will try a different tracing method.  I have a PCB track current probe (I-prober 520) and I will inject a signal between the seemingly connected points to follow the exact path of the buried traces.  I'll try to find and trace all the end points with this method too in case the signal splits into multiple paths at the damaged vias.
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #94 on: February 11, 2022, 05:40:10 pm »
If you don't mind doing further tests, I can certainly wait... but I appreciate the confirmation!  I just wanted to make sure it wasn't a case of quickly probing w/ a DMM continuity checker without noting the actual R value.  Of course I wonder why mine is higher resistance, but maybe it's just part of my board's defect(s).

>Do you know a friendly dentist or vet who would xray these areas for you?
Unfortunately, no... I tried to convince my boss to buy one a few years ago, but couldn't justify it.  I personally have a scintillator, but no xray source (nor safe way to operate one!).  Someday... ;)

Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #95 on: February 11, 2022, 09:49:53 pm »
Is your board a 16533-66504 on the stickers in the lower right?  I have one, but before I dig in I wanted to confirm because it's not clear in your photo.
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #96 on: February 11, 2022, 11:27:31 pm »
Is your board a 16533-66504 on the stickers in the lower right?
Yes, that's correct.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #97 on: February 12, 2022, 12:43:22 am »
Ok, I've done the verification with my 16534-66504 board.  It's the same as your 16534-66504 except for the model setting resistors, as far as anyone knows.

I can confirm the resistances are the same for both the A and B vias that I posted before.

After tracing with the I-prober...

The A endpoints are correct as previously shown.  I was unable to find that A goes anywhere else.

The B endpoints are numerous.  In most locations where you'll find B, you'll also find a 68.1R termination resistor and an MLCC.  That's no surprise.  There's a ton of them on the back near the center of the board.

The internal trace at the broken B via goes horizontally in both directions parallel to the edge of the board.  I was not able to find any trace that went vertically coming out of the broken via.  If you look at the photo, there's another B via to the right.  I'm guessing the purpose of the broken B via was to change layers, and then probably went back to the original layer with the B via on the right.  From that via, the B trace continues on towards the front of the card, turns up until it gets to the external trigger output, and then dives back towards the center of the board where it appears to power most, if not all, of the ECL terminations.  I can image lots of things would break if these terminations did not have power.  Most likely this was your trigger failure.

My suggestion would be to jump the left B via/traces and the right B via shown in the photo near the bottom edge of the board.
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #98 on: February 12, 2022, 01:39:11 am »
Awesome... thanks SO much for all this info!  Hopefully I'll get this fixed up tonight.  And hopefully the seller on ebay gets around to shipping the LA card/cables that I ordered sometime soon, so I can get that other board repaired, and start putting this machine to use. :-P

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #99 on: February 12, 2022, 10:36:03 am »
Got this board repair wrapped up... added the jumper wires as you recommended, tacked it down with some nail polish, and everything seems to be 100%.

BTW, it looks the 9 ohm measurement on my board was the connection from the left 'B' to the middle 'B'.  I had 0 ohms from the middle 'B' to the right 'B'.  Weird...

Thanks again for all the measurements, pics, etc!  I'll post if I make any progress on the 16750B, WRAP flag, etc.

DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #100 on: February 12, 2022, 03:40:59 pm »
Great!  And you're welcome!  It's challenging and fun troubleshooting this old technology.

I assume you saw some of the posts to upgrade your 16533A to a 16534A?

...
BTW, it looks the 9 ohm measurement on my board was the connection from the left 'B' to the middle 'B'.  I had 0 ohms from the middle 'B' to the right 'B'.  Weird...
Yeah, that's strange.  My probing showed it was a direct shot from the left B to the middle B.  Maybe the connectivity to the left trace inside the via is/was damaged by the corrosion on your board.

I'll look forward to your posts on the 16750B, and maybe the 16716A when you get to it.  The 16750B is a much more capable module, and can be upgraded to a 16752B, so you might not miss having a 16716A.  But then there's the nagging, "Well, I might be able to fix this..."
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #101 on: February 12, 2022, 09:16:44 pm »
>I assume you saw some of the posts to upgrade your 16533A to a 16534A?
>can be upgraded to a 16752B, so you might not miss having a 16716A.
Yep, once I tested the cards unmodified (to make sure I knew the state they were in when I started), the first thing I did was "upgrade" all of them.

>But then there's the nagging, "Well, I might be able to fix this..."
Heh, that's exactly what I'm expecting will happen.  Though I don't even know the state of it right now... it could be completely trashed, or it might even be working.  The seller put "for parts", which could mean anything.


BTW, I did a quick search through some of the binaries for the wrap flag stuff, and it appears to be part of the Zoom FISOs.  There's a string "FISO wrap initially TRUE", "FISO Wrap Flag never went TRUE", and "Unable to Detect WRAP Flag" (which is what I'm getting)... all in the same vicinity of other Zoom strings.  So I think I'm sticking to the plan of mapping those pins as much as possible, and maybe trying to run my finger across the FPGA pins to see if I can cause one of the other FISO wrap flag errors (if it's floating, hopefully my finger will change its state).

DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #102 on: February 13, 2022, 11:16:09 am »
Just figured I'd post this real quick... I ended up making an extension for these cards using some IDC connectors and a ribbon cable.  Pretty janky, but it seems to work.  I wanted to test it before laying out a PCB and waiting for it to arrive... then finding out it just doesn't work. :-P

So, if I get a few free minutes this week I plan to make an adapter to use two IDE cables, since they're readily available, and in lots of different lengths.  No idea how long of a cable will work, but it doesn't sound like the backplane has any really high speed signals, so hopefully reasonably long.  I just want enough that I can lay a board either side up on my bench.  If a long cable works, I guess you might be able to come all the way out the back.


Edit: I did some testing with it on my WRAP flag board... by putting my finger on/near some of the zoom chip pins, I was able to consistently get the WRAP flag message several times in a single zoom chip select test.  So, maybe the WRAP flag itself isn't the problem, but instead pins are floating when they shouldn't be (chip select, R/W pins, etc?).  I'll have to probe those chips some more.  And to confirm that was a valid test, I took my good board and put my finger on the same pins, and had no failures.

DogP
« Last Edit: February 13, 2022, 12:34:28 pm by DogP »
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #103 on: February 15, 2022, 08:23:14 am »
I put this together this morning, and have it on order... the same board can be used for both the backplane and card sides, just populate the connectors appropriately.  And from the 1:1 printout, it actually looks like it'll fit perfectly between slots, so I guess maybe you could use multiple to debug a master/slave configuration on the bench or something.

I'll post the gerbers once I receive and test the boards (next week hopefully?).

DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #104 on: February 18, 2022, 04:07:10 am »
They FINALLY got around to shipping the 16716A, and it arrived today.  Overall, the card is pretty clean, and seems like it might be fairly new, since all of the adhesive strips peeled off in one piece (I used heat, of course).

From an initial look, there's not much corrosion, though the little bit that I do see seems to be at a few vias, particularly under the BGAs, which of course sucks.  But even those look pretty minor compared to the other boards I had corrosion on.  The nastier part is some goop that's stuck to the top of the board and on the cables... doesn't look corrosive or anything, but kinda sticky, like someone's kid got too close to it with a piece of butterscotch candy (probably more likely that it's something like a dried up glop of paste flux that someone spilled on it on their workbench).  So, I've got some cleanup to do, as well as taking a very close look at traces/vias/etc. under the microscope.

Anyway, I plugged it in, and it shows up in pv as "(0x2f) Unrecognized module".  Any ideas on what causes that?  Part of me wants to suspect bad solder joints on the 208-pin QFP near the backplane connector... past experience has been that large QFPs on large boards that flex easily is a recipe for solder joints to pop free.  And I'm guessing that's the chip that communicates to the host, so it seems plausible.  I didn't see any pins that looked to be lifted, but I plan to look more closely at that later as well.

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #105 on: February 18, 2022, 10:49:50 am »
I looked more closely at the 16716A... I definitely don't see any corroded traces, and the few vias that had signs of corrosion look to be very minor (after cleaning up the little bit of green, there's plenty of gold still showing).  I also took care of the glops of goo, which cleaned up well w/ alcohol.

The QFP pins on U64 seem to be OK... I checked them under the microscope, and poked at each pin w/ tips of tweezers, and they all seem solid.

I checked the LDO voltages, and I think they're OK based on calculating from the resistor values:
U63: 1.53V
U62: 2.14V
U2: -1.70V
U7: 3.33V

So... I think I've checked the simple stuff that comes to mind.  Any thoughts?


Also, since I got the cables (why I bought that card in the first place), I was able to test the 16750B card w/ the Timing Zoom problem... it appears to only be an issue communicating with the Timing Zoom chips.  I tested every channel in the regular logic analyzer mode, and they're all fine, but when I enable Timing Zoom, it'll sometimes throw errors about the WRAP flag, and looking at the waveform data, it's sometimes correct, sometimes not, and never aligned (e.g. I trigger the analyzer on the rising edge and see it at t=0, but the Timing Zoom waveform sometimes shows 0, sometimes shows 1, and never shows the edge).  Interestingly, even with a constant '1' on the line, the timing zoom still sometimes shows '0'... though I've never seen a '0' show as a '1'.

Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #106 on: February 18, 2022, 03:43:32 pm »
Cool adapter!  Please post some photos when you get it built and try it out.

On the 16716A, there are some traces that run along the edge of the board that I've found are related to the board identification.  (On other models too).

The gray cable of course has to be installed.  It's also involved in board identification.

Attached is a photo of some severely damaged traces from a board that was not recognized.  This board had other problems, but once these traces were repaired I could at least run pv on it.  There are similar traces on the top side, but they were ok on this board.

I would suggest finding the endpoints of these traces and check continuity from end to end and look closely around the gray cable area.  Sometimes the corrosion gets under the soldermask and it's really hard to tell if a trace is good.  A very sharp set of probes is useful to pierce the soldermask along the way, or use homemade probes from sewing needles.  I have a set of Pomona 6275 probes with the stainless steel tips that I've sharpened to a very fine point with a hone.

Yes, that 208-pin QFP is the host interface.  It acts as an interface between the backplane bus and several on-board buses.  Board ID is done through one of the on-board buses.

On the 16750B, I think your finger poking test is a huge clue.  I would get a scope on those zoom chip pins and see what they look like with the chip select test looping.

Could you see any pattern to the zoom screen output?  Were any of the 68 inputs working at all?
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #107 on: February 18, 2022, 05:24:28 pm »
>The gray cable of course has to be installed.  It's also involved in board identification.
LOL, thanks for pointing out the obvious... this board is missing the cable, and I was too busy looking for the corrosion that I didn't notice! :palm: Just swapped the cable from my other board and popped it in really quickly before leaving the house this AM, and it enumerated, and passed all self-tests!  So, it looks like I just need to track down (or build) one of those cables. :)

>I have a set of Pomona 6275 probes with the stainless steel tips that I've sharpened to a very fine point with a hone.
Yep, those are the exact probes I have as well, and what I've been using to scrape the soldermask (as well as poking through the soldermask to check continuity).

>Could you see any pattern to the zoom screen output?  Were any of the 68 inputs working at all?
The several I tested seemed to be partially working, but not fully working... though I need to test more tonight.  It would sometimes correctly show that the pin was '1' when the pin was tied high, and would never show '1' when not, so it seemed to be somewhat connected to reality.  But sometimes it'd show '0' when it was tied high as well, so it seems like more than just a time shift in the buffer.  I plan to hook up the sig gen tonight and put in a square wave to see if I can catch any edges in Timing Zoom.

Thanks,
DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #108 on: February 18, 2022, 09:35:56 pm »
A further thought...

You could get more precise with a finger poke by using, say, a 1k resistor connected to a fine probe to pull up or pull down various pins on the zoom chips to see exactly which pin(s) is being sensitive.  The zoom chip appears to be a 3.3V part, or at least the bus interface is, so a pullup to 3.3V would be appropriate.

U91 (all sections) and U92 (pins 1, 2, 3) are 2 input ANDs, and one input of each section is connected to pin 29 on the U29 Altera FLEX.  The output of the 5 ANDs are each connected to one of the 5 zoom chips.  In addition, one databus trace is connected to the other side of each of the the ANDs.

So, the selection mechanism might be some kind of latched select depending on which databus trace is TRUE at the time.  Perhaps it's valid to select multiple zoom chips at the same time for some operations.  I didn't analyze it on a running system.

Without getting too crazy tracing everything at the moment, I would add to my suggestion to also poke the 1k at U91 and U92.  Sections [4,5,6] [10,9,8] and [13,12,11] of U92 appear to be unused and are left floating (which design-wise is not a good idea).

One added word of caution... I've managed to kill chips on these boards with sloppy probing by dragging a probe from one pin to the next, essentially shorting two adjacent pins together.  A couple of comparators met their demise because of this, but it could happen to any chip.
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #109 on: February 19, 2022, 02:59:08 am »
You could get more precise with a finger poke by using, say, a 1k resistor connected to a fine probe to pull up or pull down various pins on the zoom chips to see exactly which pin(s) is being sensitive.  The zoom chip appears to be a 3.3V part, or at least the bus interface is, so a pullup to 3.3V would be appropriate.

U91 (all sections) and U92 (pins 1, 2, 3) are 2 input ANDs, and one input of each section is connected to pin 29 on the U29 Altera FLEX.  The output of the 5 ANDs are each connected to one of the 5 zoom chips.  In addition, one databus trace is connected to the other side of each of the the ANDs.

So, the selection mechanism might be some kind of latched select depending on which databus trace is TRUE at the time.  Perhaps it's valid to select multiple zoom chips at the same time for some operations.  I didn't analyze it on a running system.

Without getting too crazy tracing everything at the moment, I would add to my suggestion to also poke the 1k at U91 and U92.  Sections [4,5,6] [10,9,8] and [13,12,11] of U92 appear to be unused and are left floating (which design-wise is not a good idea).
Thanks for the info... yes, I definitely plan to do a more precise "touch".  Though the interesting thing I guess I didn't mention is that I don't have to be physically touching the pins to induce the failure.  Just having my finger nearby, or even touching the top of the plastic chip can cause it.  Just a guess, but the chip might have a VCO running inside it to get the 2 GHz timing, and my finger nearby is coupling, inducing a signal on whatever pin(s) are floating... or maybe just added capacitance... or magic? ;)

One added word of caution... I've managed to kill chips on these boards with sloppy probing by dragging a probe from one pin to the next, essentially shorting two adjacent pins together.  A couple of comparators met their demise because of this, but it could happen to any chip.
Yep, that's one reason I haven't done much probing yet... those are really fine pitch QFPs, not conducive to probing.  I plan to test as much as possible at the SO resistor networks, test points on the bottom of the board, vias, etc.  But I've got some mapping to do...

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #110 on: February 19, 2022, 02:36:40 pm »
Quick update... I fed a roughly 2 MHz clock into several inputs on the 16750B, and looked at the difference between the regular output and timing zoom output.  Attached are a couple screenshots.  Notice that only some of the same signals are toggling... and sometimes it's a different set of those pins that toggle (and sometimes none at all).  But, the signals that are actually toggling are the only ones that ever do toggle in timing zoom.

The other thing to notice is that most of the changing signals toggle every sample (if I zoom into the toggling section, they toggle every 500ps).  #16 is the one oddball, that seems to have some sort of lower frequency component, but it doesn't match the frequency, nor time alignment with the actual 2 MHz signal.

Other than that, I moved the LA to my workbench closer to the rest of my test equipment, so I could probe it easier.  I used the ribbon extender so I could probe both sides of the board, and comparing back and forth between my working 16750A and this 16750B, I haven't found anything that looks out of the ordinary.  The signals at the ANDs look fine, and for the most part, everything looks very uniform while running the zoom chip select and acq tests.

One nice thing is that they didn't tent the vias on the top side of the board, so they make good test points for the scope probe.

I plan to retry the real signal test while probing to see if anything noticeably changes (like actual signal shows up, etc).  Also, I'll probably make the 1K probe like you're talking about and test that while running as well.

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #111 on: February 19, 2022, 10:28:25 pm »
Interesting update - I tried probing a bunch of pins with my Z0 probe (simply because it's input is 1K to GND).  I didn't find much interesting, EXCEPT that probing pin 135 on the FLEX chip causes things to sorta work (either side of the resistor).  See attached screenshot.  I say "sorta", because the frequency is still wrong, but I consistently see signals on all the pins that have signals, and I never get the error about the WRAP flag.

Also, when probing that pin, self-test actually passes the zoomChipSelTest... though fails the zoomAcqTest.  But, the only failure it gets on the zoomAcqTest is the edgeCount (no errors about FISO stuck bits), which maybe makes sense, since the output frequency I was seeing didn't match the actual input frequency.

Code: [Select]
pv> x zoomChipSelTest
  Slot E: Filling FISOs for Pods #1 with zeroes...
    Slot E: Checking FISO #4...
    Slot E: Checking FISO #3...
    Slot E: Checking FISO #2...
    Slot E: Checking FISO #1...
    Slot E: Checking FISO #0...
  Slot E: Filling FISOs for Pods #2 with zeroes...
    Slot E: Checking FISO #4...
    Slot E: Checking FISO #3...
    Slot E: Checking FISO #2...
    Slot E: Checking FISO #1...
    Slot E: Checking FISO #0...
  Slot E: Filling FISOs for Pods #3 with zeroes...
    Slot E: Checking FISO #4...
    Slot E: Checking FISO #3...
    Slot E: Checking FISO #2...
    Slot E: Checking FISO #1...
    Slot E: Checking FISO #0...
  Slot E: Filling FISOs for Pods #4 with zeroes...
    Slot E: Checking FISO #4...
    Slot E: Checking FISO #3...
    Slot E: Checking FISO #2...
    Slot E: Checking FISO #1...
    Slot E: Checking FISO #0...
Mod   E: TEST passed       # "zoomChipSelTest" (13, 5, 1)

Code: [Select]
pv> x zoomAcqTest
  Slot E: Chip9: edgeCount=2570, exp=811
> Slot E: Chip9: Zoom Acquisition Data Frequency Test Failed!
  Slot E: Chip8: edgeCount=2570, exp=811
> Slot E: Chip8: Zoom Acquisition Data Frequency Test Failed!
Mod   E: TEST FAILED       # "zoomAcqTest" (7, 7, -1)

Do you happen to know anything about this pin?  It looks like it goes to an inner layer, and haven't had a chance to track down where it ends up yet.  But suspiciously, it's right next to pin 136-138, which earlier you said are some Zoom chip select pins.

BUT, something else that's interesting... I did the exact same test on my working 16750A, and got the exact same result (same pins showed activity at the faster than actual rate).  So, hopefully this pin provides some clues, but maybe it actually leads nowhere. :-/

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #112 on: February 20, 2022, 01:11:35 am »
OK, not quite sure what that pin is doing, but I tracked it to U16 pin 5, through a 316 ohm resistor.  U16 is a SY89421V PLL, and pin 5 is the F1 input, which is the loop filter.  So... I'm not sure if this is intended to give the FPGA some control over the frequency, or something else.  But maybe it's a clue that clocks are a problem... so I'll take a look at the clocks in that area and see confirm everything looks correct.

DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #113 on: February 20, 2022, 03:09:37 am »
Interesting that it could be a clock issue.

I did some poking around and found the clock input on the zoom chips.  Here are my updated notes with what I think is on that side towards the pod connectors (any corrections/additions welcome):

  1NB4-5040_zoom_capture
 
  144-pin QFP
 
  ----
  43      nCS
  46      Data bus
  48      Data bus
  51      Data bus
  53      Data bus
  55      Data bus
  57      Data bus
  60      Data bus
  62      Data bus
  ----
  109     Sample data in
  111     Sample data in
  114     Sample data in
  116     Sample data in
  117     Sample data in
  119     Sample data in
  121     Sample data in
  123     Sample data in
  125     CLKA+ from zoom clock chip
  126     CLKA- from zoom clock chip
  127     CLKB+ from zoom clock chip
  128     CLKB- from zoom clock chip
  129     Sample data in
  131     Sample data in
  133     Sample data in
  136     Sample data in
  137     Sample data in
  138     Sample data in
  141     Sample data in
  143     Sample data in
  ----
 
  Notes:
 
  Clocks run when capturing, not constant freq
  CLKA and CLKB are 90 degrees out of phase most of the time
  Clocks also seem to run when moving data out of the chip
 
  Zoom clock distribution chip 1821-4731
    Generates 5 pairs of clocks that are distributed to each board
    Receives clock and clock- from clock chip (itself) through gray cable
    Fans out clocks (CLKA+, CLKA-, CLKB+, CLKB-) for the 5 zoom chips
    Each zoom chip gets its own set of 4 clocks


Attached are some scope traces of what I found on the clock inputs.  nCS makes a good scope trigger input, and the clocks can be found on a tiny resistor pack on the bottom (see photo).  The clocks when running are 156kHz.  Might be worth a look if you suspect a clocking problem.

(Ignore the spikiness in the waveforms.  I'm using a long ground wire clipped onto the chassis.  Good enough for what we need to see here.)

EDIT: Forgot to mention this is all happening while zoomChipSelTest is running.

Here is the script:
Code: [Select]
#!/bin/sh

export PVRESULTLEVEL=10
export PVDEBUGLEVEL=9

(
  sleep 8

  echo "s e"

  while true; do
    echo "x zoomChipSelTest"
   
    sleep 3
  done

) | pv
« Last Edit: February 20, 2022, 03:12:31 am by MarkL »
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #114 on: February 20, 2022, 03:23:40 am »
Aha!  Smoking gun...

I started checking the SY89421V pins, and saw the 100 MHz reference at RIN, and checked the S pins, which showed N=10.  So, the VCO frequency (output on HFOUT) should be 1 GHz, and FOUT (which is the feedback at FIN) should be 100 MHz.

Unfortunately, probing FOUT and FIN showed no visible signal.  HFOUT is too high to check on my oscilloscope, so I grabbed the spectrum analyzer.  From there, I saw the freq. was actually ~1.37 GHz, which is outside the specified range of the VCO, so it's probably slammed to the upper rail, likely because the feedback frequency is missing (it's saying "go faster, go faster!").

But, when I pull the loop filter down with the Z0 probe, it looks like it slams to the lower rail at around 320 MHz.  Apparently, at 320 MHz, the rest of the circuitry can function, though obviously not as expected (but better than when it's way overclocked).

As a check, I grabbed my good card, and saw HFOUT at 1 GHz as expected, and also had 100 MHz at FOUT and FIN.  So... I guess it seems very likely that U16 is bad.  I'll swap it from my 16716A later tonight and confirm that's the (only) issue.

DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #115 on: February 20, 2022, 04:16:58 am »
Awesome sleuthing!

I can confirm it's 1GHz.

And you found a signal that does NOT appear on the bottom.  So now I have to say "almost all signals appear on the bottom".  I can understand why they didn't want any extra vias on that pair!
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #116 on: February 20, 2022, 04:30:56 am »
Success!  While I probably have other things I should be doing tonight, I don't have anything I'd rather be doing. ;)  I swapped the SY89421V from my 16716A, booted it up, and it passes all self-tests, and the Timing Zoom signals show up correctly with the regular waveform. :)

As expected, it looks like the SY89421V is way obsolete.  Any chance you have a "parts" board you'd be willing to sell one off of?  Or, know where I could find one?  Digikey/Mouser don't have them anymore, and I didn't see any for sale on ebay or aliexpress.  I could probably hack together a replacement, but probably not worth the cost/effort of building a small carrier to do that.

Thanks for digging into those clock signals on the Zoom chips... I guess luckily for me I didn't need to trace them, but hopefully it'll help someone in the future.  And overall, thanks for sticking with me while I worked my way through these boards!  I'm really excited to start getting to use the machine.  Though I'm also excited to mess around with the pattern generator a bit, as well as writing some code for the TDK.

Thanks,
DogP
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #117 on: February 20, 2022, 04:33:35 am »
I can confirm it's 1GHz.
Haha... gotta rub it in with a screenshot from your awesome scope! ;)

DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #118 on: February 20, 2022, 04:44:07 am »
Excellent!

It was fun following along and it gave me an excuse to do more tracing and post it.  Thanks for sharing the details of what you were doing.

I took a quick look, and from I can tell they used the SY89421VZC on all the zoom-capable 167xx boards, including the higher end 16753A/54A/55A/56A.  I have a bunch of dead boards and I can send you a PLL, no problem.  Send me a PM and we can work out the details.  Maybe you can email me a pre-paid USPS label and I can drop the chip (properly protected) in a flat rate envelope, or something like that.
 
The following users thanked this post: jemotrain

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #119 on: February 20, 2022, 04:48:14 am »
I can confirm it's 1GHz.
Haha... gotta rub it in with a screenshot from your awesome scope! ;)

DogP
Of course I do.  But it's my limit.  I've resorted to the spectrum analyzer approach a few times too.  It makes a nice tunable voltmeter.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #120 on: February 22, 2022, 02:25:51 am »
So with your discovery of the 1GHz clock, what I was seeing on the CLK inputs to the 1NB4-5040 zoom capture chips was not making sense.  I couldn't figure out why they would go to the trouble of creating a 1GHz clock just to divide it down.  I assumed, wrongly, that each chip had its own PLL that somehow synced up to a slower incoming clock.  Not so.

My error was that I wasn't looking for a higher speed clock, so I didn't have the scope set up right to see it, and certainly not using appropriate probes.  The clocks run to read out the data, and that was not a surprise, but they also run at a much higher speed when capturing the samples.  And, as I found out, in zoomChipSelTest, the entire capture interval is much shorter than normal mode, on the order of 2.5us, and very easy to miss if you weren't looking for it in the test cycle of over a second.

So, instead of trying to observe the clocks with a looping zoomChipSelTest in pv, I set the module to repeat a data capture under normal operation.  And this time I used Tek P6245 1.5GHz active probes.  The 1GHz clock is at the limit of the scope and close to that of the probes, so everything looks like a sine anyway, and a wobbly one at that due to the sin(x)/x interpolation.  A much faster scope system is really needed to probe the CLK properly.  (And before anyone says it, yes, I should also use shorter grounds.)

Here are a couple of screen shots of the zoom clock inputs while capturing data.  As before, CLKA and CLKB are each a differential pair (zoom_clka+_clka-.png).  When capturing, CLKA and CLKB are duplicates with no phase offsets (zoom_clka+_clkb+.png).  The frequency changes with the inverse of the sample rate.  A sample period of 0.5ns (the fastest) runs the CLK at 1GHz, so the chip must be capturing samples on the rising and falling edges.

This is a guess, but maybe CLKA is for one half of the chip and CLKB is the other half. There appear to be two groups of 8 sample inputs and maybe they wanted the chip to be able to capture at different rates with 8-bit granularity in other instruments.

The last screen shot (zoom_capture_readout.png) is one cycle of capture and data download in normal operation mode.  The above 1GHz captures are taken during the first waveform interval in the beginning at the trigger marker.

Just puttin' it out there for comparison if anyone needs zoom clock info in the future.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #121 on: February 22, 2022, 06:40:54 pm »
Since I had the probing set up, I went back and took a closer look at the operation of "zoomChipSelTest".

Here is the test being performed in pv with "d d=9 r=9":

   Slot E: Filling FISOs for Pods #1 with zeroes...
      Slot E: Checking FISO #4...
      Slot E: Checking FISO #3...
      Slot E: Checking FISO #2...
      Slot E: Checking FISO #1...
      Slot E: Checking FISO #0...
    Slot E: Filling FISOs for Pods #2 with zeroes...
      Slot E: Checking FISO #4...
      Slot E: Checking FISO #3...
      Slot E: Checking FISO #2...
      Slot E: Checking FISO #1...
      Slot E: Checking FISO #0...
    Slot E: Filling FISOs for Pods #3 with zeroes...
      Slot E: Checking FISO #4...
      Slot E: Checking FISO #3...
      Slot E: Checking FISO #2...
      Slot E: Checking FISO #1...
      Slot E: Checking FISO #0...
    Slot E: Filling FISOs for Pods #4 with zeroes...
      Slot E: Checking FISO #4...
      Slot E: Checking FISO #3...
      Slot E: Checking FISO #2...
      Slot E: Checking FISO #1...
      Slot E: Checking FISO #0...
  Mod   E: TEST passed       # "zoomChipSelTest" (380, 0, 1)

Below is the scope capture.  The top of the screen shows a zoomed out view of the first segment of the above test for "Slot E: Filling FISOs for Pods #1 with zeroes" through the beginning of "Slot E: Checking FISO #4".  The scope trigger was set on CLKB+ pulse width <10ns.

The bottom is zoomed into the first part of the segment.  What's shown is a short burst of about 156kHz, maybe doing some set up of registers in the zoom chip.  This is followed by 33.6us of the 1GHz clock being turned on, which would be the "filling with zeroes" part.  After the 1GHz clock burst, the clock resumes 156kHz to extract the data from the zoom chip.  This is repeated 4 times, once for each of the 4 pods in the test.  What I think is the nCS pin is only active for most of the data extraction part, so there's probably more to the zoom chip select and read/write that's not understood yet.  Maybe it's just a read select.

The sample data inputs change across all the chips before each of the above 4 steps.  I didn't do an exhaustive check of all 68 inputs, but in a few spot checks it appears pv sets up the comparators to generate zeroes for the current pod and the rest are set to ones.  It then does the capture, gets the captured data from all 5 zoom chips, and then presumably looks to make sure there are only zeroes on that one pod.

There is also a short 2.5us burst of 1GHz on the clock that precedes each test cycle (not shown) that was mentioned in the previous post, but I don't have a good guess what's happening there.

While probing this area, I was also able to see that the FISO numbering starts with #0 at U55 near the edge.  On a 16715A/16A it's in the same order starting with #0 at U8.  So, if a problem is reported on one zoom chip, you know which one it is now.

Again, just posting the info here for future debugging.  It's all guesswork.  Any additions/corrections/verification is welcome.


And a side note for all pv debugging: I found out somewhat by accident if you set "d r=10" for some operations, you get even more detail.  The help says "r=9" is the highest level needed, but this apparently is not true.  Makes me suspicious if there's anything more for even higher debug levels.  I didn't see any differences past 10, but who knows.
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #122 on: February 26, 2022, 04:23:32 am »
Very cool... great info as always!

BTW, my extension boards arrived, and they work! :)  Attached are some pics.

I tested with a pair of 17" long IDE cables, which are long enough to stick out the back of the chassis.  I used a 2-drive cable (cables I had nearby) - apparently the extra connector isn't hurting signal quality too much - it gives you a convenient place to probe for reverse engineering the backplane too. ;)  I don't have any longer cables here, so I'm not sure what the limit is.

Somewhat surprisingly, it all fits as planned... the connectors tuck between backplane connectors, and doesn't block an adjacent slot.  And the cables plugged in at the card don't block access to components near the connector.  All of the connectors fit snugly, so they don't feel like they'll fall out, though I put a small screw hole in case you'd like to attach it.  Particularly, I was worried if you pull on the card when working on it, it might pop out of the socket.  So, I might 3D print a bracket to clamp over the rear connector on the card.

Maybe you could modify a filler panel and leave the cables hanging out the back for when you need to test a board outside the chassis... then you wouldn't need to remove the bottom, or even pull the unit out at all.  A 24" cable (or longer) might be helpful for that.  Or, maybe add a trap door in the bottom so you can pull the cable out the front, since presumably the front will be facing your workbench, and is probably the most convenient place for debugging a board.

Anyway, I'll post the gerbers, assembly notes, etc. to Github later tonight.  But if anyone (US shipping) wants a set, I've got some extras (I ordered 20 boards, since it was only a couple bucks more than 5)... just pay a few bucks for shipping.

DogP
« Last Edit: February 26, 2022, 04:38:42 am by DogP »
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #123 on: February 26, 2022, 12:21:58 pm »
Uploaded extension files to Github: https://github.com/pdaderko/16702B/tree/main/card_extension ...

DogP
 
The following users thanked this post: MarkL, alm

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #124 on: April 13, 2022, 07:38:40 pm »
Repair log for two 1675X Cards...

Card # 1 Zoom/Comp Errors:

  Comparators Test running...
  ...Comparators Test ended. Result: Failed***

  ZoomChipSel Test running...
  ...ZoomChipSel Test ended. Result: Failed***

Image show area of repair: 1675X_COMP_ZOOM_ERROR

Card #2 Errors:

  Memory Data Bus Test running...
      Slot B, Data Failures, chip 1, bank 1, port 3, Mems U90, U87
   < Truncated >
  ...Memory Data Bus Test ended. Result: Failed***

  Dma Test running...
   < Truncated >
      Slot B, Chip 1: RAM Count Only Unload Test ...
      Slot B, Chip 1: Interleaved Data & Count Unload Test ...

      Slot B, Chip 1, SDRAM U71:
   < Truncated >
      Slot B, Chip 1: Interleaved Data & Count Unload Test Failed!
    > Slot B, Chip 1: DMA Unload Modes Test Failed!
      Slot B, Chip 1: Bad RAMs:  U71 Bad Data: 0x3FFF
  ...Dma Test ended. Result: Failed***

  Memory Sleep Mode Test running...
      Slot B, Chip 1, SDRAM U74:
      Slot B, Chip 1, SDRAM U71:
      Slot B, Chip 1: Bottom bank check failed before Sleep mode.
  ...Memory Sleep Mode Test ended. Result: Failed***

Image of areas of repair: 1675X_MEMORY_ERROR

--- Card # 2 had prior repairs with rails removed, the card had developed additional failures after sitting. ( Note, there were "Chips" in error, but the repairs were not even in the area or memory controller )

« Last Edit: April 13, 2022, 07:40:58 pm by Hamster »
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #125 on: April 13, 2022, 07:47:00 pm »
@markL Check for broken traces on your chipsel error, see my post above, i have had several cards give this erorr and it was a trace at the back side of the board ...

Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #126 on: April 13, 2022, 10:20:17 pm »
@markL Check for broken traces on your chipsel error, see my post above, i have had several cards give this erorr and it was a trace at the back side of the board ...
Thanks, but it was actually DogP's card that had the ChipSel errors.  I was just playing along and posting probe results from a working card trying to help.

I agree that bad traces are the #1 problem on these cards.  But in this case DogP figured out it was a bad clock PLL chip.  Read back a few posts for the details on the symptoms and resolution.

Thanks for posting your symptoms and the areas you fixed.  Hopefully they'll be of help to others.  I'm not surprised on the "COMP_ZOOM_ERROR" fix.  The traces you fixed go to the voltage reference input on some of the comparators.
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #127 on: April 13, 2022, 11:36:19 pm »
Ahh..

DogP did you ever source a SY89421V  ? I have a couple of 1675X boards that are "beyond" repair that can be scavenged off.
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #128 on: April 14, 2022, 08:49:36 am »
DogP did you ever source a SY89421V  ? I have a couple of 1675X boards that are "beyond" repair that can be scavenged off.
Yes I did... MarkL was kind enough to send one my way, which got it back up and running.

A couple side notes -
I was able to track down another set of cables, so now I've got a both 1675x cards running (master/slave) with a full set of cables.
I also upgraded the HDD to CF by modifying one of the cheap ACARD AEC-7722 SCSI to IDE adapters.  That brought the boot time down from 3:25 to 2:55, and got rid of the loud OEM HDD... enabling fast boot knocked another 12 seconds off that.  I still need to permanently mount the adapter inside the chassis though.

DogP
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #129 on: October 06, 2022, 08:56:54 pm »
Hi all,

I need reopen this topic, because of I have same problem with the cards.
I have a HP 16702A, with the following cards:
2x 16534A - both working fine.
2x 16555D - primary is good, secondary is bad
1x 16720A - bad

The 16720A surface is looks good, no corrosion on the PCB.
When I run the pv utility, the following tests are failed:
loopback
clock_test
wait_test
instint_test

In case with 16555D I got the fail results when I ran the following tests:
pld_dpath_stest
vram_serial_stest
encoder_stest

Possible to find a schematic for these cards? I found a "service manual" but it is not highly detailed.

MateKrisz

 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #130 on: October 07, 2022, 02:58:48 pm »
On the 16555D failure, I would re-configure both cards to be stand-alone (not as a master and expander), and re-run the self test on each.  This will at least give you an idea if the problem exists with the inter-card communication, or is isolated to one of the cards.

The 16720A doesn't have runners on the bottom, so no surprise it's completely clean.  Is there a little gray ribbon cable installed, and in the correct position for single card operation?

On both failing cards it might be useful to post the results of running pv on the first reported failure (modtest xxxx) with the debug options turned on.  If you haven't used pv yet, use the forum Advanced Search and look for "pv" from user "MarkL".

Glad to hear both your 16534A are working.  I can't say I've had the same luck.  I would recommend removing the plastic runners and adhesive from these and all your cards before you get corrosion problems.  Heat is your friend, and you can search the forum for lots of ways people have done this removal.

I'm not aware of any schematics or detailed troubleshooting for anything from the 165xx/167xx series.  The only thing available is the Theory of Operation section in the Service Guide.  I'd suggest you read those sections and absorb every word.  Although only a summary, they seem have been written by people very familiar with the internal operation of the cards (maybe even by the developers) and are very accurate in my experience.  Combining the information from the Block-Level Theory section and Self-Tests Description section can help make sense of what pv is reporting, and can sometimes lead to you directly to the failing area.
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #131 on: October 07, 2022, 06:52:33 pm »
Yes, I tested the 16555D cards separately. When I tested it as master-slave combination, the test said a big FAIL for both card. Later I made two primary card, and tested them. so one working and one bad.

On the 16720A the gray ribbon cable is on good postiton. J8 and J9 connected. I checked it in the manual.
I made some pictures about the pv utility output. You see the loopback test output.

I found the manual little info about the self-tests:
Internal Loopback Test.
The internal loopback test verifies the operation of the module backplane interface IC. A walking ones pattern is written into module memory at a specific memory location, read, and compared with known values. Passing the internal loopback test implies the module backplane interface IC is functioning and the system is able to write to module memory.

The clock test is fail also and the wait test too. I think problem with the module backplane interface IC on the board. This chip is the big one like  Altera (U52) near the 50M clock (U72)?
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #132 on: October 11, 2022, 05:48:37 pm »
On the 16555D card, I wouldn't bother trying to get it to work in a pair with the good card until you figure out what's wrong with it in the stand-alone configuration.  Please post the pv results for the bad card by itself, both the summary of all tests, and then the first test that failed with debug options turned on.  There might be clues there.


On the 16720A, I don't have much experience troubleshooting that card since mine has always worked.  I agree that it could be the FPGA (the Altera MAX), but there's plenty of other possibilities surrounding it that could be preventing the loopback from working.  Given that some of the other tests are actually passing, I wouldn't be inclined to suspect it right away.

I think I would try to figure out exactly what the system is doing during the loopback test.  The manual says it's trying to write a pattern to memory and read it back.  The first set of debug output implies it is doing byte read/writes.  The only memory I see on the board is the Micron SDRAM (48LC4M16A2) and there are 8 of them.

I would try to find which SDRAM chip(s) it's trying to read/write to by looking at the write enable pins on each.  If you have enough probing capability, you could even try to figure out the data pattern that's being attempted (looks like it might be 0x00, 0x55, 0xff, 0xaa from your diagnostic output).

The manual says 6 of these chips are used for data, so I'm guessing the other 2 might be used for sequence control, and probably it's the two nearer the clock circuitry which would be U121 and U117.  It's only a guess, but you could start your probing with those 2 since the errors being reported are only byte sized.

Another technique to try to localize where the errors are coming from is to lift legs on chips, particularly the memory chips, and correlate it to what new errors start appearing.  Useful signals for this test would be chip select, write enable, and a data I/O line or two (not all at the same time).  This might be a good option, considering how long the loopback test runs (at least couple of minutes on my system).

A similar technique is to hold down various signals with a 50ohm or smaller resistor.  The size resistor can be determined by watching on a scope to make sure you've ruined the switching levels enough to cause an error.

Also note the large number of LVTH245 chips.  My guess is that they are probably controlling access for each memory chip to a common bus, which is how the memory is read and written by the system.  Perhaps lifting a leg or two on these could induce interesting errors that might correlate to the loopback errors.  Maybe one of these is sick, but it depends on how they've laid out the memory addressing.


It's not easy debugging these cards with no schematics and no documentation for the debugging output, especially if you don't have a working card to compare with.  Sometimes it's just easier to buy another card.
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #133 on: October 11, 2022, 09:20:56 pm »
Hi MarkL,

Thank you for your answer and your tips. I checked again my 16720a board under magnifier, and I found lot of hidden dirty area around the FPGA. Looks like first time I need to wash the PCB with IPA. Some days ago found a topic where the guy has a problem with the oscillator (U72). After I read it I ordered one quickly, I was lucky I found one on low price, and the seller live relative close to me. If the board is cleaned and the oscillator replaced by myself, I will re-run the pv tests maybe something changing. I will tell you the pv outputs here. Some days and the oscillator will arrive.

Tomorrow I will send you the 16555D pv outputs. First time I need remove the plastic things from the back side, and I replace the 4 pcs 3300uF capacitor. Maybe some has bad ESR value.

MateKrisz

 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #134 on: October 12, 2022, 02:32:02 pm »
Did you test U72 and have determined it's bad?

And test the 3300uF capacitors?

If not, I wouldn't start replacing components without having determined they're definitely bad, or at least you've exhausted all other reasonable possibilities.  Swapping out components can often make the situation worse by inadvertently introducing additional problems.

Can you provide a link to the post about the oscillator?  There are at least two more clock chips on the board that I found (PLLs) if you're looking at the failed clock test.

On the analysis cards, the pv tests are (for the most part) in order of dependency, meaning that a failed test could make later tests also fail.  I don't know for sure, but I would suspect the 16720A pv test routines are structured in a similar way.  That is why I would focus on the loopback failure first.

If you found a post that makes you suspect U72, then sure test it.  But if ok, don't pull it off the board, and return to looking at the loopback failures (in my opinion).
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #135 on: October 12, 2022, 10:29:34 pm »
I read this "solution" in the groups.io mail list some weeks ago if I remember well. When my LA not arrived yet, until then I readed some article about my cards. I bought this on eBay: eBay auction: #254489501566 Will arrive from England. I checked today, and I not found the topic in my browser history (total chaos here, tons of opened site in my history), I will search it again for you.

On the 16555D card I checked the capacitors with my HP 4274A and the ESR values are higher than usual. So I replaced them. After I re-run the pv util, but same tests are failed. I ran with debug, you see the output the post bottom.

Back to 16720A. Good idea localize on the 16720A what memory chips are bad. I would like to start with U121 and U117 and later LVTH245. Which legs need to lift up as pair? I found the chip datasheet here: https://www.micron.com/-/media/client/global/documents/products/data-sheet/dram/64mb_x4x8x16_sdram.pdf
So when I uplift the "chip select" leg, like 19, need to lift up another legs or just this? Sorry, It's not clear to me.
« Last Edit: October 12, 2022, 10:38:00 pm by MateKrisz »
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #136 on: October 23, 2022, 09:02:38 am »
Like MarkL, my 16720A has always worked, so I don't have any first-hand tips for troubleshooting it.

Regarding your 16555D - it looks like maybe Chip 1's bit 1 is floating.  All of the errors shown seem to be related to bit 1 (e.g. expected: FFFF actual: FFFD, expected: 0000 actual: 0002, etc.), but not always stuck low or high.

I don't have a 16555D to get any better information for you, but looking at closeups from an ebay auction, it looks like the two main chips are U25 and U26.  So logically, my best guess would be Chip 0=U25 and Chip 1=U26.  The board appears to have the plastic runners, so definitely remove those, and look for corrosion.  It looks like a runner goes right over the traces between U26 and its RAM, so that's where I'd inspect closely first.

If you have the unit apart enough that you can access the cards during test (or have a card extender like discussed earlier in this thread), I'd try running the test while touching the pins of U26 with your finger (and maybe U25 and the RAM as well), and see if you get any noticeably different errors.  That may help pinpoint the trouble area.

DogP
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #137 on: October 24, 2022, 03:05:58 pm »
(I haven't vanished; just preoccupied with a large number of non-electronics things...)

I had a chance to lift and hold high the DQMH and DQML legs on both U117 and U121 DRAM chips (4 "do_loopback" tests).  Results attached in the .zip below.  Each test failed within a few seconds.

The "Expected 0x55 got 0x00.." errors are exactly the same in each test even though I was messing with a different signal each time.  I was not expecting this.

The second set of errors (walking ones) were different each time, and seems to represent bits in the DRAM, which makes sense although they are not strictly in order.

And as suspected, the "do_clock_test" also fails when DQMx is held high (test not shown), which says pv may be depending on a successful memory loopback test to verify the clock.

Not sure what all this means yet since your walking ones look ok, but thought I'd share.


And on the 16555D, I agree with DogP you're definitely looking at a single-bit error.  Since the expected level is not consistent, it's likely open (flapping).  Once you get the runners off and cleaned up, inspect for corrosion and broken traces.  Use sharp probes to pierce the soldermask and test continuity for all traces passing under and near the runners end-via to end-via.

If DogP's finger test doesn't show anything interesting, you can try holding down some of the signals with a resistor as described above.  Put the test that's failing into a loop and observe what fails.  You can try this looping script here:

  https://www.eevblog.com/forum/testgear/agilent-16717a-comparator-and-zoomchipseltest-failures/msg4434136/#msg4434136

The signal traces on the 1675x cards are generally in order.  I would expect the 16555D to be the same, so you will know when you're getting close when you see the bits failing from the resistor hold-down getting closer to the bad one.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #138 on: October 26, 2022, 03:32:49 pm »
I've made an interesting discovery:

If the CLK pod cable is not plugged into the module (header J7), my card fails with the same do_loopback and other errors as yours.  With the CLK cable plugged in, everything passes.

On further inspection, all the pod cables (data and CLK) have pin 18 shorted to 20.  This may be providing some kind of enable to the rest of the circuitry when the cable is plugged in, but it only seems to matter for J7.  So, perhaps this is some kind of enable specifically for the clock.

All tests also pass if a jumper is installed on J7 between pins 18 and 20.

Do you have the CLK cable plugged in when you get the errors?  If so, try a jumper on J7 instead.  If you still get failures, we may need to look closer at the circuitry associated with J7 pins 18 and 20.

Code: [Select]
--- unplugged clk pod cable J7 ---

pv> x do_loopback
Enter CTRL-C to stop
0x0000000c: Expected 0x55 got 0x00
0x0000000d: Expected 0xff got 0x00
0x0000000e: Expected 0xaa got 0x00
0x0000001c: Expected 0x55 got 0x00
0x0000001d: Expected 0xff got 0x00
0x0000001e: Expected 0xaa got 0x00
0x0000002c: Expected 0x55 got 0x00
0x0000002d: Expected 0xff got 0x00
0x0000002e: Expected 0xaa got 0x00
0x0000003c: Expected 0x55 got 0x00
0x0000003d: Expected 0xff got 0x00
0x0000003e: Expected 0xaa got 0x00
0x0000004c: Expected 0x55 got 0x00
0x0000004d: Expected 0xff got 0x00
0x0000004e: Expected 0xaa got 0x00
0x0000005c: Expected 0x55 got 0x00
0x0000005d: Expected 0xff got 0x00
0x0000005e: Expected 0xaa got 0x00
0x0000006c: Expected 0x55 got 0x00
0x0000006d: Expected 0xff got 0x00
0x0000006e: Expected 0xaa got 0x00
0x0000007c: Expected 0x55 got 0x00
0x0000007d: Expected 0xff got 0x00
0x0000007e: Expected 0xaa got 0x00
0x0000008c: Expected 0x55 got 0x00
0x0000008d: Expected 0xff got 0x00
0x0000008e: Expected 0xaa got 0x00
0x0000009c: Expected 0x55 got 0x00
0x0000009d: Expected 0xff got 0x00
0x0000009e: Expected 0xaa got 0x00
0x000000ac: Expected 0x55 got 0x00
0x000000ad: Expected 0xff got 0x00
Card 0
00000000: 0101 0101 0101 0101 0101 0101 0101
00000001: 0202 0202 0202 0202 0202 0202 0202
00000002: 0404 0404 0404 0404 0404 0404 0404
00000003: 0808 0808 0808 0808 0808 0808 0808
00000004: 1010 1010 1010 1010 1010 1010 1010
00000005: 2020 2020 2020 2020 2020 2020 2020
00000006: 4040 4040 4040 4040 4040 4040 4040
00000007: 8080 8080 8080 8080 8080 8080 8080
00000008: 0101 0101 0101 0101 0101 0101 0101
00000009: 0202 0202 0202 0202 0202 0202 0202
0000000A: 0404 0404 0404 0404 0404 0404 0404
0000000B: 0808 0808 0808 0808 0808 0808 0808
0000000C: 1010 1010 1010 1010 1010 1010 1010
0000000D: 2020 2020 2020 2020 2020 2020 2020
0000000E: 4040 4040 4040 4040 4040 4040 4040
0000000F: 8080 8080 8080 8080 8080 8080 8080
00000010: 0101 0101 0101 0101 0101 0101 0101
00000011: 0202 0202 0202 0202 0202 0202 0202
00000012: 0404 0404 0404 0404 0404 0404 0404
00000013: 0808 0808 0808 0808 0808 0808 0808
00000014: 1010 1010 1010 1010 1010 1010 1010
00000015: 2020 2020 2020 2020 2020 2020 2020
00000016: 4040 4040 4040 4040 4040 4040 4040
00000017: 8080 8080 8080 8080 8080 8080 8080
00000018: 0101 0101 0101 0101 0101 0101 0101
00000019: 0202 0202 0202 0202 0202 0202 0202
0000001A: 0404 0404 0404 0404 0404 0404 0404
0000001B: 0808 0808 0808 0808 0808 0808 0808
0000001C: 1010 1010 1010 1010 1010 1010 1010
0000001D: 2020 2020 2020 2020 2020 2020 2020
0000001E: 4040 4040 4040 4040 4040 4040 4040
0000001F: 8080 8080 8080 8080 8080 8080 8080

Total of 33 errors
Mod   B: TEST FAILED       # "do_loopback" (1, 1, -1)
pv>

--- plugged in clk pod cable J7 ---

pv> x do_loopback
Mod   B: TEST passed       # "do_loopback" (2, 1, 1)
pv>

--- jumper between pins 18 and 20 on J7 ---

pv> x do_loopback
Mod   B: TEST passed       # "do_loopback" (3, 1, 1)
pv>
 
The following users thanked this post: MateKrisz

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #139 on: October 26, 2022, 04:12:05 pm »
A small follow-up on the J7 jumper thing...

If the jumper on J7 pin 18 to 20 is in place, and a failure is induced by holding U121.39 DQMH high, the failures reported for both sections of do_loopback begin to make more sense.

Also note do_clock_test succeeds in this case.

I'm guessing that maybe to perform byte reads and writes to SDRAM (as opposed to the full 16-bit width for the SDRAM), the clock needs to be enabled.  The first test in the do_loopback output is byte oriented, and the second set is the full memory width.  Without the clock enabled, the byte oriented output had no relation to what was actually happening with the SDRAM.  Now, with the clock enabled, SDRAM failures do have an effect for both byte and word error sections.

Code: [Select]
U121.39 DQMH held high, J7 18 to 20 jumpered

pv> x do_loopback
Enter CTRL-C to stop
0x00000001: Expected 0x00 got 0x80
0x00000002: Expected 0x00 got 0x80
0x00000003: Expected 0x00 got 0x80
0x00000004: Expected 0x00 got 0x80
0x00000005: Expected 0x00 got 0x80
0x00000006: Expected 0x00 got 0x80
0x00000007: Expected 0x00 got 0x80
0x00000008: Expected 0x00 got 0x80
0x00000009: Expected 0x00 got 0x80
0x0000000a: Expected 0x00 got 0x80
0x0000000b: Expected 0x00 got 0x80
0x0000000c: Expected 0x55 got 0xd5
0x0000000f: Expected 0x00 got 0x80
0x00000010: Expected 0x00 got 0x80
0x00000011: Expected 0x00 got 0x80
0x00000012: Expected 0x00 got 0x80
0x00000013: Expected 0x00 got 0x80
0x00000014: Expected 0x00 got 0x80
0x00000015: Expected 0x00 got 0x80
0x00000016: Expected 0x00 got 0x80
0x00000017: Expected 0x00 got 0x80
0x00000018: Expected 0x00 got 0x80
0x00000019: Expected 0x00 got 0x80
0x0000001a: Expected 0x00 got 0x80
0x0000001b: Expected 0x00 got 0x80
0x0000001c: Expected 0x55 got 0xd5
0x0000001f: Expected 0x00 got 0x80
0x00000020: Expected 0x00 got 0x80
0x00000021: Expected 0x00 got 0x80
0x00000022: Expected 0x00 got 0x80
0x00000023: Expected 0x00 got 0x80
0x00000024: Expected 0x00 got 0x80
Card 0
00000000: 0101 0101 0101 0101 0101 0101 8101
00000001: 0202 0202 0202 0202 0202 0202 8202
00000002: 0404 0404 0404 0404 0404 0404 8404
00000003: 0808 0808 0808 0808 0808 0808 8808
00000004: 1010 1010 1010 1010 1010 1010 8010
00000005: 2020 2020 2020 2020 2020 2020 8020
00000006: 4040 4040 4040 4040 4040 4040 8040
00000007: 8080 8080 8080 8080 8080 8080 8080
00000008: 0101 0101 0101 0101 0101 0101 8101
00000009: 0202 0202 0202 0202 0202 0202 8202
0000000A: 0404 0404 0404 0404 0404 0404 8404
0000000B: 0808 0808 0808 0808 0808 0808 8808
0000000C: 1010 1010 1010 1010 1010 1010 8010
0000000D: 2020 2020 2020 2020 2020 2020 8020
0000000E: 4040 4040 4040 4040 4040 4040 8040
0000000F: 8080 8080 8080 8080 8080 8080 8080
00000010: 0101 0101 0101 0101 0101 0101 8101
00000011: 0202 0202 0202 0202 0202 0202 8202
00000012: 0404 0404 0404 0404 0404 0404 8404
00000013: 0808 0808 0808 0808 0808 0808 8808
00000014: 1010 1010 1010 1010 1010 1010 8010
00000015: 2020 2020 2020 2020 2020 2020 8020
00000016: 4040 4040 4040 4040 4040 4040 8040
00000017: 8080 8080 8080 8080 8080 8080 8080
00000018: 0101 0101 0101 0101 0101 0101 8101
00000019: 0202 0202 0202 0202 0202 0202 8202
0000001A: 0404 0404 0404 0404 0404 0404 8404
0000001B: 0808 0808 0808 0808 0808 0808 8808
0000001C: 1010 1010 1010 1010 1010 1010 8010
0000001D: 2020 2020 2020 2020 2020 2020 8020
0000001E: 4040 4040 4040 4040 4040 4040 8040
0000001F: 8080 8080 8080 8080 8080 8080 8080

Total of 33 errors
Mod   B: TEST FAILED       # "do_loopback" (1, 1, -1)

pv> x do_clock_test
Mod   B: TEST passed       # "do_clock_test" (1, 0, 1)
pv>
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #140 on: November 11, 2022, 07:41:00 am »
Great troubleshooting information as always!

Quick question - has anyone opened up one of the riveted pod housings (drilling out the rivets I assume), and have any tips on getting it closed back up and looking nice and keeping it usable?  Any idea where to buy small replacement rivets like that?  I'd prefer that over replacing with a screw and nut, though I guess that's an option.

Basically, I noticed I had a few dead channels on one pod... I built a simple pattern generator to logic analyzer pod adapter to test all my pods, and also make it convenient to test various pattern generator sequences I was using.  After a little bit of testing, I found the channels are all working on the cards, but my A1 cable seems to have a few breaks in it near the 40-pin connector.  Wiggling the cable back and forth makes channels come and go.  I figure there are probably a few broken wires or solder joints that I could fix, rather than replacing the whole cable.

Thanks,
DogP
 

Offline TK

  • Super Contributor
  • ***
  • Posts: 1722
  • Country: us
  • I am a Systems Analyst who plays with Electronics
Re: Series defect on agilent 167xx boards?
« Reply #141 on: November 12, 2022, 03:57:42 am »
Great troubleshooting information as always!

Quick question - has anyone opened up one of the riveted pod housings (drilling out the rivets I assume), and have any tips on getting it closed back up and looking nice and keeping it usable?  Any idea where to buy small replacement rivets like that?  I'd prefer that over replacing with a screw and nut, though I guess that's an option.

Basically, I noticed I had a few dead channels on one pod... I built a simple pattern generator to logic analyzer pod adapter to test all my pods, and also make it convenient to test various pattern generator sequences I was using.  After a little bit of testing, I found the channels are all working on the cards, but my A1 cable seems to have a few breaks in it near the 40-pin connector.  Wiggling the cable back and forth makes channels come and go.  I figure there are probably a few broken wires or solder joints that I could fix, rather than replacing the whole cable.

Thanks,
DogP
Check the wires for damages before opening the POD
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #142 on: November 13, 2022, 11:50:07 am »
Yeah, it appears that the cable is broken internally close to the pod (if I flex the cable near the end, the channels come and go).

I popped open the pod, and each wire is a very small coax cable terminated inside the pod, and covered in a conformal coating.  So, I guess I can cut the cable back an inch or two until I get past all the breaks, and then re-terminate the cable... though it looks like it'll be a pretty tedious job.  For <$100 shipped I can get another L/A card with two more sets of cables (4 pods), which might just be worth it.  And I have 7 working pods (112 channels), so I guess it isn't that critical to get this last pod up and running immediately.

To directly answer my previous question on rivets though... it looks like they're 1/16" diameter semi-tubular steel rivets with an oval head.  I think they should be 3/8" long (hole of pod assembly is ~0.330" deep, plus 0.040" clinch allowance), though this is the closest I found for sale, keeping the other specs correct: https://www.rivetsonline.com/steel-plated-semi-tubular-rivets-116/t060s00406o .

DogP
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #143 on: November 13, 2022, 05:55:46 pm »
I'd look on AliExpress for the rivets.  They'll be cheap. Maybe not exact but do you really care?
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #144 on: November 16, 2022, 09:57:07 pm »
MarkL:
I just now can reply, because of I was far away from my LA and I can't confirm your answer. Now back to work :)
Good news that you found this. I always tested the board without cables. I have the cable set but the PODs are missing for me. I will test the card with your solution. Before I read your messages, I was thinking about how easiest the test the memory chips on the board. These chips used on the simple PC100 RAMs around the P1-P2 era. I bought a 32MB (4x8 MB) and a 64MB modules. My idea that I move 1 chip to ram module and test it with simple memtest x86. But first time I will check your idea, I hope some test will be PASS.

DogP:
The 16555 card restoration delayed because I got a 16557 also in untested condition. If it's pass I use it with 16555 both as standalone. Anyway what better? 2x 16555 as Master-Slave or 1x 16555 + 1x 16557 as standalone?
« Last Edit: November 16, 2022, 10:03:23 pm by MateKrisz »
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #145 on: November 16, 2022, 11:21:56 pm »
Hi MarkL,

I followed your insturctions. Here is my results:
Jumper on J7, 18-20. Only do_ram_test and do_dram_test is PASS others failed extept 100,180,300,ext tests they are untested.
I plugged the cables. Only the cables because of the PODs is missing for me. Results is same with the previous test.
I'm surprised that my LA ignored the J7 jumper settings.

« Last Edit: November 16, 2022, 11:48:40 pm by MateKrisz »
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #146 on: November 16, 2022, 11:42:50 pm »
I ran the x modtest on the 16557D board. All passed extept the do_comparator_stest is failed. I will check the traces on this board. Maybe find somthing broken trace.
 

Offline hackwell

  • Contributor
  • Posts: 29
  • Country: fr
Re: Series defect on agilent 167xx boards?
« Reply #147 on: November 20, 2022, 05:47:00 pm »
Hi everybody

I ran into the same issue yesterday with my 3 16718A modules. I didn't notice the failures until I tried to use pods 2 and 3 which failed on the 3 boards.
This tread helped me a lot , and I was able to fix 2 out of the 3 boards. Sure enough a bunch of tracks were sent to the 9th dimension and I had to do some heavy sewing but things went quite well.
So now I'm left with one module with an error on U2 U13 and U64 U67. This seems not to be related to corrosion and may have occurred a long time ago. maybe a faulty DRAM of bad Xilinx chip. I guess I'll have to build an extension cable and use my Tek LA to troubleshoot that one.

but I still have an issue with the allegedly good boards : they both pass all the tests individually (with vp and the GUI) , but my 16702B throws an error when using pods 3 and 4 of the master card when it's used in multicard configuration. The other one works just fine on all 4 pods

the thing is that when vp is ran against a multicard setup it relies on the master, and tests the expander through it.
weirdly enough , the failing pods 2 and 3 are located on the master card , and it passes all the self tests...

I'm a bit lost here
« Last Edit: November 21, 2022, 05:13:05 pm by hackwell »
 

Offline hackwell

  • Contributor
  • Posts: 29
  • Country: fr
Re: Series defect on agilent 167xx boards?
« Reply #148 on: November 22, 2022, 05:47:36 pm »
A quick follow up to my previous message : I found another open track which was responsible for the memory error of my third card.
It was located between the 2 FPGAs. It seems that this interconnection is used to access the whole memory space from the backplane bus , if it makes sense
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #149 on: November 27, 2022, 05:33:14 pm »
Hi MarkL,

I little waiting because of I ordered a HDMI 51mp microscope and I would like to check the traces after this is arrived.
I tested the PC100 RAMs, I have two module, both is 100% tested with Memtestx86. So if need spare memory IC I have lot of.

 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #150 on: December 01, 2022, 08:55:42 pm »
I'm here again. My digital microscope has been arrived and I checked the 16720A PCB first time. I found a physical damage on the board. One trace is broken. I attached some picture about this. I think this is connect the U87 11 with the U72 ?? The clock hide this trace connection. I think need to replace this trace with extra wire on the pcb directly.
 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #151 on: February 13, 2023, 10:45:48 am »
I have a couple 16752a cards I'm trying to fix as I have another project that requires a logic analyzer. I fixed one by removing the plastic runners and doing some careful track repair. That one passes all self tests. But the second one is being very difficult.

On the second card, it fails the Memory Data Bus Test. It's the same single bits on 2 banks
Chip 0 Bank 0 Port 1 Bits 0x00000002 U2
Chip 0 Bank 1 Port 1 Bits 0x00000002 U64
Chip 1 Bank 0 Port 3 Bits 0x10000000 U60
Chip 1 Bank 1 Port 3 Bits 0x10000000 U90

U2 and U64 are on opposite sides of the board in the same location. Same with U60 and U90. I verified that their data bits are indeed connected together (in a way that's convenient for the layout, not necessarily D0 on one connects to D0 on the other).

I verified continuity between all data bits on U60 and U90 to the 33 ohm resistor packs, and verified that on the other side of that, I do indeed measure about 35+ ohms. From there, the signals go to the Virtex FPGA, and then the other side of the FPGA looks like it connects to the actual logic chip.

What I'm a little confused about is how can it be bit 28 on U60 and U90 when they're only 16 bits wide each?? Or are they setup in a 32 bit arrangement with their companion chips (U89 and U59), and if the failure is in the upper word, it calls out U60 and U90, but if the failure was in the lower word, it would call out U89 and U59??

How does the Chip / Bank / Port nomenclature work?
Chip 0 / 1 I get - the 2 main logic analyzer asics
Is Bank which FPGA memory controller it's talking to?
Is Port which set of DRAMs the FPGA is talking to?
Does U60 tell me the same information as Chip 1 Bank 0 Port 3?

Because it's a single bit error, and I've traced the signal from the DRAM through the 33 ohm resistor pack to the other side of that, and that goes directly to the FPGA, does that point at the bga ball under the FPGA? I find it hard to believe that one ball is broken on each of the outter-most FPGAs, but it could happen I suppose.
Could this also be an interconnect issue between the FPGA and the main logic analyzer chip?

Or is the U60 / U90 thing a complete red herring, and the problem is somewhere else entirely? If I run one of the later tests, I get the same bits failing (bit 1 and bit 28), but it calls out entirely different chip identifiers??? HUH??? I'd have to go stick the card back in the analyzer and run the tests again to check which test and which chips it was calling out, but the wrong bits are in the same positions, but it was on completely different chips (U37 seems to ring a bell). I also kind of read somewhere that if there are multiple self test failures, to basically ignore all the tests after the first one that failed, as they're all dependent on each other. Is that true?

I've traced out many of the lines, including through vias that were under or close to the runners and double sided sticky pads between the main logic analyzer chip and the FPGA that controls U60 / U90, and I can't find anything that looks or measures broken.

I'm a bit stumped on this one. The one card was relatively "easy" to fix (I guess if you count scraping solder mask with a pin under a microscope and repairing traces with a single strand of wire from a 22gauge wire 'easy'), and this one is the exact opposite.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #152 on: February 13, 2023, 07:38:05 pm »
The two-connector (4 pod) analyzer boards are pretty much two independent acquisition engines.  It's split down the center and there is no crossover for acquisition data flow (although I think there is some muxing between the pods before  acquisition occurs in ASICs U22 and U45).

The data R/W access from the backplane to all the memory chips, however, is common between the two sides.  I think this is what it's complaining about, or at least is where I would look first.

As you've noted, the bit assignments are done based on what's best for the layout, and not how the chip manufacturers label the pins.  It's possible you're looking at a single problem, but I think probably not since the chips are on opposite edges of the board.

The nomenclature is a bit confusing.  To be honest I have not seen errors reported with "Chip 0" and "Chip 1" before, but I think it referring to the Virtex FPGA's as "0" and "1" within each side.  The big acquisition ASIC under the heatsink are numbered like this:

  16750A/51A/52A: U45 (pods 1+2 "Chip 9"), U22 (pods 3+4 "Chip 8")

And then I'm guessing on the pod 1+2 side:

  Chip 0 is U41
  Chip 1 is U52

And then on the pod 3+4 side:

  Chip 0 is U10
  Chip 1 is U25

Repeating... This is only a guess.  Was there any Chip8/9 information printed with these errors?

(EDIT: So this guess was wrong.  The discrepancy is from units running HP-UX vs. units running windows.  Chip 8/9 in HP-UX is Chip 0/1 in windows.  Keep reading...)


I would closely examine the large clump of traces going down the center of the board and heading towards the two FPGAs near the backplane connector, and mostly on the bottom since they are all directly under one of the runners.  I think the Altera MAX (U18) is responsible for backplane access to the acquisition memory, and on the 16750/1/2 boards is done through the Virtex.

Do you have more detailed output from the failing pv test with debug turned on (d r=10 d=9) ?  Are there any other errors being reported on the Memory Data Bus Test, or are any other pv tests failing?  In general, the first test to fail is the one to focus on fixing, since other tests very often fail as a result of the first failure.  But it's not to say to disregard clues that might exist in later failed tests, so it's at least worth running all the tests once to be aware of what else pops up.

If unable to find a break anywhere, one troubleshooting method is to put the failing test into a loop and then start perturbing operation of the various signal lines with a low resistance to ground.  Try a 33R to start, but you might need to go as low as 10R.  The idea is to see if you can generate the same error report on other bits and then try to zero in on which physical data line is having trouble.  The data lines are usually (but not always) in bit order next to each other, so you can usually tell when you're getting close to the culprit.

This method will also reveal a lot about the nomenclature as various errors are reported.  Perturbing data, address, and control signals on various memory chips will help create an understanding of Chip/Bank/Port and bit ordering.

You can also do this on your working card to try to recreate the error message you're seeing on the bad card.

It's frustrating not having any documentation for this level of diagnostics.  The exact point of failure could be sitting right in front of us and we'd never know it.
« Last Edit: February 14, 2023, 03:37:29 pm by MarkL »
 
The following users thanked this post: alm

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #153 on: February 14, 2023, 09:47:15 am »
So I did some fault injection of my own. This is what I discovered.

When testing a 16752A in a 16903A chassis (the 3 slot 16900 series Windows XP chasis), the DRAM chip identifiers given by the self test are COMPLETELY WRONG!

Chip 0 = the main logic analysis chip for pod 1/2
Chip 1 = the main logic analysis chip for pod 2/3

So, it was telling me there was a fault on bit 0x10000000 of U90/U60. At first glance, this makes sense - U90 and U60 are on opposite sides of the board of eachother, and their data lines are connected together - U90's D15 connects to U60's D0 and so on.

Ok, I'll inject another fault on a different pin on U90 - let's inject a fault on D15... run the self test

new failure on U31 / U74 at bit 0x80000000. Ok, so the data bits on the bottom chip align with the numbering they're using here obviously, but the identifiers are completely wrong.

I was also very confused as to whether this was talking about the data bus between the FPGAs and the DRAMs, or between the FPGAs and the aquisition ICs, so I injected a fault there as well and ran some tests.

No change to the "Memory Data Bus Test", but a new fault on the "Analyzer chip memory bus test"

Ok, so "Memory Data Bus" = between the FPGAs and the DRAMS including all of the 33 ohm resistor packs (which were a huge problem on my card - I took most of them off, cleaned the pads, had to repair a couple pads as they were eaten away right where the pad transitions to the trace at the boundary of the opening in the solder mask, and soldered them all back - the corrosion on the solder joints of those on my card was pretty bad)

and "Analyzer chip memory bus" = between the acquisition ASICs and the FPGAs
Chip identifiers completely unreliable

Ok, now we're getting somewhere.

On the 16752A in 16903A Memory Data Bus test, it uses the nomenclature
Chip => bank => port

Chip = 0 / 1 - which acquisition ASIC or that general side of the board - pod 1/2 = chip 0, pod 3/4 = chip 1
Bank = 0 / 1 = Top / bottom side memories - not exactly sure which bank is which side of the board as the chips' data pins are wired together
Port = 0 to 3 => seems like each FPGA has 2 ports - and there's 2 FPGAs per ASIC. Each "port" is 4 chips (2 on each side of the board). At least for Chip 0, with the bottom of the PCB facing up, and the pod connectors towards you, the "ports" go from 0 in the middle of the board to 3 on the left side of the board. Port 0 = U76 U77 on the bottom and U36 and U37 on the top. Port 3 = U89 and U90 on the bottom and U59 and U60 on the top. The port numbering and the byte order in the ports follows no logical order, and is all over the place. Port 1 - the chips right by the central bus of traces that runs to the top section of the board - aka right next to where a runner with the double sided adhesive was. I finally found the right chip!

There's also a "BONUS" port on each Chip which seems to be the one extra DRAM that doesn't have a partner that's only on the top side.

On the Analyzer Chip Memory Bus test, it uses the same nomenclature, but drops "bank" and only talks about chip and port

So seeing as my failure on chip 0 is on port 1 bit 0x00000002, that would be U82 / U36, not U90 / U60 as the incorrect self test says.

Time to go poke around with the continuity tester now that I know where I'm actually looking for a fault!

I wonder how they managed to screw that up!
« Last Edit: February 14, 2023, 10:53:25 am by ahakman »
 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #154 on: February 14, 2023, 11:35:48 am »
Here's the issue - hard to tell in the photo, but that's a nodule of corrosion and obviously the track is completely eaten away between the pad and the trace.

[ Specified attachment is not available ]

Don't mind the resistor pack being crooked - I re-flowed them all with hot air - obviously I need to remove them and clean and inspect all those pads properly too, not just reflow with some flux.

What a mess
« Last Edit: February 14, 2023, 11:37:32 am by ahakman »
 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #155 on: February 14, 2023, 11:51:45 am »
Here's some context, if it can help someone else. I labeled the couple chips I know for sure by fault injection and running the self test (again, I stress what the self test reports on a 16903A 3ch Windows XP mainframe - I have some 16702B's I could try the test in as well in HP-UX - the 16752A cards are the only cards I have though that are new enough to work in the newer mainframe, which boots faster, has a better screen, and the hard drive doesn't sound like a jet engine running).

C0 P1 L = Chip 0 Port 1 Low Word

1716143-0
« Last Edit: February 14, 2023, 11:58:38 am by ahakman »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #156 on: February 14, 2023, 03:04:42 pm »
Wow, I can't see any problem with that joint on the resistor pack.  A good reason to always do end-to-end continuity checks in areas with corrosion or on traces that transit corroded areas.  Another common location for breaks that are hard to spot are those solder blobs which are probe points.  The traces leading up to it are exposed for a very short distance after they come out from underneath the soldermask.

So, after reflowing does the card work now?

Thanks for the update on the nomenclature.  All my information is using a 16702B, and clearly they changed the chip 8/9 designation to chip 0/1 for the windows OS models.  And from your investigation, it sounds like they got much of the Uxx identifier reporting just plain wrong in the windows version.  Thanks, Agilent.

Each pod pair has 32 bits of data that is acquired and stored.  Plus, there are two extra bits for the clock/qualifier inputs that is also acquired and stored.  These are the Bonus bits, and I think are stored in U19/U47 (the DRAM chips that don't have a partner on the underside), or in whatever logical grouping that includes those two chips.

Some people have installed SCSI2SD adapters to get rid of the "jet engine" hard drive in the 167xx units.  But the drive noise is nothing compared to the chassis fans, IMO.
 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #157 on: February 14, 2023, 06:51:21 pm »
I didn't scrape the corrosion off, repair the trace and try it yet - it was about 3AM when I finally found the right chip by moving my induced fault around. Hopefully tonight I will get that fixed along with the other memory error on the other chip (after figuring out which port is the one that's failing) and look into what's causing the error withe the Bonus bank on Chip 1, then we'll see how many self tests still fail.

Yes, this card is very tricky compared with the first one I fixed. The first one had very obvious corroded traces that looked green under the solder mask. Scrape the mask off until you get to good copper on both sides, and solder in a jumper - problems solved.

This one is all about the ends of the traces where they meet pads being corroded, which are MUCH harder to spot visually. And the corrosion is further away from where the runners were. This card almost looks like it was above another card that was off gassing or something, and the corrosion is much more widespread. As I said earlier, ALL of those 33 ohm resistor packs looked just downright awful, and I can see now that some of them still need some attention. They didn't look the best on the other card that's working either, but better than this one. Long term, I probably need to remove, clean / rehab the pads, and re-solder ALL of them on the other card too - but that can wait until later.

I do want to get this up so i can actually use it for the project I have in mind. I should probably just stick the pair of 16550A cards that don't have runners and thus don't have any corrosion into a 16702B chassis and use that. For some reason I initially thought those weren't supported in a 16702B chassis, but looking at the compatibility matrix again, looks like they are - they're one of the 165xx series cards that works in the 167xx series mainframes (in the same way that the 16752a is one of the 167xx series card that works in the 169xx series mainframe)

The next issue is going to be cables - I see now that the 167xx series cards use a wider plug (not to mention that spacer built into the back of the card) than the 165xx cards do. I have some cables for the 165xx cards, but none for the 167xx series cards :(

 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #158 on: February 26, 2023, 02:41:46 am »
I was dragging my feet working on this again as I was waiting for my new soldering microscope to arrive.

I fixed all of the connections on all of the 33 ohm resistor packs, and now all my memory errors are gone! Now I only have 2 self test failures left (down from about 10 self tests failing before):
Comparators and ZoomChipSel

These are obviously in a different area of the board - back to the microscope to do some detailed inspection...

Edited to add: I found the comparitor problem - I think it was a trace I repaired previously, but there wasn't enough solder on my bodge wire and it wasn't making a good connection, or it could've been one of the vias I reflowed and got a bunch of weird looking junk out of). Sweet, both of my 16752A cards are passing self test now!

« Last Edit: February 26, 2023, 04:08:46 am by ahakman »
 
The following users thanked this post: alm

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #159 on: February 26, 2023, 07:55:10 pm »
Great!  Glad you found the problem.

The comparator control lines seem to be a common victim of corrosion.  It's likely the comparator failure(s) caused the zoom failures since the system uses the comparators to set up test data patterns for the zoom chips.

For reference for future readers, the comparators are 1NB4-5036 next to the pod connectors (top and bottom), and the zoom chips are 1NB4-5040 next to them (top only).
 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #160 on: February 26, 2023, 10:24:50 pm »
Yes, the comparators are right down by the pod cable connectors on the external side of the card, but I think the reference voltages come from a DAC that's way up close to the backplane connector. They run right under where one of the plastic runners was, and the trace was broken there. I had already bridged that trace with a wire, but it was completely disconnected from one side, either from cleaning the board with q-tips and acetone, or it just didn't solder well the first time.

For others reading this thread, just because it says "comparator failure" and the comparators are down on the external connector side of the board doesn't necessarily mean that the problem is there. Always focus on the areas with the plastic runners and the areas around where they were.

And if you have memory data bus errors, focus on all of the 33 ohm resistor packs. Especially focus on any signs of corrosion where the pads turn into traces right at the edge of the solder mask opening for the pads.

These are the kind of self test numbers I like to see:
1725992-0
« Last Edit: February 27, 2023, 12:21:54 am by ahakman »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #161 on: February 27, 2023, 02:41:27 am »
Yes, the comparators are right down by the pod cable connectors on the external side of the card, but I think the reference voltages come from a DAC that's way up close to the backplane connector. They run right under where one of the plastic runners was, and the trace was broken there. I had already bridged that trace with a wire, but it was completely disconnected from one side, either from cleaning the board with q-tips and acetone, or it just didn't solder well the first time.

For others reading this thread, just because it says "comparator failure" and the comparators are down on the external connector side of the board doesn't necessarily mean that the problem is there. Always focus on the areas with the plastic runners and the areas around where they were.
...
100% agree.  It's the path between the DAC and the comparators that can get severed which causes these problems.  It's much more rare that it's an actual chip failure (although it has happened).

On the 16752A, the DAC is U39 (AD7841ASZ), and as you point out is near the backplane connector on the top.  The traces to the comparators run on the bottom of the card on the very outer edge on the POD1/2 connector side.  And as you say, they pass right under one of the runners.  The crossing is adjacent to U90, and is a favorite corrosion spot.

There is a similar looking set of traces on the opposite edge, also on the bottom, and if corroded these can create board ID errors.

There's a mapping of DAC output pins to comparator inputs earlier in this thread for anyone needing to do the end-to-end continuity check:

  https://www.eevblog.com/forum/repair/series-defect-on-agilent-167xx-boards/msg2720304/#msg2720304
 

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #162 on: March 22, 2023, 10:30:10 am »
Bit of an odd request.

I've been hunting on ebay for a while for a dead 167xx or similar board - anything with the modern low density probe connector. Most of them are in the US and the shipping costs a fortune.

I am producing a new high end FPGA board for retro gaming, and it has a daughterboard slot. I'm thinking about using the HP front end comparator and circuit around it, connected to the FPGA.
I should be able to get speeds of @1.6Gb per channel.

I see the pinout of the comparator is quite well understood but has anybody drawn a complete schematic yet?
Does it vary much with the highest speed boards, say the 2GHz 16751a?

I could do with the cable as well, I have some probes to play with..
I'm in Sweden and happy to pay for parts - although I do need at least one intact front end with all the parts present.

One reason is to make a modern logic analyser with huge depth, but the other is for real time debug/emulation of chips in arcade boards.


Thanks for reading.

Mike.
www.fpgaarcade.com
mike@fpgaarcade.com
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #163 on: March 22, 2023, 03:25:34 pm »
Bit of an odd request.

I've been hunting on ebay for a while for a dead 167xx or similar board - anything with the modern low density probe connector. Most of them are in the US and the shipping costs a fortune.

I am producing a new high end FPGA board for retro gaming, and it has a daughterboard slot. I'm thinking about using the HP front end comparator and circuit around it, connected to the FPGA.
I should be able to get speeds of @1.6Gb per channel.

I see the pinout of the comparator is quite well understood but has anybody drawn a complete schematic yet?
Does it vary much with the highest speed boards, say the 2GHz 16751a?

I could do with the cable as well, I have some probes to play with..
I'm in Sweden and happy to pay for parts - although I do need at least one intact front end with all the parts present.

One reason is to make a modern logic analyser with huge depth, but the other is for real time debug/emulation of chips in arcade boards.
...
Not an odd request at all; an interesting idea!  The probing is not a trivial piece to get right.  Why not take advantage of a probing system that's already out there and readily available.

The 16715A, 16716A, 16717A, 16718A, 16719A, 16740A, 16741A, 16742A, 16750A/B, 16751A/B, and 16752A/B all use the same front end design, including the DAC.  In fact, many of the cards are identical and only differ by model setting resistors.  The 16753A, 16754A, 16755A, 16756A, 16760A use a different front-end design and comparator.

The former all support 2GHz Timing Zoom (except the 16715A which has unpopulated areas for it), but that's just the sample clock rate.  If you're shooting for 1.6Gbps, you should take note that the max state capture is 400MHz in the 16750/1/2 cards, and channel-to-channel skew is only specified as <1.0ns.  Besides the acquisition ASICs, the front-end could be contributing to those limits.  When you get a board, you might want to measure the actual switching characteristics of the comparators in their natural habitat before proceeding with a design.

Unfortunately schematics don't exist.  Some of the passives connected to the incoming data lines are unlabeled and would need to be measured with appropriate high-frequency gear.  The easiest approach would probably be to duplicate their layout exactly and lift all the front-end components from the board.  Length-matched traces may include some post-comparator delay compensation, so you may need to tweak lengths in your final design.

I'm in the US, so I'm unfortunately in the category of "costs a fortune" shipping.
 

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #164 on: March 22, 2023, 09:52:37 pm »
Thanks for the detailed response.
I can get hold of a working 1680A for a bit which looks to use the same front end? and I can probe around that. I've got access to a decent 'scope at work.

The test mode feature of the comparator is interesting, I should be able to use that to compensate for delays between the front end and the FPGA.

I doubt I'll get the layout quite as good as the original, but hopefully sufficient.

The MPSoC Xilinx device I am using is quite a beast, with Ethernet, USB and a couple of built in ARMs. It will be able to stream the captured data direct to the connected DDR4 memory.

 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #165 on: March 23, 2023, 01:13:32 am »
Thanks for the detailed response.
I can get hold of a working 1680A for a bit which looks to use the same front end? and I can probe around that. I've got access to a decent 'scope at work.
...
I don't have a 1680A, but it uses the same 40-pin probing system as the cards mentioned previously.

I was able to find a few teardown photos which show the 1680A acquisition card.  The photos weren't good enough to read the number off the comparator, but it has the right number of pins, and the layout and passives look the same on the input side.  The layout of the test clock area also looks the same.

You'll know for sure when you get it open and can verify if it's using the 1NB4-5036 comparator.

Please post your findings if you can - thanks!
 

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #166 on: March 23, 2023, 11:27:28 am »
I will!
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #167 on: March 24, 2023, 08:59:25 pm »
Well that's interesting.

A long time ago I bought my first 16xxxx, a 16700b.  One of the installed cards was a 16534a and it passed all the self-tests and showed valid calibration.  It seemed to work based on some minimal usage but I'd never re-run the calibration.

A year ago I bought a  mint 16702b from a local seller on Craigslist.  He assured me it passed self-tests but it turned out he thought that meant it powered on  :-/O   All three of the 16752a cards failed test (I've got two working since).   The 16720a passed self test (and failed spectacularly a week later, now every test fails).  The 16534a passed self test.

When I got home I happened to notice that C107 on the 16702b/16534a card had imploded (pic).   Given the cap type and location I doubted it was critical but not exactly encouraging.

This week I fixed the damaged pcb/cap and got around to running calibration on both the 16534a cards using a T-cable setup from AliExpress  I expected one card to pass fine and it did.  The other immediately threw "PROBLEM: either the cables are not connected properly or there is a serious problem with this module".  Continuing on only 2 of the 7 tests pass, hysterisis and trigger level.

What was unexpected was that it's the original module from the 16700b that's failing.  The one with the repaired cap is working fine.   Never assume LOL.

Question: So on the working one I'd previously removed the old runners and bought some of the recommended 3M tape.   Before I apply conformant coating and reinstall the runners and bless it good are there any other tests I should run.   I see a bunch of other tests in the service manual (hooking upto multimeter and signal generator).   

[I'll read through this thread for tips, I recall some from MarkL, on fixing the non-working card]


Thanks!
« Last Edit: March 24, 2023, 09:03:22 pm by dorkshoei »
 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #168 on: March 25, 2023, 06:40:43 am »
So I've managed to get 2 more 16752A cards working, of 3 more that I bought. After my first round of repair attempts on this batch of 3, I had one working (which I've stacked with my original 2 I repaired before, filling my 16903A chassis - but I have a line on a 16902B so it would still be nice to get all 5 cards working), one failing the ZoomAcq test, and one failing the Memory Unload Modes Test.

The one failing the ZoomAcq test seemed very suspiciously similar to the failing PLL chip problem reported earlier in this thread, so I just swapped the PLLs on the 2 boards that had problems, and now I have one card that works entirely, and presumably one that's failing both memory unloads and the ZoomAcq tests now.

Does anyone still have any "beyond repair" cards they could scavenge a PLL chip from? Or that maybe are not quite as "beyond repair" as they thought?

Does anyone have experience with where the fault would be for the Memory Unload Modes Test? Reading the service manual, that test tests reading the memory data off the card to the backplane (so presumably to the CPLD chips close to the backplane connector). The card that's failing that test had by far the worst corrosion on it. I suspect one of the vias around the middle runner (with the huge parallel bus of tracks that runs up the middle of the card), or the next 2 runners towards the "POD 1/2" side of the card - there was some very nasty corrosion there, but I just can't see anything that looks broken after I cleaned it all up. I tried probing a few of the most suspect vias on a known good card to see if I could find where they went, so I could test which one was broken on the broken card, but I wasn't able to trace some of them (inner layer traces to BGA pads I'm thinking??)
« Last Edit: March 25, 2023, 06:53:06 am by ahakman »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #169 on: March 25, 2023, 07:33:34 pm »
...
This week I fixed the damaged pcb/cap and got around to running calibration on both the 16534a cards using a T-cable setup from AliExpress  I expected one card to pass fine and it did.  The other immediately threw "PROBLEM: either the cables are not connected properly or there is a serious problem with this module".  Continuing on only 2 of the 7 tests pass, hysterisis and trigger level.

What was unexpected was that it's the original module from the 16700b that's failing.  The one with the repaired cap is working fine.   Never assume LOL.
Just to be clear, is the 16534A that's failing its cal passing all the self-tests?  If you run it anyway, can you see any traces with a signal applied to one or both channels?

Quote
Question: So on the working one I'd previously removed the old runners and bought some of the recommended 3M tape.   Before I apply conformant coating and reinstall the runners and bless it good are there any other tests I should run.   I see a bunch of other tests in the service manual (hooking upto multimeter and signal generator).   
I've never worked through the performance section.  It seems like it's a fair amount of work and wouldn't gain you that much unless you were using the card to produce verifiable test results.

The only thing I thing I think would be useful is to check is that the 50R termination is working on the specified attenuator ranges.  I don't think the 50R terminator is checked in either the self-tests or cal.  I've had attenuators where the the 50R termination relay was flaky, and one where the resistor itself was blown.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #170 on: March 25, 2023, 08:08:24 pm »
...
The one failing the ZoomAcq test seemed very suspiciously similar to the failing PLL chip problem reported earlier in this thread, so I just swapped the PLLs on the 2 boards that had problems, and now I have one card that works entirely, and presumably one that's failing both memory unloads and the ZoomAcq tests now.

Does anyone still have any "beyond repair" cards they could scavenge a PLL chip from? Or that maybe are not quite as "beyond repair" as they thought?
Wow, another dead PLL.  Can you see any clock output from the dead one?

The PLL can be had on any of the 167xx cards with Timing Zoom except the 16715A.  It might help to know your country for anyone who can scrounge up a PLL.

Quote
Does anyone have experience with where the fault would be for the Memory Unload Modes Test? Reading the service manual, that test tests reading the memory data off the card to the backplane (so presumably to the CPLD chips close to the backplane connector). The card that's failing that test had by far the worst corrosion on it. I suspect one of the vias around the middle runner (with the huge parallel bus of tracks that runs up the middle of the card), or the next 2 runners towards the "POD 1/2" side of the card - there was some very nasty corrosion there, but I just can't see anything that looks broken after I cleaned it all up. I tried probing a few of the most suspect vias on a known good card to see if I could find where they went, so I could test which one was broken on the broken card, but I wasn't able to trace some of them (inner layer traces to BGA pads I'm thinking??)
It's really better to check continuity via to via on all the traces running through or near corroded areas.  Extremely sharp probes pushed into the via holes at an angle works well.  On multiple occasions I've had traces with no visible breaks and it turned out the corrosion had gotten under the soldermask.  It can take some time to do the testing.  But you're right, it could also have eaten away the via hole plating, and that's happened to me too.

The HP-UX based analyzers are able to turn on detailed debugging output when running any of the verification tests (pv).  Is there any more detail from the windows version on which bit(s) and/or chips are failing during "Memory Unload Modes Test"?

On the 1675x cards I think the acquisition memory access path from the backplane is through one of the FPGAs near the backplane connector (I think it's the Altera MAX), up to the Virtex FPGAs on top, and then back down to the actual DRAM chips.  On the 1671x cards, it goes direct from the backplane controller FPGA to the memory chips (there are no Virtex FPGAs acting as a memory controller layer).
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #171 on: March 25, 2023, 08:35:44 pm »
Just to be clear, is the 16534A that's failing its cal passing all the self-tests?  If you run it anyway, can you see any traces with a signal applied to one or both channels?
The card had been working (at least for my usage) the previous time I tried.  Now it passes self-test but fails most of the cal.  I'd have to try again to see if it's still showing traces.

Quote
I've never worked through the performance section.  It seems like it's a fair amount of work and wouldn't gain you that much unless you were using the card to produce verifiable test results.
I'm not.  I just don't want to glue down the new runners only to find there is a fault :-)
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #172 on: March 25, 2023, 08:36:50 pm »
One further thought on the failing 16534A scope card...

It's worth the time to check the on-board regulator outputs before heading down other troubleshooting rabbit holes.  There are five regulators and their output voltages are all labeled on the top of the board.

I had one card that self-tested fine, but consistently failed calibration because one of the output setting resistors had gone bad.

I've also had bad output setting resistors on logic analysis cards too, so it's never a bad idea to verify regulator outputs on these cards also.  I remember one card that had a very out of spec ECL termination voltage, which caused a number of self-tests to fail.

Bad resistors occur more often on cards that have corrosion.  Maybe the corrosion is getting into the film on the resistor, but I've never been able to see any damage under a microscope.  It's a just a correlation at the moment.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #173 on: March 25, 2023, 08:47:47 pm »
...  I just don't want to glue down the new runners only to find there is a fault :-)
Understood.  I left my runners off, and I dislike conformal coatings passionately.  Time will tell if I'm wrong.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #174 on: March 26, 2023, 06:00:17 pm »
On the 16534A card again...

Besides the previous question on the self-test, are the cal errors associated with one channel or both?

If faulty on only one channel, a useful troubleshooting technique is to compare signals on equivalent nodes between the working and broken channel.

Similarly, if you have a fully working card, you can use that as your comparison.  A working card is a really useful resource, given the lack of any detailed documentation for these units.

You can probe two cards at the same time by using a 16701B expansion chassis, or a home-made card extender.  User DogP has some gerbers available for an extender that works well:

  https://www.eevblog.com/forum/repair/series-defect-on-agilent-167xx-boards/msg4031926/#msg4031926

There was another user who had done something similar (or was working on it), and I think it was a full-length extender card.  Can't find the post at the moment.
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #175 on: March 26, 2023, 09:18:52 pm »
Understood.  I left my runners off, and I dislike conformal coatings passionately.  Time will tell if I'm wrong.

I'm fairly accident prone by nature and just pulling the cards out over the last couple of days I had to be very careful as they *WILL* clash against each other (with the runners removed). 

My logic is that (if you read the webpage on replacing the runners) it was speculation by the 3M app engineer that this tape would work best.    I figure conformal coatings (I used silicone) are well proven wrt traces so I doubt very much they will cause an issue and maybe if there is a long term issue with the replacement 3M tape it will help prevent damage. 

Of course it took 20+ years for the issue to occur with the original OEM tape so I suspect I'll be well into senility before anything happens but ......

On the 16534A card again...

Besides the previous question on the self-test, are the cal errors associated with one channel or both?

Both.    I just removed the runners from the failing board.

It still works (both channels) connected to my Siglent function generator.  I tried a variety of waveforms and pulsing.   All worked.   Maybe slightly noisier than the fully working one.

I just re-ran calibration and didn't get the previously mentioned 'whoa this board has issues are you sure you want to continue' message and this time Trigger Delay passed but ADC, Gain and Offset still fail on both channels.

Same calibration T cable works fine on the other card.

Quote
You can probe two cards at the same time by using a 16701B expansion chassis, or a home-made card extender.  User DogP has some gerbers available for an extender that works well:
I'd seen this before.   I assume by "two cards at same time" one is via removing the unit cover.

I'd like a 16701B.  One of my units came with the cable to connect.  They're stupid expensive.   Max I'd pay is $50 and it would need to be local.

I'm parting out a couple of 16700Bs.    I think one has opt-003.   I've had zero luck selling them for $100.   Just mentioning it in case anyone wants any parts (for spares).   I'll probably keep the boards from one for spares for my 16702B.      I also have a 16702B that will be for sale (two logic cards (low end) with original runners replaced,  all cables/pods, optional pattern gen card) if anyone is interested.
« Last Edit: March 26, 2023, 09:25:02 pm by dorkshoei »
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #176 on: March 26, 2023, 10:09:10 pm »
...
Besides the previous question on the self-test, are the cal errors associated with one channel or both?

Both.    I just removed the runners from the failing board.

It still works (both channels) connected to my Siglent function generator.  I tried a variety of waveforms and pulsing.   All worked.   Maybe slightly noisier than the fully working one.

I just re-ran calibration and didn't get the previously mentioned 'whoa this board has issues are you sure you want to continue' message and this time Trigger Delay passed but ADC, Gain and Offset still fail on both channels.

Same calibration T cable works fine on the other card.

It's good that you're getting a waveform showing that the sampling and horizontal stuff is working.  That can be really difficult to troubleshoot.

Does the offset work?  Can you make it trigger on different vertical points on the waveform edge?  And if either does work, are they displaying a sane voltage?

I'd check out the regulator outputs.  There's also some reference voltages common to both channels that are generated by the opamps near the DAC (U200).  You can compare with the working card.

Here are a few probe points that might help:

  https://www.eevblog.com/forum/repair/series-defect-on-agilent-167xx-boards/msg3994748/#msg3994748

Problems with the DAC, trigger, offset, and reference circuitry would show up there.  If it's noisy, maybe you have another bad cap on one of the voltage rails?

Quote
Quote
You can probe two cards at the same time by using a 16701B expansion chassis, or a home-made card extender.  User DogP has some gerbers available for an extender that works well:
I'd seen this before.   I assume by "two cards at same time" one is via removing the unit cover.

I'd like a 16701B.  One of my units came with the cable to connect.  They're stupid expensive.   Max I'd pay is $50 and it would need to be local.

I'm parting out a couple of 16700Bs.    I think one has opt-003.   I've had zero luck selling them for $100.   Just mentioning it in case anyone wants any parts (for spares).   I'll probably keep the boards from one for spares for my 16702B.      I also have a 16702B that will be for sale (two logic cards (low end) with original runners replaced,  all cables/pods, optional pattern gen card) if anyone is interested.
Well, the trick is to get two cards powered up at the same time, and depending on the problem, configuring them the same or start them looping on the same test.  Sounds like you have enough chassis to do it.

I only have DogP's extender recently.  I've done most of my debugging from the underside by removing the bottom cover and mouse/keyboard card which exposes the bottom of the card in Slot E.  Almost all the signals on the logic analysis cards appear on one of those solder blob probe points on the bottom either because the signal transits there on a trace, or it's brought there on purpose for probing.  But it's not so on the scope cards and sometimes jumpers or access to the top is needed.
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #177 on: March 26, 2023, 11:02:30 pm »

Well, the trick is to get two cards powered up at the same time, and depending on the problem, configuring them the same or start them looping on the same test.  Sounds like you have enough chassis to do it.


I'd prefer two extension setups so both cards could be side by side on the bench outside of the unit.

Ideally there would be an extender card that slides into the chassis and then you could connect a ribbon cable to that and then the card-under-test to the ribbon cable.   Avoid having to open the cover (or grope inside to make any connections).     Of course a card this size is mostly empty space, would be $$$ to fab.
 

Offline ahakman

  • Regular Contributor
  • *
  • Posts: 87
Re: Series defect on agilent 167xx boards?
« Reply #178 on: March 27, 2023, 09:33:59 am »
Wow, another dead PLL.  Can you see any clock output from the dead one?
The PLL can be had on any of the 167xx cards with Timing Zoom except the 16715A.  It might help to know your country for anyone who can scrounge up a PLL.

I was going to check the PLL output with a spectrum analyzer, but I thought first I would do a quick experiment of swapping the chips, so I didn't actually measure the bad PLL yet. I suspect it's the same as the previous failed one in this thread - the data on the self test looked right, but clocked wrong, so I'm guessing it was running too fast / lost loop control just like the previous failed one did.

I'm in the US for anyone that might have a donor PLL board.

It's really better to check continuity via to via on all the traces running through or near corroded areas.  Extremely sharp probes pushed into the via holes at an angle works well.  On multiple occasions I've had traces with no visible breaks and it turned out the corrosion had gotten under the soldermask.  It can take some time to do the testing.  But you're right, it could also have eaten away the via hole plating, and that's happened to me too.

Yes, I probably need to get some better sharp probes for my DMM. I have fluke probes, but they have really crappy (worn) tips. I also have the push-on finer probe tip adapters that go over regular fluke probes - they're not exactly the sharpest, but they do poke through the solder mask on the vias with some prodding - having some nice sharp probes would be a nice upgrade though.

The HP-UX based analyzers are able to turn on detailed debugging output when running any of the verification tests (pv).  Is there any more detail from the windows version on which bit(s) and/or chips are failing during "Memory Unload Modes Test"?

Yes, on the windows analyzer, you can turn up the verbosity of the self tests to 9, which I think is equivalent to the max verbosity you can set in pv as well. I don't remember exactly now what the error is, but it's either a consistent byte missing, or a consistent word missing. I also seem to recall getting different results running the tests in order from the beginning to the unload test, vs running some tests after the unload test and then going back to the unload test. It still fails in the same way, but it seems to fail for a lot more data values if I come back to the unload mode test after running other tests further down - maybe it's just because of what data has been left in the memory from the other tests.

It is interesting to me that the memory address and data bus tests pass, so all of the paths from the memory controller FPGAs to the DRAMs are good, but a consistent failing byte or 2 bytes in the unload test sounds like a control signal problem between the memory controller FPGAs and the back plane CPLD/FPGA (whatever it is)

On the 1675x cards I think the acquisition memory access path from the backplane is through one of the FPGAs near the backplane connector (I think it's the Altera MAX), up to the Virtex FPGAs on top, and then back down to the actual DRAM chips.  On the 1671x cards, it goes direct from the backplane controller FPGA to the memory chips (there are no Virtex FPGAs acting as a memory controller layer).
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #179 on: March 27, 2023, 01:52:45 pm »
I'm in the US for anyone that might have a donor PLL board.
I have some hopeless cards and can send you a PLL if you want to email me a self-addressed shipping label.  Send me a PM and we can work out the details.

Quote
Yes, I probably need to get some better sharp probes for my DMM. I have fluke probes, but they have really crappy (worn) tips. I also have the push-on finer probe tip adapters that go over regular fluke probes - they're not exactly the sharpest, but they do poke through the solder mask on the vias with some prodding - having some nice sharp probes would be a nice upgrade though.
I have a pair of Pomona 6275 with the SS tips.  You can also get them with pogo tips, but the SS tips are not spring loaded and you can really jam them into the soldermask if needed.  Replacement tips are available.  They're great when they're working, but one down-side is that they tend to go bad after too much flexing where the wire enters the probe body.  I'm on my third pair.  Some people like the ProbeMaster 8152/8153 and I might try them next.

For probe sharpening, I found this (thanks to your previous pointer):

  https://northridgefix.com/product/grinding-stone-to-straighten-and-sharpen-tweezers/

It's a sharpening block with a slot in it which makes sharpening the SS tips fast and easy.

Quote
Yes, on the windows analyzer, you can turn up the verbosity of the self tests to 9, which I think is equivalent to the max verbosity you can set in pv as well. I don't remember exactly now what the error is, but it's either a consistent byte missing, or a consistent word missing. I also seem to recall getting different results running the tests in order from the beginning to the unload test, vs running some tests after the unload test and then going back to the unload test. It still fails in the same way, but it seems to fail for a lot more data values if I come back to the unload mode test after running other tests further down - maybe it's just because of what data has been left in the memory from the other tests.

It is interesting to me that the memory address and data bus tests pass, so all of the paths from the memory controller FPGAs to the DRAMs are good, but a consistent failing byte or 2 bytes in the unload test sounds like a control signal problem between the memory controller FPGAs and the back plane CPLD/FPGA (whatever it is)
Hmmm...  I could be wrong about the testing path, or maybe only some of the signals are routed through the Virtex.  I'll have to take a closer look at that.

Because the same test fails in different ways it implies something could be floating because it's severed.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #180 on: March 27, 2023, 09:21:15 pm »
...
Hmmm...  I could be wrong about the testing path, or maybe only some of the signals are routed through the Virtex.  I'll have to take a closer look at that.
I'm remembering now where I got the idea the memory access goes through the Virtex FPGA chips on the 1674x and 1675x cards.

I looked again on a 1675x card that has a Virtex FPGA (U52) and an acquisition ASIC (U45) removed.  Along with a continuity tester, I can tell that the memory address, data, and control lines go to the Virtex by way of the (many) 33R resistor packs in the vicinity of the Virtexes.  They are series termination resistors for the memory signals.

A data/control bus from the Altera MAX runs up the top of the board in the center, ducks under to inner layer(s), and goes to the Virtex pads (different pads than the memory).  I believe this is the path for memory access from the backplane.  There appear to be a couple of traces that are also part of this bus that run on the bottom center with another bus.  They are on the outside edges of the other bus.

The "other" data/control bus on the bottom services the acquisition ASICs.  It appears to be a 16-bit bus.  It also services the DAC AD7841AS (U39).  Since the DAC has a known pinout, it's possible to figure out the specific bit assignments for the bottom bus, if it was useful to know.

FYI.
 

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #181 on: April 18, 2023, 08:34:23 am »
I managed to get hold of a couple of very duff cards and I'll start to reverse engineer the front end.
I've been chatting to Keith who took these excellent pictures :
https://www.techtravels.org/2021/02/hp-agilent-5382a-tear-down-with-photos/

Here is a picture of the termination network inside the flying wire "blob"
The documentation says it's a 250R to the signal, and then a 90.9K in parallel with a 8.2pF cap.

I removed the cap, and the resistor values measure as expected.

What's also interesting is the wire to the tip is just a normal unscreened wire, and the longer one is two core, not coax.

I don't destroy these cables lightly, they were a bit damaged... 

Looking at replacement cables,  the long run is lossy coax, measuring 178R for 135CM, so about 130R/meter.  Given the 90K on the tip, lossless coax probably isn't going to make too much difference to the levels.
(Samtec can't provide lossy)
« Last Edit: April 18, 2023, 09:14:11 am by fpgaarcade »
 

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #182 on: April 18, 2023, 09:26:58 am »
And a nearly in focus picture of the front end.
6 components in the rx termination network. I'll remove and figure out how to measure.
 

Online alm

  • Super Contributor
  • ***
  • Posts: 2881
  • Country: 00
Re: Series defect on agilent 167xx boards?
« Reply #183 on: April 18, 2023, 10:02:44 pm »
Looking at replacement cables,  the long run is lossy coax, measuring 178R for 135CM, so about 130R/meter.  Given the 90K on the tip, lossless coax probably isn't going to make too much difference to the levels.
(Samtec can't provide lossy)
It's not for the levels, it's to improve flatness of the frequency response, or in other words to dampen any ringing due to impedance mismatches between the coax and both ends. Using regular Z0 coax would distort the edges with reflections bouncing up and down the long coax. See the 1969 publication Tektronix Oscilloscope Probe Circuit Concepts book starting at page 14.

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #184 on: April 19, 2023, 06:05:36 am »
Hi,
I'm aware of the reasoning behind the lossy coax, it's discussed somewhat in the patent https://patents.google.com/patent/US4777326

My view, and this is yet to be proved in practice, is a standard 50R cable such as the samtec EQCD will work "good enough" - certainly better than other hobbyist probing solutions.
We transport GHz signals over these at work, but they are correctly terminated.

I think there is a enough resistance in the existing termination and matching networks that it will be sufficiently damped. I was worried that 180R less cable resistance would throw off the comparator levels but the difference is tiny over all.

As I can't source lossy coax ribbon, the only solution is to add a series R ~50Ohm at either end of the cable. Any other ideas are most welcome.

I plan to run some simulations when I have measured the front end component values, but really testing with a pulse generator and high bandwidth 'scope measurement at the comparator input will be needed.

/Mike

btw that Tek book is a useful reference, thanks.
« Last Edit: April 19, 2023, 06:13:55 am by fpgaarcade »
 

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #185 on: April 19, 2023, 06:43:41 am »
Does anybody know the characteristic impedance of the agilent cables? I can attempt to measure it when I'm back at work.

For reference, the samtec 50R cable I'm considering a play with https://suddendocs.samtec.com/notesandwhitepapers/tcf-3650f-xx-txx_datasheet.pdf

Once we know more about the cable used I can ask around some of my China contacts for lossy micro coax - but full custom gets expensive quite quickly.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #186 on: April 19, 2023, 02:48:47 pm »
If this is just a one-off project...

You could salvage the lossy coax from 40-pin probe cables (ribbon, not woven).  The ribbons peel apart easily if you want to get down to individual or smaller groups of coax.

If you want regular coax (not lossy), the newer 90-pin probe cables could be salvaged and also peel apart easily.  I've measured the coax impedance to be 93R (+/- an ohm or two).
 

Offline fpgaarcade

  • Contributor
  • Posts: 18
  • Country: se
Re: Series defect on agilent 167xx boards?
« Reply #187 on: April 19, 2023, 03:14:55 pm »
Thanks for the impedance info, that's higher than I expected.

I have sourced some cables now for personal use.  I'm concerned if I make an FPGA board available for others, the demand for these rare cables would increase even more.
If we could find a reasonably priced modern solution which worked (nearly) as well it would help.

I've taken this thread off topic, sorry. I'll start a new one when I have more info.
Cheers,
Mike.
 

Offline fisafisa

  • Regular Contributor
  • *
  • Posts: 105
  • Country: es
Re: Series defect on agilent 167xx boards?
« Reply #188 on: May 09, 2023, 07:34:23 am »
hi.
Was trying to fix a 16720a for a long time.
it came without cables. Was planning to make my own.

When i saw that a jumper could make the card working, I couldn't believe it.
I had been reverse engineering the card recently and spent many hours  producing a kicad schematic and testing the card on a bus extension.
The main issue I was seeing is that no read clock was arriving to the serial ram.

I tried the fix and immediately it was clear that things were different.
some of the failing tests now simply hung, some passed.

I then tested the card functionally and it worked!

Something is still wrong, as the self test does not pass, but functionally I could not find a problem yet.

 Many thanks
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #189 on: May 09, 2023, 02:35:59 pm »
...
I had been reverse engineering the card recently and spent many hours  producing a kicad schematic and testing the card on a bus extension.
Can you post your schematics?  There's so little documentation on these cards that every little bit helps.  Thanks!
 

Offline MateKrisz

  • Regular Contributor
  • *
  • Posts: 97
  • Country: hu
Re: Series defect on agilent 167xx boards?
« Reply #190 on: June 17, 2023, 08:54:30 pm »
Hi,
I have problem problem with the 16720A. I tested it without cables, jumper on the clock pod.
The following tests failed:
do_loopback
do_clock_test
do_wait_test
do_instint_test

Passed tests:
do_ram_test
do_dram_test

Can you share the schematic? I will try to fix my card.
Thank you!
 

Offline aeg

  • Regular Contributor
  • *
  • Posts: 82
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #191 on: October 08, 2023, 06:16:09 am »
A few pictures for fans of corrosion
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #192 on: October 08, 2023, 01:24:58 pm »
A few pictures for fans of corrosion

Mmmm... impressive.

Looks like a 16710A/11A/12A?

Are you going to attempt repair?
 

Offline aeg

  • Regular Contributor
  • *
  • Posts: 82
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #193 on: October 09, 2023, 10:09:58 am »
Looks like a 16710A/11A/12A?

Bingo! 16711A.

Are you going to attempt repair?

there i fixed it





Code: [Select]
Mod   A: TEST passed       # "testEPLDpath" (1, 0, 1)
Mod   A: TEST passed       # "testLoadFPGA" (1, 0, 1)
Mod   A: TEST passed       # "testCableDetect" (1, 0, 1)
Mod   A: TEST passed       # "testFPGARegs" (1, 0, 1)
Mod   A: TEST passed       # "testChipRegsChip0" (1, 0, 1)
Mod   A: TEST passed       # "testChipRegsChip1" (1, 0, 1)
Mod   A: TEST passed       # "testChipRegsChip2" (1, 0, 1)
Mod   A: TEST passed       # "testChipRegsChip3" (1, 0, 1)
Mod   A: TEST passed       # "testChipRegsChip4" (1, 0, 1)
Mod   A: TEST passed       # "testChipRegsChip5" (1, 0, 1)
Mod   A: TEST passed       # "testZCal" (1, 0, 1)
Mod   A: TEST passed       # "testAdCntrs" (1, 0, 1)
Mod   A: TEST passed       # "testAdCntrRecords" (1, 0, 1)
Mod   A: TEST passed       # "testChipStateClocks" (1, 0, 1)
Mod   A: TEST passed       # "testChipTimingClocks" (1, 0, 1)
Mod   A: TEST passed       # "testChipCal" (1, 0, 1)
Mod   A: TEST passed       # "testMemsDataLines" (1, 0, 1)
Mod   A: TEST passed       # "testMemsWalkingOnes" (1, 0, 1)
Mod   A: TEST passed       # "testMemsAdrsLines" (1, 0, 1)
Mod   A: TEST passed       # "testMemsFullMeas" (1, 0, 1)
Mod   A: TEST passed       # "testRecordFlags" (1, 0, 1)
Mod   A: TEST passed       # "testOscillator" (1, 0, 1)
Mod   A: TEST passed       # "testComparators" (1, 0, 1)
Mod   A: TEST passed       # "testI2Csimple" (1, 0, 1)
Mod   A: TEST passed       # "testResources" (1, 0, 1)
Mod   A: TEST passed       # "testOtherPsyncs" (1, 0, 1)
Mod   A: TEST passed       # "testArmsTrigs" (1, 0, 1)
Mod   A: TEST passed       # "testEncoders" (1, 0, 1)
Mod   A: TEST passed       # "testHiSpeedMACs" (1, 0, 1)
Mod   A: TEST passed       # "testMemoryCal" (1, 0, 1)
« Last Edit: October 09, 2023, 10:12:07 am by aeg »
 
The following users thanked this post: jemotrain

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #194 on: October 15, 2023, 05:46:31 pm »
i have a bunch of these older 167** cards that have failed, i gave up on them, i have since moved onto 169xx cards

gg's on fixing the card, good job!
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #195 on: October 16, 2023, 01:57:39 pm »
Congrats, aeg, on the repair!  I would have bet that one was a goner.


i have a bunch of these older 167** cards that have failed, i gave up on them, i have since moved onto 169xx cards
...
Have you found that the 169xx cards do not have the corrosion problem?
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #196 on: October 16, 2023, 03:55:01 pm »
i have not found any with issues yet, however, i am guessing its because they have not had heaving use?

I did notice on the 1691x cards they level of tracing running under the runners is substantially lower.
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #197 on: October 16, 2023, 03:56:57 pm »
i have 1 16911 that gives me issues, but there is zero trace damage under it.. so it honestly became a parts board and testing things with..
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Offline aeg

  • Regular Contributor
  • *
  • Posts: 82
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #198 on: October 23, 2023, 10:13:26 am »
i have a bunch of these older 167** cards that have failed, i gave up on them, i have since moved onto 169xx cards

Any chance you're looking to part with one of the dead 167** cards? I have a 16751B with a bad comparator IC and bad pod cables, waiting for a card to pull parts from.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #199 on: October 25, 2023, 10:57:23 pm »
i have a bunch of these older 167** cards that have failed, i gave up on them, i have since moved onto 169xx cards

Any chance you're looking to part with one of the dead 167** cards? I have a 16751B with a bad comparator IC and bad pod cables, waiting for a card to pull parts from.
Don't know if you worked out a deal with Hamster, but if not I can give you some comparators for postage.  PM me if interested. 

Unfortunately I don't have any extra pod cables, but sometimes they can be repaired depending on the location of the break (assuming you're talking about the coax ribbon cable).  If you mean the flying leads, those are individually replaceable.
 

Offline aeg

  • Regular Contributor
  • *
  • Posts: 82
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #200 on: October 26, 2023, 04:52:26 am »
Unfortunately I don't have any extra pod cables, but sometimes they can be repaired depending on the location of the break (assuming you're talking about the coax ribbon cable).

Yeah, the coax ribbon cable. What's the repair technique?
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #201 on: February 12, 2024, 06:03:08 am »
Hi Everyone. I have studied this thread in detail as I'm going through all my 16717A and 16754A LA boards to fix the rail corrosion. The 16717As were quite easy to fix as mostly the usual trace corrosion. One had a via break so took some time.

On one of my 16754As, I have however found a new failure scenario, which is that the SDRAM leads next to the plastic runners are corroding to the point where there are memory errors during the pv 'dataBusTest' test. I discovered this by touching up the solder on the leads/pads to get rid of some brown "crust" and corrosion around the pins. This made the pv 'dataBusTest' fail although the board previously had tested fine all the way up to the timing zoom calibration tests.

When removing the SDRAMs mentioned in the pv dataBusTest (the ones closest to the bottom runner) I noticed the multiple leads were extremely fragile and bent and fell off very easily during cleaning. My theory is that they were almost corroded off and that the thermal shock from the soldering iron caused some pins to crack. I checked the two SDRAMs on the opposite side on the bottom side (next to another runner) and two leads broke off there too.

I have replacement ICs on order and will see if that makes the dataBusTest and subsequent data-related issues go away.
https://www.mouser.com/ProductDetail/Alliance-Memory/MT46V16M16TG-5BMTR

Note: The replacement memory  is capable of higher performance but will behave the same when clocked at the same speed and CL.

Replacement: speed grade 5B: 133MHz to 200 MHz.
Original: speed grade 75: 100 MHz to 167MHz.

Thanks,
/John.

Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 
The following users thanked this post: oPossum, MarkL

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #202 on: February 16, 2024, 04:34:42 am »
Replacement of the corroded SDRAMs fixed the pv SDRAM related test failures.

I have found yet another failure scenario: shorted out decoupling capacitors under the SDRAM controller IC. Per the attached image, there should be 51 ohm across the resistor. If it is 3-6 ohm or so, then check the nearby decoupling capacitors for shorts. Check resistance from the pads of the resistor that measures a few ohms to the nearby decoupling capacitor pads. If a dead short on both sides (resistor to decap) you know its on the same power rail. Then remove one capacitor at a time until the resistor short goes away. Confirm by measuring resistance of the decoupling capacitor. I had one at 3 ohm and one at 6 ohm. I replaced four on two separate boards and they were all 100 nF 0603's.

Note: If all the SDRAMs in the same side of the board fails 'x dataBusTest', then suspect shorted power rails like above.
« Last Edit: February 16, 2024, 04:36:35 am by John_ITIC »
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 
The following users thanked this post: oPossum, MarkL, jemotrain

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #203 on: February 16, 2024, 05:22:20 pm »
That's an interesting failure on the capacitors.  Thanks for posting the troubleshooting procedure.

How did you figure it out?

Also interesting that there's a lot less of those capacitors under the other ASIC (U126).  Almost all of the channel acquisition circuitry on these boards is duplicated with respect to the left and right, including the resistor you're pointing out.  Just wondering why one side wouldn't need the capacitors.
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #204 on: February 17, 2024, 05:29:20 am »
Regarding the bad capacitors: I found some bad that were not near any plastic runner so it's likely they went bad (a few ohms series resistance) over time for other reasons. I essentially checked every resistor and capacitor for reasonable values and also compared with a good 16754A board.

I am now next dealing with a failing anlyBusTest. The verbose pv test log is attached.

The 16754A service manual says:

"Analyzer Chip Memory Bus Test.
The purpose of this test is to check the Analysis chip memory busses that go between the Analysis chips and the SDRAM controller FPGAs."


So, I assume this test is utilizing the SDRAM controller to read data from the SDRAMs. As the dataBusTest, addrBusTest and other SDRAM related tests have already succeeded, we can rule out bad SDRAMs. The anlyBusTest.txt file shows that the expected data mostly appears some 16 cycles after it is expected. In synchronous logic, this can only be explained by one piece of logic running at a lower clock frequency than intended.

As the clksTest is also failing, I'm suspecting that some board clock is running at a slower clock rate than expected. This could explain why the data appears later than needed. But, I don't yet understand how the clock distribution works on these boards. All I know if from the service guide:

"System Clocks (J/K/L/M/Psync) Test.
The purpose of this test is to verify that the four clocks (J/K/L/M) are functional between the master board and all Analysis chips, and that the two Psync lines (A/B) are functional between the master board's Analysis chips and all Analysis chips in the module. This test verifies that the four clock lines (J/K/L/M) are driven from the master board and can be received by all Analysis chips, and that the Psync lines can be driven by each master chip on the master board and received by all other Analysis chips in the module."


I know, from reading earlier posts, that there had been some prior faults with the clock distribution or PLL chips on these boards. Perhaps that is what is going on. I will re-read to refresh the topic. But, if someone has had a similar issue, I would appreciate receiving some more bread crumbs to follow.

Thanks,
/John



« Last Edit: February 17, 2024, 05:34:25 am by John_ITIC »
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #205 on: February 18, 2024, 08:40:16 pm »
I agree it looks like a clock failure.  However, I don't think the dead PLL symptoms posted previously resulted in "Analysis Chip Data Bit" errors.  I think in your instance it looks like a data strobe failure between the FPGA (closer to the memory) and the analysis ASIC (closer to the pod connectors).

There's a hefty amount of traces between each FPGA/ASIC pair on the bottom, and I'm assuming you already checked the integrity of those traces.

Note that all the anlyBusTest failures are happening on "Chip 8", which on the 16753A/54A/55A/56A cards is the analysis ASIC associated with Pods 3+4 (U77).  There are no anlyBusTest errors happening on Chip 9, which is the other corresponding ASIC for Pods 1+2 (U76).

So, using this to your advantage, one way to tackle the issue is to compare component values with the corresponding component on the other half of the board.  I've found a number of bad (open) resistors using this method, and I guess capacitors are on the menu now thanks to your latest find.  At first pass, even if you're measuring in-circuit, all you would need to look for is anything that's wildly different from one side to the other.  The actual component value is not so important.

You could also set up a looping anylBusTest and with a scope try to locate anything that looks like a clock on the working Pod 1+2 side, and compare that to the corresponding test point on the broken Pod 3+4 side.  The clock might only be active when moving data.

Your observation that it sometimes eventually gets the test pattern right after a delay may be the result of a clock that's only partially working.  The clock could be running at the wrong logic level due to a bad series or shunt termination resistor, or maybe a bad trace.  Maybe the clock is a differential pair and one leg is dead.

You could also try to force the same error to occur on the working Pod 1+2 side by grounding various traces on that side through a 10R or 20R resistor, while watching the test in a loop with debug on.  I don't know which trace(s) are clock, but I've used this method to home in on non-working individual bits that are reported in the debug output.

I would focus on the "Analyzer Chip Memory Bus Test" failures and not worry about the "System Clocks (J/K/L/M/Psync) Test" for the moment.  The latter test is probably failing as a result of the corrupted path between the FPGA and ASIC (just never say never).
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #206 on: February 19, 2024, 05:40:02 am »
Thanks Mark. Much appreciated.

I re-checked all available traces and components around U77 but could not find anything wrong. I have been in touch with DogP and it looks hopeful that I can get my hands on a few of his extender cable adapter boards so I can do further bench testing with my scope on various live signal paths.

I have four of these 16754A boards that fail the pv comparatorCalTest and dappAddrDataTest too. And I know that these are related to the timing zoom, as the GUI calibration also fails. And I get timing zoom errors when sampling with timing zoom turned on. It will be tricky to resolve with the boards inside the 16700 case so I prefer to get the proper extension cable setup.

I can confirm that the anlyBusTest failures only occurs on pods 3 and 4. As an experiment, I fed 16-bit wide ripple counter data from an Altera dev kit (with 0.1" header LVCMOS outputs) to the pods via a flying leads probe. This test shows that I can actually consistently configure and sample data from all pod channels so I know that all data paths are intact. But channels 3/4 have some odd behavior approximately 50% of the acquisitions:

On Ch 3/4, I'm seeing lots of toggling of the bit state before it settles to the correct state. See attached images:

p515: "Bouncy" transitions on channels 3/4
p516: The same capture but zoomed in such that the "bounces" can be seen more clearly.
p517: A proper capture of channels 1/2. Half of the times, channels 3/4 look this way too.

 I have to think some more about what could cause this behavior.

Thanks,
/John.
« Last Edit: February 19, 2024, 05:42:44 am by John_ITIC »
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #207 on: February 19, 2024, 09:43:37 pm »
I have a few more key pieces of information regarding the "bouncy" transitions:

1) The bouncy phenomenon starts when the test vectors change from all 1's to all 0's. See attached p518.
2) The trigger does not catch this when triggering on D0 positive edge to D0 positive edge less than 80ns (80ns is the D0 period).
3) The fact that all exhaustive SDRAM pv tests pass fine suggests that this is an issue with the U77 analysis IC, not with the SDRAMs or SDRAM controller.

This tells me that this is an issue with the U77 analysis IC retrieving the data from the FPGA controller IC after capture.

This looks suspiciously like ground bounce, suggesting an issue with the U77 power integrity. Perhaps bad decoupling or bulk capacitors.
https://en.wikipedia.org/wiki/Ground_bounce

I'm investigating further...

/John.
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 
The following users thanked this post: alm

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #208 on: February 21, 2024, 08:22:28 pm »
...
I re-checked all available traces and components around U77 but could not find anything wrong. I have been in touch with DogP and it looks hopeful that I can get my hands on a few of his extender cable adapter boards so I can do further bench testing with my scope on various live signal paths.
...
I also have DogP's extenders and they work well, but before that I did the vast majority of my troubleshooting by putting the target card in slot E, turn the chassis upside down, remove the bottom cover, and safely relocate the mouse/keyboard interface card.  You can get to nearly the entire bottom of the card.  You can get a lot left/right comparisons done this way, and if you're looking for a bad trace, it's most likely going to be on the bottom anyway.

And if you're using remote control, you can completely disconnect the mouse/keyboard card.  The OS doesn't care if it's there or not.
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #209 on: February 22, 2024, 04:40:28 am »
Quote
I also have DogP's extenders and they work well, but before that I did the vast majority of my troubleshooting by putting the target card in slot E, turn the chassis upside down, remove the bottom cover, and safely relocate the mouse/keyboard interface card.  You can get to nearly the entire bottom of the card.  You can get a lot left/right comparisons done this way, and if you're looking for a bad trace, it's most likely going to be on the bottom anyway.

And if you're using remote control, you can completely disconnect the mouse/keyboard card.  The OS doesn't care if it's there or not.

Thanks Mark. I considered this after reading about your initial setup but since DogP had available extender boards (on the way to me) I'll go for the "proper" bench setup. I'm in no rush to get this system back up and running so will take my time to make my job easier (even if the initial setup takes a little longer) by having the whole board on the bench.

Otherwise, no update on this issue today.

/John.

Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #210 on: March 04, 2024, 04:42:13 am »
The update for this weekend is that I assembled DogP's adapter cards and managed to burn up both one of my 16754A cards as well as my 16700B main frame! It was my fault as I missed DogP's comment not to use 80-wire cables as those short together some 6-7 pins on each cable. So I'm putting this in here as a warning to others: Only use 40 pin connectors and 40 pin flat cables.

https://github.com/pdaderko/16702B/blob/main/card_extension/README.md

I looked into the 16700B issue and found that it does not detect any (known good) boards. The back plane does not have any sensitive electronics so I assume its the main interface board that somehow got damaged. I did find a replacement main interface board on Ebay so I picked that up as that would be a quick and easy fix if it resolves the issue.

Now, having learned my lesson, I tested the 16700 extension boards in another 16702A unit with 40 wire flat cables and I can report that they work fine with 21" cables. They are a little too short for my work bench arrangement so I put in an order for very long 52" cables. I think that will work well for the pv self tests as no major data is transferred over the cables. I would assume that actual data capture and upload will stress the cable interface more - I will try that out once the long cables arrive.

Finally, I attached a couple of pictures for the curious...

Thanks.
/John.
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #211 on: March 04, 2024, 05:01:48 am »
I did find a replacement main interface board on Ebay so I picked that up as that would be a quick and easy fix if it resolves the issue.

Thanks for the order though I hate giving 13% to those eBay bloodsuckers.  I have several other boards from the same 16700b for sale.  All working. PM me here.
« Last Edit: March 04, 2024, 05:14:19 am by dorkshoei »
 

Offline DogP

  • Regular Contributor
  • *
  • Posts: 95
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #212 on: March 04, 2024, 07:16:18 am »
Ouch!  I guess that's good to confirm my suspicion that damage would occur from the wrong cable... would have preferred to have kept that unconfirmed though. :-/

To track down what likely got damaged... those 80-pin cables internally short pins 2,19,22,24,26,30,40.  Pins 19 and 22 are unused on the extension, leaving pins 2,24,26,30,40.  That would cause backplane pins 2, 20, 22, 26, 36 to short together from one cable, and pins 38, 56, 58, 62, 72 to short together from the other cable.

I'm not aware of the backplane pinout being published anywhere, but simply using a voltmeter may find the obvious culprit (e.g. finding +12V, -5V, etc. which would have likely damaged a data or control signal).  Or maybe a voltage rail got shorted to another, possibly popping a fuse.  No idea if they buffered anything, or if the lines went directly to FPGAs/ASICs/etc. though.

I'd definitely be interested to hear how the long cables work out... another option is popping the bottom cover off the unit like MarkL mentioned, putting the unit on its side, and then using the extension out the opening.  That's how I tested cards on my workbench, shown in this pic: https://github.com/pdaderko/16702B/blob/main/card_extension/in_use.jpg .

DogP
 
The following users thanked this post: John_ITIC

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #213 on: March 04, 2024, 03:04:33 pm »
...
To track down what likely got damaged... those 80-pin cables internally short pins 2,19,22,24,26,30,40.  Pins 19 and 22 are unused on the extension, leaving pins 2,24,26,30,40.  That would cause backplane pins 2, 20, 22, 26, 36 to short together from one cable, and pins 38, 56, 58, 62, 72 to short together from the other cable.

I'm not aware of the backplane pinout being published anywhere, but simply using a voltmeter may find the obvious culprit (e.g. finding +12V, -5V, etc. which would have likely damaged a data or control signal).  Or maybe a voltage rail got shorted to another, possibly popping a fuse.  No idea if they buffered anything, or if the lines went directly to FPGAs/ASICs/etc. though.
...
I've done a little poking at the backplane about a year ago, and was able to take some reasonable guesses at some of the signals by direct measurements and looking at a broken 16531A scope module.  The 16531A is thru-hole, so it was a lot easier to see where various traces were going.

Here's what I have so far.  Any pin where I didn't have a good guess is not listed.  I would think shorting the -5.2V supply did the most damage, since it's also responsible for supplying the power-hungry ECL on the modules.



From service guide, power supply has:
  +5V  -5.2V  -12V  +3.3V  +12V  -12V  -3.3V

Connector: AMP 1-534204-4 (pin 1 has a square pad)

Pin(s)  Signal
------  -----------------------
1-5     +5V
6-8     +3.3V
9-10    +12V
11-12   -12V
13-14   GND
15      D7 bi-directional (D7:D0 based on 16531A DAC)
16      D6 bi-directional
17      D5 bi-directional
18      D4 bi-directional
19      D3 bi-directional
20      D2 bi-directional
21      D1 bi-directional
22      D0 bi-directional
23-24   GND
26      from backplane
27      from backplane
28      from backplane, A1 on DAC
29      from backplane, A0 on DAC
30      from backplane, 138 both decode select C
31      from backplane, 138 both decode select B
32      from backplane, 138 both decode select A
33-34   GND
35      from backplane, R/nW, select which 138 decoder
37      from backplane, 134 both G2A not enable
43-44   GND
53-54   GND
55      100MHz system clock
56      100MHz system not clock
62      from backplane, 138 decode G1
63-67   -3.3V
68-72   -5.2V
 
The following users thanked this post: John_ITIC

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #214 on: March 05, 2024, 02:35:32 am »
Quote
Thanks for the order though I hate giving 13% to those eBay bloodsuckers.  I have several other boards from the same 16700b for sale.  All working. PM me here.

PM sent.
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #215 on: March 05, 2024, 02:52:04 am »
Quote
That would cause backplane pins 2, 20, 22, 26, 36 to short together from one cable, and pins 38, 56, 58, 62, 72 to short together from the other cable.

Quote
I've done a little poking at the backplane about a year ago, and was able to take some reasonable guesses at some of the signals by direct measurements and looking at a broken 16531A scope module.  The 16531A is thru-hole, so it was a lot easier to see where various traces were going.

Here's what I have so far.  Any pin where I didn't have a good guess is not listed.  I would think shorting the -5.2V supply did the most damage, since it's also responsible for supplying the power-hungry ECL on the modules.

Thank you both. I will eventually make an attempt to fix the shorted 16754A. But first, Ill take a stab at my three other 16754A boards that all fail the RC comp and Tap delay calibration. I have one good 16754A board that I want to compare with live on the bench.

So here's a summary of your information:

Pin(s)  Signal
------  -----------------------
1-5     +5V
6-8     +3.3V
9-10    +12V
11-12   -12V
13-14   GND
15      D7 bi-directional (D7:D0 based on 16531A DAC)
16      D6 bi-directional
17      D5 bi-directional
18      D4 bi-directional
19      D3 bi-directional
20      D2 bi-directional   <= shorted to +5V (pin 2)
21      D1 bi-directional
22      D0 bi-directional   <= shorted to +5V (pin 2)
23-24   GND
26      from backplane      <= shorted to +5V (pin 2)
27      from backplane
28      from backplane, A1 on DAC
29      from backplane, A0 on DAC
30      from backplane, 138 both decode select C
31      from backplane, 138 both decode select B
32      from backplane, 138 both decode select A
33-34   GND
35      from backplane, R/nW, select which 138 decoder
36   UNKNOWN            <= shorted to +5V (pin 2)
37      from backplane, 134 both G2A not enable
38      UNKNOWN            <= shorted to -5.2V (pin 72)
43-44   GND
53-54   GND
55      100MHz system clock
56      100MHz system not clock   <= shorted to -5.2V (pin 72)
58      UNKNOWN          <= shorted to -5.2V (pin 72)
62      from backplane, 138 decode G1   <= shorted to -5.2V (pin 72)
63-67   -3.3V
68-72   -5.2V

I also found another two 16754A boards sitting in a stored away 16702A so I will need to fix the rail corrosion on those too.

This will all take a lot of time...

Thanks,
/John.
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #216 on: March 05, 2024, 02:52:57 am »
Quote
Thanks for the order though I hate giving 13% to those eBay bloodsuckers.  I have several other boards from the same 16700b for sale.  All working. PM me here.

PM sent.

If anyone else wants parts off the 16700B that I put onto eBay (that John_ITIC bought the interface pcb from). PM me.   I'll happily offer 10% discount for direct sales.

 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #217 on: March 09, 2024, 03:31:50 am »
John_ITIC bought the interface pcb

I can report that the replacement main interface board made my burnt out 16700B work again. Thank you!

I also have made another interesting observation:

Having moved 16754A boards between various systems (16700B and 16702A), I noticed that some 16754A boards fail more pv self tests in some 16700 systems than in other 16700 systems.

I also noticed that the some 16754A boards could not handle even 17" long extension cables without starting to fail the dataBusTest, hwMemoryCellTest and some others. However, other 16754A boards worked fine with 40" extension cables. I have not managed to get any 16754A board pass self tests with 52" extension cables, however.

I'm suspecting not correct power integrity on some boards / 16700 combinations can cause memory test failures. The search goes on.

/John.
« Last Edit: March 09, 2024, 05:47:31 am by John_ITIC »
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #218 on: March 09, 2024, 04:20:59 am »
John_ITIC bought the interface pcb

I can report that the replacement main interface board made my burnt out 16700B work again. Thank you!

Great.

Please post if you get the old board repaired
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #219 on: March 09, 2024, 05:00:37 am »
FWIW, I've been using DogP's extenders with a 16" cable without any issues.  So far I've only used it with a 16752A card.

I added some connectors in the middle of the extender cables to probe bus power and signals more easily.  The red marks are positive supplies, blue are negative, and black are grounds.  Unused wires in the ribbon are covered with red tape.

The ribbon is also slit in one area, with the wires being grouped into each supply rail.  This allows me to measure the current on each rail by passing the group through a current probe.

Being paranoid, the bus numbering is everywhere.  And given the number of fans in the chassis, I felt a fan blowing over the heatsinks when the card away from home would make it feel less home sick.
 
The following users thanked this post: John_ITIC

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #220 on: March 11, 2024, 05:16:30 am »
The updates for this weekend:

1) I have managed to get the 52" cables to work. I found that there is too much resistance (or inductance) in the few GND leads in the long flat cables. The trick with using longer cables is to use additional lab cables to create additional GND paths between the 16700 system and the board under test on the bench. I used four additional regular lab cables between the 16700 chassis and the 16754A and managed to pass all self-tests of my 16754A board. Note that these boards are designed to get additional grounding via the rear panel. While working on the boards, I have removed this panel so very little ground return path remains via the few pins in the backplane connector.

2) A 16754A board that previously passed all memory tests suddenly failed them again.  See attached p554. pv told me that the the following SDRAM ICs were bad: U91, U93, U95, U97 and U92, U94, U96, U98 as well as U89. These are all the SDRAMs on one of the two SDRAM controller FPGAs/ASICs. I traced this down to a shorted power rail on that SDRAM controller. Just like another board, one of the 100nF decoupling capacitors had developed an internal short (< 4 ohm), which collapsed that power rail. See attached p560b and p560c. The short was in the capacitor most to the bottom in image p560 although that cap looked good. The ones above looked crusty and corroded so I swapped them out too, although they measured okay.

3) I now get an intermittent cmpTest (and vOffsetTest) error on this board. See p561, p563, p564. I'm not sure why this pops up now as I never have seen that before. I will have to review under microscope again to see if I can spot anything unusual. And, I have to look into where the DACs are located...

So, some issues have been resolved but new issues have arrived. Never a dull moment...

Thanks,
/John.
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #221 on: March 12, 2024, 03:08:53 am »
It seems I have a bad comparator. The (attached) pv cmpTest and vOffsetTest outputs told me that there is an unexpected voltage on the 2nd comparator on pod 1. I measured across the input shunt resistors and found one that measured 1.5KOhm across while all the others were 4.38K in-circuit. I removed two of the resistors (both measured 20K out-of-circuit) and I found that the input resistance to ground on the bad channel was 435R rather than 4K like on the neighboring good channel.

I recall that others have reworked the comparators before but I don't think it was posted how to get the heat sink off. Or, what to use to put it back once the board has been reworked.

I may be able to heat it sufficiently with the heat sink attached as I have a rather hefty rework machine, with a bottom pre-heater:
https://www.zeph.com/bgarework_stations_systems_qfn_smd_hot_air_repair.htm

The ZT-7 rework machine has a shroud that completely will contain the heat sink inside. So, the heat will not be able to escape when de-soldering.

I have my burnt out 16754A that I could borrow the comparator from. Or, I could order a new comparator as I believe it was mentioned in this thread that they are still available for purchase. I will go back and re-read to see if I can find that post.

Thanks,
/John.

« Last Edit: March 12, 2024, 03:16:09 am by John_ITIC »
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #222 on: March 12, 2024, 04:05:37 am »
i thought the comparitors were the smaller agilent chips by the pod inputs without heatsinks?  I posted a some 16700/16751 cards for sale if anything is useful.
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #223 on: March 12, 2024, 02:45:00 pm »
...
I recall that others have reworked the comparators before but I don't think it was posted how to get the heat sink off. Or, what to use to put it back once the board has been reworked.

I may be able to heat it sufficiently with the heat sink attached as I have a rather hefty rework machine, with a bottom pre-heater:
https://www.zeph.com/bgarework_stations_systems_qfn_smd_hot_air_repair.htm
...
I've done rework on the comparators and have never been able to get the heatsink off.  I just moved the whole thing.  I used a pre-heater, as you are suggesting.  Below is a photo of what's underneath if it helps.

i thought the comparitors were the smaller agilent chips by the pod inputs without heatsinks?  I posted a some 16700/16751 cards for sale if anything is useful.
That's on other cards (40-pin cards).  The 16753A/54A/55A/56A and 16760A (90-pin cards) have a very different front end with differential input comparators.
 
The following users thanked this post: John_ITIC

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #224 on: March 12, 2024, 06:50:10 pm »
Thank you all for the help. I would be interested in purchasing a parts 16754A so I can get a replacement comparator as well as a replacement U128 Xilinx XC2V100 FPGA, which is used to implement the bus interface. I have confirmed that mine got damaged when using those 80-lead extension flat cables. U128 gets very hot when the damaged board is powered up. All power regulation otherwise appears fine on that board.

I have done some BGA removal, reballing and re-assembly in the past so there is a good chance I can get the board up and running by swapping out the FPGA. There is no programming involved as the FPGAs configuration image is stored in the PLCC XC18V00 PROM, which is next to the FPGA.

Based on this thread, there should be quite a few piles of 16754A boards where the repair attempts have been given up. Please PM me if anyone is interested in selling me a parts board.

Thanks,
/John.
« Last Edit: March 12, 2024, 10:19:18 pm by John_ITIC »
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #225 on: March 12, 2024, 08:30:33 pm »
Any of these would work - 16753A, 16754A, 16755A, 16756A

Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 
The following users thanked this post: John_ITIC

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #226 on: March 13, 2024, 03:03:00 am »
I successfully transplanted a comparator from my burnt 16754A and I now have one more (almost) good board. The below failures still persist.

Per the attached image, I have three 16754As (four with the burnt board) that all fail the final 'comparatorCalTest' and 'dappAddrDataTest' tests.
'comparatorCalTest' is the same test that is made in the GUI 'calibration' tab and it fails 'RC compensation' and 'tap delay' in various combinations for the different boards.

The service manual self-test description says:

Comparator Calibrations Test.
The purpose of this test is to verify that each of the comparator one-time calibrations can successfully be performed. This verifies that all of the calibration circuitry and components are within the tolerance limits required for proper calibration. This test is executed only if all probes are detached.

Timing Zoom Memory Addr/Data Test.
This test verifies connectivity of components within the analysis chip. It verifies that the address, data, and clock lines of the timing zoom circuitry is correct.

I have one good 16754A board that passes these tests too. This is what I wanted to use DogP's extender boards for. I now have a way to do comparative measurements. But, I'm not sure where to start to look. I have a hunch that these remaining errors are not related to trace corrosion.

Thanks,
/John.

Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #227 on: March 13, 2024, 03:38:51 pm »
I have several cards with the RC failure and have not been able to figure it out.  It's been a while since I looked at it, but as I recall, swapping comparators was not the answer.

Since the test says that the pods must be disconnected, perhaps start looking on the pod side of the comparator.  On the pod side, underneath is an OP184 and a mystery SOT23-5 for each comparator.  The SOT23-5 is marked AAAG, and by way of how it's connected, one guess is that it might be a MAX4516 analog SPST switch.  They're all ganged together, perhaps turned on when in test or calibration mode.

Anyway, some comparator to comparator comparisons on pod side components might turn up something interesting.  I don't think I got that far.  It can be frustrating trying to fix these cards and I had to put them aside to get other things done.
 
The following users thanked this post: John_ITIC

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #228 on: March 20, 2024, 03:12:06 am »
I have done some more research on the "RC Comp" and "Tap Delay" failures.

The 16754A service manual is a bit fuzzy on the details:

"Comparators.
The comparators are differential input/differential output devices that interpret incoming data and clock signals as either high or low. A threshold voltage provided by an internal digital-to-analog-converter (DAC) is coupled to the negative side of the differential signal through a precision resistor. Alternatively, this voltage can be provided to the data channels by a user supplied threshold line in the probe cables. There are separate internal DAC driven thresholds for the data and clock in each pod.

In order to achieve performance, an extensive calibration is performed on each comparator when the board is manufactured and the results of this calibration are stored as Calibration Constants in non-volatile memory on the logic analyzer board. These constants are loaded into the comparators at power on."

The 16760A service manual is phrasing this differently:

"Comparators.
The comparators are differential input/differential output devices
that interpret the incoming data and clock signals as either high or low. Threshold
voltage, programmed by the user through the user interface, is set by a digital-to analog
converter (DAC) coupled to the negative side of the differential signal
through a precision resistor. There are separate DAC-driven threshold voltages
for the data and for the clock. In addition, the comparator contains a diode in
which the junction temperature is monitored to ensure the module is being
properly cooled.

Much of the performance optimization for the module is accomplished by the
comparators, including channel delay setting (EyeFinder), programming of input
resistance, and frequency compensation adjustment.


Module operation such as
state clock modes and configuration are also done by the comparators. A digitalto-
analog convertor (DAC) provides the module threshold voltage for single ended
operation. The voltage at the DAC outputs are buffered to prove sufficient
line drive. An analog switch is used to channel either the module threshold
voltage from the DAC or the threshold voltage input from the system under test
to the comparators."

So, it looks like comparators themselves handle the adjustments to the "Input R", "OS Null", RC Comp" and "Tap Delay". Could it be that the comparators are "aging" and that it makes the SW unable to bring them back into spec via the available soft adjustments?

I played around with the 'vp' debug GUI (started via './vp -debug 255' from the shell) and I was able to "fudge" the various settings until the "RC Comp" and "Tap Delay" calibration passed. See attached p579.

I am, however, not certain that such a "fudged" calibration will actually work correctly. Perhaps this just makes the test pass but the H/W may still be out of spec. I know that the 16754A service manual talks about timing zoom performance validation via external pulse generator. I will study this topic some more.

I also will make an attempt to replace the FPGA on my burnt board (the one that was connected via 80- lead flat cables). I found what I believe is a direct replacement on Ebay. It is one speed grade faster but that should not matter as it is usually okay to go from a slower device to a faster, but not the other way around.

https://www.ebay.com/itm/145127678615

If the FPGA swap works, then I will have to find a replacement comparator. I may take it from my other 16754A that has "bouncy waveforms" after FFFF to 0000 transitions as I'm at this point am not too hopeful of fixing that issue...

I will post a couple of IR images in the next post as the web site does not allow more images to be attached...

Thanks,
/John.



« Last Edit: March 20, 2024, 03:15:08 am by John_ITIC »
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #229 on: March 20, 2024, 03:14:09 am »
I also attached a couple of images from my thermal imaging camera, where the FPGA can be seen is very hot after having been shorted out. I could not find any other parts on the board that were unusually hot when comparing with a good board. If the FPGA replacement works then I will have to put back the comparator that I borrowed for another board.

Step by step...

/John.

Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #230 on: March 26, 2024, 07:05:55 pm »
My burnt 16754A came back from the dead after I replaced the FPGA. So, that is the solution if anyone makes the same mistake I did (connect via 80 pin flat cables).

I have noticed that the board-board interconnect connectors are also corroded on the bottom side. See attached images. After scraping off the green corrosion, I can see that the gold flashing is gone and a brown discolored surface is underneath. I suspect this will not make good contact so, in order to do a proper refurbishment, it would be nice to replace these connectors.

Does anyone happen to know the part number of these connector?

The flex connector cable that goes between the boards has HP/Agilent part number '16754-61601' (picture attached). The Part # 16754-60002 is the set of two.

Thanks,
/John.
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #231 on: March 26, 2024, 07:38:34 pm »
My burnt 16754A came back from the dead after I replaced the FPGA. So, that is the solution if anyone makes the same mistake I did (connect via 80 pin flat cables).

I have noticed that the board-board interconnect connectors are also corroded on the bottom side. See attached images. After scraping off the green corrosion, I can see that the gold flashing is gone and a brown discolored surface is underneath. I suspect this will not make good contact so, in order to do a proper refurbishment, it would be nice to replace these connectors.

Does anyone happen to know the part number of these connector?

The flex connector cable that goes between the boards has HP/Agilent part number '16754-61601' (picture attached). The Part # 16754-60002 is the set of two.

Thanks,
/John.
I believe it is 3M Pak 8 Plug Connectors and Pak 8 Socket Connectors (previously Robinson Nugent, acquired by 3M).  Board side here:

  https://www.3m.com/3M/en_US/p/d/b30000132/

The part number looks like it should be P08-100-SLxx-A-G, where the xx depends on packaging and vacuum pickup options (see datasheet).  It's not in the datasheet as a contact quantity option, but it does show up as an obsolete part.

I've used a fine SS brush to get most of the corrosion off, and sometimes had to pick off the remaining pieces with a needle.  But that unfortunately took a fair amount gold plating with it as you also experienced.  However, the connectors still seem to work ok.  Longevity may be an issue.

They can also be stolen from the top side of dead boards where they are usually corrosion-free.  The heating cycle needs to be carefully controlled as they are easy to deform (see datasheet for process rating).  I would use Chip Quik or other low melting point alloy to get them off.
 
The following users thanked this post: John_ITIC

Offline Hamster

  • Regular Contributor
  • *
  • Posts: 115
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #232 on: March 27, 2024, 12:59:34 am »
tarnix may help remove the corrosion as well.
Arcade Board Repair Guru.  [ twitch: HammysHangout , youTube: Hammy Builds ]
 
The following users thanked this post: John_ITIC

Offline John_ITIC

  • Frequent Contributor
  • **
  • Posts: 514
  • Country: us
  • ITIC Protocol Analyzers
    • International Test Instruments Corporation
Re: Series defect on agilent 167xx boards?
« Reply #233 on: April 04, 2024, 11:05:15 pm »
MarkL was kind enough to send me a few spare comparators for my "burnt" 16754A board. I soldered one in but that board still fails the adcTest (pv shows no response from the ADC [U54, bottom side, TLC2543C]). I put the scope on it and there is no activity from the host while running the adcTest. I also swapped U54 with another board and the errors stays with my "burnt" board. As a test, I also swapped the U144 (LVT16245A) as I figure this line driver might be involved in the host communication - still no improvement.

So, there is a good possibility something else got damaged on this board when I used those 80 lead flat cables. Perhaps some internal trace got burnt off? Hard to tell.

So, at this point, I have seven working 16754A's, one working 16756A and two bad 16754As. I will upgrade the 16754A's to 16756A's via removal of resistor B3 (image attached).

Of the seven working 16754A boards, most are failing the RC comp and tap delay tests (timing zoom). I will have to continue that research at a later time as I'm sort of stuck at that issue.

At least I can be sure that no further deterioration will happen due to rail corrosion so I can push further investigation into the future. And I have more boards than I need, really.

I now have multiple TDS5xx, TDS6xx and TDS7xx scopes I need to attend to - the usual recapping and saving of battery backed SRAM + upgrade to new SRAM modules. I'd better get to that before too late.

Thanks,
/John.
Pocket-Sized USB 2.0 LS/FS/HS Protocol Analyzer Model 1480A with OTG decoding.
Pocket-sized PCI Express 1.1 Protocol Analyzer Model 2500A. 2.5 Gbps with x1, x2 and x4 lane widths.
https://www.internationaltestinstruments.com
 

Online FrodeM

  • Newbie
  • Posts: 7
  • Country: no
Re: Series defect on agilent 167xx boards?
« Reply #234 on: April 18, 2024, 07:16:31 pm »
I just though this might be worthwhile to mention..

Got a 16534A today, with crusty rails. Fortunately, no matter how bad it looked, the card passed the GUI selftests and after removing the gunk it seems like all traces are indeed intact (although small discoloration on two and ). However, one thing I noticed is that there was a decent amount of corrosion on the underside bracket for the heatsink mounts.

Reading though various threads, there's some suggestions that the corrosion is caused by moisture collecting. Given my observation, and the nature of the corrosion, I would rather think that the cause here is fumes released by the rubbery tape as it decomposes, in combination with moistures in the air. That would also explain why the corrosion in some worst-case examples is able to penetrate vias and even cause corrosion across a significant area of the top side.

On that note, it would at some point be great to figure out what precisely is the cause of this corrosion. Eventually if it can be properly neutralized as a part of the repair process (given some claim it may reappear after some time, if just washed with IPA. I know battery-leak corrosion on old motherboards is often handled with vinegar, but the chemistry for that is pretty well known. I don't want to try anything like that here unless I know it works and is safe for the boards.
« Last Edit: April 18, 2024, 07:34:26 pm by FrodeM »
 

Offline dorkshoei

  • Frequent Contributor
  • **
  • Posts: 499
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #235 on: April 18, 2024, 08:37:20 pm »
I just though this might be worthwhile to mention..

Got a 16534A today, with crusty rails. Fortunately, no matter how bad it looked, the card passed the GUI selftests

You need to see if it will pass calibration using the T cable described in the manual.       I have one that passes self-test that totally fails the calibration procedure with a fatal error.
 

Online MarkL

  • Supporter
  • ****
  • Posts: 2131
  • Country: us
Re: Series defect on agilent 167xx boards?
« Reply #236 on: April 19, 2024, 12:24:57 am »
I have a bunch of 16534A cards, and almost all of them had corrosion on the mounting bracket for the ADC.  I removed the bracket, removed the corrosion with a wire brush, and sealed the bracket with clear insulating varnish (Sprayon EL600).  And of course removed the plastic runners and their evil adhesive from the board.

It would be great to discover the actual chemical reaction that's happening so it can be neutralized properly.

And 100% agree with dorkshoei.  I've had self-test pass but calibration fail on a number of cards.  A couple of the boards had out of spec rail voltages from the local voltage regulators because of bad output voltage setting resistors.  Replacing the resistors fixed the rail voltages and allowed them to pass calibration.
 

Online FrodeM

  • Newbie
  • Posts: 7
  • Country: no
Re: Series defect on agilent 167xx boards?
« Reply #237 on: April 19, 2024, 08:36:07 pm »
I spoke too soon, after removing the rails it no longer pass selftest.

Today I went over with flux, a soldering iron and a microscope. Three of the thin lines that just looked discolorated at a spot ended up disintegrating, so I had to fix that. I also reflowed most components with visible corrosion on the solder pads. Two vias had corrosion into them, but one were connected to one planar layer and seems to make connection. For the other I found fresh metal before the in-board junction.

I will test it tomorrow, since I need to fix the power-supply of the analyzer again. Despite the grid voltage should be 230V according to the power company, during nighttime it can at times be as high as 240V. Either it's that that blows the regulators, or the PSU has an inherent design defect that triggers if you turn the analyzer off and then on again only a few minutes later.
« Last Edit: April 21, 2024, 05:19:05 pm by FrodeM »
 

Online FrodeM

  • Newbie
  • Posts: 7
  • Country: no
Re: Series defect on agilent 167xx boards?
« Reply #238 on: April 22, 2024, 06:53:50 pm »
I have a bunch of 16534A cards, and almost all of them had corrosion on the mounting bracket for the ADC.  I removed the bracket, removed the corrosion with a wire brush, and sealed the bracket with clear insulating varnish (Sprayon EL600).  And of course removed the plastic runners and their evil adhesive from the board.

It would be great to discover the actual chemical reaction that's happening so it can be neutralized properly.

Ok, put in the spare PSU. Good news is that the card passes self-test again after the repairs. Crossing my fingers for the calibration.

*Edit*
I reset to default and ran the calibration again after the screenshot, and it seems to pass fine, yay!
« Last Edit: April 22, 2024, 08:25:41 pm by FrodeM »
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf