Electronics > Repair

Series defect on agilent 167xx boards?

<< < (28/50) > >>

MateKrisz:
I read this "solution" in the groups.io mail list some weeks ago if I remember well. When my LA not arrived yet, until then I readed some article about my cards. I bought this on eBay: eBay auction: #254489501566 Will arrive from England. I checked today, and I not found the topic in my browser history (total chaos here, tons of opened site in my history), I will search it again for you.

On the 16555D card I checked the capacitors with my HP 4274A and the ESR values are higher than usual. So I replaced them. After I re-run the pv util, but same tests are failed. I ran with debug, you see the output the post bottom.

Back to 16720A. Good idea localize on the 16720A what memory chips are bad. I would like to start with U121 and U117 and later LVTH245. Which legs need to lift up as pair? I found the chip datasheet here: https://www.micron.com/-/media/client/global/documents/products/data-sheet/dram/64mb_x4x8x16_sdram.pdf
So when I uplift the "chip select" leg, like 19, need to lift up another legs or just this? Sorry, It's not clear to me.

DogP:
Like MarkL, my 16720A has always worked, so I don't have any first-hand tips for troubleshooting it.

Regarding your 16555D - it looks like maybe Chip 1's bit 1 is floating.  All of the errors shown seem to be related to bit 1 (e.g. expected: FFFF actual: FFFD, expected: 0000 actual: 0002, etc.), but not always stuck low or high.

I don't have a 16555D to get any better information for you, but looking at closeups from an ebay auction, it looks like the two main chips are U25 and U26.  So logically, my best guess would be Chip 0=U25 and Chip 1=U26.  The board appears to have the plastic runners, so definitely remove those, and look for corrosion.  It looks like a runner goes right over the traces between U26 and its RAM, so that's where I'd inspect closely first.

If you have the unit apart enough that you can access the cards during test (or have a card extender like discussed earlier in this thread), I'd try running the test while touching the pins of U26 with your finger (and maybe U25 and the RAM as well), and see if you get any noticeably different errors.  That may help pinpoint the trouble area.

DogP

MarkL:
(I haven't vanished; just preoccupied with a large number of non-electronics things...)

I had a chance to lift and hold high the DQMH and DQML legs on both U117 and U121 DRAM chips (4 "do_loopback" tests).  Results attached in the .zip below.  Each test failed within a few seconds.

The "Expected 0x55 got 0x00.." errors are exactly the same in each test even though I was messing with a different signal each time.  I was not expecting this.

The second set of errors (walking ones) were different each time, and seems to represent bits in the DRAM, which makes sense although they are not strictly in order.

And as suspected, the "do_clock_test" also fails when DQMx is held high (test not shown), which says pv may be depending on a successful memory loopback test to verify the clock.

Not sure what all this means yet since your walking ones look ok, but thought I'd share.


And on the 16555D, I agree with DogP you're definitely looking at a single-bit error.  Since the expected level is not consistent, it's likely open (flapping).  Once you get the runners off and cleaned up, inspect for corrosion and broken traces.  Use sharp probes to pierce the soldermask and test continuity for all traces passing under and near the runners end-via to end-via.

If DogP's finger test doesn't show anything interesting, you can try holding down some of the signals with a resistor as described above.  Put the test that's failing into a loop and observe what fails.  You can try this looping script here:

  https://www.eevblog.com/forum/testgear/agilent-16717a-comparator-and-zoomchipseltest-failures/msg4434136/#msg4434136

The signal traces on the 1675x cards are generally in order.  I would expect the 16555D to be the same, so you will know when you're getting close when you see the bits failing from the resistor hold-down getting closer to the bad one.

MarkL:
I've made an interesting discovery:

If the CLK pod cable is not plugged into the module (header J7), my card fails with the same do_loopback and other errors as yours.  With the CLK cable plugged in, everything passes.

On further inspection, all the pod cables (data and CLK) have pin 18 shorted to 20.  This may be providing some kind of enable to the rest of the circuitry when the cable is plugged in, but it only seems to matter for J7.  So, perhaps this is some kind of enable specifically for the clock.

All tests also pass if a jumper is installed on J7 between pins 18 and 20.

Do you have the CLK cable plugged in when you get the errors?  If so, try a jumper on J7 instead.  If you still get failures, we may need to look closer at the circuitry associated with J7 pins 18 and 20.


--- Code: ------ unplugged clk pod cable J7 ---

pv> x do_loopback
Enter CTRL-C to stop
0x0000000c: Expected 0x55 got 0x00
0x0000000d: Expected 0xff got 0x00
0x0000000e: Expected 0xaa got 0x00
0x0000001c: Expected 0x55 got 0x00
0x0000001d: Expected 0xff got 0x00
0x0000001e: Expected 0xaa got 0x00
0x0000002c: Expected 0x55 got 0x00
0x0000002d: Expected 0xff got 0x00
0x0000002e: Expected 0xaa got 0x00
0x0000003c: Expected 0x55 got 0x00
0x0000003d: Expected 0xff got 0x00
0x0000003e: Expected 0xaa got 0x00
0x0000004c: Expected 0x55 got 0x00
0x0000004d: Expected 0xff got 0x00
0x0000004e: Expected 0xaa got 0x00
0x0000005c: Expected 0x55 got 0x00
0x0000005d: Expected 0xff got 0x00
0x0000005e: Expected 0xaa got 0x00
0x0000006c: Expected 0x55 got 0x00
0x0000006d: Expected 0xff got 0x00
0x0000006e: Expected 0xaa got 0x00
0x0000007c: Expected 0x55 got 0x00
0x0000007d: Expected 0xff got 0x00
0x0000007e: Expected 0xaa got 0x00
0x0000008c: Expected 0x55 got 0x00
0x0000008d: Expected 0xff got 0x00
0x0000008e: Expected 0xaa got 0x00
0x0000009c: Expected 0x55 got 0x00
0x0000009d: Expected 0xff got 0x00
0x0000009e: Expected 0xaa got 0x00
0x000000ac: Expected 0x55 got 0x00
0x000000ad: Expected 0xff got 0x00
Card 0
00000000: 0101 0101 0101 0101 0101 0101 0101
00000001: 0202 0202 0202 0202 0202 0202 0202
00000002: 0404 0404 0404 0404 0404 0404 0404
00000003: 0808 0808 0808 0808 0808 0808 0808
00000004: 1010 1010 1010 1010 1010 1010 1010
00000005: 2020 2020 2020 2020 2020 2020 2020
00000006: 4040 4040 4040 4040 4040 4040 4040
00000007: 8080 8080 8080 8080 8080 8080 8080
00000008: 0101 0101 0101 0101 0101 0101 0101
00000009: 0202 0202 0202 0202 0202 0202 0202
0000000A: 0404 0404 0404 0404 0404 0404 0404
0000000B: 0808 0808 0808 0808 0808 0808 0808
0000000C: 1010 1010 1010 1010 1010 1010 1010
0000000D: 2020 2020 2020 2020 2020 2020 2020
0000000E: 4040 4040 4040 4040 4040 4040 4040
0000000F: 8080 8080 8080 8080 8080 8080 8080
00000010: 0101 0101 0101 0101 0101 0101 0101
00000011: 0202 0202 0202 0202 0202 0202 0202
00000012: 0404 0404 0404 0404 0404 0404 0404
00000013: 0808 0808 0808 0808 0808 0808 0808
00000014: 1010 1010 1010 1010 1010 1010 1010
00000015: 2020 2020 2020 2020 2020 2020 2020
00000016: 4040 4040 4040 4040 4040 4040 4040
00000017: 8080 8080 8080 8080 8080 8080 8080
00000018: 0101 0101 0101 0101 0101 0101 0101
00000019: 0202 0202 0202 0202 0202 0202 0202
0000001A: 0404 0404 0404 0404 0404 0404 0404
0000001B: 0808 0808 0808 0808 0808 0808 0808
0000001C: 1010 1010 1010 1010 1010 1010 1010
0000001D: 2020 2020 2020 2020 2020 2020 2020
0000001E: 4040 4040 4040 4040 4040 4040 4040
0000001F: 8080 8080 8080 8080 8080 8080 8080

Total of 33 errors
Mod   B: TEST FAILED       # "do_loopback" (1, 1, -1)
pv>

--- plugged in clk pod cable J7 ---

pv> x do_loopback
Mod   B: TEST passed       # "do_loopback" (2, 1, 1)
pv>

--- jumper between pins 18 and 20 on J7 ---

pv> x do_loopback
Mod   B: TEST passed       # "do_loopback" (3, 1, 1)
pv>

--- End code ---

MarkL:
A small follow-up on the J7 jumper thing...

If the jumper on J7 pin 18 to 20 is in place, and a failure is induced by holding U121.39 DQMH high, the failures reported for both sections of do_loopback begin to make more sense.

Also note do_clock_test succeeds in this case.

I'm guessing that maybe to perform byte reads and writes to SDRAM (as opposed to the full 16-bit width for the SDRAM), the clock needs to be enabled.  The first test in the do_loopback output is byte oriented, and the second set is the full memory width.  Without the clock enabled, the byte oriented output had no relation to what was actually happening with the SDRAM.  Now, with the clock enabled, SDRAM failures do have an effect for both byte and word error sections.


--- Code: ---U121.39 DQMH held high, J7 18 to 20 jumpered

pv> x do_loopback
Enter CTRL-C to stop
0x00000001: Expected 0x00 got 0x80
0x00000002: Expected 0x00 got 0x80
0x00000003: Expected 0x00 got 0x80
0x00000004: Expected 0x00 got 0x80
0x00000005: Expected 0x00 got 0x80
0x00000006: Expected 0x00 got 0x80
0x00000007: Expected 0x00 got 0x80
0x00000008: Expected 0x00 got 0x80
0x00000009: Expected 0x00 got 0x80
0x0000000a: Expected 0x00 got 0x80
0x0000000b: Expected 0x00 got 0x80
0x0000000c: Expected 0x55 got 0xd5
0x0000000f: Expected 0x00 got 0x80
0x00000010: Expected 0x00 got 0x80
0x00000011: Expected 0x00 got 0x80
0x00000012: Expected 0x00 got 0x80
0x00000013: Expected 0x00 got 0x80
0x00000014: Expected 0x00 got 0x80
0x00000015: Expected 0x00 got 0x80
0x00000016: Expected 0x00 got 0x80
0x00000017: Expected 0x00 got 0x80
0x00000018: Expected 0x00 got 0x80
0x00000019: Expected 0x00 got 0x80
0x0000001a: Expected 0x00 got 0x80
0x0000001b: Expected 0x00 got 0x80
0x0000001c: Expected 0x55 got 0xd5
0x0000001f: Expected 0x00 got 0x80
0x00000020: Expected 0x00 got 0x80
0x00000021: Expected 0x00 got 0x80
0x00000022: Expected 0x00 got 0x80
0x00000023: Expected 0x00 got 0x80
0x00000024: Expected 0x00 got 0x80
Card 0
00000000: 0101 0101 0101 0101 0101 0101 8101
00000001: 0202 0202 0202 0202 0202 0202 8202
00000002: 0404 0404 0404 0404 0404 0404 8404
00000003: 0808 0808 0808 0808 0808 0808 8808
00000004: 1010 1010 1010 1010 1010 1010 8010
00000005: 2020 2020 2020 2020 2020 2020 8020
00000006: 4040 4040 4040 4040 4040 4040 8040
00000007: 8080 8080 8080 8080 8080 8080 8080
00000008: 0101 0101 0101 0101 0101 0101 8101
00000009: 0202 0202 0202 0202 0202 0202 8202
0000000A: 0404 0404 0404 0404 0404 0404 8404
0000000B: 0808 0808 0808 0808 0808 0808 8808
0000000C: 1010 1010 1010 1010 1010 1010 8010
0000000D: 2020 2020 2020 2020 2020 2020 8020
0000000E: 4040 4040 4040 4040 4040 4040 8040
0000000F: 8080 8080 8080 8080 8080 8080 8080
00000010: 0101 0101 0101 0101 0101 0101 8101
00000011: 0202 0202 0202 0202 0202 0202 8202
00000012: 0404 0404 0404 0404 0404 0404 8404
00000013: 0808 0808 0808 0808 0808 0808 8808
00000014: 1010 1010 1010 1010 1010 1010 8010
00000015: 2020 2020 2020 2020 2020 2020 8020
00000016: 4040 4040 4040 4040 4040 4040 8040
00000017: 8080 8080 8080 8080 8080 8080 8080
00000018: 0101 0101 0101 0101 0101 0101 8101
00000019: 0202 0202 0202 0202 0202 0202 8202
0000001A: 0404 0404 0404 0404 0404 0404 8404
0000001B: 0808 0808 0808 0808 0808 0808 8808
0000001C: 1010 1010 1010 1010 1010 1010 8010
0000001D: 2020 2020 2020 2020 2020 2020 8020
0000001E: 4040 4040 4040 4040 4040 4040 8040
0000001F: 8080 8080 8080 8080 8080 8080 8080

Total of 33 errors
Mod   B: TEST FAILED       # "do_loopback" (1, 1, -1)

pv> x do_clock_test
Mod   B: TEST passed       # "do_clock_test" (1, 0, 1)
pv>

--- End code ---

Navigation

[0] Message Index

[#] Next page

[*] Previous page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod