Electronics > Repair

Tektronix DPO 4104 kernel panic

<< < (8/10) > >>

MarkF3837:
I flashed a known good copy of boot code. The middle part of the console output was different, but the first part and the (failed) last part are identical to the original. Same thing comparing old/new outputs for a forced update.

I may have to poke at address & mux lines to see if something looks funny. Another (grasping at straws) possibility: What size DDR RAM are in others' 'scopes? Mine has MT46V16M16; this is an early version instrument, and maybe they changed to MT46V32M16?

In the meantime, I'm going to try and track down the noise source in my current 1 GHz 'scope: a TDS784D. At least there are schematics for TDS5xx & TDS6xx families out in the internets.

pcwrangler:
That's the exact model DRAM my MSO4104 has which were bad so I replaced them. The whole saga is in my original thread below. It might give you ideas when you're ready to revisit your 4104.
https://www.eevblog.com/forum/repair/tek-mso4104-no-boot/

Good luck with the others.

mikehank:
Replace the CPU memory.  Four gullwing chips near the processor.  Two on either side.

analogRF:
I hope adrianh and pcwrangler still check this thread and be interested  ;) ;)

I have a DPO4054 which one day just froze and after power cycle it never booted again (stuck at splash screen). All voltages and oscillators are ok and no hot spot on the board out of the ordinary. Here is the boot log I get:

--- Code: ---U-Boot 1.1.4 (Jan  8 2007 - 11:12:14) Tektronix, Inc. V1.06

CPU:   AMCC PowerPC 440EP Rev. C at 333.333 MHz (PLB=133, OPB=66, EBC=66 MHz)
       I2C boot EEPROM enabled
       Internal PCI arbiter enabled, PCI async ext clock used
       32 kB I-Cache 32 kB D-Cache
Board: Tektronix Route66 IBM 440EP Main Board
        VCO: 666 MHz
        CPU: 333 MHz
        PLB: 133 MHz
        OPB: 66 MHz
        EPB: 66 MHz
I2C:   ready
DRAM:  128 MB
FLASH: 64.5 MB
PCI:   Bus Dev VenId DevId Class Int
        00  13  10b5  9056  0680  18
        00  15  1002  4c59  0300  17
DISP:  Type 1
In:    serial
Out:   serial
Err:   serial
Enter password - autobooting in 3 seconds
## Booting image at f0000000 ...
   Image Name:   Linux-2.4.20_mvl31-440ep_eval
   Image Type:   PowerPC Linux Multi-File Image (gzip compressed)
   Data Size:    1441010 Bytes =  1.4 MB
   Load Address: 00000000
   Entry Point:  00000000
   Contents:
   Image 0:  1033953 Bytes = 1009.7 kB
   Image 1:   407042 Bytes = 397.5 kB
   Verifying Checksum ... OK
   Uncompressing Multi-File Image ... OK
cmdline is console=ttyS0,9600 quiet bigphysarea=519 panic=2 root=/dev/mtdblock7 rw mem=131072k
   Loading Ramdisk to 07f2b000, end 07f8e602 ... OK
Checking for firmware update...
No USB mass storage devices found to update from.
Linux 2.4.20_mvl31-440ep_eval V 1.15 Tektronix Route66 Tue Jun 22 15:19:50 PDT 2010
stat of /var/log/dmesg failed: No such file or directory
Warning: loading NiDKEng-1.6 will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Warning: loading NiDUsb-1.6 will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Warning: loading tek will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Error programming MIA, USERi stayed low 0x00000000
insmod: init_modScope application starting (normal mode)
---------------------- startScopeApp()Oops: kernel access of bad area, sig: 11
NIP: C00CBA10 XER: 20000000 LR: C003FE20 SP: C06E9E10 REGS: c06e9d60 TRAP: 0800    Tainted: P
MSR: 00009030 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DEAR: CE16CA8C, ESR: 00000000
DEAR: CE16CA8C, ESR: 00000000
TASK = c06e8000[68] 'scopeApp.ppcep' Last syscall: 5
last math c06e8000 last altivec 00000000
GPR00: 00000000 C06E9E10 C06E8000 C01F184C C7DB5540 00000000 C7DB55A4 C7E98000
GPR08: C01EB430 C01F1834 C7DB54C0 00000050 28004888 1175BF40 0FF897D0 0FF89F8C
GPR16: 0FF89FB4 0FF89760 12B379E0 12B37940 00009032 C06E9F40 00000000 C00029DC
GPR24: C00026A0 FFFFFFED C719FDC0 C01F0000 C7DB5540 C01F0000 000000F0 CE16CA8C
Call backtrace:
C7EAF008 C003FE20 C003E774 C003E820 C003EA60 C000272C 0FF78720
10CB1D1C 10CB19C0 10CB45EC 10CB5310 10CB4F74 10CB6AAC 10CC5104
10CC5D64
 running Init code ----------------------
versionBuildFWVersionString(): TimestampString: 25-Apr-12  11:13
                               VersionFIRMWAREVERSIONversion: v2.68
                               Major ver num: 2 Minor ver num: 68
mv: canÞù
--- End code ---

I also entered the U-Boot and tried the memory test (mtest). By default it checks from 0x00400000 to 0x07000000 which passes with no error. I also tried 'mtest' from 0x00100000 to 0x07f00000 and no errors. However, I do get error at 0x07f8efb8 when I try to test RAM from 0x07f00000 to 0x07ff0000:

--- Code: ---=> mtest 07f00000 07ff0000

Testing 07f00000 ... 07ff0000:
Iteration:      1

FAILURE (read/write) @ 0x07f8efb8: expected 0x00023bef, actual 0x07f8efd0)
--- End code ---

However, I tried that on a DPO4104 with the same FW version and I get exactly the same error at that address with the same value. So I suppose this should not mean the RAM is bad.

I have forced firmware update twice with 2.68 and it all goes through with no problem and after restart I get the same error.

Any idea what I should be looking at?

What is Mia? Is that the Altera Stratix FPGA? or the PCI bridge chip? Or the ATI mobility Radeon?

any help is highly appreciated....I tend to believe it is still a DRAM issue but I think if it was DRAM it should have stopped much earlier in the process

adrianh:
> I tend to believe it is still a DRAM issue but I think if it was DRAM it should have stopped much earlier in the process

All it takes is one bit to bad in DRAM. It could end up being in the stack, the heap, the BSS.. My point is that it can trip up the kernel/user-space anywhere at anytime.

Given that the DRAM repair has fixed at least two problems that didn't have the same signature, I'd say spend the $20-30 and start replacing dram.

What I do worry about still with my scope is bad DRAM that is in use by a user-space process. If it ends up in a data processing area, the error will not be caught. So the reality is that we should all replace ALL of our DRAM.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod