EEVblog Electronics Community Forum

Electronics => Repair => Topic started by: adrianh on August 29, 2021, 11:28:58 pm

Title: Tektronix DPO 4104 kernel panic
Post by: adrianh on August 29, 2021, 11:28:58 pm
Hi all. First time post here. Please let me know if I should post to another forum/area.

I recently acquired a not-working DPO 4104. I know nothing of its history. It is in a splash-screen boot-loop. I managed to figure out where the serial port is, and captured the following log:
Quote
... [normal stuff]
## Booting image at f0000000 ...
   Image Name:   Linux-2.4.20_mvl31-440ep_eval
   Image Type:   PowerPC Linux Multi-File Image (gzip compressed)
   Data Size:    1441010 Bytes =  1.4 MB
   Load Address: 00000000
   Entry Point:  00000000
   Contents:
   Image 0:  1033953 Bytes = 1009.7 kB
   Image 1:   407042 Bytes = 397.5 kB
   Verifying Checksum ... OK
   Uncompressing Multi-File Image ... OK
cmdline is console=ttyS0,9600 quiet bigphysarea=519 panic=2 root=/dev/mtdblock7 rw mem=131072k
   Loading Ramdisk to 07f2b000, end 07f8e602 ... OK
kernel BUG at page_alloc.c:104!
Oops: Exception in kernel mode, sig: 4
NIP: C0036248 XER: 00000000 LR: C0036248 SP: C01E8820 REGS: c01e8770 TRAP: 0700    Not tainted
MSR: 00009030 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DEAR: 00004000, ESR: 08000000
TASK = c01e6960[0] 'swapper' Last syscall: 0
[... goes into reboot loop]

The error condition is flagged in the kernel when pages are being freed, so this looks like completely bogus page, or a double-free.
Quote
        if (!VALID_PAGE(page))
                BUG();

This is repeats exactly the same way everytime.

My initial suspicion is that the scope was improperly updated, or the update failed for some reason and the scope is effectively bricked.

There could be a hardware failure I guess (bad DRAM cell(s) since this is happening with a RAM disk?).

Is there anything that can be done at this point? - either more debug or some sort of flash of a new image? A traditional "firmware update" is not possible since the machine cannot boot to any level of functionality.

Thanks for any ideas!
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on August 29, 2021, 11:53:14 pm
By the boot log it doesn't look like it gets to the firmware update step but you might want to try the "FORCE" firmware method in the manual just to be thorough. Ramdisk load is the last line which is curious. You could be right about a bad flash but I'm afraid it could also be bad RAM that it tries to access after the initial copy.

I wish I had a copy of the flash chip for you to try programming but I haven't been able to find one for my MSO4104. On mine it is a 4M flash chip at U822 but contains more than just the bootloader or it would be a simple task. I'm interested to see if this thread gains any traction because I am in a similar situation. Please update with anything you learn as it would help me and others in the future.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on September 05, 2021, 10:41:16 pm
I got to thinking about this kernel panic... It is unlikely that the firmware is borked - if that were the case, I doubt that it would even get out of the gate. Now the kernel panic in this case is the result of the kernel finding a free page (i.e., the valid bit in the PTE is clear) where it thinks it should have a valid page. Assuming that the kernel is fine (the scope did work once after all), that means that when the kernel wrote the PTE to set the page base along with the valid and R/W/X bits, the valid bit (and maybe others as well) did not get set. So bad DDR DRAM.

So I bought four new DRAMs ($2.50 each at mouser). The plan was to desolder and replace each dram in turn, verifying that the scope didn't break worse after each one.

Well, I got lucky. The first DRAM I replaced fixed the problem, all the POST passed and the scope is functional!

Now all I need to do is find a replacement map/zoom button, get some probes, and get it calibrated...
Title: Re: Tektronix DPO 4104 kernel panic
Post by: james_s on September 05, 2021, 11:43:49 pm
Wow that's great, thanks for posting the update. Always feels good to fix something like that.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on September 05, 2021, 11:58:11 pm
Great job! Thank you for updating the thread.

Care to speculate on the bootloader freezing at the F in the FLASH: 64MB line? I haven't found the solution yet.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on September 26, 2021, 11:32:36 pm
Great job! Thank you for updating the thread.

Care to speculate on the bootloader freezing at the F in the FLASH: 64MB line? I haven't found the solution yet.

I've been thinking about your failure... So my failure was caused by bad DRAM - DRAM that was once good. In a way - I was very lucky because I got a kernel fault with bad page tables which clearly indicated that a location had gone bad. But bad dram in kernel space could have many other symptoms. For example, let's assume the OS is unpacked from a ROM to DRAM. The kernel is happily going along and then gets an interrupt (timer, device, whatever) and the ISR has some corrupt instructions.. An infinite exception loop could result.

I'm obviously speculating, but clearly bad DRAM is (now) a problem with these scopes. The DRAMs are cheap and the repair is fairly straightforward, so I'd say start replacing your DRAM (start with the bottom one on the CPU side) and see how it goes. Worst you will lose is a few hours and ~$20.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on September 27, 2021, 12:05:04 am
Thanks Adrian, that gives me a little hope. I don't think it is even loading the kernel yet because uBoot doesn't get to that point. I just don't know enough about uBoot to determine if it's interacting with DRAM and would fail if DRAM was bad. Either way, the low cost of replacement is worthwhile.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on September 27, 2021, 12:20:10 am
Can you post a link to your boot log? I'm wondering if I could find the generic uboot source given the version (or maybe circa the kernel version year) and figure out what it is doing. It would've been modified heavily by Tek but the overall arc of what it is doing might be helpful. At some point it will unpack the kernel from Somewhere into DRAM, but when is the key.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on September 27, 2021, 06:12:53 pm
This is the entirety of the boot log. It stops at the same place every time. If you want to read more, I started a thread a while back here: https://www.eevblog.com/forum/repair/tek-mso4104-no-boot/msg3596092/#msg3596092 (https://www.eevblog.com/forum/repair/tek-mso4104-no-boot/msg3596092/#msg3596092)
Code: [Select]
U-Boot 1.1.4 (Jan  8 2007 - 11:12:14) Tektronix, Inc. V1.06

CPU:   AMCC PowerPC 440EP Rev. C at 333.333 MHz (PLB=133, OPB=66, EBC=66 MHz)
       I2C boot EEPROM enabled                                               
       Internal PCI arbiter enabled, PCI async ext clock used               
       32 kB I-Cache 32 kB D-Cache                                           
Board: Tektronix Route66 IBM 440EP Main Board                               
        VCO: 666 MHz                                                         
        CPU: 333 MHz                                                         
        PLB: 133 MHz                                                         
        OPB: 66 MHz
        EPB: 66 MHz
I2C:   ready
DRAM:  128 MB
F

I appreciate any insight you might have. The more input the better I can wrap my head around things.  :-+
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on September 27, 2021, 08:18:25 pm
I'm still thinking a problem with DRAM. This is why: As soon as uboot has figured out how much RAM is in the system (it might be hardcoded either in source or some blob somewhere) it is going to point its stack somewhere in the DRAM area so that it can actually start doing more complex activities. If the stack is pointing to bad DRAM then the system is going to get completely wedged - which certainly looks like your case.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on September 28, 2021, 03:05:34 am
That makes sense and is at least worth a try. I ordered the last 3 Digikey had in stock. Wish me luck.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on September 28, 2021, 03:22:05 am
Good luck! And if you need one or two extra DRAMS, drop me a PM. FWIW, I used air to get the old chips off the board (and lead-free takes some HOT air), cleaned the pads, and then drag soldered with lots of flux. I put kapton tape on all the little guys nearby and I didn't lose anyone! :phew:
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on September 28, 2021, 03:35:36 am
I appreciate the offer and will definitely let you know. That's the same technique I usually do with one additional step; I add leaded solder to the leads to lower the temp required to remove the chips with hot air. That way I don't stress the board and hopefully don't kill the chips just in case they are still good. Still, a little stressful working on a valuable piece of equipment. I'll let you know how it goes.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on October 03, 2021, 11:14:16 pm
Update: I just replaced both DRAM chips on the CPU side and it behaves the same exact way. I supposed that is good news that I didn't make it worse but it's still non-op. I hesitate to continue after seeing no progress. What do you think?
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on October 03, 2021, 11:39:19 pm
Well… I think you should keep going. The stack would be in high mem so maybe the other side of the board(?). And to motivate you I’ll donate two drams to the cause. PM me.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: aibi1590 on October 07, 2021, 01:41:25 pm
Hey I got one mso2012 and a white screen appears.
I watch the ubootlog and stop at DRAM:64MB
I infer that the flash is faulty.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on October 07, 2021, 10:04:12 pm
That looks basically identical to pcwrangler's failure. I'm curious how his DRAM experiment will turn out...
Title: Re: Tektronix DPO 4104 kernel panic
Post by: aibi1590 on October 08, 2021, 01:37:15 pm
That looks basically identical to pcwrangler's failure. I'm curious how his DRAM experiment will turn out...
Yes, but I don't know if it is DRAM or flash failure.
The four DRAMs on my MSO2012 are not the same.
And I don’t think I can buy a new flash and pre-burn the firmware into it.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on October 09, 2021, 02:23:20 pm
SUCCESS! I replaced the DRAM and it came right up. I can't thank Adrian enough for the support on this. Although only 2 confirmed cases is anecdotal so far, Adrian might be right; these scopes could have faulty DRAM. I hope more people see this thread.

aibi1590, your bootlog stops almost exactly where mine did. Even though your version is 2000 series, I would still bet the RAM is bad. Give it a try, the cost is minimal.


Now...... where did I put all those screws?!
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on October 09, 2021, 03:56:19 pm
For the record: The bad ram was the bottom one on the back side (opposite of CPU) of the board.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on October 09, 2021, 04:00:34 pm
SWEET! You can ultimately thank the early linux mmu developers for being professional programmers and asserting what should be true. I work in CPU development so I knew what was going on (we boot linux all the time on our simulators and real hardware). Hopefully more folks can get these amazing scopes working again!
Title: Re: Tektronix DPO 4104 kernel panic
Post by: aibi1590 on October 13, 2021, 05:21:35 am
OK!I will try it.
very grateful to adrianh and pcwrangler.
I have ordered the new DRAM in a few weeks and I will report the result.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: aibi1590 on November 26, 2021, 04:09:39 pm
Eventually I replaced all the RAM but the problem remained the same :'(
Title: Re: Tektronix DPO 4104 kernel panic
Post by: maxwelllls on December 02, 2021, 01:49:51 am
Eventually I replaced all the RAM but the problem remained the same :'(
You can try to measure the impedance of each memory data line to GND, by comparing the parameters of multiple chips to find abnormal values of pins, perhaps some CPU pins are rosin joint
Title: Re: Tektronix DPO 4104 kernel panic
Post by: MarkF3837 on January 04, 2022, 12:54:28 am
Thank you, community, for sleuthing the DPO4xxx 'scopes. I'm thinking (hoping) my DPO4104 might also have bad RAM. This scope also has a "stuck in splash screen" fault. Before I knew about the debug serial port, I tried updating firmware via a USB stick. The update started, but got stuck somewhere. Now that I have a debug port dump, I can see where it fails, but I don't "speak uBoot", so I'm not sure what to make of it.

Erasing and flashing flash pages goes along just fine, until it attempts a file mount. A snapshot of the key part of the dump follows:


    (boot & update messages)
    ...
Erasing 128 Kibyte @ 160000 -- 91 % complete.
  - Writing to flash; this may take a while...
Finished updating backup kernel.
Updating filesystem... DO NOT TURN OFF THE SYSTEM!!!!
Extracting the contents of /usr/local/perm...
  - Erasing flash... this may take a while.
Erasing 128 Kibyte @ 1de0000 -- 99 % complete.
mount: Mounting /dev/mtdblock7 on /mnt/rootfs failed: Invalid argument
  - Writing to flash; this may take a while...
tar: Cannot create directory `./usr/local': No space left on device
tar: Cannot create directory `./usr/local': No space left on device
    ...
    (repeat a bunch of times)
    ...
tar: Cannot create directory `./usr/local/': No space left on device
tar: ./usr/local/bin/scopeApp.ppcep: No such file or directory
An error occured while updating the root filesystem!
    (the end)


Looks like tar was uncompressing a file, but had nowhere to go.
Best case scenario, I replace some RAM and all is good. Worst case -- it's bricked.

-Mark
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on January 04, 2022, 01:11:18 am
Can you please post/attach the *entire* log?
Title: Re: Tektronix DPO 4104 kernel panic
Post by: MarkF3837 on January 04, 2022, 02:02:30 am
Here ya go:

U-Boot 1.1.4 (Jan  8 2007 - 11:12:14) Tektronix, Inc. V1.06

CPU:   AMCC PowerPC 440EP Rev. B at 333.333 MHz (PLB=133, OPB=66, EBC=66 MHz)
       I2C boot EEPROM enabled
       Internal PCI arbiter enabled, PCI async ext clock used
       32 kB I-Cache 32 kB D-Cache
Board: Tektronix Route66 IBM 440EP Main Board
        VCO: 666 MHz
        CPU: 333 MHz
        PLB: 133 MHz
        OPB: 66 MHz
        EPB: 66 MHz
I2C:   ready
DRAM:  128 MB
FLASH: 64.5 MB
PCI:   Bus Dev VenId DevId Class Int
        00  13  10b5  9056  0680  18
        00  15  1002  4c59  0300  17
DISP:  Type 1
In:    serial
Out:   serial
Err:   serial
Enter password - autobooting in 3 seconds
## Booting image at f0000000 ...
   Image Name:   Linux-2.4.20_mvl31-440ep_eval
   Image Type:   PowerPC Linux Multi-File Image (gzip compressed)
   Data Size:    1441010 Bytes =  1.4 MB
   Load Address: 00000000
   Entry Point:  00000000
   Contents:
   Image 0:  1033953 Bytes = 1009.7 kB
   Image 1:   407042 Bytes = 397.5 kB
   Verifying Checksum ... OK
   Uncompressing Multi-File Image ... OK
cmdline is console=ttyS0,9600 quiet bigphysarea=519 panic=2 root=/dev/mtdblock7 rw mem=131072k
   Loading Ramdisk to 07f2b000, end 07f8e602 ... OK
Checking for firmware update...
Mounted /dev/sda1 as /mnt/sda1
Checking md5sum...
fwUpdate.sh: OK
md5sum check passed, continuing...
Firmware update script found in image. Executing script...
Firmware platform check passed.
This is a new kernel.
This instrument is a DPO4xxx.
No firmware version file found in flash, performing update
  - kernel.img found.
  - filesystem.tar.gz found.
  - bootloader.img found.
  - comparing versions...
V1.06
V1.06
installed bootloader is equal or newer. Skipping.
  - splash.img found.
  - splashmso.img found.
  - route66_fp.s19 found.
    Front panel version file contents -  Route66 FP 0 27 3
  - fwEnvUpdate.sh found.
Update files were found.  Checking file integrity, please wait...
Running: /usr/bin/md5sum -c < md5sum.txt
bootloader.img: OK
dispBmp: OK
filesystem.tar.gz: OK
firmware_complete4.bmp: OK
firmware_update4.bmp: OK
fwEnvUpdate.sh: OK
fwUpdate.sh: OK
getPlatform: OK
kernel.img: OK
nv.jffs2: OK
ppver.txt: OK
radeonRegs.ppc: OK
radeonVideoCap.scr: OK
root.jffs2: OK
route66_fp.s19: OK
route66_fp_version.txt: OK
splash.img: OK
splashmso.img: OK
uBootExtract: OK
Using /usr/share/modules/radeonfb.o
md5sum check passed for all files, continuing...
Performing update in 15 seconds...
(The front panel will be updated during the next powerup.)
Flashing splash.img... DO NOT TURN OFF THE SYSTEM!!!!
  - Erasing flash; this may take a while...
Erasing 128 Kibyte @ e0000 -- 87 % complete.
  - Writing to flash; this may take a while...
Finished updating splash screen.
Flashing kernel.img... DO NOT TURN OFF THE SYSTEM!!!!
  - Erasing flash; this may take a while...
Erasing 128 Kibyte @ 160000 -- 91 % complete.
  - Writing to flash; this may take a while...
Finished updating kernel.
Flashing kernel.img into backup partition...
  - Erasing flash; this may take a while...
Erasing 128 Kibyte @ 160000 -- 91 % complete.
  - Writing to flash; this may take a while...
Finished updating backup kernel.
Updating filesystem... DO NOT TURN OFF THE SYSTEM!!!!
Extracting the contents of /usr/local/perm...
  - Erasing flash... this may take a while.
Erasing 128 Kibyte @ 1de0000 -- 99 % complete.
mount: Mounting /dev/mtdblock7 on /mnt/rootfs failed: Invalid argument
  - Writing to flash; this may take a while...
tar: Cannot create directory `./usr/local': No space left on device
tar: Cannot create directory `./usr/local': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: Cannot create directory `./usr/local': No space left on device
tar: Cannot create directory `./usr/local/': No space left on device
tar: ./usr/local/bin/scopeApp.ppcep: No such file or directory
An error occured while updating the root filesystem!

Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on January 04, 2022, 03:10:49 am
Well, your issue is certainly different than the two that have been fixed. I suppose bad dram could result in the "no space left on device" error... However the "invalid argument" warning is a little disturbing. I wish I understood more how flash updating works! (Does anyone want to capture a log of a successful fw update??)

Bad dram? maybe. Bad flash chip?? (is the flash mounted as a filesystem after erasing and before writing - probably).

I can imagine a corrupt bit in a ram filesystem that happened to make an inode disappear... Again, it will cost you only $20 in dram and some serious smd soldering, but it sure would be nice to have a more solid theory. Do you know where the flash chip is? Might be easier to try replacing that first.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: MarkF3837 on January 04, 2022, 06:23:20 am
Good idea: there's a single flash chip (AM29LV040B-90JC) in a 32-PLCC. Easy to rework. There are four mystery BGAs that appear to be on the same bus; they don't matter, because I don't have BGA rework equipment.

Where does the boot code reside? If it is in this part, then I'll need to get a programmer.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: JohnPi on January 04, 2022, 06:06:02 pm
Can someone please post a photo of where the serial port is on these scopes ? We have about 4 of these in our lab that fail to boot, and I may try exchange DRAMs between them...
Title: Re: Tektronix DPO 4104 kernel panic
Post by: MarkF3837 on January 05, 2022, 12:48:02 am
It's on the bottom of my DSP4104. Remove the cover and there is an opening in the Aluminum shield. A 1.27 mm pitch connector has 20 pins. I identified two; see the photo. Signal is 3.3V & 9600 b/s.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on January 09, 2022, 10:40:03 pm
Good idea: there's a single flash chip (AM29LV040B-90JC) in a 32-PLCC. Easy to rework. There are four mystery BGAs that appear to be on the same bus; they don't matter, because I don't have BGA rework equipment.

Where does the boot code reside? If it is in this part, then I'll need to get a programmer.

1. The bootloader does reside on that AM29LV040B. It is easy to remove/replace but yes you will need a programmer to do anything with it. Read it multiple times with it firmly in the programmer to compare against each other and rule out intermittent errors (indicating bad chip or bad contacts to programmer) and make backups! But first: Your log shows it skipped updating the bootloader because it was "same or newer". Have you tried the "force update" procedure mentioned in the firmware documentation? If you do, make sure you watch and copy the log in the process.
2. What is the full boot log withOUT trying to update the firmware via USB?
3. It is conceivable that the RAM is bad, as Adrian said, and that is causing havoc with the install. If the above doesn't work, I would spend the ~$20 and give it a try.

Good Luck
Title: Re: Tektronix DPO 4104 kernel panic
Post by: MarkF3837 on February 03, 2022, 02:02:26 am
Finally had some free time. After replacing flash and RAM, behavior is the same, even after adding "forceinstall.txt" to the USB drive. Could something in the bootloader be corrupted? Has anyone uploaded their flash contents? I could try booting with that code.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on February 05, 2022, 05:20:12 am
Finally had some free time. After replacing flash and RAM, behavior is the same, even after adding "forceinstall.txt" to the USB drive. Could something in the bootloader be corrupted? Has anyone uploaded their flash contents? I could try booting with that code.

Did you save the log when you tried the force install?
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on February 05, 2022, 05:01:02 pm
... Could something in the bootloader be corrupted? Has anyone uploaded their flash contents? I could try booting with that code.

Sent you a PM
Title: Re: Tektronix DPO 4104 kernel panic
Post by: MarkF3837 on February 07, 2022, 06:50:29 pm
I flashed a known good copy of boot code. The middle part of the console output was different, but the first part and the (failed) last part are identical to the original. Same thing comparing old/new outputs for a forced update.

I may have to poke at address & mux lines to see if something looks funny. Another (grasping at straws) possibility: What size DDR RAM are in others' 'scopes? Mine has MT46V16M16; this is an early version instrument, and maybe they changed to MT46V32M16?

In the meantime, I'm going to try and track down the noise source in my current 1 GHz 'scope: a TDS784D. At least there are schematics for TDS5xx & TDS6xx families out in the internets.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on February 07, 2022, 09:12:07 pm
That's the exact model DRAM my MSO4104 has which were bad so I replaced them. The whole saga is in my original thread below. It might give you ideas when you're ready to revisit your 4104.
https://www.eevblog.com/forum/repair/tek-mso4104-no-boot/ (https://www.eevblog.com/forum/repair/tek-mso4104-no-boot/)

Good luck with the others.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: mikehank on March 15, 2022, 08:20:44 pm
Replace the CPU memory.  Four gullwing chips near the processor.  Two on either side.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: analogRF on April 06, 2023, 06:47:16 pm
I hope adrianh and pcwrangler still check this thread and be interested  ;) ;)

I have a DPO4054 which one day just froze and after power cycle it never booted again (stuck at splash screen). All voltages and oscillators are ok and no hot spot on the board out of the ordinary. Here is the boot log I get:
Code: [Select]
U-Boot 1.1.4 (Jan  8 2007 - 11:12:14) Tektronix, Inc. V1.06

CPU:   AMCC PowerPC 440EP Rev. C at 333.333 MHz (PLB=133, OPB=66, EBC=66 MHz)
       I2C boot EEPROM enabled
       Internal PCI arbiter enabled, PCI async ext clock used
       32 kB I-Cache 32 kB D-Cache
Board: Tektronix Route66 IBM 440EP Main Board
        VCO: 666 MHz
        CPU: 333 MHz
        PLB: 133 MHz
        OPB: 66 MHz
        EPB: 66 MHz
I2C:   ready
DRAM:  128 MB
FLASH: 64.5 MB
PCI:   Bus Dev VenId DevId Class Int
        00  13  10b5  9056  0680  18
        00  15  1002  4c59  0300  17
DISP:  Type 1
In:    serial
Out:   serial
Err:   serial
Enter password - autobooting in 3 seconds
## Booting image at f0000000 ...
   Image Name:   Linux-2.4.20_mvl31-440ep_eval
   Image Type:   PowerPC Linux Multi-File Image (gzip compressed)
   Data Size:    1441010 Bytes =  1.4 MB
   Load Address: 00000000
   Entry Point:  00000000
   Contents:
   Image 0:  1033953 Bytes = 1009.7 kB
   Image 1:   407042 Bytes = 397.5 kB
   Verifying Checksum ... OK
   Uncompressing Multi-File Image ... OK
cmdline is console=ttyS0,9600 quiet bigphysarea=519 panic=2 root=/dev/mtdblock7 rw mem=131072k
   Loading Ramdisk to 07f2b000, end 07f8e602 ... OK
Checking for firmware update...
No USB mass storage devices found to update from.
Linux 2.4.20_mvl31-440ep_eval V 1.15 Tektronix Route66 Tue Jun 22 15:19:50 PDT 2010
stat of /var/log/dmesg failed: No such file or directory
Warning: loading NiDKEng-1.6 will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Warning: loading NiDUsb-1.6 will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Warning: loading tek will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Error programming MIA, USERi stayed low 0x00000000
insmod: init_modScope application starting (normal mode)
---------------------- startScopeApp()Oops: kernel access of bad area, sig: 11
NIP: C00CBA10 XER: 20000000 LR: C003FE20 SP: C06E9E10 REGS: c06e9d60 TRAP: 0800    Tainted: P
MSR: 00009030 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
DEAR: CE16CA8C, ESR: 00000000
DEAR: CE16CA8C, ESR: 00000000
TASK = c06e8000[68] 'scopeApp.ppcep' Last syscall: 5
last math c06e8000 last altivec 00000000
GPR00: 00000000 C06E9E10 C06E8000 C01F184C C7DB5540 00000000 C7DB55A4 C7E98000
GPR08: C01EB430 C01F1834 C7DB54C0 00000050 28004888 1175BF40 0FF897D0 0FF89F8C
GPR16: 0FF89FB4 0FF89760 12B379E0 12B37940 00009032 C06E9F40 00000000 C00029DC
GPR24: C00026A0 FFFFFFED C719FDC0 C01F0000 C7DB5540 C01F0000 000000F0 CE16CA8C
Call backtrace:
C7EAF008 C003FE20 C003E774 C003E820 C003EA60 C000272C 0FF78720
10CB1D1C 10CB19C0 10CB45EC 10CB5310 10CB4F74 10CB6AAC 10CC5104
10CC5D64
 running Init code ----------------------
versionBuildFWVersionString(): TimestampString: 25-Apr-12  11:13
                               VersionFIRMWAREVERSIONversion: v2.68
                               Major ver num: 2 Minor ver num: 68
mv: canÞù

I also entered the U-Boot and tried the memory test (mtest). By default it checks from 0x00400000 to 0x07000000 which passes with no error. I also tried 'mtest' from 0x00100000 to 0x07f00000 and no errors. However, I do get error at 0x07f8efb8 when I try to test RAM from 0x07f00000 to 0x07ff0000:
Code: [Select]
=> mtest 07f00000 07ff0000

Testing 07f00000 ... 07ff0000:
Iteration:      1

FAILURE (read/write) @ 0x07f8efb8: expected 0x00023bef, actual 0x07f8efd0)

However, I tried that on a DPO4104 with the same FW version and I get exactly the same error at that address with the same value. So I suppose this should not mean the RAM is bad.

I have forced firmware update twice with 2.68 and it all goes through with no problem and after restart I get the same error.

Any idea what I should be looking at?

What is Mia? Is that the Altera Stratix FPGA? or the PCI bridge chip? Or the ATI mobility Radeon?

any help is highly appreciated....I tend to believe it is still a DRAM issue but I think if it was DRAM it should have stopped much earlier in the process
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on April 06, 2023, 07:03:18 pm
> I tend to believe it is still a DRAM issue but I think if it was DRAM it should have stopped much earlier in the process

All it takes is one bit to bad in DRAM. It could end up being in the stack, the heap, the BSS.. My point is that it can trip up the kernel/user-space anywhere at anytime.

Given that the DRAM repair has fixed at least two problems that didn't have the same signature, I'd say spend the $20-30 and start replacing dram.

What I do worry about still with my scope is bad DRAM that is in use by a user-space process. If it ends up in a data processing area, the error will not be caught. So the reality is that we should all replace ALL of our DRAM.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: analogRF on April 06, 2023, 07:07:34 pm
> I tend to believe it is still a DRAM issue but I think if it was DRAM it should have stopped much earlier in the process

All it takes is one bit to bad in DRAM. It could end up being in the stack, the heap, the BSS.. My point is that it can trip up the kernel/user-space anywhere at anytime.

Given that the DRAM repair has fixed at least two problems that didn't have the same signature, I'd say spend the $20-30 and start replacing dram.

What I do worry about still with my scope is bad DRAM that is in use by a user-space process. If it ends up in a data processing area, the error will not be caught. So the reality is that we should all replace ALL of our DRAM.

can you decipher anything from the kernel messages? it seems it must have something to do with MIA whatever that is.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: adrianh on April 06, 2023, 08:29:12 pm
can you decipher anything from the kernel messages? it seems it must have something to do with MIA whatever that is.

Sorry cannot help you there.

Could you please share your setup to get into u-boot? I was only ever able to get output from the serial port. I would be neat to interact with the bootloader and try the memtest.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: pcwrangler on April 09, 2023, 03:39:48 am
From what I can see in your bootlog, it appears like another DRAM related issue to me. Suddenly stopping with garbage output coupled with the "bad area" error not too far above that...
Code: [Select]
insmod: init_modScope application starting (normal mode)
---------------------- startScopeApp()Oops: kernel access of bad area, sig: 11
Mine had a similar problem early in the boot process, stopping suddenly with no feedback. I think adrianh might be right, replace the DRAM chips. The low cost is well worth the try. Either way, please update this thread so we can keep track of these errors and their solutions.

As far as the MIA, no idea. The good news is the system caught the errors and reported it in the bootlog so it wasn't fatal (at least not to that point). Unless a full DRAM swap doesn't fix your issue, I wouldn't go down that rabbit hole.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: boyddotee on May 10, 2023, 12:22:41 pm
can you decipher anything from the kernel messages? it seems it must have something to do with MIA whatever that is.

Sorry cannot help you there.

Could you please share your setup to get into u-boot? I was only ever able to get output from the serial port. I would be neat to interact with the bootloader and try the memtest.

For uboot, there is a short window of "hit any key", also works by sending a character via serial.

That short window may have been changed by the developer so it may take a few try's.
Title: Re: Tektronix DPO 4104 kernel panic
Post by: chaos_chris on September 30, 2024, 12:53:36 pm
Hi all,
DPO4104 not leaving boot splash screen,
by reading the serial port (thank you!) I get as attached.

Does this sound familiar to anyone here?

Thank you very much in advance!

______________________________________________________
U-Boot 1.1.4 (Jan  8 2007 - 11:12:14) Tektronix, Inc. V1.06

CPU:   AMCC PowerPC 440EP Rev. C at 333.333 MHz (PLB=133, OPB=66, EBC=66 MHz)       I2C boot EEPROM enabled
       Internal PCI arbiter enabled, PCI async ext clock used
       32 kB I-Cache 32 kB D-Cache
Board: Tektronix Route66 IBM 440EP Main Board
<9>VCO: 666 MHz
<9>CPU: 333 MHz
<9>PLB: 133 MHz
<9>OPB: 66 MHz
<9>EPB: 66 MHz
I2C:   ready
DRAM:  128 MB
FLASH: 64.5 MB
PCI:   Bus Dev VenId DevId Class Int
        00  13  10b5  9056  0680  18   
        00  15  1002  4c59  0300  17      
DISP:  Type 1
In:    serial
Out:   serial
Err:   serial
Enter password - autobooting in 3 seconds
## Booting image at f0000000 ...
   Image Name:   Linux-2.4.20_mvl31-440ep_eval
   Image Type:   PowerPC Linux Multi-File Image (gzip compressed)
   Data Size:    1441010 Bytes =  1.4 MB
   Load Address: 00000000
   Entry Point:  00000000
   Contents:
   Image 0:  1033953 Bytes = 1009.7 kB
   Image 1:   407042 Bytes = 397.5 kB
   Verifying Checksum ... OK
   Uncompressing Multi-File Image ... OK
cmdline is console=ttyS0,9600 quiet bigphysarea=519 panic=2 root=/dev/mtdblock7 rw mem=131072k
   Loading Ramdisk to 07f2b000, end 07f8e602 ... OK
Checking for firmware update...
No USB mass storage devices found to update from.
Linux 2.4.20_mvl31-440ep_eval V 1.15 Tektronix Route66 Tue Jun 22 15:19:50 PDT 2010
stat of /var/log/dmesg failed: No such file or directory
rm: cannot remove '/var/run/*': No such file or directory
Warning: loading NiDKEng-1.6 will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted (http://www.tux.org/lkml/#export-tainted) for information about tainted modules
Warning: loading NiDUsb-1.6 will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted (http://www.tux.org/lkml/#export-tainted) for information about tainted modules
Warning: loading tek will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted (http://www.tux.org/lkml/#export-tainted) for information about tainted modules
Scope application starting (normal mode)
---------------------- startScopeApp() running Init code ----------------------
versionBuildFWVersionString(): TimestampString: 25-Apr-12  11:13   
                               VersionFIRMWAREVERSIONversion: v2.68                     
                               Major ver num: 2 Minor ver num: 68                     
     Initializing Mia[0]< 
<9>PA Bus error.  Skipping initialization of devices on PA bus
     Initializing Ibm440[0]
     Initializing HFD204ADC[1]
     Initializing HFD204ADC[0]
Model id: 0x03
     Initializing Ltc1658Dac[0] 
Board id: 0x01
     Initializing M859[3]
     Initializing M859[2]
     Initializing M859[1]
     Initializing M859[0]
      HFD144[0] ID_REG = 0x00001440
      HFD144[1] ID_REG = 0x00001440
      HFD144[2] ID_REG = 0x00001440
      HFD144[3] ID_REG = 0x00001440
     Initializing Hfd144[0]
     Initializing Hfd144[1]
     Initializing Hfd144[2]
     Initializing Hfd144[3]
<9>Open Max5362 successful.
     Initializing Max5362[0]
     Initializing Ltc1661Dac[1]
     Initializing Ltc1661Dac[0]
  hwInit failed
 Init ADT7468 and locking.
 Factory Checksum: Stored: 41143, Calculated: 41143  - OK
 Spc CheckSum: stored: 37628 calculated: 37628  - OK
 
Title: Re: Tektronix DPO 4104 kernel panic
Post by: squadchannel on September 30, 2024, 01:25:51 pm
guessing from the logs quoted in this reply, stuck on demux initialization?
Could be a hardware issue.

https://www.eevblog.com/forum/testgear/possible-ticking-time-bomb-in-tek-dpo3000-and-mso3000-series-of-scopes-dpo4000/msg3539784/#msg3539784 (https://www.eevblog.com/forum/testgear/possible-ticking-time-bomb-in-tek-dpo3000-and-mso3000-series-of-scopes-dpo4000/msg3539784/#msg3539784)