Author Topic: Agilent 34461A corrupted flash  (Read 23011 times)

0 Members and 1 Guest are viewing this topic.

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #75 on: July 30, 2023, 07:56:56 pm »


It does support falling back to a backup config, but I don't see any other copies. That is probably optional (which would be why yours isn't attempting to load it). What does the spare area after that first page of the config look like?


all 00 all the way up to 0xC4000

Are you certain you are reading the spare area? It is the extra 0x40 (64) bytes after the end of the 0x800 byte page. How were you reading the flash?

There is definitely some calculation it is doing with the spare area in the code before it decides it is a bad block... I am just not sure what algorithm it is using.
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #76 on: July 30, 2023, 08:01:57 pm »

Are you certain you are reading the spare area? It is the extra 0x40 (64) bytes after the end of the 0x800 byte page. How were you reading the flash?

There is definitely some calculation it is doing with the spare area in the code before it decides it is a bad block... I am just not sure what algorithm it is using.

I could be quite wrong about this. So here is what I did for this last dump:
Code: [Select]
p510> nand read 0x800000 0xc0000 0x4000
.
.
p510> md.b 0x800000 0x4000

so 0x4000 in the nand read might not mean the same as 0x4000 in the memory read?
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #77 on: July 30, 2023, 08:03:51 pm »
for the previous dump I used 0x800 instead of 0x4000 in nand read

but as you see the length of the file is 0x4000 and the first 4 bytes are crc32 and that leaves 0x3ffc of data as you suggested
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #78 on: July 30, 2023, 08:21:41 pm »
ok for crc calculation I skip the first 4 bytes and select everything all the way up to 0x83F and do a crc32
i also tried selecting up to 0x7FF and also up to 0x3FFF and I never get those 4 bytes
I am using HxD hex editor to do that
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #79 on: July 30, 2023, 08:25:16 pm »
for the previous dump I used 0x800 instead of 0x4000 in nand read

but as you see the length of the file is 0x4000 and the first 4 bytes are crc32 and that leaves 0x3ffc of data as you suggested

for the previous dump I used 0x800 instead of 0x4000 in nand read

but as you see the length of the file is 0x4000 and the first 4 bytes are crc32 and that leaves 0x3ffc of data as you suggested

I would guess that command is not including the spare area, as that is stored at a "lower level" on the flash and is usually not used by applications.

Every page is actually 0x840 bytes long. the last 0x40 bytes is typically used to store meta data (bad blocks, ECC information, etc) and is usually invisible for higher-level operations.

Looking at the code, I see what appears to be some software-implemented ECC check that is operating on the spare pages, just before the program decides to throw up that block read failure.

My guess is that data is not checking out, and causing the PBOOT to report a failure, despite the data actually being good.

UBOOT may not be implementing this check.
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #80 on: July 30, 2023, 08:29:14 pm »
ok for crc calculation I skip the first 4 bytes and select everything all the way up to 0x83F and do a crc32
i also tried selecting up to 0x7FF and also up to 0x3FFF and I never get those 4 bytes
I am using HxD hex editor to do that

I am starting at 0xc0000 + 4 to skip the CRC, then running to 0xC3FF. The CRC32 is good on my area when I do that, and is also good on yours if I pad it out to that length with 00's (you didn't send me the whole thing to test).

So is there anything in there to the end of that area that isn't 00? I am using the CRC-32 in 010 Editor.
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #81 on: July 30, 2023, 08:31:04 pm »
I can use the nand dump command which only dumps one page 0x800 at a time but it also dumps the out of band (.oob) data for that page
is that what you mean?
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #82 on: July 30, 2023, 08:32:43 pm »
There is a slight chance, that rewriting all the pages in that block could cause the it to re-write the spare area ECC calculations. But that depends on whether the ting you are using to re-write those areas actually implements the ECC algorithm and updates it.

The idea would be that you write that block back exactly as it already exists, and see if the writing process takes care of whatever errors are present in the spare.

In my flash, that entire block is just the config padded by 0s followed by empty space.
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #83 on: July 30, 2023, 08:33:14 pm »
I can use the nand dump command which only dumps one page 0x800 at a time but it also dumps the out of band (.oob) data for that page
is that what you mean?

Yes, oob is another name for the spare, so that should do the trick.
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #84 on: July 30, 2023, 08:44:49 pm »


Yes, oob is another name for the spare, so that should do the trick.

I added those extra 0x40 bytes to the end of the 0x800 bytes page and made a bin file (attached)

still not getting those starting 4 bytes as CRC32  :-[ :-[
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #85 on: July 30, 2023, 08:46:17 pm »
here it is

Thanks. That looks significantly different than mine.

I think what I am going to suggest is that you dump the entire block (64 * 2048 byte pages) starting at that 0xc0000 offset and then re-write exactly the same block back to flash if you can under the Uboot prompt.

Unless there is some actual physical flash damage, that should get the spare areas back in sync for the block.

Of course, this is taking a risk, so usual disclaimer here...
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #86 on: July 30, 2023, 08:49:25 pm »
here it is

Thanks. That looks significantly different than mine.

I think what I am going to suggest is that you dump the entire block (64 * 2048 byte pages) starting at that 0xc0000 offset and then re-write exactly the same block back to flash if you can under the Uboot prompt.

Unless there is some actual physical flash damage, that should get the spare areas back in sync for the block.

Of course, this is taking a risk, so usual disclaimer here...

oh no you looked at the wrong file, I messed up there and deleted that message already  :-[ :-[

please look at the bin file I just posted. change the extension to .bin

 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #87 on: July 30, 2023, 08:51:39 pm »
the file i posted earlier and you looked at was supposed to be this one here but I dont know what happened

it is just the plain text. Then I made a bin file from it and posted above

 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #88 on: July 30, 2023, 08:57:49 pm »
here it is

Thanks. That looks significantly different than mine.

I think what I am going to suggest is that you dump the entire block (64 * 2048 byte pages) starting at that 0xc0000 offset and then re-write exactly the same block back to flash if you can under the Uboot prompt.

Unless there is some actual physical flash damage, that should get the spare areas back in sync for the block.

Of course, this is taking a risk, so usual disclaimer here...

oh no you looked at the wrong file, I messed up there and deleted that message already  :-[ :-[

please look at the bin file I just posted. change the extension to .bin

Looks the same. (I had taken the text and copied it into my hex editor to reproduce it).

The disturbing thing is that looking at your boot process, it basically told you that every block up to the end of flash from there has the same issue, which points to possible physical damage.

I would still try rewriting that block and see if the error clears for that one, but you may have other errors if that fixes the boot config block.
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #89 on: July 30, 2023, 09:01:42 pm »
it does not look out of ordinary to me

by the way, if i want to dump the nand and write it back into the nand in uboot
I cannot write into those out of band areas. "nand dump" dumps one page at a time with those oob
but nand write has no way of writing in those oob area unless they are calculated automatically and updated by the nand controller

in that case I can just read the data by nand read and write it back instead of sumping one page at a time
I am only concerned about nand write command because still I am not sure what is the third parameter
is it a page size or is it # of bytes or some other block size? that scares me  :scared: :scared:
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #90 on: July 30, 2023, 09:03:37 pm »
can you point me to what errors/problems you see in that last bin file with .oob data?
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #91 on: July 30, 2023, 09:07:13 pm »
can you point me to what errors/problems you see in that last bin file with .oob data?

Without knowing the exact structure they are using, I can't say what is wrong with it exactly, except that the structure appears to be different. There could be a lot of reasons for this. See my spare on the first page for the config below.
Code: [Select]
FF FF FF FF FF FF FF FF 95 A5 6A 69 59 FF FF FF
6A F3 3F CC FF FF FF FF FF FF FF FF FF FF FF FF
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

Yours has more actual data in it, and is only filling the one side. Since these are structured in a very specific way, it is unusual to see such a large difference.
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #92 on: July 30, 2023, 09:09:00 pm »
it does not look out of ordinary to me

by the way, if i want to dump the nand and write it back into the nand in uboot
I cannot write into those out of band areas. "nand dump" dumps one page at a time with those oob
but nand write has no way of writing in those oob area unless they are calculated automatically and updated by the nand controller

in that case I can just read the data by nand read and write it back instead of sumping one page at a time
I am only concerned about nand write command because still I am not sure what is the third parameter
is it a page size or is it # of bytes or some other block size? that scares me  :scared: :scared:


DO NOT try to write the OOB data. Just rewrite the block normally. The functions that write the block SHOULD calculate the data that needs to go into the OOB and write it for you!
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #93 on: July 30, 2023, 09:14:00 pm »
it does not look out of ordinary to me

by the way, if i want to dump the nand and write it back into the nand in uboot
I cannot write into those out of band areas. "nand dump" dumps one page at a time with those oob
but nand write has no way of writing in those oob area unless they are calculated automatically and updated by the nand controller

in that case I can just read the data by nand read and write it back instead of sumping one page at a time
I am only concerned about nand write command because still I am not sure what is the third parameter
is it a page size or is it # of bytes or some other block size? that scares me  :scared: :scared:


DO NOT try to write the OOB data. Just rewrite the block normally. The functions that write the block SHOULD calculate the data that needs to go into the OOB and write it for you!

From what I can see in the uboot help, the commands should be similar to the read command.

Have you tried the "nand bad" command to see if any bad blocks are known?

You could also try writing this to the NEXT block assuming it is completely blank already. If you noticed during bootup, it does try the next block all the way to the end of flash.
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #94 on: July 30, 2023, 09:16:49 pm »


DO NOT try to write the OOB data. Just rewrite the block normally. The functions that write the block SHOULD calculate the data that needs to go into the OOB and write it for you!

yeah but still I am not quite sure how the nand write command behaves. Is it taking # of bytes for the size parameter or number of 2kB pages or some other block size
For example, in the uboot variables I see some commands like this:
Code: [Select]
nand erase 0x00620000 ${blocksize};nand write 0x4000000 0x00620000 ${blocksize}but i cannot find where the blocksize was defined.

Help on nand commands in uboot gives me this:
Code: [Select]
p510> help nand
nand - NAND sub-system

Usage:
nand info - show available NAND devices
nand device [dev] - show or set current device
nand read - addr off|partition size
nand write - addr off|partition size
    read/write 'size' bytes starting at offset 'off'
    to/from memory address 'addr', skipping bad blocks.
nand erase [clean] [off size] - erase 'size' bytes from
    offset 'off' (entire device if not specified)
nand bad - show bad blocks
nand dump[.oob] off - dump page
nand scrub - really clean NAND erasing bad blocks (UNSAFE)
nand markbad off [...] - mark bad block(s) at offset (UNSAFE)
nand biterr off - make a bit error at offset (UNSAFE)

which seems to suggest the third parameter is just # of bytes but those in the uboot variables confuse me

 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #95 on: July 30, 2023, 09:18:48 pm »
"nand bad" shows no bad blocks
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #96 on: July 30, 2023, 09:32:50 pm »


From what I can see in the uboot help, the commands should be similar to the read command.

Have you tried the "nand bad" command to see if any bad blocks are known?

You could also try writing this to the NEXT block assuming it is completely blank already. If you noticed during bootup, it does try the next block all the way to the end of flash.

the errors in the log start from
Code: [Select]
Reading NAND configuration
FMD_DirectRead: Invalid block at sector 0x180 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0x1c0 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0x200 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0x240 bumping by 0x40 sectors

and it keeps bumping by 0x40 sectors whatever that means until the last error which is
Code: [Select]
FMD_DirectRead: Invalid block at sector 0xff80 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0xffc0 bumping by 0x40 sectors

how is sector related to page? how long is it?

"nand info" commands says "sector size 128 KiB" but seems way too large to me
« Last Edit: July 30, 2023, 09:35:08 pm by analogRF »
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #97 on: July 30, 2023, 09:39:42 pm »


From what I can see in the uboot help, the commands should be similar to the read command.

Have you tried the "nand bad" command to see if any bad blocks are known?

You could also try writing this to the NEXT block assuming it is completely blank already. If you noticed during bootup, it does try the next block all the way to the end of flash.



the errors in the log start from
Code: [Select]
Reading NAND configuration
FMD_DirectRead: Invalid block at sector 0x180 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0x1c0 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0x200 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0x240 bumping by 0x40 sectors

and it keeps bumping by 0x40 sectors whatever that means until the last error which is
Code: [Select]
FMD_DirectRead: Invalid block at sector 0xff80 bumping by 0x40 sectors
FMD_DirectRead: Invalid block at sector 0xffc0 bumping by 0x40 sectors

how is sector related to page? how long is it?

"nand info" commands says "sector size 128 KiB" but seems way too large to me
They are using sector size = 2048 bytes, so the same thing as a page. Multiply that # by 2048 to get offset.
It is effectively starting from c0000 and going to the end of flash.
 

Online ElectronMan

  • Regular Contributor
  • *
  • Posts: 111
  • Country: us
Re: Agilent 34461A corrupted flash
« Reply #98 on: July 30, 2023, 09:48:35 pm »
So I just tested this from uboot, which reads out the page, and places it in the next block (the next block was empty in mine).

It seems to work without any issue for me. But YMMV.

The MD was just to check that memory area was mostly empty.
Code: [Select]
md 0x800000 0x20000
nand read 0x800000 0xc0000 0x20000
nand write 0x800000 0xe0000 0x20000
 

Offline analogRF

  • Super Contributor
  • ***
  • Posts: 1024
  • Country: ca
Re: Agilent 34461A corrupted flash
« Reply #99 on: July 30, 2023, 09:52:44 pm »


They are using sector size = 2048 bytes, so the same thing as a page. Multiply that # by 2048 to get offset.
It is effectively starting from c0000 and going to the end of flash.

yes I just figured that out but it is strange because when the unit boots
I am 100% it reads a lot from the flash and never complains for example about cal data being corrupted etc...
cannot be bad all the way to 128MB

I was just looking at the dumps of some random pages after 0xC0800 and they are all 00 (with identical oob data which has 4 rows of data and 4 rows of FF alternating) until 0xC4000 when all data become FF including all the oob and that continues for very long length....

 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf