Author Topic: NAND Flash chip replacement / equivalent & programming  (Read 1585 times)

0 Members and 1 Guest are viewing this topic.

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
NAND Flash chip replacement / equivalent & programming
« on: March 02, 2023, 09:05:08 am »
Hi all. I am working on some main boards for a vending machine that run a cut down Linux of some kind. They're getting to the age where they are having issues which i believe is related to the flash. Reloading fresh firmware sometimes spews pages of "skipping bad block 0x000390000" (or whatever address) during the installation and sometimes even loading fresh firmware won't fix it, other times it helps, but will have random freezes or settings that cannot be adjusted for some strange reason. Due to chip shortages, terrorism, pandemic and solar winds, supply of the original boards new is expensive and spotty at best, and i would rather repair something than throw it away especially when i have a few boxes of them to go through.

The original part is MT29F1G08ABADAH4-IT (NQ277 Marking) and i have sourced several "equivalent" parts (as far as i can tell) from a reputable source (Digikey).

Replacements parts i sourced:

S34ML01G200BHI000 Programmed fine, exact same basic specs as original from what i can tell
IS34ML01G084 Programmed fine, exact same basic specs as original from what i can tell
MX30LF1G28AD-XKI Gave errors about programming due to the OOB block size being different?? so i didn't program or try this chip.


I have removed an original, good condition, working flash from the board with a fresh installation of the firmware. I have then read the contents out with my programmer, programmed an equivalent chip and soldered it to the board but the board no longer boots. It either just does a watchdog type reset with a flashing red light or i get no activity lights at all other than a main power light. Programming verified 100% OK, and i used the slow speed mode and normal mode during programming just to be sure there was no issues. I'm using an XGECU T56 programmer and the latest 12.50 software with the latest firmware loaded on the programmer (prompted on startup of the latest software.

To rule out an epic fail as this is the first time i've done any kind of BGA soldering, I have then resoldered the original flash chip to the board and it works still, so i haven't killed the board or anything with all this messing around (surprisingly).

I'm out of my depth on this with regards to how 'specific' the board / other hardware could be to the chip that is installed as the original parts are NLA through the places i know (Mouser, digikey, element14) and i'm very very hesitant to buy from eBay (although they do list some for sale) for this as i need to warrant the repair as i'm considering the boards "refurbished" and have to provide a warranty for my work. Could there be some kind of other data on the chip that isn't being read/copied. Could it be minor differences in how the different brands/models of chips work. I don't really understand most of the in depth stuff in the data sheets.

Any help would be... helpful.

Thanks.
 

Offline Mario87

  • Regular Contributor
  • *
  • Posts: 249
  • Country: gb
Re: NAND Flash chip replacement / equivalent & programming
« Reply #1 on: March 02, 2023, 10:53:43 am »
According to Silicon Expert (paid service we have access to at work) none of the devices you found are crosses with the original. All crosses are variants of SK Hynix HY27UA081G1M devices and they are a 'B-Cross' which means "Pin to Pin compatible with minor electrical differences and/or minor package dimension."

All other devices which they state are crosses with your original are obsolete now, just as your original component is.
 
The following users thanked this post: sean0118

Offline fzabkar

  • Super Contributor
  • ***
  • Posts: 2735
  • Country: au
 
The following users thanked this post: thm_w

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #3 on: March 02, 2023, 04:42:42 pm »
Thanks for that. I’ll look into the alternate part you mentioned.
 

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #4 on: March 02, 2023, 04:44:57 pm »
Thanks for that. I did look over the datasheets but I didn’t notice that difference before I purchased it for that one part. Thankfully the software was able to tell me before going to the effort of soldering it.
 

Offline fzabkar

  • Super Contributor
  • ***
  • Posts: 2735
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #5 on: March 02, 2023, 05:00:52 pm »
The page size of the hynix part is 512 + 16 bytes.

The Micron device has 4-bit internal ECC. The hynix part appears to rely totally on external ECC. AFAICT, the hynix part is less compatible than the others.
« Last Edit: March 02, 2023, 05:06:31 pm by fzabkar »
 

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #6 on: March 02, 2023, 05:01:22 pm »
According to Silicon Expert (paid service we have access to at work) none of the devices you found are crosses with the original. All crosses are variants of SK Hynix HY27UA081G1M devices and they are a 'B-Cross' which means "Pin to Pin compatible with minor electrical differences and/or minor package dimension."

All other devices which they state are crosses with your original are obsolete now, just as your original component is.

I looked up that part, its package dimensions are larger (8.5x15 vs 9x11) which in this case is no issue as there is space on the board but I’ll have to check if I have an adapter for my programmer to locate the part with a different package size. 

Could I trouble you to list the other parts which are correct crosses as I don’t get a lot of hits for the Hynix part except for some Chinese websites and Alibaba.
 

Online mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 14068
  • Country: gb
    • Mike's Electric Stuff
Re: NAND Flash chip replacement / equivalent & programming
« Reply #7 on: March 02, 2023, 05:12:54 pm »
The problem with NAND flash is that parts will have a number of bad blocks, and these will be mapped out by the device's filesystem. If you replace the part, even with the exact same part number,  the bad blocks will be in different places, so the bad-block map you copied from the old chip will be wrong.
There are also likely to be subtle differences in commands, layout, erase block sizes etc. which may cause problems.

The way I'd suggest approaching it is to try reading the old device over a range of reduced supply voltages, as a marginal flash cell may read OK at a lower threshold, and then erasing & re-writing.
This method can  work on EPROMs and NOR flash, I don't know if there are any differences in NAND that makes this less likely to work though. I'd imagine it probably won't on MLC flash as the thresholds are probably defined by internal references.
Another possibility is that there are areas that are being written, and have died due to flash endurance issues. NAND can also suffer degredation from very large numbers of reads ( "read-disturb" errors).



 



Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 
The following users thanked this post: sean0118

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #8 on: March 02, 2023, 07:58:54 pm »
The problem with NAND flash is that parts will have a number of bad blocks, and these will be mapped out by the device's filesystem. If you replace the part, even with the exact same part number,  the bad blocks will be in different places, so the bad-block map you copied from the old chip will be wrong.
There are also likely to be subtle differences in commands, layout, erase block sizes etc. which may cause problems.

The way I'd suggest approaching it is to try reading the old device over a range of reduced supply voltages, as a marginal flash cell may read OK at a lower threshold, and then erasing & re-writing.
This method can  work on EPROMs and NOR flash, I don't know if there are any differences in NAND that makes this less likely to work though. I'd imagine it probably won't on MLC flash as the thresholds are probably defined by internal references.
Another possibility is that there are areas that are being written, and have died due to flash endurance issues. NAND can also suffer degredation from very large numbers of reads ( "read-disturb" errors).

Thanks Mike, I specifically chose to read out a ‘known good’ chip which had no bad blocks reporting when loading fresh firmware other than one specific address that shows up on every ‘good’ board I’ve programmed to try and avoid such issues with bad blocks. But I realise that even in a new chip there can be bad blocks which are internally flagged as bad by the controller but I don’t really understand how that translates to how the data is allocated as I guess there is no ‘hard coding’ for accessing an address when it’s just a Linux file system on the chip as opposed to a micro accessing an EPROM.

This repair is going to be a lot more complicated than I was hoping it seems.

I’m now thinking if I can get another flash chip from a board that is good, remove it, erase and reprogram with the image I’ve been using and put that onto another board. If THAT fails then I know it’s my process. If it works then I would know it’s chip compatibility issue and I need to seek out some surplus original chips.
 

Offline fzabkar

  • Super Contributor
  • ***
  • Posts: 2735
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #9 on: March 02, 2023, 08:12:03 pm »
The "spare area" usually contains the ECC bytes plus the logical block address. The controller must keep a map table which correlates each LBA with its corresponding physical block address. This is done for wear levelling purposes. I don't know if your application warrants wear levelling, in which case there may not be a Flash Translation Layer (FTL).

If you could upload a complete [compressed] flash dump, including the spare area (OOB), we may be able to see how the firmware is structured.
« Last Edit: March 02, 2023, 08:15:17 pm by fzabkar »
 

Offline thm_w

  • Super Contributor
  • ***
  • Posts: 7361
  • Country: ca
  • Non-expert
Re: NAND Flash chip replacement / equivalent & programming
« Reply #10 on: March 03, 2023, 01:46:08 am »
There is that ebay USA seller, as well as grey market: https://octopart.com/search?q=MT29F1G08ABADAH4&currency=USD&specs=0
If it does turn out that the same NAND was required.

The chance of getting something fake would be low, if it was it should be obvious. There would be some decent chance if you order from china of getting a used reballed IC. But most of those grey market suppliers should be OK.

Maybe small chance that whatever low level bootloader is on the device is checking the NAND ID, and throwing an error. But it sounds like you have a serial debug output and its not saying anything for the non-working IC right?
Profile -> Modify profile -> Look and Layout ->  Don't show users' signatures
 

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #11 on: March 03, 2023, 03:21:35 am »
There is that ebay USA seller, as well as grey market: https://octopart.com/search?q=MT29F1G08ABADAH4&currency=USD&specs=0
If it does turn out that the same NAND was required.

The chance of getting something fake would be low, if it was it should be obvious. There would be some decent chance if you order from china of getting a used reballed IC. But most of those grey market suppliers should be OK.

Maybe small chance that whatever low level bootloader is on the device is checking the NAND ID, and throwing an error. But it sounds like you have a serial debug output and its not saying anything for the non-working IC right?

Thanks for the link. Once I test the original flash part that I swap then I’ll know for sure.  I haven’t hooked up any serial debug at this point as there is a specific diagnostic light flash sequence on the board that should happen within the first 5 seconds of power to indicate that it’s happy.
 

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #12 on: March 03, 2023, 07:56:51 am »
The "spare area" usually contains the ECC bytes plus the logical block address. The controller must keep a map table which correlates each LBA with its corresponding physical block address. This is done for wear levelling purposes. I don't know if your application warrants wear levelling, in which case there may not be a Flash Translation Layer (FTL).

If you could upload a complete [compressed] flash dump, including the spare area (OOB), we may be able to see how the firmware is structured.

Its a vending machine so the device is on 24/7 running animations, monitoring and managing fridge temperature, communicating with payment systems etc.

This is the binary i extracted from the good chip. The OTP area is empty so this is just a dump of the flash blocks. Hopefully that includes what you need.

https://fastupload.io/RGXoJYvusVQjTzx/file
 

Offline fzabkar

  • Super Contributor
  • ***
  • Posts: 2735
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #13 on: March 03, 2023, 08:01:38 pm »
There is a small area at the start of the dump which is configured as 512 + 16 bytes (data + spare area). Thereafter it becomes 2048 + 64 bytes.

I have extracted the data and spare areas as separate components. The data area at the beginning appears to be contiguous and fixed, whereas the rest of the data appear to be subject to wear levelling. This is evidenced by discontinuities in several text blocks.

http://users.on.net/~fzabkar/temp/Vending_MC.7z

You can view the SA files by setting your hex editor to 16 or 64 bytes per line. The patterns will then be self evident. That said, I haven't yet worked out the functions of each byte.

I use HxD (freeware hex editor).

https://mh-nexus.de/en/hxd/
« Last Edit: March 03, 2023, 08:09:13 pm by fzabkar »
 
The following users thanked this post: Cozzmo, thm_w, Audiorepair

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #14 on: March 03, 2023, 11:52:00 pm »
There is a small area at the start of the dump which is configured as 512 + 16 bytes (data + spare area). Thereafter it becomes 2048 + 64 bytes.

I have extracted the data and spare areas as separate components. The data area at the beginning appears to be contiguous and fixed, whereas the rest of the data appear to be subject to wear levelling. This is evidenced by discontinuities in several text blocks.

http://users.on.net/~fzabkar/temp/Vending_MC.7z

You can view the SA files by setting your hex editor to 16 or 64 bytes per line. The patterns will then be self evident. That said, I haven't yet worked out the functions of each byte.

I use HxD (freeware hex editor).

https://mh-nexus.de/en/hxd/

So it sounds to me then that some kind of initial software loading needs to be done "through the board" as opposed to directly programming the chips before they're installed on the board.
 

Offline CozzmoTopic starter

  • Regular Contributor
  • *
  • Posts: 67
  • Country: au
Re: NAND Flash chip replacement / equivalent & programming
« Reply #15 on: March 04, 2023, 09:43:03 am »
There is a small area at the start of the dump which is configured as 512 + 16 bytes (data + spare area). Thereafter it becomes 2048 + 64 bytes.

I have extracted the data and spare areas as separate components. The data area at the beginning appears to be contiguous and fixed, whereas the rest of the data appear to be subject to wear levelling. This is evidenced by discontinuities in several text blocks.

http://users.on.net/~fzabkar/temp/Vending_MC.7z

You can view the SA files by setting your hex editor to 16 or 64 bytes per line. The patterns will then be self evident. That said, I haven't yet worked out the functions of each byte.

I use HxD (freeware hex editor).

https://mh-nexus.de/en/hxd/


Looking at the original firmware loading file (that is loaded via USB) it is in this format, not sure if that helps at all?

https://yaffs.net/yaffs-2-specification
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf