Author Topic: STM32 interrupt affecting functionality of FatFS (Elm Chan)  (Read 1502 times)

0 Members and 1 Guest are viewing this topic.

Offline semir-tTopic starter

  • Regular Contributor
  • *
  • Posts: 64
  • Country: ba
Hi everyone,

I am currently working on a project where NOR External Flash is connected to the MCU via QSPI. I am utilizing FatFS (Elm Chan) to store data in memory as files. When data access is needed, a USB connection between the device and PC enables the device to function as a Mass Storage Device.

However, I am encountering an intermittent issue where FatFS stops working, returning an error code of FR_INVALID_OBJECT. This error is difficult to replicate consistently, but I suspect it might be caused by a frequently called ISR. Additionally, I joined this project at a later stage and discovered that this ISR is unusually long. My plan is to eventually optimize the ISR by moving certain tasks to the main program execution to shorten it, though this is not my immediate priority.

Do you think that ISR could be causing this or there is something else I should be looking for ?
 

Offline CountChocula

  • Supporter
  • ****
  • Posts: 208
  • Country: ca
  • I break things—sometimes on purpose.
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #1 on: May 22, 2024, 12:29:13 pm »
Did you try disabling the ISR to see if the error continues to occur?
Lab is where your DMM is.
 

Offline semir-tTopic starter

  • Regular Contributor
  • *
  • Posts: 64
  • Country: ba
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #2 on: May 22, 2024, 01:17:41 pm »
Yes, when I disable this long IRQ then device appears to be working but I can't be sure as the issue is random. Sometimes device can work up to an few hours without any issues and sometimes it happens after few minutes.

This is what I found on the FatFS document web-site:
Quote
If a write operation to the FAT volume is interrupted due to an accidental failure, such as sudden blackout, wrong media removal and unrecoverable disk error, the FAT structure on the volume can be broken.

This is in case of media removal and similar situations but I was wondering can this happen because of the long IRQ. Maybe IO driver somehow fails and I get the same issue as reported here. One thing I notice is that the File System is broken after this issue and I see a lot of dummy files with strange names.
 

Offline ataradov

  • Super Contributor
  • ***
  • Posts: 11474
  • Country: us
    • Personal site
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #3 on: May 22, 2024, 01:50:09 pm »
Just to clarify, are you using FatFS at the same time as the host is connected via USB? If so, then this is not something that would ever work.

If FatFS and USB access to the flash is separated, then what specific interrupts you think cause this?
Alex
 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3773
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #4 on: May 22, 2024, 01:56:47 pm »
Are you writing the FLASH device within the ISR? I went around this a while ago and posted loads here :)

e.g.
https://www.eevblog.com/forum/microcontrollers/cleanest-way-to-block-usb-interrupts/

The simplest way to implement FatFS / USB MSC is to do the whole FLASH write inside the ISR so that ISR may hold up lower priority stuff for the write time which might be say 15ms! But apparently (see the various threads of mine) this is what SD cards do and it is the max compatibility mode.

The "proper solution" is to write a "driver" for the FLASH (or equiv) so that the USB MSC ISR is short, but this is really hard unless you have tons of RAM for caching writes, or you return a Busy state to the Host so it keeps retrying (which is poorly documented and often doesn't work).

Quote
are you using FatFS at the same time as the host is connected via USB? If so, then this is not something that would ever work.

Why not?

The USB MSC stuff is done in the USB ISR. The Host (Windows usually) just sees a block device, 512 byte blocks (best done with a FLASH which actually has 512 byte blocks, obviously, but you can do blocking/deblocking as in hard disks) and knows nothing about the embedded code.

The internal filesystem access is done with FatFS which provides the embedded code with a file-based "view" of the block device. Usually FAT12 unless you have 8MB or more, IIRC, and then FAT16.

There will always be things to watch if both internal code and USB MSC are active concurrently, but there are ways to handle that. These are notes from my project manual:

====

The 2MB FAT filesystem is accessible to both a user application program running in the XXX and via USB as a removable mass storage media visible to the USB Host (Windows etc).
Read speed varies from 300kbytes/sec to 2Mbytes/sec, depending on various factors. Write speed is around 30kbytes/sec - the FLASH programming speed.

File System Limitations
The 4MB FLASH device has an endurance of 100k writes, on a per-byte basis. The FAT directory area is written on every file write and this thus gets the most wear. Care must therefore be exercised when writing code which does file write operations. There is no limit on reading operations since no "last-accessed" timestamp gets written - FAT12/FAT16 file systems have only the "last-modified" timestamp.
Only "8.3" filenames (names in the format xxxxxxxx.xxx) are supported e.g.
filename.txt
file1.txt
f.a
but not filename2.txt3 because >8 or >3 in the filename and the extension, respectively, are not allowed. Wiki article: https://en.wikipedia.org/wiki/8.3_filename

However, nothing stops a USB Host (e.g. Windows) creating whatever filenames and directory structures it is able to according to its own operating system limitations. Within the FAT12 drive it can create more or less anything. If the Host creates a non-8.3 filename, it will be visible to user code under its 8.3 alternate filename which the Host must create, by convention.

All files accessible to user code must be in the root of the drive. Subdirectories (folders) are not supported. If a USB Host creates a subdirectory, this will not be accessible to user code although its name will be visible to the get_file_list() and get_file_properties() functions.

Host Operating System Caching issues
It is important to realise that the USB Host sees the 2MB of FLASH as no more than a number of 512 byte sectors representing a FAT12 removable storage device, which it owns entirely. It can perform whatever operations it wants to in there, without regard to whether the XXX internal software understands it. The XXX system software is viewing this 2MB FLASH block from the other side and uses the FatFS embedded filesystem module
http://elm-chan.org/fsw/ff/00index_e.html
to interpret the data as a FAT12 filesystem.

Looking at this in the opposite direction (XXX to USB Host), files created or modified by the XXX do not immediately become visible to the USB Host. This is due to Host operating system architecture; for example Windows assumes nobody else is changing the data on a removable drive, and almost never checks for changes. It checks if it detects that the media has been removed or mounted. See file_usb_eject() and file_usb_insert() functions on how to make XXX file changes visible to the USB Host.

Files created or modified by the USB Host should become visible to XXX code immediately, assuming the data is actually written and the directory updated. This does not always happen because the OS may not flush the data to the USB drive. This was particularly true for older versions of Windows (before winXP).

Filenames are not case-sensitive. All filenames created by user applications are converted to uppercase.
The FLASH device used for the filesystem implements a per-sector pre-read on writes and performs the write only if the data is different. This increases the write speed by around 50x if the data has not changed. It can sometimes appear confusing in that a file writes extremely fast; this is because is has not been changed, or only a few bytes were changed! This is sometimes visible with XXX firmware updates.

====

My FatFS build was for 8.33 only, no LFN, and there were complicated reasons for that related to RTOS usage. Frankly no embedded code needs LFN anyway :)

There are ways to do a concurrent access filesystem but not with a removable USB device profile. You need to implement an ethernet storage device, IIRC, like a Synology network drive.

« Last Edit: May 22, 2024, 02:28:41 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline semir-tTopic starter

  • Regular Contributor
  • *
  • Posts: 64
  • Country: ba
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #5 on: May 22, 2024, 02:46:47 pm »
Thanks, will go through other thread and get back to you.

But just to give more info on this. Device is used to record data from various sensors, so when we start new recording session we open a new file and write the data in 4K blocks. During recording we don't connect USB (We can' physically as while recording there is other connector attached which blocks the USB). So device opens the file at start of the session and then we use f_write and f_sync. After some random time f_write will return the  FR_INVALID_OBJECT and any other  function that we call returns the same error.

The code that I got and which was developed by previous engineer uses a long IRQ handler to acquire one of the sensors data. There are some other IRQ Handlers but most of them are short. When I disable this long IRQ handler I don't seem to have the issue described.

One thing that is interesting as well is that when I have everything enabled and I stop the recording manually before anything happens I use the USB to download the recording to my PC. But when I mount it to my Linux PC I see my file and lot of other rubbish files. These files are not present when I disable the long IRQ, so that is why my assumption was that somehow this long IRQ corrupts the writing to FatFS (FAT table).

I know that I should fix the long ISR anyways but somehow that is not priority for us as in my opinion it would require a lot of restructuring of the code as we need accurate timestamp for all measured data.
 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3773
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #6 on: May 22, 2024, 03:01:49 pm »
I reckon there is an issue with interfacing to your FLASH device, with the long ISR. Is it SPI? Maybe SPI RX overflow?

You are corrupting the FAT filesystem somehow.

If you can't fix it, try caching the data into RAM and flush that to the FAT FS at some other time.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline xvr

  • Frequent Contributor
  • **
  • Posts: 371
  • Country: ie
    • LinkedIn
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #7 on: May 22, 2024, 03:16:40 pm »
Looks like synchronization problem or weired memory access. FatFS do not have timeing dependencies (AFAIK). It should work with any IRQ length.
 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3773
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #8 on: May 22, 2024, 03:31:49 pm »
Yes, FatFS has no timers or timeouts AFAIK. It provides mapping between a file API (open, read, etc) and a block storage device. But be careful with some build options. More notes from my project:

====

This area of the FLASH is visible via USB to Windows (etc) as a 2MB removable block storage device, presenting 512 byte sectors.

To enable internal code to access this filesystem for file I/O, FatFS is used: http://elm-chan.org/fsw/ff/00index_e.html. A number of functions are provided for basic file I/O and easy configuration data storage. When writing new file functions, use care to avoid corruption due to concurrent file system access over USB. For an example, see get_space().

The _FS_REENTRANT parameter in FatFS is set to 0 to disable re-entrancy. This was done because if this is enabled, all kinds of extra code gets brought in which basically needs the RTOS to be running. It does run fine with _FS_REENTRANT=1 (lots of testing was done under RTOS) but this prevents the file functions being used early on in main() for checking for config files.

The above would result in the file functions being “not thread-safe” which would mean that they would all need to be all in one RTOS thread. This would greatly complicate any function where file ops are driven from external events, such as during web based config file editing. So each of the file ops is protected with a mutex.

Write endurance

The file system is subject to the 100k write limit of the AT45DB321E serial FLASH.

There is also the 50k sector cumulative write limit.
« Last Edit: May 22, 2024, 03:45:06 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline CountChocula

  • Supporter
  • ****
  • Posts: 208
  • Country: ca
  • I break things—sometimes on purpose.
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #9 on: May 22, 2024, 03:52:56 pm »
I agree that this sounds like a synchronization problem rather than something that's specifically connected to a long-running ISR. Does the code in that ISR access the same resources as FatFS? For example, could it be trying to write to a file while another file operation is in progress? If so, it's possible that FatFS is not built to be reentrant under certain circumstances. If the ISR does perform disk operations, you could try to commenting them out to see if they are the cause.

IMO, you should also try to make this heisenbug more reproducible… otherwise, you're going to go crazy trying to solve something that may or may not happen, and you're never going to be 100% sure you've actually fixed it :)
Lab is where your DMM is.
 

Online voltsandjolts

  • Supporter
  • ****
  • Posts: 2338
  • Country: gb
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #10 on: May 22, 2024, 04:55:40 pm »
When recording sensor data to FLASH (via FatFS), how do you stop recording?
By button press, allowing clean flush to FLASH?
Otherwise, if it's by disconnecting the power source, then this could cause filesystem corruption, if power removed while writing to FLASH.
 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3773
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #11 on: May 22, 2024, 05:52:49 pm »
Yes; you need to know when to run fclose. Timeout?

If writing flash from an isr, this is very tricky, because the flash write speed is slow. This may be why your isr takes a long time to run. It's a bad architecture. The right way is to log to ram in the isr, and a separate foreground task (or rtos task) flushes this to flash.

Probably you log from an isr because you have a regular isr trigger tick producing a time stamp.

Ideally also you detect power failure and run fclose then, before vcc is gone, so at least you end up with a partial but valid file. Otherwise you will just get orphaned clusters. You can fix these with chkdsk /f over usb though - usually, or sometimes ;)
« Last Edit: May 22, 2024, 05:54:54 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline semir-tTopic starter

  • Regular Contributor
  • *
  • Posts: 64
  • Country: ba
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #12 on: May 23, 2024, 06:51:55 am »
I agree that this sounds like a synchronization problem rather than something that's specifically connected to a long-running ISR. Does the code in that ISR access the same resources as FatFS? For example, could it be trying to write to a file while another file operation is in progress? If so, it's possible that FatFS is not built to be reentrant under certain circumstances. If the ISR does perform disk operations, you could try to commenting them out to see if they are the cause.

IMO, you should also try to make this heisenbug more reproducible… otherwise, you're going to go crazy trying to solve something that may or may not happen, and you're never going to be 100% sure you've actually fixed it :)

I am going crazy ahaha. Still trying to find a pattern. No, the ISR stores the data to buffer in RAM and when we have full 4K whe then have a function which writes to a memory.


When recording sensor data to FLASH (via FatFS), how do you stop recording?
By button press, allowing clean flush to FLASH?
Otherwise, if it's by disconnecting the power source, then this could cause filesystem corruption, if power removed while writing to FLASH.

We stop by pressing the button on a device. When the butons i presed we store what we have in buffer at that point in time (We append zeros if the 4K is not full) and we close the file.
 

Offline xvr

  • Frequent Contributor
  • **
  • Posts: 371
  • Country: ie
    • LinkedIn
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #13 on: May 23, 2024, 07:03:19 am »
Check for 4K buffer overflow in ISR.
 

Offline semir-tTopic starter

  • Regular Contributor
  • *
  • Posts: 64
  • Country: ba
Re: STM32 interrupt affecting functionality of FatFS (Elm Chan)
« Reply #14 on: May 24, 2024, 06:59:20 am »
Thanks you all for helping me.

I think that I finally figured out what is the issue. I think that I don't have enough RAM memory and that ISR overwrites the FATFS buffers and corrupts the data. This makes sense as the issue was noticed after I increased one of the buffers used to acquire data.

My assumption that the ISR length was issue came from opservation that when I disable the ISR it doesn't affect FATFS and once enabled it does. But apparently it is not the length of the ISR (even though it is a bad thing to have it this long) it was all the acquisition and buffering that we did inside the ISR).

I now reduced few buffers and I notice that I don't see the issue for now.
 
The following users thanked this post: voltsandjolts


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf