Author Topic: 32F417 USB FS (not HS) and DMA, and how to write via USB to a slow FLASH drive (Read 4863 times)

ataradov · « **Reply #25 on:** January 25, 2022, 12:48:43 am »

I mean in the existing code. After STORAGE_Write_FS() returns, the stack would send CSW. You need to find that place and modify it, or see if there is something your code can already do to prevent it from sending the status.

peter-h · « **Reply #26 on:** January 25, 2022, 07:11:33 am »

OK; I get you now.

What we tried was getting Storage_Write_FS() to return a BUSY status but that is evidently not supported. The answer is to return nothing, but that since that function has to return "something"... one has to return "OK" and block the CSW transmission higher up. That could be easily done with a global flag.

The more tricky thing is how to return the CSW once the FLASH is finished programming. Is it just a packet stuffed into the USB TX FIFO? It can't be just a constant packet because there are potentially loads of devices on the USB bus.

ataradov · « **Reply #27 on:** January 25, 2022, 07:15:05 am »

Just change the code that calls this function and make it accept any code you like. There is no need for any global flags.

And you send CSW by doing the same thing normal stack already does. There is likely a function that formats the payload and sends the response.

peter-h · « **Reply #28 on:** January 25, 2022, 07:32:05 am »

That code was already changed to support the BUSY return (1). Previously it supported only 0 or negative values. Didn't work

Is the CSW returned asynchronously, or in response to a poll from the host? If the latter, then returning it when the flash programming is done, is easy.

I found some info here
https://wiki.osdev.org/USB_Mass_Storage_Class_Devices

ataradov · « **Reply #29 on:** January 25, 2022, 07:43:02 am »

Quote from: peter-h on January 25, 2022, 07:32:05 am

Is the CSW returned asynchronously, or in response to a poll from the host? If the latter, then returning it when the flash programming is done, is easy.

Everything happens in response to a poll from the host. In USB devices can't send anything on their own. Host will keep on polling for it for a long time until you send it, so functionally it is asynchronous.

But it is not something you can just send. It is no a fixed data, it contains information that reflects the state of the MSD stack (residue part). It has to be correctly filled out. But again, there is a place that already does this. You just need to find it.

Quote from: peter-h on January 25, 2022, 07:32:05 am

I found some info here
https://wiki.osdev.org/USB_Mass_Storage_Class_Devices

Why not get the actual specification. It is only 22 pages long including TOC and all the legal stuff.

peter-h · « **Reply #30 on:** January 25, 2022, 08:04:11 am »

I found this

Code: [Select]

/**
* @brief  MSC_BOT_SendCSW
*         Send the Command Status Wrapper
* @param  pdev: device instance
* @param  status : CSW status
* @retval None
*/
void  MSC_BOT_SendCSW(USBD_HandleTypeDef *pdev, uint8_t CSW_Status)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;

  hmsc->csw.dSignature = USBD_BOT_CSW_SIGNATURE;
  hmsc->csw.bStatus = CSW_Status;
  hmsc->bot_state = USBD_BOT_IDLE;

  (void)USBD_LL_Transmit(pdev, MSC_EPIN_ADDR, (uint8_t *)&hmsc->csw,
                         USBD_BOT_CSW_LENGTH);

  /* Prepare EP to Receive next Cmd */
  (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, (uint8_t *)&hmsc->cbw,
                               USBD_BOT_CBW_LENGTH);
}

It gets called from various places, one of which is the SCSI layer:

Code: [Select]

/**
* @brief  SCSI_ProcessWrite
*         Handle Write Process
* @param  lun: Logical unit number
* @retval status
*/
static int8_t SCSI_ProcessWrite(USBD_HandleTypeDef *pdev, uint8_t lun)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;
  uint32_t len = hmsc->scsi_blk_len * hmsc->scsi_blk_size;

  len = MIN(len, MSC_MEDIA_PACKET);

  if (((USBD_StorageTypeDef *)pdev->pClassSpecificInterfaceMSC)->Write(lun, hmsc->bot_data,
                                                      hmsc->scsi_blk_addr,
                                                      (len / hmsc->scsi_blk_size)) < 0)
  {
    SCSI_SenseCode(pdev, lun, HARDWARE_ERROR, WRITE_FAULT);
    return -1;
  }

  hmsc->scsi_blk_addr += (len / hmsc->scsi_blk_size);
  hmsc->scsi_blk_len -= (len / hmsc->scsi_blk_size);

  /* case 12 : Ho = Do */
  hmsc->csw.dDataResidue -= len;

  if (hmsc->scsi_blk_len == 0U)
  {
    MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED);
  }
  else
  {
    len = MIN((hmsc->scsi_blk_len * hmsc->scsi_blk_size), MSC_MEDIA_PACKET);

    /* Prepare EP to Receive next packet */
    (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, hmsc->bot_data, len);
  }

  return 0;
}

but obviously one can't just call this function asynchronously.

Yes I could read the spec but I am not so clever. I am an "old guy" who can manage basic C and I am trying to finish this product

I've climbed up a huge learning curve over the past year, from a few decades of embedded background with much simpler hardware, and mostly in assembler.

Does anybody fancy doing this, for say USD 100? It's probably dead simple if you know how.

It only needs to be done for the flash write. The read is only 200-300us long so it can remain in the Storage_Write_FS() function (which is an ISR). I can then do an RTOS task to do the actual write, based on a global flag being set.

ataradov · « **Reply #31 on:** January 25, 2022, 08:12:41 am »

It is impossible to tell how it can be called without studying how the stack works. That's the downside of frameworks - if you need something that is not supported out fo the box, you have a lot of figuring out to do.

But the first thing I would do is trace which one of them is called and how exactly it is called after your flash write returns.

peter-h · « **Reply #32 on:** January 25, 2022, 08:24:47 am »

I did a breakpoint after the flash write and then went to the drive, and to my alarm the flash write got called just by double-clicking on a file

I will need to check if that happens just by Windows being connected...

Anyway, this is the stack at that point

Code: [Select]

Thread #1 [main] 1 [core: 0] (Suspended : Breakpoint)	
	STORAGE_Write_FS() at usbd_storage_if.c:290 0x803ee70	
	SCSI_ProcessWrite() at usbd_msc_scsi.c:1,008 0x800e522	
	SCSI_Write10() at usbd_msc_scsi.c:818 0x800e692	
	SCSI_ProcessCmd() at usbd_msc_scsi.c:177 0x800e8c8	
	MSC_BOT_DataOut() at usbd_msc_bot.c:203 0x800de38	
	USBD_MSC_DataOut() at usbd_msc.c:497 0x800dbfa	
	USBD_MSC_CDC_DataOut() at usbd_msc_cdc.c:392 0x803ed90	
	USBD_LL_DataOutStage() at usbd_core.c:342 0x800ea62	
	HAL_PCD_DataOutStageCallback() at usbd_conf.c:160 0x803ea6a	
	PCD_EP_OutXfrComplete_int() at stm32f4xx_hal_pcd.c:2,143 0x800aca2	
	<...more frames...>

I will investigate.

ataradov · « **Reply #33 on:** January 25, 2022, 08:33:46 am »

Quote from: peter-h on January 25, 2022, 08:24:47 am

I did a breakpoint after the flash write and then went to the drive, and to my alarm the flash write got called just by double-clicking on a file

File access times and other metadata are being modified most likely. Ideally that would be cached, but windows became less aggressive with caching due to people not properly unmounting the drives.

Quote from: peter-h on January 25, 2022, 08:24:47 am

Anyway, this is the stack at that point

So, assuming hmsc->scsi_blk_len is 0 after the function returns, all you need to do is comment out the call for MSC_BOT_SendCSW(); here, and call it after the flash is written. Again, it all needs to be checked against other parts of the stack.

peter-h · « **Reply #34 on:** January 25, 2022, 08:47:49 am »

Quote

File access times and other metadata are being modified most likely.

Yes indeed. It depends on the app. Opening a .txt (Notepad) does nothing. Opening a .jpg does try to do a write. So this is OK.

Quote

So, assuming hmsc->scsi_blk_len is 0 after the function returns, all you need to do is comment out the call for MSC_BOT_SendCSW(); here

Right after the flash write returns, there is indeed a test for len=0 and CSW gets sent. So that could easily be blocked by testing a global "flash write pending" flag. (Rather than commenting it out, since that code is likely used elsewhere e.g. flash read).

Quote

call it after the flash is written. Again, it all needs to be checked against other parts of the stack.

That's the tricky bit. Perhaps the best place would be to test a global "send CSW" flag in the code which processes a regular poll from the host.

I also now realise why the existing setup works. I am doing the ~16ms flash write inside STORAGE_Write_FS, so the CSW gets sent only after the flash write is finished, which is exactly what you are suggesting. The drawback is that all interrupts (of a lower priority) are disabled for 17ms, which screws up a few things. I think all timer interrupts get disabled.

Fortunately, writing the flash from USB will be unusual. There is no requirement to guarantee system performance while e.g. a new config file is being written via USB. But at least I know why the existing scheme doesn't blow up completely

But it would be nice to do it properly. I am happy to pay someone to have a go. It is the standard ST USB library

and AFAIK the only mods done were

- merging in of CDC (virtual com port)
- modding the SCSI layer to accept a USBD_BUSY return from STORAGE_Write_FS - didn't do anything useful

abyrvalg · « **Reply #35 on:** January 25, 2022, 11:28:08 am »

IMO buffering the data and deferring the CSW is bad idea: the data stage could be much bigger than one sector, so you’ll need enough RAM to buffer it. It should be totally ok to NAK portions of data while flash is busy - several cheaper USB sticks I’ve studied don’t buffer anything, they just set up direct DMA xfers from USB peripheral to NAND sequencer and wait for completion, allowing NAND to throttle USB.
Or you can try the following hack to keep the existing code structure:
- check if the flash is busy before STORAGE_Write_FS() call
- ready: proceed normally
- busy: STALL the data stage, send CSW with csw.dCSWDataResidue = cbw.dCBWDataTransferLength
In theory this should tell "the device was unable to process entire transfer this time, please retry the residing data in another command". Not sure if OSes will be happy with this, but it should be easy to test.

But anyway, as it was stated in "USB interrupt disabling" thread, all this is doomed due to OS cache inconsistencies: stale data reads from files updated by device after USB connection, FAT corruption discarding device side FAT updates etc. A FAT volume can't be shared as writable to both parties.

peter-h · « **Reply #36 on:** January 25, 2022, 02:12:07 pm »

Quote

the data stage could be much bigger than one sector,

I may well be misunderstanding but Windows always writes in multiples of 1 sector to the flash. It can't do anything else because that is the standard removable block device API. The write-flash func is just 1 sector.

At the higher USB stack level it may well be more, and the USB stack data buffers are a few k (IIRC) so yes probably a few sectors get in there and then there are multiple calls to my flash-sector function.

The plan would be to offload each flash write to an RTOS task, which checks a global flag every say 5ms and then picks up a buffer address and flashes it.

Quote

check if the flash is busy before STORAGE_Write_FS() call

This gets complicated because while I have a flash check status function, it can't be called from within an ISR because checking the flash status is done over SPI

This is why my flash r/w func is blocking on the SPI portion. That conveniently produces a fully blocking read function. But it doesn't produce a fully blocking write function, because the SPI part is ~200us (and is fully blocking) and then you have ~17ms of the internal programming cycle.

It is actually very interesting how this can work. Until yesterday, the sector write function was blocking only on the SPI part but I realised that it was a miracle it worked, with USB interrupts being able to get in there during the internal programming cycle. I think it worked because the flash is happy to continue the programming cycle while SPI is getting on with something else, and if SPI gets another write, that just gets delayed. I now have a fully blocking flash write, all 17ms of it, and it has reduced some "rare funny stuff" to zero.

No matter how I look at this, I think disabling USB interrupts is essential during all flash SPI activity, but that is quick and affects USB for only a short time (under 300us) and obviously doesn't affect anything else.

So yes I have a check status function but it could be delayed for 300us.

Quote

ready: proceed normally

That will work for the first sector, only

Quote

busy: STALL the data stage, send CSW with csw.dCSWDataResidue = cbw.dCBWDataTransferLength

I think this is problematic due to having to test it with "everything out there".

The Q is how do flash sticks do this? They all work.

Quote

they just set up direct DMA xfers from USB peripheral to NAND sequencer and wait for completion

Which, if I understand right, is exactly what I am doing, with the only issue being that the 17ms USB ISR is affecting other stuff in my system. The host sees my system as it would see a USB stick. No CSW until write (sector or block?) complete, and note that most "cheap" flash chips have a 4k blocksize so probably no CSW for 4k (but then big flash chips have much faster writes than mine).

Well, it also affects the USB VCP (CDC) data flow, which has the benefit of some host retries but which I found gets corrupted if you are doing solid writing. This could be solved by documenting it

I think I need to implement what ataradov is suggesting, but I am not smart enough to understand the ST USB stack to implement the "return CSW after flash write ends" part.

The suggestions in the other thread that I write a "driver" for the flash just postpone the problem further down. One still needs a way to hold off the host on a series of USB writes.

abyrvalg · « **Reply #37 on:** January 25, 2022, 05:29:47 pm »

By “check if the flash is busy” I mean “check if any firmware-initiated flash operation is in progress” (can be indicated i.e. by an sw flag set at start/cleared at the end), not “poll the flash chip”.
And I strongly advise getting an idea about the cache inconsistency problem before spending further time on current things, as it can render all this work useless at the end. An example:
- you plug the device into the PC
- the OS reads and caches the FAT (they all do that AFAIK)
- firmware writes some file, allocating new clusters and updating the FAT
- PC has no way to know about that
- PC writes some file and updates the FAT too, but that update is based on a cached FAT version
- your FAT is in a state as if firmware-side file write never happened (but file’s directory entry indicates new size. Or not, if the directory was cached/updated too, depending on file location and luck)

peter-h · « **Reply #38 on:** January 25, 2022, 06:55:33 pm »

Sure; I have internal unmount/mount functions which take care of that, in the inside -> PC direction. To do it reliably involves a ~ 5 second delay before a remount.

In the PC -> filesystem direction it is more complicated, because the PC OS is merely writing sectors, not files etc. The only way I know of involves snooping on the USB data and when you see the final FAT update write, you can tell the embedded code that a file has been written. Obviously this is never 100% but it never will be. In most cases of firmware etc updates, there is a reboot anyway.

peter-h · « **Reply #39 on:** January 26, 2022, 11:22:15 am »

I've had an idea

After the flash write, it returns to the calling function which is

Code: [Select]

/**
* @brief  SCSI_ProcessWrite
*         Handle Write Process
* @param  lun: Logical unit number
* @retval status
*/
static int8_t SCSI_ProcessWrite(USBD_HandleTypeDef *pdev, uint8_t lun)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;
  uint32_t len = hmsc->scsi_blk_len * hmsc->scsi_blk_size;

  len = MIN(len, MSC_MEDIA_PACKET);

  if (((USBD_StorageTypeDef *)pdev->pClassSpecificInterfaceMSC)->Write(lun, hmsc->bot_data,
                                                      hmsc->scsi_blk_addr,
                                                      (len / hmsc->scsi_blk_size)) < 0)
  {
    SCSI_SenseCode(pdev, lun, HARDWARE_ERROR, WRITE_FAULT);
    return -1;
  }

  hmsc->scsi_blk_addr += (len / hmsc->scsi_blk_size);
  hmsc->scsi_blk_len -= (len / hmsc->scsi_blk_size);

  /* case 12 : Ho = Do */
  hmsc->csw.dDataResidue -= len;

  if (hmsc->scsi_blk_len == 0U)
  {
    MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED);
  }
  else
  {
    len = MIN((hmsc->scsi_blk_len * hmsc->scsi_blk_size), MSC_MEDIA_PACKET);

    /* Prepare EP to Receive next packet */
    (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, hmsc->bot_data, len);
  }

  return 0;
}

and notably there is this

Code: [Select]

 if (hmsc->scsi_blk_len == 0U)
  {
    MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED);
  }

which would be trivial to skip based on a global "flash write started" flag.

Now, how to return the CSW when the flash write finishes?

The send CSW function is

Code: [Select]

/**
* @brief  MSC_BOT_SendCSW
*         Send the Command Status Wrapper
* @param  pdev: device instance
* @param  status : CSW status
* @retval None
*/
void  MSC_BOT_SendCSW(USBD_HandleTypeDef *pdev, uint8_t CSW_Status)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;

  hmsc->csw.dSignature = USBD_BOT_CSW_SIGNATURE;
  hmsc->csw.bStatus = CSW_Status;
  hmsc->bot_state = USBD_BOT_IDLE;

  (void)USBD_LL_Transmit(pdev, MSC_EPIN_ADDR, (uint8_t *)&hmsc->csw,
                         USBD_BOT_CSW_LENGTH);

  /* Prepare EP to Receive next Cmd */
  (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, (uint8_t *)&hmsc->cbw,
                               USBD_BOT_CBW_LENGTH);
}

I could perhaps cache the pdev structure (size=sizeof()) from the time it was skipped

and call MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED); when the flash write has finished.

It probably won't work because who knows what else has been messing with pdev in the meantime... but this board has only one USB port on it.

peter-h · « **Reply #40 on:** January 27, 2022, 10:51:20 am »

I posted it here

https://community.st.com/s/question/0D53W00001KTXLFSA5/how-to-integrate-slowwrite-spi-flash-into-stm32-usb-device-library-to-return-ok-csw-when-flash-write-finishes?

and got some more responses but nothing concrete.

Reading between the lines, it does sound like simply delaying the USBD_OK code for the whole duration of the flash write will work. After all, it is working now. I just need to find out how to implement it. As I say, does anyone fancy doing it, for money?


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: 32F417 USB FS (not HS) and DMA, and how to write via USB to a slow FLASH drive (Read 4863 times)

ataradov

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

ataradov

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

ataradov

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

ataradov

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

ataradov

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

abyrvalg

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

abyrvalg

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

peter-h

Re: 32F417 USB FS (not HS) and DMA

Share me