Author Topic: 32F417 USB FS (not HS) and DMA, and how to write via USB to a slow FLASH drive  (Read 4864 times)

0 Members and 1 Guest are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Is it possible to replace this loop with DMA?

Not even sure the 32F417 DMA can connect to that USB FIFO. No mention in the RM. There is a dedicated DMA for the USB (like there is for ETH, where you really need it) but is it not available in FS mode (only in HS mode).

But if the USB FIFO has an address then surely one could do a memory-memory transfer? There is no "busy" status.

« Last Edit: January 27, 2022, 10:52:19 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #1 on: January 22, 2022, 05:05:08 pm »
Regular DMA should be able to write this FIFO. But if you are not using isochronous transfers,the maximum packet size is 64 bytes (16 words). It is not worth it to setup DMA for that short of a time.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #2 on: January 22, 2022, 05:48:35 pm »
Is FS mode never more than 16 words? That indeed would not be worth doing anything with. Although one could have a DMA channel set up and just load the transfer count and enable it to do the transfer.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #3 on: January 22, 2022, 06:08:20 pm »
Only isochronous transfers can go up to 1023 bytes (~256 words). All other transfers are 64 bytes maximum.

But then you also need to poll for completion status. With time spent preparing and starting DMA transfer and waiting for it to complete, it will be a wash.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #4 on: January 22, 2022, 08:30:25 pm »
I am wondering what actually happens if a Windows machine is accessing an embedded target like mine via USB.

I am running a VCP for debugs; not relevant.
I am also running the block device, which does 512 byte sector r/w from a serial FLASH. I've been optimising this part, as per the other thread. The writing will always be slow but that's OK, and I got the reading down to 200us for a sector. But if the USB comms uses just 64 byte packets, in the case of a write:

Windows must be sending 8 of these in succession, which get placed into some buffer (the USB code has some which are a few k), all under interrupts (which I assume must be separately serviced) and then issues a "write", and that generates the flash write.

In the case of a Read, Windows must request a sector read, and then issue 8 polls to retrieve the 512 bytes.

Is the above about right?

FWIW, I am reading a 2MB file from FatFS based 21mbps SPI serial FLASH in about 3 seconds (this is doing
copy /b i:*.* nul
in a DOS box in win7-64, obviously taking care to not be reading cached data) which is pretty respectable, but I am more concerned for how long the target interrupts are blocked, which they seem to be comprehensively by USB interrupts :)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #5 on: January 22, 2022, 08:39:01 pm »
Windows must be sending 8 of these in succession, which get placed into some buffer
Each of the 8 messages is received separately on a hardware level. Your USB stack reads and accumulates all of them. It depends on your USB stack API, but usually you provide a buffer to store the data. This buffer must be big enough to store 512 bytes. USB stack will reassemble the parts and provide a completion callback for the whole buffer.

In the case of a Read, Windows must request a sector read, and then issue 8 polls to retrieve the 512 bytes.
Same thing. You provide a buffer and submit the request. The contents of the buffer are sent in 8 packets, and the common callback is called for the whole buffer.


target interrupts are blocked, which they seem to be comprehensively by USB interrupts :)
It is hard to tell what specific stack implementation does, but there is generally no need to lock the interrupts for too long. That fill of the FIFO is pretty fast. The only possible issue here is that if completion callback is called from the interrupt handler, then you should not do any real work in that callback. Mark the buffer as ready to write and return. Then in the main loop check for the flag and write the data. Again, you need to know how USB stack is implemented to write optimal code.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #6 on: January 22, 2022, 10:38:34 pm »
"Mark the buffer as ready to write and return"

Sure; that was the advice in the other place
https://www.eevblog.com/forum/microcontrollers/cleanest-way-to-block-usb-interrupts/msg3949951/#msg3949951

but it still leaves the question of how to stop the host pushing in more writes. There has to be a way of returning a BUSY to the host, until that 512 byte sector has been written to the FLASH and the FLASH program cycle has finished.

I am using the STM32 USB device library from ST, as supplied with Cube. Someone spent some months getting that to work :) But it seems to work well.
« Last Edit: January 22, 2022, 10:44:07 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #7 on: January 22, 2022, 11:13:55 pm »
but it still leaves the question of how to stop the host pushing in more writes. There has to be a way of returning a BUSY to the host, until that 512 byte sector has been written to the FLASH and the FLASH program cycle has finished.
It happens automatically. When the host tries to send you new data and there is no buffer to receive that data (endpoint is not armed), USB controller would return NAK and the host will try again. And the only time the data would be accepted is when your applications request a read again.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #8 on: January 23, 2022, 02:58:53 pm »
How is the endpoint armed (or not)?
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline Tagli

  • Contributor
  • Posts: 31
  • Country: tr
Re: 32F417 USB FS (not HS) and DMA
« Reply #9 on: January 23, 2022, 03:51:32 pm »
AFAIK, OTG_FS can hold up to 7 packages in its endpoint TX FIFO. So, it could make sense to load 64 x 7 (448) bytes for bulk transfer using DMA. But there are some issues:

1) I'm not sure if DMA can access FIFO push register.
2) I think classical 64 x 2 double buffer is easier to manage in software.
3) OTG_FS can't make DMA requests. So how to trigger DMA? Maybe one can use interrupts to trigger it, but just performing 8 x 32-bit writes is easier. No need to use DMA for that.

For RX, it would be harder I guess. Because there is only one RX buffer, and hardware itself pushes some metadata into RX buffer, along with the actual USB data.

BTW, I have no idea how OTG_HS handles its dedicated DMA. I haven't read that part of the RM.
Gokce Taglioglu
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #10 on: January 23, 2022, 05:13:21 pm »
How is the endpoint armed (or not)?
No idea in case of STM32. See what your firmware does when reading stuff. Based on the name of the function you posted, see what USB_ReadPacket() does. Or read the reference manual. 
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #11 on: January 23, 2022, 05:42:45 pm »
This is both of them

Code: [Select]
/**
  * @brief  USB_WritePacket : Writes a packet into the Tx FIFO associated
  *         with the EP/channel
  * @param  USBx  Selected device
  * @param  src   pointer to source buffer
  * @param  ch_ep_num  endpoint or host channel number
  * @param  len  Number of bytes to write
  * @param  dma USB dma enabled or disabled
  *          This parameter can be one of these values:
  *           0 : DMA feature not used
  *           1 : DMA feature used
  * @retval HAL status
  */
HAL_StatusTypeDef USB_WritePacket(USB_OTG_GlobalTypeDef *USBx, uint8_t *src, uint8_t ch_ep_num, uint16_t len, uint8_t dma)
{
  uint32_t USBx_BASE = (uint32_t)USBx;
  uint32_t *pSrc = (uint32_t *)src;
  uint32_t count32b, i;

  if (dma == 0U)
  {
    count32b = ((uint32_t)len + 3U) / 4U;
    for (i = 0U; i < count32b; i++)
    {
      USBx_DFIFO((uint32_t)ch_ep_num) = *((__packed uint32_t *)pSrc);
      pSrc++;
    }
  }

  return HAL_OK;
}

/**
  * @brief  USB_ReadPacket : read a packet from the Tx FIFO associated
  *         with the EP/channel
  * @param  USBx  Selected device
  * @param  dest  source pointer
  * @param  len  Number of bytes to read
  * @param  dma USB dma enabled or disabled
  *          This parameter can be one of these values:
  *           0 : DMA feature not used
  *           1 : DMA feature used
  * @retval pointer to destination buffer
  */
void *USB_ReadPacket(USB_OTG_GlobalTypeDef *USBx, uint8_t *dest, uint16_t len)
{
  uint32_t USBx_BASE = (uint32_t)USBx;
  uint32_t *pDest = (uint32_t *)dest;
  uint32_t i;
  uint32_t count32b = ((uint32_t)len + 3U) / 4U;

  for (i = 0U; i < count32b; i++)
  {
    *(__packed uint32_t *)pDest = USBx_DFIFO(0U);
    pDest++;
  }

  return ((void *)pDest);
}
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #12 on: January 23, 2022, 06:48:23 pm »
Ok, I read the manual for you.

In case of STM32 USB controller would automatically ACK all incoming frames as long as there is space in FIFO. You can also enable NAK on all out transfers using "Global OUT NAK", but this is not useful here.

So, the endpoint is always armed as long as there is space in FIFO. As soon as space runs out, controller would be sending NAKs until you read the data.
Alex
 
The following users thanked this post: voltsandjolts

Offline Tagli

  • Contributor
  • Posts: 31
  • Country: tr
Re: 32F417 USB FS (not HS) and DMA
« Reply #13 on: January 23, 2022, 08:06:05 pm »
It's also possible to enable NAK on a specific endpoint. So, you can stop reception even when you have free space on RX FIFO.
Gokce Taglioglu
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #14 on: January 23, 2022, 08:15:03 pm »
Yes, I did not find that on a quick read, and it made no sense to me, but now I see there is a control of this via SNAK / CNAK bits in the OUT endpoint control register.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #15 on: January 23, 2022, 09:17:15 pm »
So you think that one can simply stop usb_read_packet while the flash is programming, and that will take care of it?
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #16 on: January 23, 2022, 09:23:04 pm »
Yes, sure, but it is a very bad solution. It is generally a good idea to receive full sector (512 bytes) into the temporary buffer as fast as possible, and then write it to the flash from the main task.  The actual write status is communicated over separate MSC commands, your application should not normally even call USB_ReadPacket(), it would be done by the MSC stack, which will give you a full payload. You should not block USB stack from working while write is happening.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #17 on: January 23, 2022, 09:45:05 pm »
Sure; understood. I am, as you can see, without a full understanding of this incredibly complex subsystem, and I am trying to work out where to do what.

USB_ReadPacket() is called by ST code, not by my code. It's a part of the stack, AFAICT.

Eventually the stack calls this

Code: [Select]
/**
  * @brief  .
  * @param  lun: .
  * @retval USBD_OK if all operations are OK else USBD_FAIL
  * The page write is about 18ms.
  *
  */
int8_t STORAGE_Write_FS(uint8_t lun, uint8_t *buf, uint32_t blk_addr, uint16_t blk_len)
{
#ifndef USB_INT_PROTECT
if (flash_busy) return (USBD_BUSY);
#endif
Flash_WritePage(buf, blk_len * STORAGE_BLK_SIZ, blk_addr * STORAGE_BLK_SIZ);
return (USBD_OK);

}

but the line return (USBD_BUSY); didn't work. It sounds like this function should just set a "got a buffer, write it to flash" flag (which gets an RTOS task to flash the buffer) and exits with an OK status immediately. But if the flash is busy programming (e.g. still programming the last sector), what is it to do? It can't exit with "USBD_OK" immediately because the host will just send another sector.

Incidentally, as reported previously, the calling code didn't support any return status other than USBD_OK :) But even after that bug was fixed, it still didn't work. So I guess ST's code was never tested with a storage device other than something fast enough to be fully written inside that ISR.
« Last Edit: January 23, 2022, 10:02:48 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #18 on: January 23, 2022, 10:21:10 pm »
I don't know how their APIs are supposed to work. My personal preference is to spend a bit of time upfront, but end up with something I understand and know how it works exactly.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #19 on: January 23, 2022, 10:51:19 pm »
It looks like there is really no point in DMA for the 16 words, but it would be great if somebody knew how to indicate to the host that the storage device is busy.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #20 on: January 23, 2022, 11:03:49 pm »
In USB MSC protocol host does not simply push the data. The transfer starts with a Command Block Wrapper (CBW), then data is transferred, which is followed by the Command Status Wrapper (CSW).

The host will not send any new commands or data until the status of the last command is received. Quote from the spec:
Quote
The host shall not transfer a CBW to the device until the host has received the CSW for any outstanding CBW. If the host issues two consecutive CBWs without an intervening CSW or reset, the device response to the second CBW is indeterminate.

So all you need to do is figure out how to delay transmission of the CSW until you are ready to get more data, and send it when you are ready.

Check that there is not some status code that STORAGE_Write_FS() can return, which would cause the CSW to not be sent. And then there must be some API that would separately send CSW.

And if not - then here is your simplest solution. Modify STORAGE_Write_FS() to not send CSW and add that API to send it when you are done.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #21 on: January 24, 2022, 07:31:28 am »
OK thank you. Will look into this.

The funny thing was that STORAGE_Write_FS returns only "OK" (which here is 0) and the other codes available are

Code: [Select]
/* Following USB Device status */
typedef enum {
  USBD_OK   = 0U,
  USBD_BUSY,
  USBD_EMEM,
  USBD_FAIL,
}USBD_StatusTypeDef;

which are 0 1 2 3

while the code calling STORAGE_Write_FS would accept only 0 and then negative numbers :) So anything other than "OK" was never tested. Unfortunately modding the calling code (which gets to STORAGE_Write_FS via this function table):

Code: [Select]
USBD_StorageTypeDef USBD_Storage_Interface_fops_FS =
{
  STORAGE_Init_FS,
  STORAGE_GetCapacity_FS,
  STORAGE_IsReady_FS,
  STORAGE_IsWriteProtected_FS,
  STORAGE_Read_FS,
  STORAGE_Write_FS,
  STORAGE_GetMaxLun_FS,
  (int8_t *)STORAGE_Inquirydata_FS
};

to accept a BUSY, and getting STORAGE_Write_FS to return a BUSY, was not successful.

I will have another dig around.


Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #22 on: January 24, 2022, 09:40:35 pm »
Looking at this some more, it looks like there is an interrupt when the FIFO is full.

This then triggers the software copy loop, which free runs, without any status checking, moving (presumably) 64 bytes, into the 512 byte buffer.

Then the USB stack requests more data and gets another 64, so another FIFO-full interrupt, and another 64 bytes copied, further along into the 64 byte buffer.

Then, when all 512 bytes have arrived into the 512 byte buffer, the USB stack calls that final function which does the FLASH write.

All the above is an interrupt driven state machine.

IF the assumption is correct that the USB stack is using the FIFO status to request more data, an easy approach might be to disable this interrupt and instead poll that status bit (if there is one) from an RTOS task, and implement the FLASH programming from within that. That way the FIFO will be left full during programming and it all "should just work".

The problem is that interrupt runs a long chain of things...
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #23 on: January 24, 2022, 10:12:43 pm »
This is not correct. As I said, after you do your flash write, you should see MSC driver send CSW. Your goal is to prevent transmission of that CSW. If you do this, there will never be any data in the FIFO until you sent it. This is the only correct way to approach this. Anything else is a hack that may or may not work and may or may not be compatible with different OSes.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #24 on: January 24, 2022, 10:42:14 pm »
"after you do your flash write"

Do you mean "before you commence your flash write"? And then send the CSW when the flash write has finished?

That's probably quite a rewrite of the ST USB stack. I bet it's been done by many, and unpublished :) because this has to be a common requirement. You can read flash very fast, down to ~50us for a sector, but no way to write it fast.

OTOH, the USBD_OK return may be what sends back the CSW, so not doing anything in that lowest level function (where currently the flash write is done) might do it. Then I need to find a way to send the CSW when the flash write ends.
« Last Edit: January 24, 2022, 10:45:23 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #25 on: January 25, 2022, 12:48:43 am »
I mean in the existing code. After STORAGE_Write_FS() returns, the stack would send CSW. You need to find that place and modify it, or see if there is something your code can already do to prevent it from sending the status.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #26 on: January 25, 2022, 07:11:33 am »
OK; I get you now.

What we tried was getting Storage_Write_FS() to return a BUSY status but that is evidently not supported. The answer is to return nothing, but that since that function has to return "something"... one has to return "OK" and block the CSW transmission higher up. That could be easily done with a global flag.

The more tricky thing is how to return the CSW once the FLASH is finished programming. Is it just a packet stuffed into the USB TX FIFO? It can't be just a constant packet because there are potentially loads of devices on the USB bus.

Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #27 on: January 25, 2022, 07:15:05 am »
Just change the code that calls this function and make it accept any code you like. There is no need for any global flags.

And you send CSW by doing the same thing normal stack already does. There is likely a function that formats the payload and sends the response.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #28 on: January 25, 2022, 07:32:05 am »
That code was already changed to support the BUSY return (1). Previously it supported only 0 or negative values. Didn't work :)

Is the CSW returned asynchronously, or in response to a poll from the host? If the latter, then returning it when the flash programming is done, is easy.

I found some info here
https://wiki.osdev.org/USB_Mass_Storage_Class_Devices
« Last Edit: January 25, 2022, 07:34:00 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #29 on: January 25, 2022, 07:43:02 am »
Is the CSW returned asynchronously, or in response to a poll from the host? If the latter, then returning it when the flash programming is done, is easy.

Everything happens in response to a poll from the host. In USB devices can't send anything on their own. Host will keep on polling for it for a long time until you send it, so functionally it is asynchronous.

But it is not something you can just send. It is no a fixed data, it contains information that reflects the state of the MSD stack (residue part). It has to be correctly filled out. But again, there is a place that already does this. You just need to find it.


I found some info here
https://wiki.osdev.org/USB_Mass_Storage_Class_Devices
Why not get the actual specification. It is only 22 pages long including TOC and all the legal stuff.
« Last Edit: January 25, 2022, 07:44:46 am by ataradov »
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #30 on: January 25, 2022, 08:04:11 am »
I found this

Code: [Select]
/**
* @brief  MSC_BOT_SendCSW
*         Send the Command Status Wrapper
* @param  pdev: device instance
* @param  status : CSW status
* @retval None
*/
void  MSC_BOT_SendCSW(USBD_HandleTypeDef *pdev, uint8_t CSW_Status)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;

  hmsc->csw.dSignature = USBD_BOT_CSW_SIGNATURE;
  hmsc->csw.bStatus = CSW_Status;
  hmsc->bot_state = USBD_BOT_IDLE;

  (void)USBD_LL_Transmit(pdev, MSC_EPIN_ADDR, (uint8_t *)&hmsc->csw,
                         USBD_BOT_CSW_LENGTH);

  /* Prepare EP to Receive next Cmd */
  (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, (uint8_t *)&hmsc->cbw,
                               USBD_BOT_CBW_LENGTH);
}

It gets called from various places, one of which is the SCSI layer:

Code: [Select]
/**
* @brief  SCSI_ProcessWrite
*         Handle Write Process
* @param  lun: Logical unit number
* @retval status
*/
static int8_t SCSI_ProcessWrite(USBD_HandleTypeDef *pdev, uint8_t lun)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;
  uint32_t len = hmsc->scsi_blk_len * hmsc->scsi_blk_size;

  len = MIN(len, MSC_MEDIA_PACKET);

  if (((USBD_StorageTypeDef *)pdev->pClassSpecificInterfaceMSC)->Write(lun, hmsc->bot_data,
                                                      hmsc->scsi_blk_addr,
                                                      (len / hmsc->scsi_blk_size)) < 0)
  {
    SCSI_SenseCode(pdev, lun, HARDWARE_ERROR, WRITE_FAULT);
    return -1;
  }

  hmsc->scsi_blk_addr += (len / hmsc->scsi_blk_size);
  hmsc->scsi_blk_len -= (len / hmsc->scsi_blk_size);

  /* case 12 : Ho = Do */
  hmsc->csw.dDataResidue -= len;

  if (hmsc->scsi_blk_len == 0U)
  {
    MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED);
  }
  else
  {
    len = MIN((hmsc->scsi_blk_len * hmsc->scsi_blk_size), MSC_MEDIA_PACKET);

    /* Prepare EP to Receive next packet */
    (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, hmsc->bot_data, len);
  }

  return 0;
}

but obviously one can't just call this function asynchronously.

Yes I could read the spec but I am not so clever. I am an "old guy" who can manage basic C and I am trying to finish this product :) I've climbed up a huge learning curve over the past year, from a few decades of embedded background with much simpler hardware, and mostly in assembler.

Does anybody fancy doing this, for say USD 100? It's probably dead simple if you know how.

It only needs to be done for the flash write. The read is only 200-300us long so it can remain in the Storage_Write_FS() function (which is an ISR). I can then do an RTOS task to do the actual write, based on a global flag being set.
« Last Edit: January 25, 2022, 08:13:44 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #31 on: January 25, 2022, 08:12:41 am »
It is impossible to tell how it can be called without studying how the stack works. That's the downside of frameworks - if you need something that is not supported out fo the box, you have a lot of figuring out to do.

But the first thing I would do is trace which one of them is called and how exactly it is called after your flash write returns.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #32 on: January 25, 2022, 08:24:47 am »
I did a breakpoint after the flash write and then went to the drive, and to my alarm the flash write got called just by double-clicking on a file ;) I will need to check if that happens just by Windows being connected...

Anyway, this is the stack at that point

Code: [Select]
Thread #1 [main] 1 [core: 0] (Suspended : Breakpoint)
STORAGE_Write_FS() at usbd_storage_if.c:290 0x803ee70
SCSI_ProcessWrite() at usbd_msc_scsi.c:1,008 0x800e522
SCSI_Write10() at usbd_msc_scsi.c:818 0x800e692
SCSI_ProcessCmd() at usbd_msc_scsi.c:177 0x800e8c8
MSC_BOT_DataOut() at usbd_msc_bot.c:203 0x800de38
USBD_MSC_DataOut() at usbd_msc.c:497 0x800dbfa
USBD_MSC_CDC_DataOut() at usbd_msc_cdc.c:392 0x803ed90
USBD_LL_DataOutStage() at usbd_core.c:342 0x800ea62
HAL_PCD_DataOutStageCallback() at usbd_conf.c:160 0x803ea6a
PCD_EP_OutXfrComplete_int() at stm32f4xx_hal_pcd.c:2,143 0x800aca2
<...more frames...>

I will investigate.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11260
  • Country: us
    • Personal site
Re: 32F417 USB FS (not HS) and DMA
« Reply #33 on: January 25, 2022, 08:33:46 am »
I did a breakpoint after the flash write and then went to the drive, and to my alarm the flash write got called just by double-clicking on a file ;)
File access times and other metadata are being modified most likely. Ideally that would be cached, but windows became less aggressive with caching due to people not properly unmounting the drives.


Anyway, this is the stack at that point

So, assuming hmsc->scsi_blk_len is 0 after the function returns, all you need to do is comment out the call for MSC_BOT_SendCSW(); here, and call it after the flash is written. Again, it all needs to be checked against other parts of the stack.
Alex
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #34 on: January 25, 2022, 08:47:49 am »
Quote
File access times and other metadata are being modified most likely.

Yes indeed. It depends on the app. Opening a .txt (Notepad) does nothing. Opening a .jpg does try to do a write. So this is OK.

Quote
So, assuming hmsc->scsi_blk_len is 0 after the function returns, all you need to do is comment out the call for MSC_BOT_SendCSW(); here

Right after the flash write returns, there is indeed a test for len=0 and CSW gets sent. So that could easily be blocked by testing a global "flash write pending" flag. (Rather than commenting it out, since that code is likely used elsewhere e.g. flash read).

Quote
call it after the flash is written. Again, it all needs to be checked against other parts of the stack.

That's the tricky bit. Perhaps the best place would be to test a global "send CSW" flag in the code which processes a regular poll from the host.

I also now realise why the existing setup works. I am doing the ~16ms flash write inside STORAGE_Write_FS, so the CSW gets sent only after the flash write is finished, which is exactly what you are suggesting. The drawback is that all interrupts (of a lower priority) are disabled for 17ms, which screws up a few things. I think all timer interrupts get disabled.

Fortunately, writing the flash from USB will be unusual. There is no requirement to guarantee system performance while e.g. a new config file is being written via USB. But at least I know why the existing scheme doesn't blow up completely ;)

But it would be nice to do it properly. I am happy to pay someone to have a go. It is the standard ST USB library



and AFAIK the only mods done were

- merging in of CDC (virtual com port)
- modding the SCSI layer to accept a USBD_BUSY return from STORAGE_Write_FS - didn't do anything useful

« Last Edit: January 25, 2022, 10:55:03 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 824
  • Country: es
Re: 32F417 USB FS (not HS) and DMA
« Reply #35 on: January 25, 2022, 11:28:08 am »
IMO buffering the data and deferring the CSW is bad idea: the data stage could be much bigger than one sector, so you’ll need enough RAM to buffer it. It should be totally ok to NAK portions of data while flash is busy - several cheaper USB sticks I’ve studied don’t buffer anything, they just set up direct DMA xfers from USB peripheral to NAND sequencer and wait for completion, allowing NAND to throttle USB.
Or you can try the following hack to keep the existing code structure:
- check if the flash is busy before STORAGE_Write_FS() call
- ready: proceed normally
- busy: STALL the data stage, send CSW with csw.dCSWDataResidue = cbw.dCBWDataTransferLength
In theory this should tell "the device was unable to process entire transfer this time, please retry the residing data in another command". Not sure if OSes will be happy with this, but it should be easy to test.

But anyway, as it was stated in "USB interrupt disabling" thread, all this is doomed due to OS cache inconsistencies: stale data reads from files updated by device after USB connection, FAT corruption discarding device side FAT updates etc. A FAT volume can't be shared as writable to both parties.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #36 on: January 25, 2022, 02:12:07 pm »
Quote
the data stage could be much bigger than one sector,

I may well be misunderstanding but Windows always writes in multiples of 1 sector to the flash. It can't do anything else because that is the standard removable block device API. The write-flash func is just 1 sector.

At the higher USB stack level it may well be more, and the USB stack data buffers are a few k (IIRC) so yes probably a few sectors get in there and then there are multiple calls to my flash-sector function.

The plan would be to offload each flash write to an RTOS task, which checks a global flag every say 5ms and then picks up a buffer address and flashes it.

Quote
check if the flash is busy before STORAGE_Write_FS() call

This gets complicated because while I have a flash check status function, it can't be called from within an ISR because checking the flash status is done over SPI :) This is why my flash r/w func is blocking on the SPI portion. That conveniently produces a fully blocking read function. But it doesn't produce a fully blocking write function, because the SPI part is ~200us (and is fully blocking) and then you have ~17ms of the internal programming cycle.

It is actually very interesting how this can work. Until yesterday, the sector write function was blocking only on the SPI part but I realised that it was a miracle it worked, with USB interrupts being able to get in there during the internal programming cycle. I think it worked because the flash is happy to continue the programming cycle while SPI is getting on with something else, and if SPI gets another write, that just gets delayed. I now have a fully blocking flash write, all 17ms of it, and it has reduced some "rare funny stuff" to zero.

No matter how I look at this, I think disabling USB interrupts is essential during all flash SPI activity, but that is quick and affects USB for only a short time (under 300us) and obviously doesn't affect anything else.

So yes I have a check status function but it could be delayed for 300us.

Quote
ready: proceed normally

That will work for the first sector, only

Quote
busy: STALL the data stage, send CSW with csw.dCSWDataResidue = cbw.dCBWDataTransferLength

I think this is problematic due to having to test it with "everything out there".

The Q is how do flash sticks do this? They all work.

Quote
they just set up direct DMA xfers from USB peripheral to NAND sequencer and wait for completion

Which, if I understand right, is exactly what I am doing, with the only issue being that the 17ms USB ISR is affecting other stuff in my system. The host sees my system as it would see a USB stick. No CSW until write (sector or block?) complete, and note that most "cheap" flash chips have a 4k blocksize so probably no CSW for 4k (but then big flash chips have much faster writes than mine).

Well, it also affects the USB VCP (CDC) data flow, which has the benefit of some host retries but which I found gets corrupted if you are doing solid writing. This could be solved by documenting it :)

I think I need to implement what ataradov is suggesting, but I am not smart enough to understand the ST USB stack to implement the "return CSW after flash write ends" part.

The suggestions in the other thread that I write a "driver" for the flash just postpone the problem further down. One still needs a way to hold off the host on a series of USB writes.


« Last Edit: January 25, 2022, 02:28:44 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline abyrvalg

  • Frequent Contributor
  • **
  • Posts: 824
  • Country: es
Re: 32F417 USB FS (not HS) and DMA
« Reply #37 on: January 25, 2022, 05:29:47 pm »
By “check if the flash is busy” I mean “check if any firmware-initiated flash operation is in progress” (can be indicated i.e. by an sw flag set at start/cleared at the end), not “poll the flash chip”.
And I strongly advise getting an idea about the cache inconsistency problem before spending further time on current things, as it can render all this work useless at the end. An example:
- you plug the device into the PC
- the OS reads and caches the FAT (they all do that AFAIK)
- firmware writes some file, allocating new clusters and updating the FAT
- PC has no way to know about that
- PC writes some file and updates the FAT too, but that update is based on a cached FAT version
- your FAT is in a state as if firmware-side file write never happened (but file’s directory entry indicates new size. Or not, if the directory was cached/updated too, depending on file location and luck)
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #38 on: January 25, 2022, 06:55:33 pm »
Sure; I have internal unmount/mount functions which take care of that, in the inside -> PC direction. To do it reliably involves a ~ 5 second delay before a remount.

In the PC -> filesystem direction it is more complicated, because the PC OS is merely writing sectors, not files etc. The only way I know of involves snooping on the USB data and when you see the final FAT update write, you can tell the embedded code that a file has been written. Obviously this is never 100% but it never will be. In most cases of firmware etc updates, there is a reboot anyway.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #39 on: January 26, 2022, 11:22:15 am »
I've had an idea :)

After the flash write, it returns to the calling function which is

Code: [Select]
/**
* @brief  SCSI_ProcessWrite
*         Handle Write Process
* @param  lun: Logical unit number
* @retval status
*/
static int8_t SCSI_ProcessWrite(USBD_HandleTypeDef *pdev, uint8_t lun)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;
  uint32_t len = hmsc->scsi_blk_len * hmsc->scsi_blk_size;

  len = MIN(len, MSC_MEDIA_PACKET);

  if (((USBD_StorageTypeDef *)pdev->pClassSpecificInterfaceMSC)->Write(lun, hmsc->bot_data,
                                                      hmsc->scsi_blk_addr,
                                                      (len / hmsc->scsi_blk_size)) < 0)
  {
    SCSI_SenseCode(pdev, lun, HARDWARE_ERROR, WRITE_FAULT);
    return -1;
  }

  hmsc->scsi_blk_addr += (len / hmsc->scsi_blk_size);
  hmsc->scsi_blk_len -= (len / hmsc->scsi_blk_size);

  /* case 12 : Ho = Do */
  hmsc->csw.dDataResidue -= len;

  if (hmsc->scsi_blk_len == 0U)
  {
    MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED);
  }
  else
  {
    len = MIN((hmsc->scsi_blk_len * hmsc->scsi_blk_size), MSC_MEDIA_PACKET);

    /* Prepare EP to Receive next packet */
    (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, hmsc->bot_data, len);
  }

  return 0;
}

and notably there is this

Code: [Select]
if (hmsc->scsi_blk_len == 0U)
  {
    MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED);
  }

which would be trivial to skip based on a global "flash write started" flag.

Now, how to return the CSW when the flash write finishes?

The send CSW function is

Code: [Select]
/**
* @brief  MSC_BOT_SendCSW
*         Send the Command Status Wrapper
* @param  pdev: device instance
* @param  status : CSW status
* @retval None
*/
void  MSC_BOT_SendCSW(USBD_HandleTypeDef *pdev, uint8_t CSW_Status)
{
  USBD_MSC_BOT_HandleTypeDef *hmsc = (USBD_MSC_BOT_HandleTypeDef *)pdev->pClassDataMSC;

  hmsc->csw.dSignature = USBD_BOT_CSW_SIGNATURE;
  hmsc->csw.bStatus = CSW_Status;
  hmsc->bot_state = USBD_BOT_IDLE;

  (void)USBD_LL_Transmit(pdev, MSC_EPIN_ADDR, (uint8_t *)&hmsc->csw,
                         USBD_BOT_CSW_LENGTH);

  /* Prepare EP to Receive next Cmd */
  (void)USBD_LL_PrepareReceive(pdev, MSC_EPOUT_ADDR, (uint8_t *)&hmsc->cbw,
                               USBD_BOT_CBW_LENGTH);
}

I could perhaps cache the pdev structure (size=sizeof()) from the time it was skipped



and call MSC_BOT_SendCSW(pdev, USBD_CSW_CMD_PASSED); when the flash write has finished.

It probably won't work because who knows what else has been messing with pdev in the meantime... but this board has only one USB port on it.


« Last Edit: January 26, 2022, 11:38:12 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3697
  • Country: gb
  • Doing electronics since the 1960s...
Re: 32F417 USB FS (not HS) and DMA
« Reply #40 on: January 27, 2022, 10:51:20 am »
I posted it here

https://community.st.com/s/question/0D53W00001KTXLFSA5/how-to-integrate-slowwrite-spi-flash-into-stm32-usb-device-library-to-return-ok-csw-when-flash-write-finishes?

and got some more responses but nothing concrete.

Reading between the lines, it does sound like simply delaying the USBD_OK code for the whole duration of the flash write will work. After all, it is working now. I just need to find out how to implement it. As I say, does anyone fancy doing it, for money?
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf