Author Topic: Ethernet interface for MCU  (Read 4196 times)

0 Members and 1 Guest are viewing this topic.

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 28498
  • Country: nl
    • NCT Developments
Re: Ethernet interface for MCU
« Reply #25 on: January 04, 2025, 01:08:16 pm »
That is another way of doing things (which I have promoted several times before) as the interrupt controller is essentially a pre-emptive task scheduler.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 
The following users thanked this post: Siwastaja

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 287
  • Country: ua
Re: Ethernet interface for MCU
« Reply #26 on: January 04, 2025, 05:19:09 pm »
But I recognize it does not work for everyone, and adding another layer of blocking calls (concurrently to blocking networking calls) is not possible without adding an OS with a task scheduler.

It is debatable what calls are "blocking". It is possible to make a TLS-enabled application totally non-blocking (i.e. no blocking IO functions) , it's just verification routines during the TLS handshake could be damn slow, so the app will be stuck for a long time in a slow crypto function (but not waiting for some condition, as usually blocking functions do).

For example, a non-tuned mbedTLS example shipped with Cube gives ~2.7 seconds handshake time on STM32F756 board. A non-tuned Zephyr example on the same board gives ~4.2 seconds (also uses mbedTLS, but Zephyr's stack instead of LWIP). Mongoose on the same board gives ~0.4 seconds handshake time (Mongoose built-in TCP/IP and TLS). See this TLS explainer video for details:

 

All these are too slow, so yeah, either an interrupt - based code would be needed, or an RTOS . Both solutions use interrupts to preempt CPU that is executing a long and expensive crypto function.
« Last Edit: January 04, 2025, 05:21:23 pm by tellurium »
Open source embedded network library https://github.com/cesanta/mongoose
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Offline ulixTopic starter

  • Regular Contributor
  • *
  • Posts: 115
  • Country: de
Re: Ethernet interface for MCU
« Reply #27 on: January 07, 2025, 03:56:06 pm »
Thank you to everyone for their contributions to my question. I had already looked at the W5500, but I was not familiar with Wiznet. I will be getting some modules from them.  :-+
 

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 287
  • Country: ua
Re: Ethernet interface for MCU
« Reply #28 on: January 07, 2025, 07:39:12 pm »
Thank you to everyone for their contributions to my question. I had already looked at the W5500, but I was not familiar with Wiznet. I will be getting some modules from them.  :-+

I have used W5500 modules produced by Wiznet (so both W5500 chip and the module are by Wiznet), sourced from mouser and digikey:
https://eu.mouser.com/ProductDetail/WIZnet/WIZ850io?qs=W0yvOO0ixfFLSlENQWBCKg%3D%3D

They are quite pricey (even more expensive than their rpi pico devboards with integrated w5500)

And also I've used Aliexpress clones of the aforementioned modules (around 4 bucks or less), e.g.
https://www.aliexpress.com/item/1005006942624924.html

And Wiznet's devboards with integrated W5500, both rp2040 and rp2350:
https://eu.mouser.com/ProductDetail/WIZnet/W5500-EVB-Pico2?qs=iLKYxzqNS77FKgrqfXqzIQ%3D%3D
https://eu.mouser.com/ProductDetail/WIZnet/W5500-EVB-PICO?qs=t7xnP681wgXCanyLhrM3cQ%3D%3D
Open source embedded network library https://github.com/cesanta/mongoose
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #29 on: January 07, 2025, 08:37:21 pm »
But surely the OP is just doing something like reading remotely located electricity meters. He is not (I hope) building remotely controlled electricity substations in Ukraine ;)

He could just use UDP and a very simple custom protocol. Very likely he can afford to lose some data. TCP gives you error correction and implicit flow control but if there is an actual wire break etc that won't help.

TLS (X509) is a massively complicated way to get security, and in the self signed cert business (implicit in mass production) it doesn't get you much more than a simple shared key setup.

But "capture via LAN" means a lot of things. If you want to use a modern browser then your customer (being an internet security expert) may "need" TLS. But you probably still can't avoid browser warnings. For now, plain HTTP works... but there are other ways to do this. If you can assemble the retrieving end, it can be much simpler.
« Last Edit: January 07, 2025, 08:50:21 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 9440
  • Country: fi
Re: Ethernet interface for MCU
« Reply #30 on: January 08, 2025, 10:45:26 am »
Well of course if the data is not sensitive in itself, the device can't do much harm even if broken into, then why indeed add security if it's not needed.

And UDP is really fine, added benefit is that it's packetized so generating and parsing it is much simpler. You don't have to sync to a stream of bytes. If you take care of lost packets by just periodically sending new values (and losing some does not matter), then no point in using TCP.
 

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #31 on: January 08, 2025, 04:26:24 pm »
Just picking up this older point

Quote
it's just verification routines during the TLS handshake could be damn slow, so the app will be stuck for a long time in a slow crypto function (but not waiting for some condition, as usually blocking functions do).

and how true this is. On my product I had a whole lot of fun with MbedTLS blocking various things, even though FreeRTOS is pre-emptive. It turned out to be peripheral stuff e.g. TLS was walking through the certificates in cacert.pem, one by one, which (even with a 21MHz SPI FLASH FAT12 FatFS file system) was taking quite some time, plus the software implementations of RSA or EC not being exactly fast (the 32F4 has hardware AES, but nothing else what's useful (it has DES/3DES). I had to mutex the FLASH API (obviously) but also separately had to mutex around TLS so the HTTP server (which runs on top of LWIP's netconn API) was blocked during TLS, which takes roughly 3 seconds (168MHz).

How do you get say 3 secs down to 0.4 sec? I watched your whole video but - apart from marketing Mongoose ;) - it doesn't say how. Is there some super fast EC algorithm? I am not surprised the crypto suites in MbedTLS are perhaps nothing special; I think they basically trawled the internet for what they could find and stuffed it in there. There are so many, they could hardly do anything else. But their software AES256 is not bad, at 800kbytes/sec which is faster than most "IOT" boxes need (the 32F417 hardware AES is much faster).

Quote
And UDP is really fine, added benefit is that it's packetized so generating and parsing it is much simpler.

How about getting rid of LWIP too and using the 32F4's (etc) ETH controller alone? The solution will be MAC# based and thus not routable but in the right application (wired, with a hub or a switch) it should work.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 9440
  • Country: fi
Re: Ethernet interface for MCU
« Reply #32 on: January 08, 2025, 05:24:41 pm »
How about getting rid of LWIP too and using the 32F4's (etc) ETH controller alone? The solution will be MAC# based and thus not routable but in the right application (wired, with a hub or a switch) it should work.

UDP/IP is simple enough you can implement it from scratch! Same for ARP. I mean, I did that on hardware (VHDL on FPGA) in a project which had totally different focus, but I needed them for data collection and it was easier to just write them than to try to cobble something together from Opencores or whatever. It was small enough of a diversion that I just did that and it didn't stall the project too much. DHCP would be nice addition though to avoid static IP allocation and that isn't excessively complex either.

In any case much simpler than full TCP/IP stack.
 

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #33 on: January 08, 2025, 06:35:53 pm »
Yes but you are super clever.

For me, it would be a choice between

- understanding the ETH interface (basically working out ST's "low level input/output" code, interfacing to LWIP) and using that directly
- if one has the above debugged (as I have), using LWIP for UDP, or anything higher up

I can see that probably implementing UDP on top of the ETH interface (the Synopsis stuff) is not much more than adding a header onto the ETH packet. If you did it in an FPGA, what ETH controller did you use?
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 10147
  • Country: gb
Re: Ethernet interface for MCU
« Reply #34 on: January 08, 2025, 06:41:09 pm »
Yes but you are super clever.
You don't need to be super clever. If you only want to handle small UDP packets its basically just a matter of constructing a header and trailer and sending, and analysing a header and trailer and delivering. If you want to handle big packets you have to deal with fragmentation, which pushes up the complexity a fair amount, since things can arrive out of order, even if they seldom do. TCP increases the complexity a lot. However, sticking to small UDP packets does lead to a pretty simple job.
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 9440
  • Country: fi
Re: Ethernet interface for MCU
« Reply #35 on: January 08, 2025, 06:59:07 pm »
No, I mean UDP+IP is a really simple protocol. If you are clever enough to set up STM32 ETH peripheral, you most certainly can generate the UDP and IP  headers which just contain stuff like destination IP address and port number. No state management, nothing like that. Even a struct containing this metadata and memcpying it on the buffer is enough.

Just choose not to support large packets (I think the usual limit is around 1500 bytes) and it's very simple.
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 10147
  • Country: gb
Re: Ethernet interface for MCU
« Reply #36 on: January 08, 2025, 07:12:22 pm »
No, I mean UDP+IP is a really simple protocol. If you are clever enough to set up STM32 ETH peripheral, you most certainly can generate the UDP and IP  headers which just contain stuff like destination IP address and port number. No state management, nothing like that. Even a struct containing this metadata and memcpying it on the buffer is enough.

Just choose not to support large packets (I think the usual limit is around 1500 bytes) and it's very simple.
I forgot about the separation of UDP and IP. IP just a bit more header and trailer. It adds no real complexity. If you are really RAM constrained, which is often a problem with comms protocols on MCUs, you can limit yourself to something smaller than the normal 1500 byte MTU.
 

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #37 on: January 08, 2025, 07:12:43 pm »
OK; I get that. Keep it below the worst case MTU on the LAN; I'd say 1200 bytes to be sure.

The interface to the ST ETH subsystem is complex though - unless there is a very simple way of using it for just 1 packet at a time. It runs a linked list of packets.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Online Siwastaja

  • Super Contributor
  • ***
  • Posts: 9440
  • Country: fi
Re: Ethernet interface for MCU
« Reply #38 on: January 08, 2025, 07:21:41 pm »
I forgot about the separation of UDP and IP. IP just a bit more header and trailer. It adds no real complexity. If you are really RAM constrained, which is often a problem with comms protocols on MCUs, you can limit yourself to something smaller than the normal 1500 byte MTU.

You don't need to separate UDP and IP for a simple implementation. Both are just a few header fields, so no need to complicate it by any generic layering idea.

The ethernet MAC peripheral in any microcontroller (I admit I have never used STM32's peripheral) would have a RAM buffer of some sort, either DMAing from main memory or a separate piece of memory addressable by the CPU, so you would just directly generate your desired frame there, with the header and trailer and then signal the MAC to send it.

For truly easy-to-setup system you would also need ARP but that's still simple. DHCP is already a bit more complex, but if user can deal with static IP addresses then don't do it. On the other hand, then you need some other way than Ethernet to configure the static IP address (e.g., UART or LCD pushbutton interface) which might be more work than using DHCP to obtain IP address from the router.
 

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #39 on: January 08, 2025, 07:34:08 pm »
Quote
The ethernet MAC peripheral in any microcontroller (I admit I have never used STM32's peripheral) would have a RAM buffer of some sort, either DMAing from main memory or a separate piece of memory addressable by the CPU, so you would just directly generate your desired frame there, with the header and trailer and then signal the MAC to send it.

Oh for sure :) but the following is just one little bit of it...

Code: [Select]

// ********************************************************************************
// Functions extracted from stm32f4xx_hal_eth.c and simplified
// The original HAL_BUSY lock mechanism was no good
// [url]https://community.st.com/s/question/0D50X0000BOtUflSQF/bug-stm32-lwip-ethernet-driver-rx-deadlock[/url]


void IF_HAL_ETH_TransmitFrame(ETH_HandleTypeDef *heth, uint32_t FrameLength)
{
uint32_t bufcount = 0U, size = 0U, i = 0U;

if (FrameLength == 0U)
{
return;
}

/* Check if the descriptor is owned by the ETHERNET DMA (when set) or CPU (when reset) */
if(((heth->TxDesc)->Status & ETH_DMATXDESC_OWN) != (uint32_t)RESET)
{
return;
}

/* Get the number of needed Tx buffers for the current frame */
if (FrameLength > ETH_TX_BUF_SIZE)
{
bufcount = FrameLength/ETH_TX_BUF_SIZE;
if (FrameLength % ETH_TX_BUF_SIZE)
{
bufcount++;
}
}
else
{
bufcount = 1U;
}
if (bufcount == 1U)
{
/* Set LAST and FIRST segment */
heth->TxDesc->Status |=ETH_DMATXDESC_FS|ETH_DMATXDESC_LS;
/* Set frame size */
heth->TxDesc->ControlBufferSize = (FrameLength & ETH_DMATXDESC_TBS1);
/* Set Own bit of the Tx descriptor Status: gives the buffer back to ETHERNET DMA */
//__DMB();
heth->TxDesc->Status |= ETH_DMATXDESC_OWN;
/* Point to next descriptor */
heth->TxDesc= (ETH_DMADescTypeDef *)(heth->TxDesc->Buffer2NextDescAddr);
}
else
{
for (i=0U; i< bufcount; i++)
{
/* Clear FIRST and LAST segment bits */
heth->TxDesc->Status &= ~(ETH_DMATXDESC_FS | ETH_DMATXDESC_LS);

if (i == 0U)
{
/* Setting the first segment bit */
heth->TxDesc->Status |= ETH_DMATXDESC_FS;
}

/* Program size */
heth->TxDesc->ControlBufferSize = (ETH_TX_BUF_SIZE & ETH_DMATXDESC_TBS1);

if (i == (bufcount-1U))
{
/* Setting the last segment bit */
heth->TxDesc->Status |= ETH_DMATXDESC_LS;
size = FrameLength - (bufcount-1U)*ETH_TX_BUF_SIZE;
heth->TxDesc->ControlBufferSize = (size & ETH_DMATXDESC_TBS1);
}

/* Set Own bit of the Tx descriptor Status: gives the buffer back to ETHERNET DMA */
//__DMB();
heth->TxDesc->Status |= ETH_DMATXDESC_OWN;
/* point to next descriptor */
//__DMB();
heth->TxDesc = (ETH_DMADescTypeDef *)(heth->TxDesc->Buffer2NextDescAddr);
}
}

/* When Tx Buffer unavailable flag is set: clear it and resume transmission */
if (((heth->Instance)->DMASR & ETH_DMASR_TBUS) != (uint32_t)RESET)
{
/* Clear TBUS ETHERNET DMA flag */
(heth->Instance)->DMASR = ETH_DMASR_TBUS;
/* Resume DMA transmission*/
//__DMB();
(heth->Instance)->DMATPDR = 0U; // Any value issues a descriptor list poll demand.
}

// Make this function blocking, otherwise following code (very rarely) overwrites
// the last DMA buffer!
// The reason why the calling code doesn't take care of this is unexplained, but this
// hack has no effect on tx speed
// Removed 23/7/22 due to
// [url]https://community.st.com/s/question/0D73W000001P2M3SAK/detail[/url]
//if ( (((heth->Instance)->DMASR) & (0x7 << 20)) != 0 ) {}

return;

}



HAL_StatusTypeDef IF_HAL_ETH_GetReceivedFrame(ETH_HandleTypeDef *heth)
{
uint32_t framelength = 0U;

/* Check if segment is not owned by DMA */
/* if (((heth->RxDesc->Status & ETH_DMARXDESC_OWN) == (uint32_t)RESET) && ((heth->RxDesc->Status & ETH_DMARXDESC_LS) != (uint32_t)RESET)) */
//__DMB();
if(((heth->RxDesc->Status & ETH_DMARXDESC_OWN) == (uint32_t)RESET))
{
/* Check if last segment */
if(((heth->RxDesc->Status & ETH_DMARXDESC_LS) != (uint32_t)RESET))
{
/* increment segment count */
(heth->RxFrameInfos).SegCount++;

/* Check if last segment is first segment: one segment contains the frame */
if ((heth->RxFrameInfos).SegCount == 1U)
{
(heth->RxFrameInfos).FSRxDesc =heth->RxDesc;
}

heth->RxFrameInfos.LSRxDesc = heth->RxDesc;

/* Get the Frame Length of the received packet: substruct 4 bytes of the CRC */
framelength = (((heth->RxDesc)->Status & ETH_DMARXDESC_FL) >> ETH_DMARXDESC_FRAMELENGTHSHIFT) - 4U;
heth->RxFrameInfos.length = framelength;

/* Get the address of the buffer start address */
heth->RxFrameInfos.buffer = ((heth->RxFrameInfos).FSRxDesc)->Buffer1Addr;
/* point to next descriptor */
heth->RxDesc = (ETH_DMADescTypeDef*) ((heth->RxDesc)->Buffer2NextDescAddr);

/* Return function status */
return HAL_OK;
}
/* Check if first segment */
else if((heth->RxDesc->Status & ETH_DMARXDESC_FS) != (uint32_t)RESET)
{
(heth->RxFrameInfos).FSRxDesc = heth->RxDesc;
(heth->RxFrameInfos).LSRxDesc = NULL;
(heth->RxFrameInfos).SegCount = 1U;
/* Point to next descriptor */
heth->RxDesc = (ETH_DMADescTypeDef*) (heth->RxDesc->Buffer2NextDescAddr);
}
/* Check if intermediate segment */
else
{
(heth->RxFrameInfos).SegCount++;
/* Point to next descriptor */
heth->RxDesc = (ETH_DMADescTypeDef*) (heth->RxDesc->Buffer2NextDescAddr);
}
}

/* Return function status */
return HAL_ERROR;
}


// ********************************************************************************



/**
 * @brief This function should do the actual transmission of the packet. The packet is
 * contained in the pbuf that is passed to the function. This pbuf
 * might be chained.
 *
 * @param netif the lwip network interface structure for this ethernetif
 * @param p the MAC packet to send (e.g. IP packet including MAC addresses and type)
 * @return ERR_OK if the packet could be sent
 *         an err_t value if the packet couldn't be sent
 *
 * @note Returning ERR_MEM here if a DMA queue of your MAC is full can lead to
 *       strange results. You might consider waiting for space in the DMA queue
 *       to become available since the stack doesn't retry to send a packet
 *       dropped because of memory failure (except for the TCP timers).
 */
static err_t low_level_output(struct netif *netif, struct pbuf *p)
{
err_t errval;
struct pbuf *q;
uint8_t *buffer = (uint8_t *)(EthHandle.TxDesc->Buffer1Addr);
__IO ETH_DMADescTypeDef *DmaTxDesc;
uint32_t framelength = 0;
uint32_t bufferoffset = 0;
uint32_t byteslefttocopy = 0;
uint32_t payloadoffset = 0;

DmaTxDesc = EthHandle.TxDesc;
bufferoffset = 0;

/* copy frame from pbufs to driver buffers */
for(q = p; q != NULL; q = q->next)
{
/* Is this buffer available? If not, goto error */
if((DmaTxDesc->Status & ETH_DMATXDESC_OWN) != (uint32_t)RESET)
{
errval = ERR_USE;
goto error;
}

/* Get bytes in current lwIP buffer */
byteslefttocopy = q->len;
payloadoffset = 0;

/* Check if the length of data to copy is bigger than Tx buffer size*/
// This code never runs. See
// [url]https://www.eevblog.com/forum/microcontrollers/anyone-here-familiar-with-lwip/msg4693118/#msg4693118[/url]
while( (byteslefttocopy + bufferoffset) > ETH_TX_BUF_SIZE )
{

//osDelay(2); - was a buffer overwrite issue, not possible to reproduce later
// see mod at the end of IF_HAL_ETH_TransmitFrame() which is a better fix

// Copy data to Tx buffer - should use DMA but actually the perf diff is negligible
#ifdef SPEED_TEST
TopLED(true);
#endif
memcpy_fast( (uint8_t*)((uint8_t*)buffer + bufferoffset), (uint8_t*)((uint8_t*)q->payload + payloadoffset), (ETH_TX_BUF_SIZE - bufferoffset) );
#ifdef SPEED_TEST
TopLED(false);
#endif

/* Point to next descriptor */
DmaTxDesc = (ETH_DMADescTypeDef *)(DmaTxDesc->Buffer2NextDescAddr);

/* Check if the buffer is available */
if((DmaTxDesc->Status & ETH_DMATXDESC_OWN) != (uint32_t)RESET)
{
errval = ERR_USE;
goto error;
}

buffer = (uint8_t *)(DmaTxDesc->Buffer1Addr);

byteslefttocopy = byteslefttocopy - (ETH_TX_BUF_SIZE - bufferoffset);
payloadoffset = payloadoffset + (ETH_TX_BUF_SIZE - bufferoffset);
framelength = framelength + (ETH_TX_BUF_SIZE - bufferoffset);
bufferoffset = 0;
}

/* Copy the remaining bytes */
#ifdef SPEED_TEST
TopLED(true);
#endif
memcpy_fast( (uint8_t*)((uint8_t*)buffer + bufferoffset), (uint8_t*)((uint8_t*)q->payload + payloadoffset), byteslefttocopy );
#ifdef SPEED_TEST
TopLED(false);
#endif
bufferoffset = bufferoffset + byteslefttocopy;
framelength = framelength + byteslefttocopy;
}

/* Prepare transmit descriptors to give to DMA */
IF_HAL_ETH_TransmitFrame(&EthHandle, framelength);

errval = ERR_OK;

error:

/* When Transmit Underflow flag is set, clear it and issue a Transmit Poll Demand to resume transmission */
if ((EthHandle.Instance->DMASR & ETH_DMASR_TUS) != (uint32_t)RESET)
{
/* Clear TUS ETHERNET DMA flag */
EthHandle.Instance->DMASR = ETH_DMASR_TUS;

/* Resume DMA transmission*/
//__DMB();
EthHandle.Instance->DMATPDR = 0; // Any value issues a descriptor list poll demand.
}
return errval;
}

/**
  * @brief Should allocate a pbuf and transfer the bytes of the incoming
  * packet from the interface into the pbuf.
  *
  * @param netif the lwip network interface structure for this ethernetif
  * @return a pbuf filled with the received packet (including MAC header)
  *         NULL on memory error
  */

static struct pbuf * low_level_input(struct netif *netif)
{
struct pbuf *p = NULL, *q = NULL;
uint16_t len = 0;
uint8_t *buffer;
__IO ETH_DMADescTypeDef *dmarxdesc;
uint32_t bufferoffset = 0;
uint32_t payloadoffset = 0;
uint32_t byteslefttocopy = 0;
uint32_t i=0;

/* get received frame */

HAL_StatusTypeDef status = IF_HAL_ETH_GetReceivedFrame(&EthHandle);

if (status != HAL_OK)
{
return NULL; // Return if no RX data
}
else
{
rxactive=true; // set "seen rx data" flag
}

/* Obtain the size of the packet and put it into the "len" variable. */
len = EthHandle.RxFrameInfos.length;
buffer = (uint8_t *)EthHandle.RxFrameInfos.buffer;

// Dump unwanted multicasts, unless g_eth_multi=true.
if (should_accept_ethernet_packet(buffer, len))
{
/* We allocate a pbuf chain of pbufs from the Lwip buffer pool */
p = pbuf_alloc(PBUF_RAW, len, PBUF_POOL);
}

// Load the packet (if not rejected above) into LWIP's buffer
if (p != NULL)
{
dmarxdesc = EthHandle.RxFrameInfos.FSRxDesc;
bufferoffset = 0;

for(q = p; q != NULL; q = q->next)
{
byteslefttocopy = q->len;
payloadoffset = 0;

/* Check if the length of bytes to copy in current pbuf is bigger than Rx buffer size */
// This code never runs. See
// [url]https://www.eevblog.com/forum/microcontrollers/anyone-here-familiar-with-lwip/msg4693118/#msg4693118[/url]
while( (byteslefttocopy + bufferoffset) > ETH_RX_BUF_SIZE )
{
/* Copy data to pbuf */
#ifdef SPEED_TEST
TopLED(true);
#endif
memcpy_fast( (uint8_t*)((uint8_t*)q->payload + payloadoffset), (uint8_t*)((uint8_t*)buffer + bufferoffset), (ETH_RX_BUF_SIZE - bufferoffset));
#ifdef SPEED_TEST
TopLED(false);
#endif

/* Point to next descriptor */
dmarxdesc = (ETH_DMADescTypeDef *)(dmarxdesc->Buffer2NextDescAddr);
buffer = (uint8_t *)(dmarxdesc->Buffer1Addr);

byteslefttocopy = byteslefttocopy - (ETH_RX_BUF_SIZE - bufferoffset);
payloadoffset = payloadoffset + (ETH_RX_BUF_SIZE - bufferoffset);
bufferoffset = 0;
}

/* Copy remaining data in pbuf */
#ifdef SPEED_TEST
TopLED(true);
#endif
memcpy_fast( (uint8_t*)((uint8_t*)q->payload + payloadoffset), (uint8_t*)((uint8_t*)buffer + bufferoffset), byteslefttocopy);
#ifdef SPEED_TEST
TopLED(false);
#endif
bufferoffset = bufferoffset + byteslefttocopy;
}
}

/* Release descriptors to DMA. This tells the ETH DMA that the packet has been read */
/* Point to first descriptor */
dmarxdesc = EthHandle.RxFrameInfos.FSRxDesc;
/* Set Own bit in Rx descriptors: gives the buffers back to DMA */
for (i=0; i< EthHandle.RxFrameInfos.SegCount; i++)
{
//__DMB();  - fossil code for the 32F417, apparently.
dmarxdesc->Status |= ETH_DMARXDESC_OWN;
dmarxdesc = (ETH_DMADescTypeDef *)(dmarxdesc->Buffer2NextDescAddr);
}

/* Clear Segment_Count */
EthHandle.RxFrameInfos.SegCount =0;

/* When Rx Buffer unavailable flag is set: clear it and resume reception */
if ((EthHandle.Instance->DMASR & ETH_DMASR_RBUS) != (uint32_t)RESET)
{
/* Clear RBUS ETHERNET DMA flag */
EthHandle.Instance->DMASR = ETH_DMASR_RBUS;
/* Resume DMA reception */
EthHandle.Instance->DMARPDR = 0;
}
return p;
}


Enjoy :)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 287
  • Country: ua
Re: Ethernet interface for MCU
« Reply #40 on: January 08, 2025, 10:59:42 pm »
How do you get say 3 secs down to 0.4 sec? I watched your whole video but - apart from marketing Mongoose ;) - it doesn't say how.

It would be a very long video if it explains all ins and outs.
But here are some points that make it fast:

1. The speed of TCP/IP stack itself. It depends on many things, but most crucial are - are sockets used or not. Sockets introduces extra buffering layer, plus it demands TCP/IP stack be in a separate task. Each IO call would then translate into RTOS queue call between app task and TCP/IP task. Here are some old benchmark numbers on F746 Nucleo board:
     Zephyr: < 10  QPS (HTTP queries per second)
     LWIP + sockets: 16 QPS
     LWIP + no sockets (raw API): 286 QPS
     Mongoose (no sockets): 1084 QPS
     As you can see, numbers differ in 2 orders of magnitude.
2. RSA vs EC. EC is less RAM hungry and more performant. There are a lot of nuances, and generally, that's the rule
3. If EC is used, then verification step does not use HW acceleration (EC curve math does not use RSA, SHA, etc).
4. TLS 1.3 vs TLS 1.2 - version matters. TLS 1.3 handshake is ~30% faster, cause client sends everything necessary during the handshake, server verifies and responds - and that's it. One back and forth, and not multiple back and forths like in TLS 1.2.

It is possible to tune mbedTLS to perform as fast, but it takes time and expertise. Mongoose is tuned to max performance and minimum RAM usage by default, there is no need to tune it.
Open source embedded network library https://github.com/cesanta/mongoose
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 287
  • Country: ua
Re: Ethernet interface for MCU
« Reply #41 on: January 08, 2025, 11:15:44 pm »
Peter,
STM32 ethernet driver implementation is famous for its bloat and low quality. It is infested with bugs which ST does not manage to fix for years.
Open source embedded network library https://github.com/cesanta/mongoose
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2473
  • Country: br
    • CADT Homepage
Re: Ethernet interface for MCU
« Reply #42 on: January 09, 2025, 02:30:03 am »
I got quick results using a STM32H743 Disco board with their web server example. It was easy to add telnet connectivity using a web find. I also managed to implement fiber connectivity using a TI phy chip.
W5500 was easy to use, too. I spent one or two weeks to convert their library into interrupt driven for better integration with other processes. Of course performance will be limited by the SPI interface clock.
An important part in all this was a cheap way of sniffing ethernet packets at 100 MHz. I used some old Netgear DS104 hub that would do the job as soon as some nodes were configured as half duplex.

Regards, Dieter
« Last Edit: January 09, 2025, 07:59:33 pm by dietert1 »
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 28498
  • Country: nl
    • NCT Developments
Re: Ethernet interface for MCU
« Reply #43 on: January 09, 2025, 12:15:04 pm »
How do you get say 3 secs down to 0.4 sec? I watched your whole video but - apart from marketing Mongoose ;) - it doesn't say how.

It would be a very long video if it explains all ins and outs.
But here are some points that make it fast:

1. The speed of TCP/IP stack itself. It depends on many things, but most crucial are - are sockets used or not. Sockets introduces extra buffering layer, plus it demands TCP/IP stack be in a separate task. Each IO call would then translate into RTOS queue call between app task and TCP/IP task. Here are some old benchmark numbers on F746 Nucleo board:
Not necessarily; a round-robin (super loop) setup will work just fine for as long as the sockets are non-blocking. IMHO BSD style sockets are a hard requirement as this allows to evaluate / test code on a PC and use existing code which is based on sockets. If BSD sockets are slow, you are doing something wrong because the BSD sockets where designed in the days when computers where significantly slower compared to modern day micro controllers.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline mikeselectricstuff

  • Super Contributor
  • ***
  • Posts: 14126
  • Country: gb
    • Mike's Electric Stuff
Re: Ethernet interface for MCU
« Reply #44 on: January 09, 2025, 12:51:45 pm »
Quote
An important part in all this was a cheap way of sniffing ethernet packets a 100 MHz. I used some old Netgear DS104 hub that would do the job as soon as some nodes were configured as half duplex.

Regards, Dieter
If you like to see your ethernet  packets scope/logic analyzer style, I still have a few of these left: www.etherdecode.com
Youtube channel:Taking wierd stuff apart. Very apart.
Mike's Electric Stuff: High voltage, vintage electronics etc.
Day Job: Mostly LEDs
 
The following users thanked this post: nctnico

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #45 on: January 09, 2025, 02:26:24 pm »
Tellurium I simply do not believe those numbers are meaningful.

Something else is going on.

My target is doing 1MB/sec on file transfers, and that is with a very sub-optimal low level input implementation whereby I have an RTOS task polling for input

Code: [Select]

/**
  * This function is the ethernetif_input task. It uses the function low_level_input()
  * that handles the actual reception of bytes from the network interface.
  *
  * This is a standalone RTOS task so is a forever loop.
  * Could be done with interrupts but then we have the risk of hanging the xxx with fast input
  * (unlikely if ETH is via a switch, but you could have a faulty LAN with lots of
  * broadcasts) plus we have the issue of link status change detection in a thread-safe way.
  *
  */

void ethernetif_input( void * argument )
{

struct pbuf *p;
struct netif *netif = (struct netif *) argument;
uint32_t link_change_check_count = ETH_LINK_CHANGE_COUNT;

// Define RX activity timer, for dropping fast poll down to slow poll
TimerHandle_t *rxactive_timer = xTimerCreate("ETH RX active timer", pdMS_TO_TICKS(ETH_SLOW_POLL_DELAY), pdFALSE, NULL, RXactiveTimerCallback);

// Start "rx active" timer
xTimerStart(rxactive_timer, 20); // 20 is just a wait time for timer allocation

do
    {

p = low_level_input( netif ); // This sets rxactive=true if it sees data

if (p!=NULL)
{
if (netif->input( p, netif) != ERR_OK )
{
pbuf_free(p);
}
}

if (rxactive)
{
rxactive=false;
// Seen rx data - reload timeout
xTimerReset(rxactive_timer, 20); // Reload "rx active" timeout (with ETH_SLOW_POLL_DELAY)
// and get osDelay below to run fast
rx_poll_period=ETH_RX_FAST_POLL_INTERVAL;
}

// This has a dramatic effect on ETH speed, both ways (TCP/IP acks all packets)
osDelay(rx_poll_period);

// Do ETH link status change check

link_change_check_count--;
if (link_change_check_count==0)
{
// reload counter
link_change_check_count = ETH_LINK_CHANGE_COUNT;

// Get most recently recorded link status
bool net_up = netif_is_link_up(&g_xxx_netconf_netif);

// Read the physical link status
ethernetif_set_link(&g_xxx_netconf_netif);

// Has the link status changed
if (net_up != netif_is_link_up(&g_xxx_netconf_netif))
{
ethernetif_update_config(&g_xxx_netconf_netif);

if (net_up) {
   // Link was up so must have dropped
   debug_thread_printf("Ethernet link down");
}
else {
   // Link was down so must be up - restart DHCP
   debug_thread("Ethernet link up");
   xxx_network_restart_DHCP();
}
}
}

    } while(true);

}

There simply aren't several seconds' worth of internet transactions involved in the TLS setup. The CPU time is spent firmly in the RSA or EC crypto. So your 10x difference could well be RSA versus EC.

This is a very old issue, which is why there are chips which do hardware RSA, in ~100ms. If you could do it in 400ms, then a modern 500MHz arm32 would do RSA in 100ms in software, EC much faster, and nobody would be using those coprocessor chips (well except for "secure" private key storage).

Can you supply actual numbers for your RSA and EC code, and at what MHz.

Other apps I have running that use sockets show no performance issues. For example an MbedTLS application (HTTPS client) does a file download at a decent speed. I personally use the Netconn API of LWIP which is below the socket layer but still...

Quote
STM32 ethernet driver implementation is famous for its bloat and low quality. It is infested with bugs which ST does not manage to fix for years.

Where exactly is the "bloat and low quality"? The 32F4 ETH subsystem runs a linked list of packets (I've not studied it in detail; it is very complicated and I have better things to do). The code I posted interfaces between the buffer structure of LWIP and this packet list. It doesn't look like it can be made much shorter. There are zero-copy versions posted on the ST forum, from "Piranha", but I am not sure anybody understands them, and he appears to have vanished. I too was concerned about the data moving, so did done timing measurements with a scope and the software copy my (ST) code uses takes an insignificant time. It's hard to see where much time can be saved. Replacing the memcpy() with DMA would have made no difference in any likely IOT application.

As regards bugs, the guy who did this project originally spent maybe a month full time equivalent patching the code from bug reports found all over the internet, which indeed is crap from ST, but that is historical.

Regarding LWIP config, yes, very true, it is poorly documented - symptomatic of a project which started ~17 years ago, the devs have long ago got themselves girlfriends and moved on :) and nobody wrote up a "how to get started quick" guide for it. But the 17 years also works in its favour: bugs should have been worked out of it, over literally millions of devices in the field. And it is very unlikely that everybody is using the same 10% of the functionality i.e. concealing major bugs. That could well be true for MbedTLS though. I spent ages  googling for info on how to set up the buffer sizes etc, testing and documenting it. It's actually easy if you throw a lot of RAM at it (say 50k) but I am not doing that. A google for lwipopts.h yields loads of hits and eventually you find some useful info... A crap situation but typical of open source software! The alternative is closed commercial sources, or low deployment number sources where one doesn't really know how well they have been tested.

« Last Edit: January 09, 2025, 02:51:00 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline SCSKITS

  • Contributor
  • Posts: 47
  • Country: us
  • Retired electrical engineer
Re: Ethernet interface for MCU
« Reply #46 on: January 09, 2025, 03:30:40 pm »
Its wired Ethernet, but the Lantronix X-Ports are easy to use. From what I remember though, they are a bit expensive, about $50 to $70 depending on version. The ones I used were a simple serial interface. No special PCB requirements either.
SCS, DIY upgrades for older test equipment
 

Offline tellurium

  • Frequent Contributor
  • **
  • Posts: 287
  • Country: ua
Re: Ethernet interface for MCU
« Reply #47 on: January 09, 2025, 06:36:56 pm »
If BSD sockets are slow, you are doing something wrong because the BSD sockets where designed in the days when computers where significantly slower compared to modern day micro controllers.

I am not doing anything wrong.

In the benchmark I mentioned , I took a board, a stock examples for Zephyr and ST Cube (which used LWIP and LWIP's httpd), built and measured the QPS using "siege" utility. I did not "do" anything wrong cause I did not touch the code.

I know very well how BSD sockets were designed (I have a video about that), and I know how they are implemented in LWIP and Zephyr (see my note above).
Open source embedded network library https://github.com/cesanta/mongoose
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #48 on: January 09, 2025, 06:46:41 pm »
I don't think HTTPD which comes with LWIP (in ST Cube IDE) works as it is. I paid someone to do something with it and it turned out to be largely broken.

It would be great to find out where the 10:1 speed difference comes from.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 
The following users thanked this post: nctnico

Online peter-h

  • Super Contributor
  • ***
  • Posts: 4420
  • Country: gb
  • Doing electronics since the 1960s...
Re: Ethernet interface for MCU
« Reply #49 on: January 10, 2025, 09:15:46 am »
This is a fairly typical case with MbedTLS pinging the well known hc-ping.com device failure notification service. The numbers on the left are ms, so you are looking at 6 seconds

Code: [Select]
2595367457: healthcheck url= [url]https://hc-ping.com/8cd78cf4-eb2b-4844-b572-xxxxxxxxxxxxxxxxx[/url]
2595367461:
              . Seeding the random number generator...
2595367468:  ok

2595367469:   . Connecting to tcp/hc-ping.com/443...
2595367503:  connect ok

2595367509:   . Setting up the SSL/TLS structure...
2595367515:  ok

2595367523:   . Performing the SSL/TLS handshake...

Certificate verify (depth 2) for:

cert. version     : 3
serial number     : 5C:8B:99:C5:5A:94:C5:D2:71:56:DE:CD:89:80:CC:26
issuer name       : C=US, ST=New Jersey, L=Jersey City, O=The USERTRUST Network, CN=USERTrust ECC Certification Authority
subject name      : C=US, ST=New Jersey, L=Jersey City, O=The USERTRUST Network, CN=USERTrust ECC Certification Authority
issued  on        : 2010-02-01 00:00:00
expires on        : 2038-01-18 23:59:59
signed using      : ECDSA with SHA384
EC key size       : 384 bits
basic constraints : CA=true
key usage         : Key Cert Sign, CRL Sign

This certificate has no flags

Certificate verify (depth 1) for:

cert. version     : 3
serial number     : F3:64:4E:6B:6E:00:50:23:7E:09:46:BD:7B:E1:F5:1D
issuer name       : C=US, ST=New Jersey, L=Jersey City, O=The USERTRUST Network, CN=USERTrust ECC Certification Authority
subject name      : C=GB, ST=Greater Manchester, L=Salford, O=Sectigo Limited, CN=Sectigo ECC Domain Validation Secure Server CA
issued  on        : 2018-11-02 00:00:00
expires on        : 2030-12-31 23:59:59
signed using      : ECDSA with SHA384
EC key size       : 256 bits
basic constraints : CA=true, max_pathlen=0
key usage         : Digital Signature, Key Cert Sign, CRL Sign
ext key usage     : TLS Web Server Authentication, TLS Web Client Authentication

This certificate has no flags

Certificate verify (depth 0) for:

cert. version     : 3
serial number     : 33:6D:EC:80:31:F7:A3:44:B6:A1:0E:56:DB:7B:9B:CD
issuer name       : C=GB, ST=Greater Manchester, L=Salford, O=Sectigo Limited, CN=Sectigo ECC Domain Validation Secure Server CA
subject name      : CN=hc-ping.com
issued  on        : 2024-09-17 00:00:00
expires on        : 2025-10-16 23:59:59
signed using      : ECDSA with SHA256
EC key size       : 256 bits
basic constraints : CA=false
subject alt name  : hc-ping.com, [url=http://www.hc-ping.com]www.hc-ping.com[/url]
key usage         : Digital Signature
ext key usage     : TLS Web Server Authentication, TLS Web Client Authentication

This certificate has no flags

2595373489:  TLS handshake ok

2595373495:   . Verifying peer X.509 certificate...
2595373501:  ok

2595373507: Using TLS ciphersuite: TLS-ECDHE-ECDSA-WITH-CHACHA20-POLY1305-SHA256
2595373513:   > Write to server:
2595373519:  72 initial bytes written:
2595373525:   < Read from server:
2595373931: Headers length = 188, Content-Length = 2

2595373937:  190 bytes read:
2595373943: HTTP/1.1 200 OK
server: nginx
date: Fri, 10 Jan 2025 09:12:02 GMT
content-type: text/plain; chars...


2595374063: http incoming connection

There is a ~3 sec wait after the "Performing the SSL/TLS handshake..." and before the long certificate debugs come out. They come out in about 100ms.

When doing these tests one must do it against a specific website and server configuration, obviously.
« Last Edit: January 10, 2025, 10:50:58 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf