Author Topic: Anyone here familiar with LWIP? (Read 17062 times)

peter-h · « **on:** July 18, 2022, 01:20:41 pm »

This is a weird one. I posted it on the ST forum but nobody there replies (a huge number of posts every day).

To interface LWIP to the low level ETH hardware, one has a file normally called ethernetif.c. This contains low_level_input() and low_level_output() functions. In the simplest case (mine) these are polled, not interrupt-driven.

My latter function - this code is all over the internet, in pretty well identical form - is

Code: [Select]

static err_t low_level_output(struct netif *netif, struct pbuf *p)
{
  err_t errval;
  struct pbuf *q;
  uint8_t *buffer = (uint8_t *)(EthHandle.TxDesc->Buffer1Addr);
  __IO ETH_DMADescTypeDef *DmaTxDesc;
  uint32_t framelength = 0;
  uint32_t bufferoffset = 0;
  uint32_t byteslefttocopy = 0;
  uint32_t payloadoffset = 0;

  DmaTxDesc = EthHandle.TxDesc;
  bufferoffset = 0;
  
  /* copy frame from pbufs to driver buffers */
  for(q = p; q != NULL; q = q->next)
  {
    /* Is this buffer available? If not, goto error */
    if((DmaTxDesc->Status & ETH_DMATXDESC_OWN) != (uint32_t)RESET)
    {
      errval = ERR_USE;
      goto error;
    }
    
    /* Get bytes in current lwIP buffer */
    byteslefttocopy = q->len;
    payloadoffset = 0;
    
    /* Check if the length of data to copy is bigger than Tx buffer size*/
    while( (byteslefttocopy + bufferoffset) > ETH_TX_BUF_SIZE )
    {

  //osDelay(2); // TODO

      /* Copy data to Tx buffer*/
      memcpy( (uint8_t*)((uint8_t*)buffer + bufferoffset), (uint8_t*)((uint8_t*)q->payload + payloadoffset), (ETH_TX_BUF_SIZE - bufferoffset) );
      
      /* Point to next descriptor */
      DmaTxDesc = (ETH_DMADescTypeDef *)(DmaTxDesc->Buffer2NextDescAddr);
      
      /* Check if the buffer is available */
      if((DmaTxDesc->Status & ETH_DMATXDESC_OWN) != (uint32_t)RESET)
      {
        errval = ERR_USE;
        goto error;
      }
      
      buffer = (uint8_t *)(DmaTxDesc->Buffer1Addr);
      
      byteslefttocopy = byteslefttocopy - (ETH_TX_BUF_SIZE - bufferoffset);
      payloadoffset = payloadoffset + (ETH_TX_BUF_SIZE - bufferoffset);
      framelength = framelength + (ETH_TX_BUF_SIZE - bufferoffset);
      bufferoffset = 0;
    }
    
    /* Copy the remaining bytes */
    memcpy( (uint8_t*)((uint8_t*)buffer + bufferoffset), (uint8_t*)((uint8_t*)q->payload + payloadoffset), byteslefttocopy );
    bufferoffset = bufferoffset + byteslefttocopy;
    framelength = framelength + byteslefttocopy;
  }
  
  /* Prepare transmit descriptors to give to DMA */ 
  sys_mutex_lock(&lock_eth_if_out);
  HAL_ETH_TransmitFrame(&EthHandle, framelength);
  sys_mutex_unlock(&lock_eth_if_out);
  
  errval = ERR_OK;
  
error:
  
  /* When Transmit Underflow flag is set, clear it and issue a Transmit Poll Demand to resume transmission */
  if ((EthHandle.Instance->DMASR & ETH_DMASR_TUS) != (uint32_t)RESET)
  {
    /* Clear TUS ETHERNET DMA flag */
    EthHandle.Instance->DMASR = ETH_DMASR_TUS;
    
    /* Resume DMA transmission*/
    EthHandle.Instance->DMATPDR = 0;
  }
  return errval;
}

Note the commented-out osDelay(2). This fixed a bug whereby a large data transfer (2MB) was fairly often corrupted. An examination of the data found that the size was preserved but some data appeared in 2 places consecutively, as if a buffer was being written before previous data was extracted out of it (by the dedicated DMA controller which services the 32F4 ETH subsystem).

I suspected the memcpy was running while the DMA was still reading that buffer. That delay immediately fixed the issue. I use (2) rather than (1) because osDelay(1) is actually 0ms to 1ms.

The obvious Q is: how come this works for other people? Or maybe lots of people fixed this bug and never posted about it. OTOH I can easily see that most users would never discover it, because it shows up only on

- big transfers (1MB+)
- data source is very fast (my flash FS read speed is > 1mbyte/sec)

If point #2 is not the case then this won't show up because LWIP will be feeding the packets to low_level_output too slowly and the DMA will always be under-running. Those people are wasting RAM having more than 1 TX buffer anywhere, too, so all the stuff all over the internet about "tuning TCP" is wasted

My first suspicion was that this bit of code (which AIUI is supposed to check if there is a DMA TX buffer available) is supposed to be before that memcpy which is above it, but actually if one looks at the program flow, it should be ok because it is repeated elsewhere.

Code: [Select]

      /* Check if the buffer is available */
      if((DmaTxDesc->Status & ETH_DMATXDESC_OWN) != (uint32_t)RESET)
      {
        errval = ERR_USE;
        goto error;
      }

My other thought was: why return an error code if no TX buffer available when a) the TX DMA cannot fail (unless the silicon is duff); b) at 10/100mbps, a buffer will become available very fast; c) the error condition is returned to LWIP and according to google (there is virtually zero support on LWIP anywhere, even on LWIP mailing lists) LWIP does not always retransmit on a TX error.

Anyway, the osDelay(2) is a bad bodge, so I fixed it inside HAL_ETH_TransmitFrame() by waiting for the DMA status to show "all transfers complete", with

Code: [Select]

  // Make this function blocking, otherwise following code overwrites the last DMA buffer!
  if ( (((heth->Instance)->DMASR) & (0x7 << 20)) != 0 )
  {
	  taskYIELD();   // not really necessary since the time here would be a max of 1 MTU at 10-100mbps
  }

which is probably suboptimal because it prevents LWIP getting the next packet ready while DMA is transmitting the previous one to ETH. With some test code, the output speed is about 200kbytes/sec which is totally fine for the application.

I also found that replacing the "check if buffer is available" with

Code: [Select]

     // Is this buffer available? If not, wait
      while ( (DmaTxDesc->Status & ETH_DMATXDESC_OWN) != 0 ) {}

works too, and the function never returns an error code.

Does anyone know anything about this stuff?

ttt · « **Reply #1 on:** July 18, 2022, 08:34:27 pm »

The STM32F4 series have I+D caches. In case you have that enabled have you tried to flush the cache before the memcpy (setting DCRST and ICRST bits in FLASH_CR)?

wek · « **Reply #2 on:** July 18, 2022, 10:32:49 pm »

Quote from: ttt on July 18, 2022, 08:34:27 pm

The STM32F4 series have I+D caches. In case you have that enabled have you tried to flush the cache before the memcpy (setting DCRST and ICRST bits in FLASH_CR)?

Those caches are on the FLASH interface. There's no reason to flush them unless you change the FLASH content (i.e. reprogram FLASH), and it's not the case here.

JW

wek · « **Reply #3 on:** July 19, 2022, 12:13:39 am »

Quote

An examination of the data found that the size was preserved but some data appeared in 2 places consecutively

Dropped frames should result in data missing, not corrupted.

And IP should be OK with missing data, that's why TCP.

I don't offer answers, just doubts.

JW

peter-h · « **Reply #4 on:** July 19, 2022, 09:52:14 am »

Quote

The STM32F4 series have I+D caches. In case you have that enabled have you tried to flush the cache before the memcpy (setting DCRST and ICRST bits in FLASH_CR)?

I found that too but all I could find is that the H7 has this issue, not the F4 - as wek says above.

Quote

Dropped frames should result in data missing, not corrupted.

I don't think frames are dropped. The packet serial numbering system would pick that up. I think, with my very limited knowledge of TCP/IP, that if you overwrite that buffer with "junk" and the buffer was holding a complete packet, then the ETH controller will generate a "good" CRC on the packet (this I believe is a low level hardware feature; LWIP is not computing a CRC32, is it?) as it is being transmitted. And since there is no end-to-end CRC, the corruption is not detected.

Possibly my bold text above is irrelevant and actually each of the buffers does simply get sent as a packet on ETH, with a 1:1 buffer-packet relationship. I know the 32F4 ETH controller picks up buffers according to order in some sort of list, but I don't think it concatenates them to fill up an MTU.

For amusement: Many years ago I designed a token-ring LAN (before ETH was cost-effective or even less than unbelievably complicated, using a Z180+85c30, SDLC packets, and a Manchester encoded isolated MIL1553 physical interface). I did that after a contracted-out LAN project with the WD2840 token ring controller ground to a halt (£500/day in 1985!) and I did it myself the hard way. I had issues like this too. With just CRC16, errors would slip through if there was enough noise.

wek · « **Reply #5 on:** July 19, 2022, 11:17:23 am »

> ETH controller will generate a "good" CRC on the packet (this I believe is a low level hardware feature; LWIP is not computing a CRC32, is it?

If you've set so.

The initial post in https://community.st.com/s/question/0D53W00001bKph9SAC/hardware-ipv4-checksum-on-an-stm32f407-is-not-working-though-all-the-proper-settings-are-set-works-on-rare-occasions-oddly sums it up.

> if you overwrite that buffer with "junk"

The question is, where does the buffer overwrite happen, exactly. It might've happened at the low_level_output() routine - which would imply that the OWN bit checking does not work as intended, for whatever reason; but also before that, still in the IP/TCP stack.

JW

peter-h · « **Reply #6 on:** July 19, 2022, 12:01:19 pm »

Thank you.

Looking in opt.h and lwipopts.h, the way this is supposed to work is that opt.h does the defines for lwip and lwipopts.h overrides those defined in lwipopts.h. Looking at both files, in my project, it looks like only ipv6 checksums are done in software. I am not supporting ipv6 anyway.

Quote

The STM32F4 Ethernet peripheral is picky; it wants to initially see zeroes in the checksum fields it is going to hardware-calculate. If the checksum field goes in zero, it comes out on the wire calculated correctly. But if it goes in nonzero, it comes out on the wire as zeroes instead.

I haven't got a clue how or where to zero that value. However, this sounds like a consistent thing, so it should be easy to reproduce.

Quote

The question is, where does the buffer overwrite happen, exactly. It might've happened at the low_level_output() routine - which would imply that the OWN bit checking does not work as intended, for whatever reason; but also before that, still in the IP/TCP stack.

Indeed. I think it was the memcpy that was doing the overwrite, since delaying just before it fixed the problem. But it is very marginal; a board I am working on, with a load of RTOS tasks running, doesn't show it. I am transferring 2MB jpegs as the easiest way to see data corruption and haven't seen this problem. I will come back to it later.

As an interesting aside, ref this
#define ETH_TXBUFNB 2 // (4U) /* 4 Tx buffers of size ETH_TX_BUF_SIZE */
I have found that with 2 or anything bigger I am getting a tx speed of 120kbytes/sec, with 1 it falls to half that. The flash file read speed is about 5x to 10x bigger so there is clearly a big bottleneck in LWIP. For this application this is fine but maybe it is indicative of something? Removing the above mentioned wait for ETH DMA to be totally finished after each transfer (making it blocking) also makes no difference. Increasing MEM_SIZE in lwipopts.h dramatically (from 6k to say 16k) improves the tx speed only 20%. I will post this in the other thread.

peter-h · « **Reply #7 on:** July 21, 2022, 02:25:56 pm »

This issue is very hard to reproduce. Now, it is gone. I have changed a few things around. It was seen only on a board running a version of the software with most RTOS tasks stripped out (which is what I might expect, because it looks like LWIP is feeding in the packets extra fast) and I tried that too, without success.

So I left in that bit of code which checks the 000 "DMA all finished" bits. It makes no difference to performance that I can see.

betocool · « **Reply #8 on:** July 21, 2022, 11:16:23 pm »

I've had failures and wins with LWIP, on F7s and H7, and long long time ago on an Atmel UC100 or something along those lines, 2009.

I don't have the references here unfortunately, but what I do to increase throughput is to increase the size of the receive buffer to, say, 1024 bytes or even 1536 (I think that is the max Ethernet transfer size), as well as increasing the number of buffers used for LwIP on the config files. In an F7 or H7 memory is not a problem usually, so I allocate a lot.

I had one application where I used TCP/IP raw (bare metal), and it worked, but it took a lot of tinkering really to get it to the speeds I needed.

After that, I tried using TCP/IP with FreeRTOS, and again after some tinkering I got it working steady. I never tested it more than a few hours at a time though, but speeds about 10 MBit or so were ok. The important thing here, at least for me, was to have a good enough data receive buffer before writing data from the Micro to the PC, for that I used a reasonably large StreamBuffer.

Also, last thing, I always prefer to set up the micro as a server, and any application as clients on PCs. I feel the micro doesn't really know where he is sending the data to a priori. I usually use micros as data gatherers, eventually some processing, before bulk sending to a PC. I have not come across an application where a PC would have to send tons of data to a micro.

My 2c's worth.

Cheers,

Alberto

peter-h · « **Reply #9 on:** July 22, 2022, 08:21:22 am »

I have been tinkering with it and (in retrospect not surprisingly) found a huge bottleneck in the polled low level receive

Code: [Select]

/**
  * This function is the ethernetif_input task. It uses the function low_level_input()
  * that should handle the actual reception of bytes from the network
  * interface. Then the type of the received packet is determined and
  * the appropriate input function is called.
  *
  * This is a standalone RTOS task so is a forever loop.
  * Could be done with interrupts but then we have the risk of hanging the KDE with fast input.
  *
  */
void ethernetif_input( void * argument )
{
	struct pbuf *p;
	struct netif *netif = (struct netif *) argument;

	// This mutex position is based on an idea from Piranha here, with the delay added
	// https://community.st.com/s/question/0D50X0000BOtUflSQF/bug-stm32-lwip-ethernet-driver-rx-deadlock

	do
    {
		sys_mutex_lock(&lock_eth_if_in);
		p = low_level_input( netif );
		sys_mutex_unlock(&lock_eth_if_in);

		if (p!=NULL)
		{
			if (netif->input( p, netif) != ERR_OK )
			{
				pbuf_free(p);
			}
		}

		osDelay(10);	// this has a dramatic effect on ETH speed, both ways

    } while(true);
}

This controls both up and down speed because TCP/IP is a duplex flow.

I am getting 250kbytes/sec which is plenty for this application. By going for osDelay(1) I get 1200kbytes/sec and one could probably get a bit more by making RX interrupt-driven but then you need to be careful with mutexes (can't call them from an ISR). So that's 10mbps too, like you got. But my LAN connection is 100mbps and I think the 32F4 ETH controller processes the packets so fast that having lots of low level buffering is wasted. In fact I get the same speed with just 1 low level (ETH DMA) buffer; now I am using 2 for TX and 2 for RX.

The above code is transferring 2500 bytes every 10ms. My experiments suggest that tweaking the lwipopts.h buffer numbers etc makes almost no difference at these low packet rates.

There are also some very basic points which you don't read about much. For example, the ETH running at 100mbps (most gear is 1000mbps but your 32F4 ETH is 10 or 100 only, so it will run at 100 unless you deliberately cripple it) and your TX packets will be disappearing down the wire so fast that it makes almost no difference how much or how little buffering you have set up. No point in having 10 packet buffers (15k of RAM) if only 1 ever gets used; one discovers this only by prefilling the buffer area with say 0xAA and then after a while taking a look at it. I reckon in a typical RTOS based scenario where you have a load of other code running, anything more than 2 TX buffers in ETH and 2 in LWIP will be wasted - in most applications.

arijav · « **Reply #10 on:** July 22, 2022, 07:26:39 pm »

I am sorry to be slightly out of topic, but I would like to give you a recommendation. After months trying to get a reliable ETH + Raw API LWIP (w/o OS) setup with an STM32H7 for an application that required very high availability (sending high speed tcp data streams over weeks in some cases months) I gave up, as after 10 hours aprox. the STM always stopped sending packets. I have moved now to a different stack, FreeRTOS+TCP, and it is a real breeze, the connection is fully reliable.

https://www.freertos.org/FreeRTOS-Plus/FreeRTOS_Plus_TCP/index.html

If you need high availability I really recommend to move away from the buggy ETH+LWIP STM32 setup.

artag · « **Reply #11 on:** July 22, 2022, 07:52:28 pm »

I don't know that this is relevant, and it refers to an LWIP implementation I did on a PPC555 some tears ago. But it did have a cause that surprises me.

I would get errors (exceptions) on transmit frames. The cause of these was DMA underrun, and it occurred when the memory bus was busy with some other DMA transfers. It served as a warning that DMA is not 'free performance' just because the CPU isn't doing it. Even though the DMA bus was wide, relatively high performance, and running between an internal peripheral and internal (afaicr) RAM buffers, it was possible even with 10Mb ethernet to use up all the bandwidth and both the concurrent transfers couldn't keep to their timing.

peter-h · « **Reply #12 on:** July 22, 2022, 09:20:03 pm »

I think so much depends on some small details.

I have FreeRTOS running a load of tasks, some quite complicated e.g. DMA->DACs waveform generator, and I have five ETH-using tasks. One is doing ICMP pings, one is an HTTP server, and two are doing complicated HTTPS/TLS data uploads to servers. The 5th is NTP (UDP packets).

It runs solidly. However, I do get crashes when the HTTP server is running. After a number of hours only. Here:
https://www.eevblog.com/forum/microcontrollers/32f4-hard-fault-trap-how-to-track-this-down/msg4315627/#msg4315627

The HTTP server is the only one using the LWIP "netconn" API. The others are using the RAW or Socket APIs. I have set up LWIP to be thread-safe (lwipopts.h) and that seems to work.

So, yeah, there may be bugs in there, but there can be bugs anywhere, and in today's fast moving world, how good software is depends primarily on a) how many people use it b) how varies are their applications c) whether the developer is responsive. I reckon most open source software is actually pretty crap and requires weeks or months of trawling the internet and patching bugs.

Artag - how did you solve that one?

artag · « **Reply #13 on:** July 22, 2022, 10:27:48 pm »

Quote from: peter-h on July 22, 2022, 09:20:03 pm

Artag - how did you solve that one?

I think we had to protect the ethernet transfer, which had the most stringent timing requirements. But I forget what the conflict was with - it may have been with the cache fill. Though that doesn't really add up with the use of internal memory. Maybe it was an external bandwidth conflict and we buffered the ethernet packet through internal ram to give it more bandwidth.

It was 15 years ago and I haven't worked there for much of that time !

peter-h · « **Reply #14 on:** July 23, 2022, 08:07:56 am »

LWIP, like TLS, FreeRTOS, and everything supplied by STM with Cube IDE, must have had hundreds of bug fixes just in the last few years. By the time I reach my actuarial life expectancy, 95.3% of the bugs will have been fixed, and the remaining 4.7% will remain because the coders have got themselves girlfriends

peter-h · « **Reply #15 on:** January 14, 2023, 04:29:11 pm »

Back to this, and looking for ideas...

I have a couple of my boxes, set up for serial - ETH, back to back, looping back all the way to the start. Sending test packets with a special tester.

It all works. Transit time is 10-20ms which is fully accounted for with my code, but I am finding that very occassionally the delay goes to 1-2 seconds.

I am running on a LAN so the ETH section is just a switch.

Does anyone have any idea on what a TCP/IP stack (LWIP in this case) could be causing these long gaps?

I don't know of any way to troubleshoot LWIP. It is open source but the devs abandoned it > 10 years ago, so a google finds a huge number of hits and few solution. The usual stuff

8goran8 · « **Reply #16 on:** January 14, 2023, 08:13:59 pm »

I would try to sniff the ethernet traffic with wireshark and observe the IP packets just before the delay. Perhaps some comparison could be made with recorded content when there is no lag. That might give me an idea where to start looking for the problem.

gmb42 · « **Reply #17 on:** January 16, 2023, 11:34:40 am »

Quote from: 8goran8 on January 14, 2023, 08:13:59 pm

I would try to sniff the ethernet traffic with wireshark and observe the IP packets just before the delay. Perhaps some comparison could be made with recorded content when there is no lag. That might give me an idea where to start looking for the problem.

If you're at all concerned about accurate timing of packets you should consider using a TAP. If the price of a TAP is too rich for your tastes then you could use a span or mirror port on a switch that supports such options but you will then lose some accuracy in timing.

An on-machine capture is the least reliable option for accurate timing, especially as off-loading and other NIC acceleration features can create a false picture of what actually goes out on the wire.

dut:Mark · « **Reply #18 on:** January 21, 2023, 12:14:24 pm »

Quote from: peter-h on January 14, 2023, 04:29:11 pm

Does anyone have any idea on what a TCP/IP stack (LWIP in this case) could be causing these long gaps?

I don't know of any way to troubleshoot LWIP. It is open source but the devs abandoned it > 10 years ago, so a google finds a huge number of hits and few solution. The usual stuff

Just FYI: latest 2.1.3 release dates from November 2021 as far as I can see.

With respect to your issues and assuming you are using TCP (not UDP or raw Ethernet): think about what happens if you encounter packet drop, delayed ACKs, exhausted memory pools etc. Any broadcast traffic happening on your network which may affect resources consumed by your network stack? Is anything executing at your nodes that may block your network stack? Keep in mind that what happens on a single node, may affect the other nodes as well. And if you are only sending small packets, look into Nagle's algorithm and the silly window syndrome. As others mentioned, use Wireshark to investigate and keep in mind that TCP does not make any promises about time of delivery.

peter-h · « **Reply #19 on:** January 22, 2023, 11:11:07 am »

I have disabled Nagle algorithm. Or at least I think I have - this is a part of the code

Code: [Select]


        //set the socket to non-blocking
        int flags = O_NONBLOCK;
        ioctl(fd, FIONBIO, &flags);

        // Disable Nagle algorithm for faster response
        flags = TCP_NODELAY;
        ioctl(fd, FIONBIO, &flags);

It would have accounted for a lot of issues I was seeing. But ideally I would like to disable it globally for LWIP; the only code examples I can find disable it on a per-socket basis. Also, tracing through the LWIP code, I don't think the above is doing anything. After a lot of googling I think

Code: [Select]

        // Disable Nagle algorithm for faster response
        int optval = 1;
        int optlen = sizeof(optval);
        setsockopt(fd, IPPROTO_TCP, TCP_NODELAY, &optval, optlen);

may be right. Stepping through it, it executes the right code

Code: [Select]


#if LWIP_TCP
/* Level: IPPROTO_TCP */
  case IPPROTO_TCP:
    /* Special case: all IPPROTO_TCP option take an int */
    LWIP_SOCKOPT_CHECK_OPTLEN_CONN_PCB_TYPE(sock, optlen, int, NETCONN_TCP);
    if (sock->conn->pcb.tcp->state == LISTEN) {
      return EINVAL;
    }
    switch (optname) {
    case TCP_NODELAY:
      if (*(const int*)optval) {
        tcp_nagle_disable(sock->conn->pcb.tcp);
      } else {
        tcp_nagle_enable(sock->conn->pcb.tcp);
      }
      LWIP_DEBUGF(SOCKETS_DEBUG, ("lwip_setsockopt(%d, IPPROTO_TCP, TCP_NODELAY) -> %s\n",
                  s, (*(const int *)optval)?"on":"off") );
      break;

The function which gets eventually called is lwip_setsockopt_impl() and when the correct parms are supplied this does the above tcp_nagle_disable() which does a
#define tcp_nagle_disable(pcb) ((pcb)->flags |= TF_NODELAY)

I wonder if anyone knows about this?

The Nagle algorithm seems a throwback to old days of slow connections. It reportedly introduces delays of up to 300ms. Disabling it modifies packet timing dramatically.

Quote

As others mentioned, use Wireshark to investigate and keep in mind that TCP does not make any promises about time of delivery.

For sure, but anyone who has the deep expertise to be able to do that and get useful information doesn't need to be asking questions on forums

This stuff is probably the most complex stuff anyone is likely to be doing in embedded systems, and largely because one is trying to make use of an incredibly complicated and unsupported chunk of code called LWIP.

Has anyone used the FreeRTOS+TCP stack mentioned above?
https://www.freertos.org/FreeRTOS-Plus/FreeRTOS_Plus_TCP/index.html

peter-h · « **Reply #20 on:** January 24, 2023, 08:37:28 am »

Not really expecting a reply because this stuff is specialised but having spent many days on one issue I think LWIP has a call rate (call frequency) limit on the socket calls. Probably writing but possibly also reading. If this is exceeded, it doesn't actually lose the data but some internal mechanism is triggered which inserts long delays of 1-2 secs in the data flow.

Does anyone recognise this behaviour?

dare · « **Reply #21 on:** January 24, 2023, 03:51:23 pm »

Quote from: peter-h on January 24, 2023, 08:37:28 am

Not really expecting a reply because this stuff is specialised but having spent many days on one issue I think LWIP has a call rate (call frequency) limit on the socket calls. Probably writing but possibly also reading. If this is exceeded, it doesn't actually lose the data but some internal mechanism is triggered which inserts long delays of 1-2 secs in the data flow.

Does anyone recognise this behaviour?

This sounds like packet loss, but its hard to tell from the limited information available. It could be external, due to a lossy physical network, or it could be internal, due to buffer exhaustion.

With its myriad of configuration options, LwIP can be rather inscrutable, but it does work. Your best bet is to enable LwIP internal logging and then run a packet capture of the conversation. I suggest configuring debug in lwipopts.h roughly as follows:

Code: [Select]

#ifdef LWIP_DEBUG

#define MEMP_OVERFLOW_CHECK            ( 1 )
#define MEMP_SANITY_CHECK              ( 1 )

#define MEM_DEBUG        LWIP_DBG_OFF
#define MEMP_DEBUG       LWIP_DBG_OFF
#define PBUF_DEBUG       LWIP_DBG_ON
#define API_LIB_DEBUG    LWIP_DBG_ON
#define API_MSG_DEBUG    LWIP_DBG_ON
#define TCPIP_DEBUG      LWIP_DBG_ON
#define NETIF_DEBUG      LWIP_DBG_ON
#define SOCKETS_DEBUG    LWIP_DBG_ON
#define DEMO_DEBUG       LWIP_DBG_ON
#define IP_DEBUG         LWIP_DBG_ON
#define IP6_DEBUG        LWIP_DBG_ON
#define IP_REASS_DEBUG   LWIP_DBG_ON
#define RAW_DEBUG        LWIP_DBG_ON
#define ICMP_DEBUG       LWIP_DBG_ON
#define UDP_DEBUG        LWIP_DBG_ON
#define TCP_DEBUG        LWIP_DBG_ON
#define TCP_INPUT_DEBUG  LWIP_DBG_ON
#define TCP_OUTPUT_DEBUG LWIP_DBG_ON
#define TCP_RTO_DEBUG    LWIP_DBG_ON
#define TCP_CWND_DEBUG   LWIP_DBG_ON
#define TCP_WND_DEBUG    LWIP_DBG_ON
#define TCP_FR_DEBUG     LWIP_DBG_ON
#define TCP_QLEN_DEBUG   LWIP_DBG_ON
#define TCP_RST_DEBUG    LWIP_DBG_ON
#define PPP_DEBUG        LWIP_DBG_OFF

#define LWIP_DBG_TYPES_ON         (LWIP_DBG_ON|LWIP_DBG_TRACE|LWIP_DBG_STATE|LWIP_DBG_FRESH|LWIP_DBG_HALT)

#endif

If you are unable to make sense of the packet capture and log output, share it here for others to interpret.

ejeffrey · « **Reply #22 on:** January 24, 2023, 04:34:31 pm »

The Nagle algorithm is a congestion reduction algorithm to reduce the number of small packets, especially when doing things like sending unbuffered terminal or serial port data over a socket.

The way it generally works is this: if you make a send() call that results in a short packet being sent, future send() calls will be buffered and not immediately sent until either 1) the previous packet is ACKed 2) you have accumulated enough data for a full size packet or 3) I think if the sender times out waiting for the ACK it will retransmit the old data as well as any new data.

The Nagle algorithm should only really be an issue on high latency links, and can be avoided by batching your send calls manually. It also doesn't cause an issue in protocols that wait for a response from the other end to continue as they are limited by the round trip time anyway. You can turn it off with setsockopt but if your client doesn't try to send multiple short packets in a row that's unlikely to be the cause of your latency.

It was historically a major problem for HTTP as the server, and to some extent the client would send headers first followed by the body and did not want to wait for a response in between.

Also keep in mind that Nagle is a transmit side feature. You can enable/disable it on both sides independently, but one side can't force the other to change it's behavior.

peter-h · « **Reply #23 on:** January 24, 2023, 08:12:13 pm »

ejeffrey · « **Reply #24 on:** January 24, 2023, 10:24:40 pm »

You need to be looking at packet capture data from tcpdump/wireshark. There is basically no point in diagnosing any networking issue if you aren't 100% comfortable sorting through network capture data. Don't guess about broadcast traffic, traffic stalls, or other weirdness, measure it.

If one or both endpoints are regular computers you can just run it on the endpoints. If both are embedded systems you will need to find a way to tap the signal. If your network hosts are 100 megabit, you can find an old "known hub", if you are operating at 1 Gb or higher you can use a switch with a MIRROR port or a dedicated tap device like this:

https://www.amazon.com/Dualcomm-1000Base-T-Gigabit-Ethernet-Network/dp/B004EWVFAY/ref=asc_df_B004EWVFAY/

Even just plugging a monitoring host into the same network switch as your devices would tell you about broadcast traffic.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Anyone here familiar with LWIP? (Read 17062 times)

Share me