Products > Networking & Wireless
Can one find out how much data can be sent to a NON blocking socket without loss
peter-h:
--- Quote ---Don't be offended
--- End quote ---
Never offended by anything "technical" - it's all a great learning experience :)
In an RTOS environment, the way buffers move along is likely to be quite random, or maybe sometimes not...
--- Quote ---If all downstream buffers are a multiple of 512 bytes in size, and the TCP MTU is large enough to never split 512 byte packets, then this is to be expected: either the buffer is full, or it has room for a multiple of 512 bytes, all down the buffer chain.
--- End quote ---
That's really interesting. Yes, there are two 512 byte buffers, but in addition there are a few k MTU-sized buffers (1500 + a bit). I tried it with the 512 byte buffers much bigger than the MTU and still didn't see a short write. But this could be for any reason at all. TCP/IP is incredibly complex.
Nominal Animal:
(Just wanted to be sure. To me, these discussions are like talking shop while standing in a hallway or having a coffee/tea/soda, with lots of interested faces following the conversation, while not actively participating. In real life, I'm the type who observes their nonverbal cues, and when abbreviations/phenomena/algorithms/etc. are discussed that those others seem to fail to follow, I stop and describe/explain things to everyone; and make sure everybody is on the same track. Online, it doesn't work that well –– no cues ––, and sometimes the other persons take my explanations/descriptions as if I thought they might not know that, while it's never that: it's just that I want everybody following the discussion to be kept along. Solving a problem for just one person isn't that interesting or useful, really: it is when you help with the problem solving procedure, possibly introducing new concepts and ways of solving similar problems, alternate causes for such problems, that others might encounter later on and stumble on to the recorded discussion, that makes it worthwhile to spend as much time and effort on the things as I do. I am not very "clever" myself, I just love trying to help solve problems, and do it mostly via brute force effort. :P)
--- Quote from: peter-h on December 23, 2022, 07:19:21 pm ---there are two 512 byte buffers, but in addition there are a few k MTU-sized buffers (1500 + a bit). I tried it with the 512 byte buffers much bigger than the MTU and still didn't see a short write. But this could be for any reason at all. TCP/IP is incredibly complex.
--- End quote ---
Yep, and I'm not at all familiar with how LWIP handles TCP buffering; the upper/lower function dispatch is making it hard to track the exact call chain.
I know a Berkeley/POSIX -type socket interface does require indirect function dispatch, but I prefer function pointers myself. With something like Elixir Cross Referencer or even plain grep -e member -R . they are easier to follow than what LWIP uses. (If you wonder what that does, https://elixir.bootlin.com/ exposes the Linux kernel sources using it. I use it all the time to trace stuff through the Linux kernel. One can install it locally; if you use a httpd server on Linux that only serves on loopback (127.x.y.z), it will not be externally accessible.)
But just because I would do things differently, and dislike the way LWIP does things, does not mean I wouldn't use LWIP myself: it is just a tool, after all, and one with significant resources and real-world testing behind it. I was just grumbling... ;)
peter-h:
Actually almost nobody understands how LWIP manages its buffers etc versus the config options in the lwipopts.h file. Much is online but ambiguous unless you already know the answer. I spent days changing various options and seeing how much RAM got used up and where, and documented them as best as I could.
But as far as free code goes, there probably isn't anything better.
An interesting point relating to this is whether LWIP (or how much of it) is zero-copy. Obviously on the way out (to ETH) it can't be because the least it needs to do to your supplied buffer is to attach the headers etc to it, and in reality in needs to split it if > MTU. On the way in it could be zero-copy but then the biggest buffer it would give you would be MTU sized.
Nominal Animal:
--- Quote from: peter-h on December 23, 2022, 10:34:02 pm ---But as far as free code goes, there probably isn't anything better.
--- End quote ---
Quite possible.
I have looked at FNET, because vjmuzik has an Arduino library (FNET fork and NativeEthernet) for use with Teensy, which is more to my liking, but I haven't really used it in anger either. Teensy 4.1 is the only MCU I currently have with a 10/100 Ethernet on it. (I do have several different SBCs running Linux, and I can do zero-copy network I/O in Linux using memory-mapped tx and rx buffers, but that's different: the kernel does all the hard, complex stuff there.)
--- Quote from: peter-h on December 23, 2022, 10:34:02 pm ---An interesting point relating to this is whether LWIP (or how much of it) is zero-copy.
--- End quote ---
True.
I've done lots of MPI stuff, and particularly like the asynchronous I/O interface. It has very similar requirements as zero-copy, in that one must not modify the data buffer until the async operation completes. (Most implementations use a dedicated I/O thread handling the transfers in non-blocking mode.)
Compared to the socket interface, a zero-copy would really need a completely different interface, one where a write/send either takes a callback or closure, or returns a token, so that the buffer is retained until the operation completes (by the IP stack doing the callback, or updating the token state). Similarly, a read/receive should really be event-based, with a similar token or closure to tell the IP stack the data is no longer needed.
If I had to support zero-copy on Berkeley/POSIX sockets -type interface, I'd cry: it really isn't suitable for the task.
At minimum, I'd like to separate the header part and the payload part. Although they would be contiguous in received messages, having them separate in the API, and separate when sending messages –– especially if you could do a scatter-send with multiple recipients, the stack duplicating the data internally as needed –– would make a lot of sense. (The completion tracking would then be per header, not per data.)
Unfortunately, even in MPI, using MPI_Isend() and MPI_Irecv() for "zero-copy"/nonblocking/asynchronous I/O, seem to be extremely hard for many programmers to understand. I've even had heated arguments with "MPI Experts" who claim that using these is inherently dangerous (because they just didn't understand how to use them properly). It isn't, and is the only way to allow a HPC distributed simulator/calculator to both compute and communicate simultaneously, not wasting time.
At the core, they behave just like zero-copy async sends and receives: the call returns immediately, but the data is read from the buffer or written to the buffer at some point in the future, and one needs to check the state of the request object to determine whether it succeeded or not. (I don't like that, I'd much prefer to have a callback/closure/event instead.) So, having dealt with the confusion, I'm not at all surprised that zero-copy I/O interfaces are "hard", when even something as stable and widely used as MPI confuses the "experts".
In IP stacks, the layered OSI model confuses full-stack developers even more, because it takes a lot of experience and a robust personality to understand that such models are abstract, and do not need to –– should not –– reflect the actual API or implementation. (Saying that out aloud among network software developers would normally start a shouting match, too.. I got some really unpopular views! But I do try to explain what my views and opinions are based on, so that one can check if they have a reason to agree or disagree.)
peter-h:
This thing is more complicated because the interface between LWIP and the CPU's ETH controller is just moving data packets, to which the TCP (UDP?) headers and tails are added by the ETH controller.
The interface software could just pass a pointer to a list of packet pointers, but my version does actually copy over the data. There is a later zero-copy version which I haven't implemented because I don't need it, there is zero support, and debugging this stuff is almost impossible.
Navigation
[0] Message Index
[*] Previous page
Go to full version