Author Topic: The death of Xon/Xoff flow control...  (Read 7306 times)

0 Members and 1 Guest are viewing this topic.

Offline PlainName

  • Super Contributor
  • ***
  • Posts: 7211
  • Country: va
Re: The death of Xon/Xoff flow control...
« Reply #75 on: September 16, 2024, 02:57:19 am »
It works.

Shown below, blue is TX sending a stream of zero bytes and yellow is RX waiting for XOFF. As soon as that is received, the UART finishes the current byte and stops transmitting. The whole process takes nanoseconds, which certainly means that a round trip over USB to the CPU and back is not involved. This is PL2303, probably not even a genuine one.

FTDI would likely still send one more byte, because that's what it does when CTS/RTS flow control is used. This makes FTDI unusable with primitive MCUs with one byte RX buffer.
I don't think so. On such MCUs you'll have an interrupt + software ring buffer to receive data. If you use flow control, you set the high-watermark at a certain fill level of the ring buffer (say somewhere between 50% and 90%) which means you can still accept some extra data.

I think you missed where he said it is the UART that sees Xoff and stops sending. Thus, at most, there would be one byte delay (the one already being sent).

The UART is the sensible place to do Xon/Xoff, but since it is in-band and you really don't want UARTs falling over on binary data, many if not most implementations do it in software and have the buffer issues. And, I might note, you also missed the sending buffer that either has to empty or get skipped before the Xoff even leaves the serial port. After all, Xon/Xoff is simply serial flow control using in-band data where there aren't enough signals, and it really wants to work at the same protocol level as the hardware signals it's replaced.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6867
  • Country: fi
    • My home page and email address
Re: The death of Xon/Xoff flow control...
« Reply #76 on: September 16, 2024, 05:36:37 am »
Of course a USB HC will ACK packets without CPU involvement, that's its damn job. How software can screw up is by not queuing enough transfer buffers - when the HC runs out of those, it will stop polling the endpoint for new data until more buffers are submitted by software.
Right.  The end result is exactly the same: USB packets not ACK'ed (or not enough transfer buffers queued, so the device has no opportunities to send more data), because the target application is not getting scheduled every few milliseconds; every half a millisecond at 1 Mbaud.

The solution used in Linux for USB SS –– for example, in SDR –– is to use async USB bulk transfers (via USBFS; not a file system, but the raw USB character devices at /dev/bus/usb/BBB/NNN) and queue a lot of them; and to increase the default USB buffer size from 64k (often up to 1G).

My point is that it is not reasonable for the device manufacturers to expect desktop OS applications to be able to always respond within milliseconds, because they never have. The other option is to say that OSes are buggy, because they cannot guarantee such low latencies.  Both are valid viewpoints, as there are reasons for and against for both.

It works. [...] This is PL2303, probably not even a genuine one.
Thing is, PL2303 uses its own driver and not USB CDC ACM, and it definitely supports both RTS/CTS and XON/XOFF flow control in the device.  (It only supports XON = 021/DC1/Ctrl-Q and XOFF = 023/DC3/Ctrl-S.)

That is, they implement flow control on the USB device itself, and the OS driver just tells it the flow control model to use.

PL2303G (aka type TXN) is extremely close to what a fixed USB CDC ACM would be.  It does have to use two Vendor-specific commands –– one for reading and one for writing the extra termios state –– and extends the SerialState notification message by adding bit 7, CTS.

To keep it short, if USB Class definitions for Communication Devices 1.2, and specifically PSTN120.pdf in it, were to
  • Table 31: UART State Bitmap Values (page 33): add
    $$\begin{array}{l|l|l}
    \text{D7} & \text{bClearToSend} & \text{This signal corresponds to RS-232 signal CTS.} \\
    \end{array}$$
  • Table 13: Class-Specific Request Codes for PSTN Subclasses (page 20): add
    $$\begin{array}{l|l}
    \text{SET_FLOW_CONTROL} & \text{24h} \\
    \text{GET_FLOW_CONTROL} & \text{25h} \\
    \end{array}$$
  • Insert two new chapters between 6.3.13 and 6.3.14: SetFlowControl and GetFlowControl.
    bmRequestType=00100001B, wLength=0/2/4, Data=XON and XOFF codes (5/6/7/8/16-bit), and wValue is the bit mask of supported flow control modes.  I'd use bit 0 for RTS/CTS flow control, and bit 1 for XON/XOFF flow control, because I've heard of one strange device from a friend in Canada that needed both to actually work.  If wLength=0, XON/XOFF codes are fixed at 021/DC1/Ctrl-Q and XOFF = 023/DC3/Ctrl-S.  If wLength=2, the device supports arbitrary 8-bit XON/XOFF codes.  If wLength=4, the device supports arbitrary 16-bit XON/XOFF codes when Line Coding bDataBits==16.
then CDC ACM could support flow control just like PL2303 do.

Interestingly, the Linux PL2303 driver was written by Greg K-H, not Prolific; and in addition to the about twenty Prolific vendor:product USB IDs it supports, it also supports about fifty other ones too.  The PL2303G (aka TYPE_HXN) is very close to USB CDC ACM, and doesn't require the vendor message tickling (see drivers/usb/serial/pl2303.c:pl2303_startup(), and the vendor-specific messages non-HXN PL2303's require to work).  It only supports one port per device, though.
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 7212
  • Country: pl
Re: The death of Xon/Xoff flow control...
« Reply #77 on: September 16, 2024, 05:40:11 am »
FTDI would likely still send one more byte, because that's what it does when CTS/RTS flow control is used. This makes FTDI unusable with primitive MCUs with one byte RX buffer.
I don't think so. On such MCUs you'll have an interrupt + software ring buffer to receive data. If you use flow control, you set the high-watermark at a certain fill level of the ring buffer (say somewhere between 50% and 90%) which means you can still accept some extra data.
Fair enough, it will work if this is what you do and if you are always fast enough to read one received byte before another two bytes overrun the UART peripheral's RX queue.

But if you rely on the UART automatically driving flow control for you, you need the transmitter to stop after the current byte when it sees CTS transition in the middle of it. Been there, done that.
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 7212
  • Country: pl
Re: The death of Xon/Xoff flow control...
« Reply #78 on: September 16, 2024, 05:59:25 am »
Of course a USB HC will ACK packets without CPU involvement, that's its damn job. How software can screw up is by not queuing enough transfer buffers - when the HC runs out of those, it will stop polling the endpoint for new data until more buffers are submitted by software.
Right.  The end result is exactly the same: USB packets not ACK'ed (or not enough transfer buffers queued, so the device has no opportunities to send more data), because the target application is not getting scheduled every few milliseconds; every half a millisecond at 1 Mbaud.

The solution used in Linux for USB SS –– for example, in SDR –– is to use async USB bulk transfers (via USBFS; not a file system, but the raw USB character devices at /dev/bus/usb/BBB/NNN) and queue a lot of them; and to increase the default USB buffer size from 64k (often up to 1G).

My point is that it is not reasonable for the device manufacturers to expect desktop OS applications to be able to always respond within milliseconds, because they never have. The other option is to say that OSes are buggy, because they cannot guarantee such low latencies.  Both are valid viewpoints, as there are reasons for and against for both.

USB has always been designed to be cheap. USB devices are meant to be cheap and they are not expected to buffer more than a few packets worth of data, certainly not more than a (micro)frame worth. Hosts are supposed to poll them often enough for that not to be an issue. Host controllers are cheap too and they use system RAM for transfer buffers, and they have the important property that they accept packet queues in the form of linked lists of arbitrary length (not entirely true for periodic transfers before XHCI, but you could still queue many milliseconds ahead). It's the software's job to queue enough data buffers to cover its latency.

This is perfectly doable and hardly unique to SDR. When you play or record audio with a USB soundcard, you have many milliseconds worth of packets queued and you don't hear glitches when you resize a GUI window. Hell, even Windows probably gets it right or users would be whining loudly. There is no reason serial couldn't do it too.

It works. [...] This is PL2303, probably not even a genuine one.
Thing is, PL2303 uses its own driver and not USB CDC ACM, and it definitely supports both RTS/CTS and XON/XOFF flow control in the device.  (It only supports XON = 021/DC1/Ctrl-Q and XOFF = 023/DC3/Ctrl-S.)

That is, they implement flow control on the USB device itself, and the OS driver just tells it the flow control model to use.

Yes, that's my whole point. And a quick glance at Linux drivers suggests that most chips have this functionality in hardware. Notably excluding CH431, although it might also be that the HW has the ability but it's not implemented in Linux.
 

Online guenthert

  • Frequent Contributor
  • **
  • Posts: 756
  • Country: de
Re: The death of Xon/Xoff flow control...
« Reply #79 on: September 16, 2024, 07:32:22 am »
The mechanism is this:
[..]
Your complaint is that in Windows and in Linux that interrupt has too much latency for typical RS-232/UART to USB converters.

At 115200 baud, the data rate is about 11520 bytes per second.  If the converter to-host buffer size is N bytes beyond one USB packet –– remember, the converter cannot discard the sent USB packet until it has been ACKed, or indeed data may/will be lost ––, the allowed interrupt latency is N/11520 seconds, or N·86.8µs.  Let's say a typical converter has 128 bytes of buffer (so 64 bytes over one full packet at FS), which corresponds to 5.6ms.  Essentially, it is double-buffering the device-to-host USB packets.

Typical desktop systems often have longer interrupt latencies than that 5.6ms under load, because the scheduling unit is of that order.

In Linux, you can mitigate this by using a kernel configured to use 1000 Hz ticks (CONFIG_HZ=1000, CONFIG_HZ_1000=y).  Debian desktop (and derivatives like Ubuntu and Mint) defaults to 250 Hz, so the scheduling "unit" is 4ms.
[..]
    Sorry, that's nonsense.  You seem to confuse task scheduling with interrupt handling.  Interrupt latencies in Linux on a modern desktop will be in the order of 0.01ms.   There is plenty of information about that out there, have a look at e.g. https://www4.cs.fau.de/Publications/2018/herzog_18_sbesc.pdf
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 2266
  • Country: 00
Re: The death of Xon/Xoff flow control...
« Reply #80 on: September 16, 2024, 08:28:25 am »
The mechanism is this:
[..]
Your complaint is that in Windows and in Linux that interrupt has too much latency for typical RS-232/UART to USB converters.

At 115200 baud, the data rate is about 11520 bytes per second.  If the converter to-host buffer size is N bytes beyond one USB packet –– remember, the converter cannot discard the sent USB packet until it has been ACKed, or indeed data may/will be lost ––, the allowed interrupt latency is N/11520 seconds, or N·86.8µs.  Let's say a typical converter has 128 bytes of buffer (so 64 bytes over one full packet at FS), which corresponds to 5.6ms.  Essentially, it is double-buffering the device-to-host USB packets.

Typical desktop systems often have longer interrupt latencies than that 5.6ms under load, because the scheduling unit is of that order.

In Linux, you can mitigate this by using a kernel configured to use 1000 Hz ticks (CONFIG_HZ=1000, CONFIG_HZ_1000=y).  Debian desktop (and derivatives like Ubuntu and Mint) defaults to 250 Hz, so the scheduling "unit" is 4ms.
[..]
    Sorry, that's nonsense.  You seem to confuse task scheduling with interrupt handling.  Interrupt latencies in Linux on a modern desktop will be in the order of 0.01ms.   There is plenty of information about that out there, have a look at e.g. https://www4.cs.fau.de/Publications/2018/herzog_18_sbesc.pdf

Pay attention, both windows and Linux don't support nested interrupts! (at least not out of the box)
That means that, when a new interrupt arrives while the os is still servicing another interrupt, the new interupt will be serviced only after the other one has finished and, depending on the
source of the interrupt and how well the driver for that interrupt has been written, that can take a considerable amount of time.
And no, interrupts are not spreaded over multiple cores, so no multitasking there...


 

Offline m k

  • Super Contributor
  • ***
  • Posts: 2446
  • Country: fi
Re: The death of Xon/Xoff flow control...
« Reply #81 on: September 16, 2024, 05:06:44 pm »
I have full control over the device which is receiving the data. It is the PC side which isn't dealing with xon/xoff properly and that can't be fixed in a way it will always work.

So you can trig when start bit happens?
Send Xoff then.

You'll still receive what is in the furthest buffer, so that you must handle.
After that you can send Xon.
If your Xoff wont go through in time you need more buffer.

If your Xoff is dealt by UART, no problem.
If USB, who knows yes, but if it can't receive your Xoff it can't send anything either, if it's not broken and flow control is on.

Maybe I didn't understand something.
You changed to xmodem protocol, so both ends must do something, since xmodem is not a default thing.
Similarly both ends can switch software flow control on.
Or are you saying that PC is not obeying your flow control setting at all?

software flow control test
copy file com1: to copy com1: con:
W10 -> W95 OSR2, no.
W95 -> W10, no.

W95 mode doesn't have flow control settings.

W95 HyperTerminal to W10 puTTY is fine, Xoff is supported.
W10 copy file com1: to W95 HyperTerminal is fine, Xoff is supported.

W95 HyperTerminal to W10 copy com1: con: is unknown, Xoff is not sent.
So keyboard is not con: anymore or I can't see the 1200 baud LED change.

After puTTY W10 COM settings are what puTTY defined.
Console mode baud alone defaults parity and data bits, but not flow control.

W95 is different, its DOSbox is not a box.

So native W10 clearly supports software flow control.

XP SP2 is like W10.
So it also supports software flow control natively.

Vista machine has no serial port.
Maybe that's a good thing.
Advance-Aneng-Appa-AVO-Beckman-Danbridge-Data Tech-Fluke-General Radio-H. W. Sullivan-Heathkit-HP-Kaise-Kyoritsu-Leeds & Northrup-Mastech-OR-X-REO-Simpson-Sinclair-Tektronix-Tokyo Rikosha-Topward-Triplett-Tritron-YFE
(plus lesser brands from the work shop of the world)
 

Offline nctnicoTopic starter

  • Super Contributor
  • ***
  • Posts: 27935
  • Country: nl
    • NCT Developments
Re: The death of Xon/Xoff flow control...
« Reply #82 on: September 16, 2024, 09:50:06 pm »
I have full control over the device which is receiving the data. It is the PC side which isn't dealing with xon/xoff properly and that can't be fixed in a way it will always work.

So you can trig when start bit happens?
Send Xoff then.

You'll still receive what is in the furthest buffer, so that you must handle.
After that you can send Xon.
If your Xoff wont go through in time you need more buffer.

If your Xoff is dealt by UART, no problem.
If USB, who knows yes, but if it can't receive your Xoff it can't send anything either, if it's not broken and flow control is on.

Maybe I didn't understand something.
You changed to xmodem protocol, so both ends must do something, since xmodem is not a default thing.
Similarly both ends can switch software flow control on.
Or are you saying that PC is not obeying your flow control setting at all?
Exactly. And the problem lies somewhere between the API I/O layer and the USB dongle doing the USB-to-UART conversion. A cursory look into the Linux kernel source shows that implementing flow control is handed over to the driver to deal with it which is not always the case (or may have gotten broken in the many Linux kernel refactoring cycles). So I decided to use a transfer method which is handled at the application level entirely and thus bypassing any possible brokeness in OS layers (including device drivers) and hardware. I have spend a couple of days on this problem and at some point I needed something which just works.
« Last Edit: September 16, 2024, 09:54:01 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6867
  • Country: fi
    • My home page and email address
Re: The death of Xon/Xoff flow control...
« Reply #83 on: September 17, 2024, 02:01:05 am »
My point is that it is not reasonable for the device manufacturers to expect desktop OS applications to be able to always respond within milliseconds, because they never have. The other option is to say that OSes are buggy, because they cannot guarantee such low latencies.  Both are valid viewpoints, as there are reasons for and against for both.

USB has always been designed to be cheap. USB devices are meant to be cheap and they are not expected to buffer more than a few packets worth of data, certainly not more than a (micro)frame worth. Hosts are supposed to poll them often enough for that not to be an issue. Host controllers are cheap too and they use system RAM for transfer buffers, and they have the important property that they accept packet queues in the form of linked lists of arbitrary length (not entirely true for periodic transfers before XHCI, but you could still queue many milliseconds ahead). It's the software's job to queue enough data buffers to cover its latency.

This is perfectly doable and hardly unique to SDR. When you play or record audio with a USB soundcard, you have many milliseconds worth of packets queued and you don't hear glitches when you resize a GUI window. Hell, even Windows probably gets it right or users would be whining loudly. There is no reason serial couldn't do it too.
Perhaps.  These USB devices are also prone to stop working (or being yanked out) at any moment, and it is even more important to not lock up (even temporarily) when that happens, or gobble up unlimited amounts of RAM (especially on routers and appliances with very limited amounts available).  You can see this all over the Linux USB device drivers, what with timeouts and automatic backoffs (1ms backoff/latency added whenever USB device NAKs a packet).
Thus, I'm hesitant to fully agree.  I'd need to see/hear it done right on a desktop OS first, to believe it can be done significantly better.

The wrench in the cogs in Linux/Android/Mac/BSDs is the tty layer.  It is traditionally capped at 4096 bytes of buffer, and just isn't designed for high throughput.  (You can observe this by creating a simple pseudoterminal pair, and push data through locally.  The bandwidth you can achieve is surprisingly low, compared to e.g. a socket pair or a pipe.)  When using USB 2.0 HS (480 Mbit/s) microcontrollers, it becomes a bottleneck.

Because of Teensy 4, I've played with a number of possible solutions, suitable for CDC ACM (and close enough converter chips like some PL2303 variants).  The bulk data is trivial via a simple character device driver, and configuration via ioctls (even reusing the termios tty ones).  Even a pair of vendor commands could be reasonably implemented via ioctls, with just the actual USB message format varied depending on vendor:product or quirk.  Notifications are what  has given me a bit of a pause, because you can easily get 8000 of them per second.  POSIX signals are an obvious choice, with dedicated ioctl receiving the off-band message, but considering the possible frequency, it isn't optimal.  Obtaining a secondary descriptor for notifications would be better, but this is unusual, so proper destruction especially when the bulk one is closed first gets a bit complicated.  And then there are more abstract issues, like whether packet boundaries should be respected with read(): yes allows for datagram approaches, no is better for bulk bandwidth.  Linux input subsystem (using whole struct input_event datagrams only) shows how useful the datagram approach is, being rock solid for a quarter of a century now, and yet easily extended.  And we do have both datagram and stream local/UNIX domain sockets for a reason.

Sorry, that's nonsense.  You seem to confuse task scheduling with interrupt handling.
You seem to forget the effect of the tty layer and softirq handling in between the userspace process and the USB device.

Because this is not a Linux kernel developer forum, I'm not trying to be technically correct:  I'm trying to describe the observable behaviour of the system, not its implementation details.  Just pointing out I'm technically wrong is not helping anyone: at minimum, you should provide the correct description.

In particular, see for yourself the difference between CONFIG_HZ=250 and CONFIG_HZ=1000, specifically the difference in latency from the USB device point of view when using USB serial (CDC ACM or vendor driver, doesn't matter).  If we take your post as the whole truth, then there would be no difference, would there?

I do err quite often, and I'm ready to acknowledge my errors.  Fact is, if we ignore the liberties I've taken with the technical descriptions, I still stand by the cause-and-effect chain/interactions I've tried to explain here.  Whether it is the actual interrupt latency, or the latency at which the softirq task in the tty layer manages to push data forwards allowing further transfers to occur, is simply irrelevant to me at the level of complexity/abstraction we're discussing here.  If you can describe it better, please do, and don't just strike down others who try.
 

Offline Karel

  • Super Contributor
  • ***
  • Posts: 2266
  • Country: 00
Re: The death of Xon/Xoff flow control...
« Reply #84 on: September 17, 2024, 06:31:19 am »
The wrench in the cogs in Linux/Android/Mac/BSDs is the tty layer.  It is traditionally capped at 4096 bytes of buffer, and just isn't designed for high throughput.  (You can observe this by creating a simple pseudoterminal pair, and push data through locally.  The bandwidth you can achieve is surprisingly low, compared to e.g. a socket pair or a pipe.)  When using USB 2.0 HS (480 Mbit/s) microcontrollers, it becomes a bottleneck.

The "real" tty driver uses a flipbuffer of 4096 bytes (2 x 4096 bytes). When one is full (or a timeout occurred), it's handed
over to the applicationn layer and it starts filling up the other buffer (it flips the buffer).

The "virtual/usb" tty driver uses the same mechanism but instead of a flipbuffer of 2x 4096 bytes, it uses
10 buffers of 4096 bytes (in a fifo setup). This way, the throughput of the old tty driver is enhanced a lot.
 

Offline Andy Chee

  • Super Contributor
  • ***
  • Posts: 1114
  • Country: au
Re: The death of Xon/Xoff flow control...
« Reply #85 on: September 17, 2024, 06:44:09 am »
Exactly. And the problem lies somewhere between the API I/O layer and the USB dongle doing the USB-to-UART conversion.
In four pages of discussion, you have yet to mention what baud rate you are running at.

Does your Xon/Xoff problem exist at all baud rates?
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 7212
  • Country: pl
Re: The death of Xon/Xoff flow control...
« Reply #86 on: September 17, 2024, 07:01:50 am »
It exists at all baud rates fast enough to fill the target device's RX buffer in whatever time it takes to wait for the preceding flash write to complete, obviously :P

The OP's problem is that flow control apparently wasn't even enabled in his tests in the first place. And even if it were, a second problem exists that not all HW supports it.
 

Offline Andy Chee

  • Super Contributor
  • ***
  • Posts: 1114
  • Country: au
Re: The death of Xon/Xoff flow control...
« Reply #87 on: September 17, 2024, 07:10:48 am »
And even if it were, a second problem exists that not all HW supports it.
This seems ripe for a benchmarking experiment across multiple OS platforms (Win 98, Win XP, Win 7, macOS, Linux) and multiple USB-serial dongles (FTDI, PL2303, CP2102).

So what's needed is a standardised "black box" which can send/receive xon/xoff bytes, maybe a PIC micro LCD terminal with a 16 byte receive buffer, send Xoff when the buffer reaches 12 bytes full, send Xon when the buffer reaches 4 bytes full?
 

Offline nctnicoTopic starter

  • Super Contributor
  • ***
  • Posts: 27935
  • Country: nl
    • NCT Developments
Re: The death of Xon/Xoff flow control...
« Reply #88 on: September 17, 2024, 07:18:26 am »
It exists at all baud rates fast enough to fill the target device's RX buffer in whatever time it takes to wait for the preceding flash write to complete, obviously :P

The OP's problem is that flow control apparently wasn't even enabled in his tests in the first place. And even if it were, a second problem exists that not all HW supports it.
:palm: Flow control was enabled. Just not handled the way it should be.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 7212
  • Country: pl
Re: The death of Xon/Xoff flow control...
« Reply #89 on: September 17, 2024, 07:54:51 am »
OK, cool.

Neither Linux and Windows have working xon/xoff handshaking implemented. More precisely, the xon/xoff function is handed to the driver and (next to) none of the hardware manufacturers seems to have implemented this form of flow control.
So what sort of hardware have you tested and found not to implement flow control?

Because I have demonstrated that it clearly works as advertised on PL2303 dongles, and later also checked FTDI (as expected - one more byte sent, then it goes silent). I haven't tried built-in motherboard UARTs, but I would be highly surprised if they don't work. I haven't tried Windows either, but pretty sure the above vendors implement it too.

So far the only widespread UART that I know not to support it is CH341.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6867
  • Country: fi
    • My home page and email address
Re: The death of Xon/Xoff flow control...
« Reply #90 on: September 17, 2024, 10:22:35 am »
So far the only widespread UART that I know not to support it is CH341.
Actually, the Linux reverse-engineered ch341 driver does not (handle either RTS/CTS or XON/XOFF handshaking), but the WinChipHead CH341 Linux driver – it's provided as GPLv2 source code, not binaries  :-+ – has the code to support RTS/CTS handshaking on the device.  Windows users might wish to check the corresponding Windows driver, too, instead of using the built-in one.

I don't have any CH341-based converters myself to test and verify, though.  It could be it doesn't actually work in real life.  (Because of this, I didn't check the driver sources for anything suspicious, either.)

If that works, the support could be easily ported to the upstream kernel ch341 driver too, the licenses matching and all.  To enable RTS/CTS flow control, one needs to usb_control_msg(dev, pipe, 0x9A, USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT, 0x2727, 0x0101, NULL, 0, DEFAULT_TIMEOUT); and to disable it, usb_control_msg(dev, pipe, 0x9A, USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT, 0x2727, 0x0000, NULL, 0, DEFAULT_TIMEOUT);.  Both utterly normal USB vendor control messages.

So, it looks like for RTS/CTS, we're down to USB CDC ACM.  I suspect there are many that do not implement XON/XOFF; CH341 doesn't.
 
The following users thanked this post: SiliconWizard

Offline magic

  • Super Contributor
  • ***
  • Posts: 7212
  • Country: pl
Re: The death of Xon/Xoff flow control...
« Reply #91 on: September 17, 2024, 01:46:13 pm »
https://www.spinics.net/lists/linux-serial/msg21945.html
https://bugzilla.kernel.org/show_bug.cgi?id=197109
https://lore.kernel.org/all/20240905224326.7787-1-me@lodewillems.com/T/
 :-// :wtf: :popcorn:

The vendor driver is GPL, but it uses magic numbers everywhere so probably it wouldn't be accepted upstream, and for some reason either WCH doesn't want to work on improving the existing driver, or some maintainer or another doesn't want their contributions for some reason, or they CBA to clean up their code. I don't know what's going on, but it looks like a case of problem(s) sitting between keyboard and monitor.

(In addition to the fairly anemic response to 3rd parties trying to fix the upstream driver, as shown above.)
« Last Edit: September 17, 2024, 02:04:32 pm by magic »
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6867
  • Country: fi
    • My home page and email address
Re: The death of Xon/Xoff flow control...
« Reply #92 on: September 17, 2024, 03:43:07 pm »
I don't know what's going on, but it looks like a case of problem(s) sitting between keyboard and monitor.
Yep.  Note how I didn't say it would be easy to push those changes upstream.  Patches along the same lines for ch341 were first submitted in 2016, AFAICT.

The subsystem maintainer is Johan Hovold, and the proper list for this is linux-usb (MARC archives).  I suspect it would take someone willing to obtain a number of ch341 variants for verifying the changes work on them as well, revamping the driver to better conform to current USB driver style (there's a lot of unnecessary old baggage there), and work with the maintainer (Hovold, and Greg K-H for pushing the changes to -stable) at least medium-term to get it up to snuff.  Starting with following the proper patch submission procedure, especially the checklist.

(They're not grumpy, either one, but it takes a resilient person to bug the list once per week until responded, then work through every single complaint or suggestion.  It can be frustrating work for very little gain, unless someone does it to gain linux kernel developer experience, and the know-how on how to go about upstreaming stuff.  They will be blunt, and you will be ignored at first, so gentle but insistent perseverence and resilience is absolutely necessary.)

The reason WCH doesn't do this is because they don't have to.  Paying a Linux kernel dev to do this would not increase their business any.  I don't know if they have samples of different (incl. past) versions of ch341 chips for testing, but since they bothered to GPL their driver, they might help if it doesn't cost them anything.

For the devices I have, I prefer to use Teensy 4's (made into ad-hoc USB serial converters).  Most of the SBC and router consoles are RX+TX only, no RTS/CTS flow control, so I like to use the large buffers to catch everything even if I don't have an application open on the port.  And isolators sprinkled here and there, all willy-nilly.
« Last Edit: September 17, 2024, 03:47:19 pm by Nominal Animal »
 

Offline nctnicoTopic starter

  • Super Contributor
  • ***
  • Posts: 27935
  • Country: nl
    • NCT Developments
Re: The death of Xon/Xoff flow control...
« Reply #93 on: September 17, 2024, 03:50:17 pm »
OK, cool.

Neither Linux and Windows have working xon/xoff handshaking implemented. More precisely, the xon/xoff function is handed to the driver and (next to) none of the hardware manufacturers seems to have implemented this form of flow control.
So what sort of hardware have you tested and found not to implement flow control?
I'm using CP20x USB-UART converters.

I recall some of the posts you linked to and while you can argue broken drivers shouldn't exist, the reason they exist is because people appearantly no longer care about having working flow control. This may sound like a big leap to you but I really like to be sure that whatever I send out into the field, works with 99.99% of hardware / software so there is no dissapointment from the user's side if that can be avoided. Better safe than sorry.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 7212
  • Country: pl
Re: The death of Xon/Xoff flow control...
« Reply #94 on: September 17, 2024, 06:36:55 pm »
The reason WCH doesn't do this is because they don't have to.  Paying a Linux kernel dev to do this would not increase their business any.  I don't know if they have samples of different (incl. past) versions of ch341 chips for testing, but since they bothered to GPL their driver, they might help if it doesn't cost them anything.
Well, they are maintaining this out of tree driver and adding new kernel version #ifdefs every now and then. So it already costs them something and there is apparently a business case for having a reasonably functional Linux driver.

So what sort of hardware have you tested and found not to implement flow control?
I'm using CP20x USB-UART converters.
Hmm, did you mean Silicon Labs CP210x?

Their datasheets advertise "hardware Xon/Xoff handshaking supported" and something of that sort appears to be implemented in the Linux driver. Theoretically, it may be buggy, but IMO there is at least an equally good chance that your software configuration was simply wrong, as it is my direct experience that getting it right is tricky.

I would test this myself right now if I had this hardware, but I don't, so I can't.
« Last Edit: September 17, 2024, 06:48:58 pm by magic »
 

Online iMo

  • Super Contributor
  • ***
  • Posts: 5154
  • Country: bt
Re: The death of Xon/Xoff flow control...
« Reply #95 on: September 17, 2024, 07:11:55 pm »
Here is an older FTDI pdf on Xon/Xoff..
Readers discretion is advised..
 

Offline nctnicoTopic starter

  • Super Contributor
  • ***
  • Posts: 27935
  • Country: nl
    • NCT Developments
Re: The death of Xon/Xoff flow control...
« Reply #96 on: September 17, 2024, 10:30:59 pm »
The reason WCH doesn't do this is because they don't have to.  Paying a Linux kernel dev to do this would not increase their business any.  I don't know if they have samples of different (incl. past) versions of ch341 chips for testing, but since they bothered to GPL their driver, they might help if it doesn't cost them anything.
Well, they are maintaining this out of tree driver and adding new kernel version #ifdefs every now and then. So it already costs them something and there is apparently a business case for having a reasonably functional Linux driver.

So what sort of hardware have you tested and found not to implement flow control?
I'm using CP20x USB-UART converters.
Hmm, did you mean Silicon Labs CP210x?

Their datasheets advertise "hardware Xon/Xoff handshaking supported" and something of that sort appears to be implemented in the Linux driver. Theoretically, it may be buggy, but IMO there is at least an equally good chance that your software configuration was simply wrong, as it is my direct experience that getting it right is tricky.
How can setting a program to use software flow control go wrong? Just set 'software flow control' to enabled; it shouldn't be any more difficult than that. Especially since I tested several different terminal emulators which all have the exact same problem. I can get that 1 program might fail due to lack of testing, but minicom and Putty are pretty solid in my book and those failed as well.
« Last Edit: September 17, 2024, 10:33:49 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6867
  • Country: fi
    • My home page and email address
Re: The death of Xon/Xoff flow control...
« Reply #97 on: September 17, 2024, 10:35:27 pm »
The reason WCH doesn't do this is because they don't have to.  Paying a Linux kernel dev to do this would not increase their business any.  I don't know if they have samples of different (incl. past) versions of ch341 chips for testing, but since they bothered to GPL their driver, they might help if it doesn't cost them anything.
Well, they are maintaining this out of tree driver and adding new kernel version #ifdefs every now and then. So it already costs them something and there is apparently a business case for having a reasonably functional Linux driver.
Implementing a working USB serial driver is easy and cheap: there are many already in the upstream kernel to look at as a guide.  It is finessing the driver into a form that is acceptable to upstream that is more expensive, because it requires someone to persevere and commit to the project until it is in acceptable shape and included – plus stay on afterwards, enough to respond to any bug reports and to add the new device IDs and such.

It is not that expensive, and there are already SoC vendors like RockChip who employ such developers because it makes financial sense.  In WinChipHead's case, they already have an in-tree driver (that does not support hardware flow control, and apparently has other issues as well, for example with break line condition) and their own out-of-tree GPL2 source driver, which makes it hard to justify any further investment.

Of course, if they had such a developer to support all their chips, with that developer having support from the company to do the job properly, upstreaming new device support and so on, they could change the perception of the company and their products among kernel developers and systems integrators (those building their own "distros" for appliances like routers and such).  I think it would be worth it within a year or two, but me and business planning aren't friends anymore, so I could easily be wrong.

In a way, it might be fun to do, actually.  The hardest part would be to obtain samples of the various versions of the chips, and actually testing the "fixed" driver works correctly over the whole known range, with some automated test benches or data flow tests.  That sort of hardware testing would also be very much appreciated by the subsystem maintainers, and make patch submission much easier; it looks like a major issue with past patch submissions was the inability to check whether they negatively affected other versions of the ch341 devices.

It would also be fun to find out how the USB Implementers Forum would react to the suggested additions for flow control configuration to USB CDC 1.2 spec.  I very much doubt they'd even read emails from a non-member, much less an open source developer.  (WCH = Nanjing Qinheng Microelectronics Co., Ltd., vendor 0x1A86.)  I don't think it'd succeed, but again, I'm often wrong.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6867
  • Country: fi
    • My home page and email address
Re: The death of Xon/Xoff flow control...
« Reply #98 on: September 17, 2024, 10:55:13 pm »
How can setting a program to use software flow control go wrong?
The Linux CP210x driver does support XON/XOFF flow control, and with any XON and XOFF characters, too.  When you plug in the converter, could you check if
    stty -a -F /dev/ttyconverter
output contains start = ^Q; stop = ^S; ?  It should, but if not, it'd be the reason for the flow control issue.

If not, then in Linux, you need to set them:
    stty -F /dev/ttyconverter start '^Q' stop '^S' ixon ixoff
because obviously the terminal applications won't.  This is easiest to do in a suitable udev rule matching the device, and adding RUN:="/sbin/stty -F /dev/%k start '^Q' stop '^S' ixon ixoff" (or other suitable termios settings).

However, the driver mentions erratum CP2102N_E104, saying that CP2102N in QFN20, QFN24, QFN28 packages with firmware versions up to 0x10004 do not support flow control.  I hope your device is not a CP2102N?
 

Offline nctnicoTopic starter

  • Super Contributor
  • ***
  • Posts: 27935
  • Country: nl
    • NCT Developments
Re: The death of Xon/Xoff flow control...
« Reply #99 on: September 17, 2024, 11:18:09 pm »
Output looks OK to me:

stty -a -F /dev/ttyUSB0
speed 115200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W;
lnext = ^V; discard = ^O; min = 1; time = 0;
-parenb -parodd -cmspar cs8 hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc -ixany -imaxbel -iutf8
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke -flusho -extproc

The chip used is a CP2103
With xon/xoff flow control enabled the xon/xoff characters are absorbed somewhere but they just don't have any effect.

I don't think you should rule out there is a bug in the specific kernel version I'm using even though it is a vanilla one from Debian. I've seen some weird crap where device drivers where partially rewritten to support new kernel data structures and others where not.
« Last Edit: September 17, 2024, 11:20:41 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf