Author Topic: make tmux & screen able to survive socket failure during session attached  (Read 2377 times)

0 Members and 1 Guest are viewing this topic.

Online DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4452
  • Country: gb
Re: make tmux & screen able to survive socket failure during session attached
« Reply #25 on: January 17, 2025, 09:21:39 pm »
Yes, rsp+, rb532a and rbm33g  :o :o :o

I need to buy a mini-rack, 10"
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4452
  • Country: gb
Re: make tmux & screen able to survive socket failure during session attached
« Reply #26 on: January 18, 2025, 02:33:40 pm »
More specifically, Linux uses the standard Unix/POSIX termios, where (by default)
  • EOT (ASCII 004, Ctrl+D; termios .c_cc[VEOF]) flushes the outgoing buffer; and if sent at the beginning of a line, read() will return 0, signifying end-of-file/end-of-stream/end-of-transmission (assuming termios ICANON is set)
  • ETX (ASCII 003, Ctrl+C; termios .c_cc[VINTR]) causes an INT signal to be sent to the processes having this as their controlling terminal (assuming termios ISIG is set)
  • FS (ASCII 034; termios .c_cc[VQUIT]) causes a QUIT signal to be sent to the processes having this as their controlling terminal (assuming termios ISIG is set)
  • SUB (ASCII 032, Ctrl+Z; termios .c_cc[VSUSP]) causes an TSTP signal to be sent to the processes having this as their controlling terminal (assuming termios ISIG is set)
The standard SSH keepalive is sending null packets, essentially transferring no payload data but ensuring the other end is still in contact.


Code: [Select]
20250118 12:32:34 Waiting for end of input or terminal to close.
20250118 12:32:37 SIGINT, Interrupted system call.

rsp+ registered this during the experiment.
Screen lost the session, the attached process got terminated  :o :o :o
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7303
  • Country: fi
    • My home page and email address
Re: make tmux & screen able to survive socket failure during session attached
« Reply #27 on: January 18, 2025, 03:24:30 pm »
Code: [Select]
20250118 12:32:34 Waiting for end of input or terminal to close.
20250118 12:32:37 SIGINT, Interrupted system call.
Ewww! :wtf:

Yeah, MobaXterm (or something) injects an ETX (\x03) when the connection is lost, unless the SSH Keepalive setting is enabled.

(In theory, it could be somewhere else in the process chain, but because the setting seems to fix the issue, I'd wager it is MobaXterm.  There is a possibility it is actually sshd on the Linux side, if the actual cause is that without keepalive, MobaXterm may send incomplete SSH packets; i.e. that the SSH server kills the connection due to what it sees as bad data; the reason keepalive would fix this is that it somehow ensures that application-internal buffers are flushed to the TCP/IP connection through the TLS layer at packet boundaries... but to verify, I'd need to see MobaXterm sources, which aren't publicly available.  Anyway, if one installed the handlers as SA_SIGINFO, with (int signo, siginfo_t *info, void *context), then info->si_code would be SI_KERNEL if it indeed came from the tty layer, or SI_USER if it was sent by kill with info->si_pid identifying the sending process.)



I posted the program and this post just to show the technique I use to solve these kinds of mysteries.  It is a simple procedure, but can take an annoying amount of time (due to having to write the test program, and then reproduce the problem, then refine the catching program iteratively to find out further info –– myself not limited by spent time much nowadays), and it relies on reliably reproducing the issue.  One starts by identifying the way the problem occurs –– here, what signal kills the process; what kind of situation causes the target process to exit ––, then testing exactly how that happens (by obtaining further info from the killing signal), and finally by isolating the cause of that signal.

In this case, we skipped directly to isolating the cause.  I knew the error message was associated with MobaXterm (I know it because it has pretty good remote graphics support; X11, RDP, and VNC at least), and a quick search associated it with SSH keepalive, so I discovered the configuration setting involved.

(A decade to almost two ago, I volunteered maintaining an installation of IHMC CmapServer – part of CmapTools written in Java – for an university department of education, and often had to use a Windows workstation to connect to the Linux server using X11.  I might have used MobaXterm around 2011, but am not absolutely certain.  That installation was quite funky, too: the original instructions wanted to install the closed-source Java stuff as root on Linux, but paranoid me will not!  So, I ran it as a non-privileged user in a sandbox-like environment, with Apache as a reverse proxy and providing SSL/TLS security for connections to the service.  Worked like a charm quite reliably for years.  Nowadays, I'd absolutely recommend implementing the client side in HTML+CSS+JavaScript with active server connections using WebSockets, with server side WebSocket stuff (collaboration) proxied via Nginx/Apache to a service daemon run unprivileged; could even write that in Python for portability.  No Java.)
 
The following users thanked this post: DiTBho

Online ejeffrey

  • Super Contributor
  • ***
  • Posts: 4060
  • Country: us
Re: make tmux & screen able to survive socket failure during session attached
« Reply #28 on: January 18, 2025, 05:57:48 pm »
Yes if a connection is dropped without being reset or closed, the screen client can stick around for a long time.
The default behavior of screen is to reattach to the first available disconnected screen instance.
You need to use the -d or -D option to detach the zombie client.

GNU/screen was emerged without -d/-D options.
Re-emerged. Now, it sucessfully resumes sessions!


However ...
... using MobaXterm (professional edition v21.5) as terminal on Windows 10 ... there's still something wrong.
In this case, something kills the attached process.

e.g. if under MabaXterm, connected to the remote GNU/Linux server via ssh, I run screen -R dev1, and then nano file.txt, ... if the network drops, the process GNU/nano is killed.

This is really weird,

I just found out that this only happens with a mobaxterm terminal on Windows.

I will try later to grab more information and post here. I can't access that laptop at the moment.

Frankly if it is possible you "accidentally" had a version of gnu screen that didn't support the -d/-D arguments your system is insane and there is no telling what other piece of software could be built or configured incorrectly. 

The terminal emulator is definitely suspect, but the dependence on SSH keepalives settings means it could also be a server side problem. 

Try installing a bog standard Debian system and see if you have the same problem with a mobaxterm client?
 

Online DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4452
  • Country: gb
Re: make tmux & screen able to survive socket failure during session attached
« Reply #29 on: January 18, 2025, 06:44:31 pm »
Frankly if it is possible you "accidentally" had a version of gnu screen that didn't support the -d/-D arguments your system is insane and there is no telling what other piece of software could be built or configured incorrectly. 

Umm, according to the Diary of Catalyst, app-misc/screen was minimally profiled on December 5, 2009, and since then compiled accordingly with the minimum possible features. A sort of "light" version, which, for all these years, using home network with less than 10% of packet loss, never caused any problem.

The problems started a few months ago, and most likely it is time to replace one or both wifi routers in the two rack cabinets that act as a bridge between where I am now, and where the remote computers are.

When you have a list of USE-flags, and patches to be applied on several multi-libs - uc-libc is still as experimental as problematic on mips - it might happen that you want to compile stuff with the minimale dependencies and features possible.

For example. We have been working on this issue for 20 years, and GNU/Linux on ip32r10k still has serious problems related to the speculative execution of mips r10k on a platform that is not cache-coherent ...

So, most likely I copied and pasted the minimum GNU/Screen profile from there, as Screen and Bash are the only two things we run on the experimantal ram-rootfs kernels (init=/bin/bash), so most likely I applyied it to the other mips profiles of GNU/screen, which are instead stable and can be emerged full features, plus optimized for performances instead of for minimal space to reduce the cache pressure.

The terminal emulator is definitely suspect, but the dependence on SSH keepalives settings means it could also be a server side problem. 

=net-misc/openssh-* from v5.5_p1-r2 (~2009) to v9.2_p1-r2 (~2024) to be investigated, I experimented the same behavior.
Anyway, I have never had any problems using a GNU/Linux terminal.
The problem only occured when using MobaXterm without the KeepAlive option enabled.

I should try another Windows terminals.
I will install it sooner or later. I don't always have my laptop available, it's a company one.

Try installing a bog standard Debian system and see if you have the same problem with a mobaxterm client?

I don't "install GNU/Linux", I build everything from sources for different archs(1).

(1) { mipsII/be, mipsIII/be, mipsIV/be, mips32r2/be, mips32r2/le, ppc32/be, ppc64/be, hppa2/be }.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online ejeffrey

  • Super Contributor
  • ***
  • Posts: 4060
  • Country: us
Re: make tmux & screen able to survive socket failure during session attached
« Reply #30 on: January 18, 2025, 07:10:56 pm »

Umm, according to the Diary of Catalyst, app-misc/screen was minimally profiled on December 5, 2009, and since then compiled accordingly with the minimum possible features. A sort of "light" version, which, for all these years, using home network with less than 10% of packet loss, never caused any problem.

It's nothing about packet loss, it's about how you built screen with core features disabled.  Who knows what else is wrong with your screen, SSH, terminal subsystem, and shell configurations.

Quote
The problems started a few months ago, and most likely it is time to replace one or both wifi routers in the two rack cabinets that act as a bridge between where I am now, and where the remote computers are.


The only thing that is guaranteed to be the problem is the network or routers.  Screen, on a properly configured host is 100% immune to the problem you are having.  Packet loss, dropped connections, stalled connections,  or whatever.  That is the entire point of the screen program and it works correctly for everyone else.  The terminal client *could* be sending  garbage that causes it to disconnect but I seriously doubt it.  For no other reason that once the connection  is dropped the client isn't in a position  to send extra data even if it wanted to.

Quote
When you have a list of USE-flags, and patches to be applied on several multi-libs - uc-libc is still as experimental as problematic on mips - it might happen that you want to compile stuff with the minimale dependencies and features possible.

If you disable standard features because you don't know what they are, you are going to have problems.

Quote
I don't "install GNU/Linux", I build everything from sources for different archs(1).

(1) { mipsII/be, mipsIII/be, mipsIV/be, mips32r2/be, mips32r2/le, ppc32/be, ppc64/be, hppa2/be }.

That's exactly why I suggested you do so. If say it's 90% you have an improperly configured server and 10% the problem is the client.  The fact that it "seemed to work" in the past doesn't mean anything.  So the best course it to try against a known working system.
 

Online DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4452
  • Country: gb
Re: make tmux & screen able to survive socket failure during session attached
« Reply #31 on: January 18, 2025, 07:30:53 pm »
Quote
Screen, on a properly configured host is 100% immune to the problem you are having

And in fact it works perfectly with a GNU/Linux terminal!
An attached process is only terminated if I use MobaXterm as terminal, and only if I don't use the "keep aware" option.

Quote
If you disable standard features because you don't know what they are, you are going to have problems.

The Diary of Catalyst reports that /usr/sbin/sshd(net-misc/openssh) has always been compiled with the default profile, and runs with the default configuration.
« Last Edit: January 18, 2025, 07:44:40 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online DiTBhoTopic starter

  • Super Contributor
  • ***
  • Posts: 4452
  • Country: gb
Re: make tmux & screen able to survive socket failure during session attached
« Reply #32 on: January 20, 2025, 10:52:33 am »
Further experiment:
MobaXterm without "keepalive", manifested the same behavior on an attached process to GNU/Screen with dropbear as ssh-server.

Code: [Select]
net-misc/dropbear
      Latest version available: 2024.85-r2
      Latest version installed: 2024.85-r2
      Homepage:      https://matt.ucc.asn.au/dropbear/dropbear.html
      Description:   Small SSH 2 client/server designed for small memory environments

Ok, I think it's solved.
I have to work on other development stuff now.
« Last Edit: January 20, 2025, 10:54:55 am by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: Nominal Animal


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf