If you only want to access your Linux machine's screen remotely then what is wrong with something like x11vnc or xrdp?
I'm developing x11 terminals { mechanics, hardware, firmware, kernel-support, userspace-support, apps } that must work exactly as "screen/tmus" works, i.e. offer the possibility of being able to allow you to run X11 programs on a remote host, direct their display to your local machine (my X11 terminal), and then to disconnect from these programs and reconnect from the same or another machine(s), without losing any state.
That's "screen for X11", a persistent connection.
That has the advantage that it will also work with modern applications like browsers (or stuff like Electron) which don't work over X11 remote connections anymore due to the ancient OpenGL version supported by the X11 extensions required to make OpenGL work across network.
I don't know if I will support OpenGL, anyway, Web-browers are always problematic for X11.
They require too much bandwidth, they move too many tcp/ip connections: luckily I don't have to support any browser(1), or video player, only X11 applications like { Geany, nedit, ... or even things that rely on openMotif } therefore very light.
(1) must be "distraction free": no browsers/Youtube, no socials, ...
Expecting that you will find something with no dependencies but that implements stuff like literally an entire X protocol, has built-in web client, can interact with the various desktop session protocols, forwards stuff like audio, webcams, etc. - good luck. Get real, that ain't gonna happen. Nobody is going to re-implement things like that from scratch (and introduce myriads of bugs in the process) instead of using existing and tested libraries.
This is precisely the point:
to clean up and eliminate everything that is superfluous, physically clone the repo, remove the not wanted code(2), purge Python parts, distill the project, and package it in a "minimal" and "essential" form
(2) remove[]={ built-in web client, interact with the various desktop session protocols, forwards stuff like audio, webcams, etc, ... }. Don't need/want these.
feel free to reinvent VNC or RDP, along with the image compression and encryption/authentication, it is your time
Actually, umm, I did a similar thing by "hacking" a couple of strange objects built in China that bought on eBay. They capture the XVGA video stream (1024x768, we are on 70Mhz pixel clock, so it is less than 20fps), compress it in H26*, and send it in TCP/IP packets on ethernet 1000Mbit/sec.
At the other end I built an ad-hoc application in C (fbcon/directfb) that intercepts the packets and reconstructs the transmitted information on /dev/fb0.
It is a very crude system, it works directly in contact with the kernel services, I recently also added audio (remote-out, local-in) and remote keyboard/mouse (local-in, remote-out to kernel /dev/input).
The problem is: it somehow works, though.... with three of those things connected to the router I'm literally "pissing off" the OpenSpace guys