UpdateScope is up and working!4 channels are now working flawlessly. Still some improvements to do on annoying trigger horizontal positionning and zooming, though.
PerformanceI was wrong about the Cortex M7 performance.
After activating data cache and instruction cache, things got much better. M7 runs at 216 MHz, SDRAM at 108 MHz.
GraphicsI went from double buffering to triple buffering for the screen. Doing such, I avoid the "60FPS to 30FPS" drop each time the processing takes a little bit to long. Space in external SDRAM is not an issue.
Now, the FPS are much smoother, I'm always about 55-58 FPS, down to 40-45 FPS when using FFT or when one channel is noisy all over the screen. That's
without compilation optimisation (without the -Ofast).
Need to solve some frame buffer switching issue, I got some ghost images, probably a race condition to pointer switching.
The heaviest graphic operation is drawing the lines. With 4 channels, 700 pixel screen width, it's 2800 lines every 16.6ms. I'm using Bresenham's algorithm and I tried everything possible to go faster, (different algorithms, inline functions, playing with cache and burst of Cortex M7...) without success. And every AI suggestions was slower, we can keep our jobs

I'm thinking moving to LVGL for smoother graphics, but I'm afraid to loose a lot of performance.
RTOSI also move from my super messy big loop to
FreeRTOS. First time using. First step going to the middleware world.
I decided to go for it when I wanted to add calibration function working with the screen at the same time. I understood, that I will have to handle different tasks in parallel.
But it was a goal from the start of the project. I'm gonna need it too with USB and FAT.
Well, got it up and working after an afternoon.
I was relunctant to use AI, but here Gemini (the one from google start page) is actually incredible at helping from zero. Without it, it would have been 4-5 days, not half a day.
I'm concerned about the future of internet tbh. However, for deeper logic, he's still out. But for how long?
Well, dealing with parallel tasks is actually awesome. It simplifies things SO MUCH. It's just a bit brain-complicated about data access, task sleep/wakeup, semaphores and stuff.
I'm still getting a 70% CPU time only for graphics...
The way I draw my library is the following:
I use big "canvas"
static structures that contains all dots, lines, rectangles, BMP images, texts.
Inside any functions (or task, now) which need graphics, I just add object to this structure. I got several structures, corresponding to different graphics layers.
When the runnning functions end, I call a "draw frame" task, that browses all the structures and copy BMP or text fonts from FLASH to SDRAM frame buffer (using DMA2D and ChromART), draw the lines (Bresenham's algorithm), well doing all this stuff. Note for myself: need to add some semaphore of double buffering here.
Anyways this takes 70% CPU at 60 FPS.
I didn't mention it, but it's all in C. C++ is for later...
Bare-metalI'm still not using any STM32Cube ready stuff. I include the libraries and write the Cmake myself, to understand it all. I'm still bare-metal, without HAL.
It's to understand how everything is working and linked together.
In the future, I may change tactic.
HardwareI still got issues with matching oscilloscope probe to attenuation stage input. I got a simulation ruggning on LTspice, calculating everything, RLC of PCB lines, impedance of the cheap oscilloscope probe I'm using, lenght of cable, parasitic capacitance of diodes, relay, photomos... but still, practical results differs a little from simulation (but not so much!)
Actually, with the best setting I could got, I sampled with FPGA a
rising time of 20ns, without ringing. Which means about 20-30MHz bandwith, which I expected from start. It's ok for 100Msps.
Well, I ordered a bunch of NP0 picofarad capacitors (like 15, 18, 20, 22, 24...) and will try them out all. At least they are 0603 as input voltage might be high, it's gonna be a bliss to solder!
Also, I got issues with offset OPAMP. I'm using multiplexed DAC from the STM32F7 to generate an analog offset for all the channels before they reach the ADC (this helps to keep the 8-bits resolution, any digital offset would lead to only 7 bits - and I'm not even speaking of ENOB), anyway my fast 180MHz OPAMPs followers were oscillating, probably due to lenght of PCB traces and their capacity. I had the issue from the very first prototype, but then I decided to use an LM324, much slower OPAMP. And it started oscillating even more, very badly, at low freq.
But now, I tried using theses LM324
again, and everything is fine. Analog electronics are a mystery. Anyway, I've got a stable offset voltage.
Sorry for all the un-understandable stuff.
Here is a cute stupid phoro of the scope during calibrating stage.
I also wanted to add a cute image of a cat (which would get more likes than any of this stupid work) but there are already plenty over the internet.

Note1: painting in glossy black is a bad idea,
Note2: I'm still waiting for the custom stickers