Products > Test Equipment

Pocket-Sized 6 GHz 1 TS/s ET Scope

<< < (97/107) > >>

SJL-Instruments:
So, we found the root cause of the issue.

First, we misread your reply #477, and thought that the commands themselves would cause a lockup. We can confirm that running the commands, in conjunction with our software, reproduces the behavior you're seeing. Sorry about that.

The main problem is that one of your commands sets the CDF tolerance (% command) to 0.000007, and our software does not reset it to the default (0.01) on startup. We will implement this in the next revision. This solves the "first lockup mode" you're describing, where the software connects but does not trigger.

The reason this causes a lockup is that the timeout for the R command is scaled based on the CDF tolerance (the expected amount of time needed to reach the specified tolerance), and we did not put a cap on this. :palm: In v15 firmware we will get rid of this, and allow the user to specify a max timeout manually, in seconds.

This change will solve the "second lockup mode" to some degree, but not completely. As an example, at very low trigger rates (100 Hz), each R command may need 5 seconds to gather enough data. It's possible to queue up hundreds of these read commands in the serial buffer, effectively "locking up" the scope for several minutes.

This behavior is technically "as designed," but the end result is not desirable. We could of course put a hard cap on the timeout, or decrease the buffer size, but this artificially limits the capability of the scope, and does not completely get rid of the issue. Another idea is to limit the total "queued time" of buffered commands, but the time taken is data-dependent and not predictable. If you have a preferred way to solve this issue, let us know.

On another note, this behavior is unchanged from v13. We did not find it, since our tests did not try fuzzing in conjunction with running the software. We will make this mode part of our testing in the future. The necessary conditions are for the CDF tolerance to be very low (<10 ppm), and the max samples to be very high (>3 million), which is difficult to find with random bytes.

We did check the firmware for any buffer overflow issues, and did not find any. Any commands that would overflow the buffer will be dropped, but should not cause a lockup.


--- Quote from: joeqsmith on March 25, 2024, 11:44:12 pm ---Not that it helps, but on the graphing, LabView allows you to have multiple markers per axis on a graph.  The horizontal would normally be 0 - 10 for example, but I can have a second scale of say 10n - 10.5n on that same axis.   I can also manually scale the graphs.  So, I send the data to the graph and set the min and max horizontal to a half sample from the actual min/max.  I then set the second scale to what the actual time is.  There's no math or anything to track.  It's very clean to do it this way and I don't have a dead spot on the graph which is sounds like you will have based on your description.  If I tell the scope to sweep from 10-11, I am not expecting the graph to be from 9-12 with data only showing up between 10 and 11.  I expect it to start at 10 and stop at 11.   

--- End quote ---
We've attached a screenshot of the updated scheme - this should be identical to your proposal in #478.

joeqsmith:
One possible solution would be to add a watchdog.  The software would send down a refresh command at some minimum rate.  The firmware would require this command, or any other communications at this minimum rate or it would return to the default power up state.  The firmware can't really count on the calibration command as the user may not want to send it.  The refresh command may not even have an acknowledge returned to the PC.

You may want some sort of fault status that the software could read to determine if a reset occurred.   


--- Quote ---It's possible to queue up hundreds of these read commands in the serial buffer, effectively "locking up" the scope for several minutes.
--- End quote ---

That's fine as long as the PC would continue to send the refresh during this time.  If something went wrong and no data was being sent from the PC, the firmware would reset to the default state.   The timeout could be in the several seconds.  We are not trying to waste a lot of time with this sanity check.  We just want a way to recover if there is a major fault. 

I'm sure there are many ways to skin this one.   Give it some thought.

SJL-Instruments:
After asking the good folks on the MCU forum, seems like the consensus, and industry-standard implementation, is to scrap the FIFO entirely (or reduce its length to 1). In our specific case, there is no extra throughput gain to be had by having a FIFO length of 2 or higher, and a FIFO length of 1 is sufficient if the end-user program can be mapped to a state machine. This avoids the lockup problem entirely.

There is a question of whether to actually remove the FIFO, or just remove it as part of the documented interface. Both options would behave identically as long as the user implementation follows the documentation (wait for response to start before sending next command). Removing it may break backwards compatibility - we are leaning towards keeping it for this reason. If it is kept, we would add a note in the documentation in case anybody does run into the lockup.

joeqsmith:
I had changed to a full handshake some time ago and don't see a problem with it working as described.  As long as everything is documented, should be good to go.  I have added the tol command to my software as well and adding the Y to the R command. 

***


--- Quote ---The reason this causes a lockup is that the timeout for the R command is scaled based on the CDF tolerance (the expected amount of time needed to reach the specified tolerance), and we did not put a cap on this. :palm: In v15 firmware we will get rid of this, and allow the user to specify a max timeout manually, in seconds.

--- End quote ---

Consider that what ever you come up with, there needs to be a way to abort a command (without pulling the USB cable).   As I mentioned, I use a full handshake today.  However, if the scope doesn't respond in N time, I do some sort of recovery.  Normally, resending the last command.  There isn't any mention in the manual how you want to handle such cases. 

Kean:

--- Quote from: SJL-Instruments on March 27, 2024, 01:52:11 am ---There is a question of whether to actually remove the FIFO, or just remove it as part of the documented interface.

--- End quote ---

Are you able to detect software disconnection of the communication port to flush the FIFO.  e.g. via the CP2102 hardware handshaking signals or maybe the SUSPEND pins.
Not sure if SUSPEND is useful, as that probably cannot be triggered under app software control, and only by host power management.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

There was an error while thanking
Thanking...
Go to full version
Powered by SMFPacks Advanced Attachments Uploader Mod