So that you know, ICSP is "In Circuit Serial Programming", which is the protocol to program the firmware inside PIC MCUs
I know, and I actually know most of the ICSP commands various models of PICs use, too. Nevertheless, as I haven't made my own PIC ICSP dongle, I don't know the optimal abstract-command set. You mentioned you have, so you'd be better able to define that.
Maybe i have a misconception of what a driver does, but in my mind it's a small program that read/write from/to the peripheral and GPIO, but running at a higher "priority", and when required it either returns the function, or calls a callback so the software in userspace can go on
Yes. In Linux, the timing requirements mean you'd need to create a driver (specifically a character device driver), in the form of a kernel module.
Linux Device Drivers, 3rd Edition describes the implementation and background, although the in-kernel interfaces have changed somewhat. The idea is that the driver exposes a character device under
/dev that an userspace process (with suitable privileges, requirements configured via udev rules as the default is root-only) opens the device as normal, operates on it, the driver implementing the operations.
The userspace-facing interface has several mechanisms, but two are simple and recommended:
read()/
pread() and
write()/
pwrite() to read or write data from/to the device, starting at a specific offset or position (default is current file position); and
ioctl() messages with a single userspace memory structure as an input/output parameter. (This obviously suggests using read/pread/write/pwrite for Flash/RAM access, noting the offsets are 64 bit even on 32-bit architectures, and ioctl()s for commands and requests, but again, not having implemented a PIC ICSP dongle myself, I don't presume to claim that is the best option. In fact, the Linux event device interface –– exactly this kind of a character device, but one that reads/writes only complete
struct input_event structures, containing a kernel timestamp, unsigned 16-bit
type, unsigned 16-bit
code, and signed 32-bit
value –– uses reads and writes only, ignoring "file position", and although some idiots tried to move it to an ioctl(), the read/write model has been found best for Human Interface Device messages. A set of messages that occur at the same time are always separated by a single
.type=EV_SYN,.code=SYN_REPORT,.value=0 message.)
It is doable, yes. I recommend using
Bootlin Elixir Cross Referencer for investigating the Linux kernel source tree for specific kernel versions, especially when reading LDD3 (
2.6.39.4 tree), as it makes all identifiers links with cross references.
Compare to using a microcontroller with native full-speed (12 Mbit/s) or high-speed (480 Mbit/s) USB, implementing one or more USB Serial (USB CDC) endpoints. Again, my preference is
Teensy 4.0 with
Teensyduino (although bare metal is quite possible, as the bootloader is on a separate chip; only means you don't really have access to JTAG/SWD pins on the i.MX RT1062), with TXU0102 (UART) or TXU0304 (SPI) voltage translators, or ISO6721/7721 (UART) or ISO6741/7741 (SPI) isolators, both only needing a couple of 100nF supply bypass capacitors.
You implement a request-response (or command-response) protocol, noting that with high-speed USB, each data packet can have up to 512 bytes of payload data: that much data can appear "atomically" in your receive buffers. I recommend splitting longer data transfers into smaller packets, so you won't need larger buffers, although Teensy 4.x do have 1Mbyte of RAM (in two 512k parts).
On the host computer, you configure an udev rule to set the proper owner, group, and mode for the device node, and create a symlink (nobody wants to scan which
/dev/ttyACMN it might be!), say
/dev/usb-icsp-interface. You write an userspace application in whatever language you want (but I do recommend using
termios and not the serial libraries; they're all crappy in my opinion), but note that as explained in
man 2 read and
man 2 write, each such call can return a smaller value ("short count") than requested, in which case you simply need to retry with the rest of the data. In Python, you use
open(devicepath, "rb", buffering=0) to get raw I/O, so that the object will be of
io.RawIOBase class or a derivative; and
termios module to set it to raw mode. It is almost always done by obtaining the current settings, saving them for restoring just before closing the device and copying as the basis of the new settings. I can show you exact code if you have decided what programming language you'll be using.
In essence, moving the "icsp driver" code from kernel to a separate microcontroller you simplify the programming interface, as you aren't restricted to the kernel interfaces, nor do you need to consider any code running in parallel at all. You have full control and essentially full isolation. Depending on your needs, you can use a microcontroller from any number of families, from the cheap WCH CH55x to the NXP i.MX RT1062 used on Teensy 4, including many Microchip ones.
The main difference is that you need to convert from function calls to serialized command structures or requests and their responses passed along a bidirectional pipe, and on the MCU, implement the function from commands/requests received serially via USB.
The Linux userspace API for SPI devices is documented
here and in the
<linux/spi/spidev.h> header file, both part of the Linux kernel docs. It is based on 'spidev' character devices, and is rather limited. In particular,
read() and
write() are half-duplex, with the other side discarded, although that should be fine for a shared MOSI/MISO/DI/DO pin (which I only realized after writing the above).
If and only if that interface is sufficient and implemented for the SBC you want, you can use the SPI interface from an userspace application. Obviously, you can also use GPIO pins in conjunction, but with the timing caveat: you must allow for any operation to be delayed by up to several hundred milliseconds when the SBC is otherwise busy. The same applies to separate transfers over SPI, unless you use the
ioctl() interface with one or more
struct spi_ioc_transfer structures, in which case they are almost continuous and the chip select is kept asserted for all transfers in the same ioctl command. Changing the userspace process and I/O priority can shrink that to maybe two dozen milliseconds under load –– less than that when not loaded, of course ––, but then a bug in it causing a busy loop can make the machine nearly unusable; as in terminating that process take a couple of minutes.
Personally, I'd test the SPI interface on the SBC you consider using first, using a microcontroller as the SPI slave. For bulk data, I like to use the 32 high bits of the Xorshift64* pseudo-random number generator with userspace-specified 64-bit seed (any nonzero 64-bit state is okay); here, one test command would read the current seed and error counter, another set the seed and reset error counter, one send a sequence of data that the MCU would check against Xorshift64*, and another receive a sequence of data (userspace comparing to Xorshift64*). This is one of the rare generators that passes all BigCrush statistical tests for randomness, which not even Mersenne Twister does. The period is 2⁶⁴-1, which is sufficient. Plus, it is extremely fast on microcontrollers, consisting of (64-bit) bit shifts and exclusive-ors, with a final 64-bit multiplication (of which only 32 most significant bits are actually used) for mixing. This is what I used for USB Serial testing on Teensy 4.x, getting 200+ Mbit/s sustained indefinitely.
Only after such testing would I trust the SPI and spidev implementation for ICSP use. (It is not commonly enough used to be automatically trustworthy, in my opinion, you see.)