Computing > Programming

any examples of OS not written in C/C++?

<< < (39/39)

I think the ARM1 was mostly clean because it was new and minimalist. Newer ARM cores are somewhat more raggedy.

As for optimisers, that's mostly GCC which turns out rubbish to be fair. I haven't noticed clang do anything quite as terrible at high optimisation levels. I think my favourite one was GCC unrolling a variable cycle loop. Not sure how it worked out that was possible  :-//

I am on the fence with respect to "bugs". Some are programming errors, some are language specification error but the majority in experience are actually not understanding the task that is required to start with to a sufficient degree and winging it hoping that if the problem is solvable that you do enough of it to get paid.


--- Quote from: bd139 on May 05, 2021, 10:35:38 am ---Best use for zero page on the 6502 is stack as it’s cheaper to access it.

--- End quote ---

Ahhh .. I'd contest that.

Ok, you could use X as a kind of stack pointer into zero page and do a PUSH A as "STA $00,X; DEX" and POP A as "INX; LDA $00,X". But those are both 4 + 2 = 6 cycles while the actual PHA is 3 cycles and PLA is 4 cycles. Plus you're using 3 bytes of code instead of 1. And you lose a scarce register! Hard to see the point.

The *only* advantage is you can use LDA $01,X STA $02,X etc to directly access stack elements other than the top one, which you can't do with the real stack.

But with the real stack you can do TSX then LDA $0100,X or STA $0101,X etc if and when needed as a cost of 2 cycles for the TSX and 4 cycles for loads or 5 for stores -- the same timing as for Zero Page for loads, and 1 cycle more for stores.

--- Quote ---I think ARM started as 32 bit 6502 with orthogonal register utility.

--- End quote ---

Not at all. The ways in which 6502 influenced ARM are well documented as:

1) the 6502 uses basically every available bus cycle. You can mostly understand the execution time of programs as being the number of bytes of instruction fetch, loads, and stores they do, with just a couple of exceptions (2 cycle minimum, and extra cycle for indexed addressing modes that need a carry from low byte to hight byte)

2) Wilson and Furber visited the Western Design Center to discuss them designing a 32 bit chip for Acorn. The WDC weren't interested, but the Acorn people were stunned at how few employees there were and figured that if WDC could design CPUs then so could they.

--- Quote ---Compiler should make register allocation decisions based on what it’s compiling. Itanium was designed around that concept. The actual architecture was impenetrable by humans. We probably should let the machines design the ISA at this point like we do with the silicon.

--- End quote ---

I disagree with that too :-)  But then I'm spending quite a lot of my time helping design and evaluate RISC-V ISA extensions.

Certainly it's very useful to use the machine to help you sift through existing code looking for improvement opportunities and evaluating the magnitude of them, rather than just going off hunches, but there's still a heck of a lot of good taste involved.

Some of the best (or worst) 6502 code I saw was the BASIC interpreter - 8kbyte of spaghetti (I didn't see the source, just disassembled it). Amazing that it packed all the functions in, dynamic strings etc. I had the interesting task of implementing a sliding windows error correcting protocol (X.PC) on a C64. It had to play nicely with the BASIC because the application was written in BASIC. It was possible to disable the BASIC ROM and access the 8K of RAM that occupied the same address space. So most of the code and buffers were 'invisible'. The O/S had to be hacked also, to implement a bit bashing 1200 baud software UART (the C64 doesn't have one).

I agree that ARM architecture is better handled by HLL.

Multithreading on a processor is pretty straightforward. When a process is interrupted, some of the state has to be saved so it can resume without corruption. If the interrupt moves that state somewhere, it can then enable interrupts and be interrupted itself. It just has to put back the state of the interrupted process on completion and return. I am using the technique to deal with NMEA messages from a GPS module. There is a background process that does some calculation every second, and an interrupt routine to buffer incoming NMEA characters as each character is received. When a full NMEA message is buffered, the character receiver saves the state of the background, branches to the message handler and enables interrupts. That allows the message to be parsed while another buffer is being received. After parsing, the background process state is put back and the parser returns. The background process eventually picks up the parsed data. There is a timing issue that the parser has to complete parsing a message before the next buffer is filled. In real life there's plenty of cycles to spare.

I wanted to run the PIC16F1455 from a precise voltage. The MAX6350 provides 5V up to 15mA, stable to 4uV/C. Wondered if that was enough. Measured the processor current, up to 5mA at times.  Cheering.

I love linear addressing. Seriously, I love it!

I am playing with two 8bit MPUs attached to a shared ram. 2Kbyte of dual port ram, 32Kbyte of ram, 8Kbyte of rom.

This toy needs a kernel, a pico kernel, a femto kernel, even only a simply scheduler, something to sort out a very wild parallel computing.

Each node runs four tasks cooperative scheduled, the node-A is a producer, the node-B is a consumer, and the dual-port between them is what they use to communicate, push, and pop data.

I am still programming everything in assembly.


[0] Message Index

[*] Previous page

There was an error while thanking
Go to full version