Not interested in winning some dick size competition. If RISC-V ends up in the middle of the pack and competitive on measures such as code size or number of instructions just by compiling straightforward C code in a wide variety of situations with no special effort then I'm perfectly content.
x86, 68k and VAX were all designed at a time when maximizing the productivity of the assembly language programmer was seen as one of the highest (if not actual highest) priorities. They'd gone past simply trying to make a computer that worked and even making the fastest computer and come to a point that computers were not only fast *enough* for many applications but had hit a speed plateau. (It's hard to believe now that Apple sold 1 MHz 6502 machines for over *seventeen* years, and the Apple //e alone for 11 years.)
The x86, 68k and VAX were all vastly easier for the assembly language programmer than their predecessors the 8080, 6800, and PDP-11 (or PDP-10). They also were better for compilers, though people still didn't trust them.
The RISC people came along and said "If you simplify the hardware in *this* way then you can build faster machines cheaper, compilers actually have an easier time making optimal code, and everyone will be using high level languages in future anyway".
A lot of that was because you had to calculate instruction latencies yourself and put dependent instructions far enough away that the result of the previous instruction was already available -- and not doing it meant not just that your program was not as efficient as possible but that it didn't work at all! Fortunately, that stage didn't last long, for two reasons: 1) your next generation CPU would have different latencies (sometimes longer as pipeline lengths increased), meaning old binaries would not work, and 2) as CPUs increased in MHz faster than memory did caches were introduced and then you couldn't predict whether a load would take 2 cycles or 10 and the same code had to be able to cope with 10 but run faster when you got a cache hit.
Hey... anybody know if 'The Mill' still grinding away?
https://millcomputing.com/
ARM was unusual in being designed to specifically take advantage of the fast page mode memory which had become available leading to instructions like load and store multiple-time compiling.
This woman is definitively a superheroine
This woman is definitively a superheroine
Roger that, job very well done.
# regs
reg00: 0x00000000
reg01: 0xdeadbeaf
reg02: 0x00000000
reg03: 0x00000000
reg04: 0x00000000
reg05: 0x00000000
reg06: 0x00000000
reg07: 0x00000000
reg08: 0x00000000
reg09: 0x00000000
reg10: 0x00000000
reg11: 0x00000000
reg12: 0x00000000
reg13: 0x00000000
reg14: 0x00000000
reg15: 0x00000000
reg16: 0x00000000
reg17: 0x00000000
reg18: 0x00000000
reg19: 0x00000000
reg20: 0x00000000
reg21: 0x00000000
reg22: 0x00000000
reg23: 0x00000000
reg24: 0x00000000
reg25: 0x00000000
reg26: 0x00000000
reg27: 0x00000000
reg28: 0x00000000
reg29: 0x00000000
reg30: 0x00000000
reg31: 0x00000000
# md 0xf1000000
f1000000..f10007ff 2048 byte I00:0 mem:1 hd:1 magic1 bin/data_cpu1reg.bin
showing memory @ 0xf1000000
0xf1000000 .. 0xf10007ff
f1000000: 00000000 afbeadde 00000000 00000000 [................]
f1000010: 00000000 00000000 00000000 00000000 [................]
f1000020: 00000000 00000000 00000000 00000000 [................]
f1000030: 00000000 00000000 00000000 00000000 [................]
f1000040: 00000000 00000000 00000000 00000000 [................]
f1000050: 00000000 00000000 00000000 00000000 [................]
f1000060: 00000000 00000000 00000000 00000000 [................]
f1000070: 00000000 00000000 00000000 00000000 [................]
f1000080: 00000000 00000000 00000000 00000000 [................]
f1000090: 00000000 00000000 00000000 00000000 [................]
f10000a0: 00000000 00000000 00000000 00000000 [................]
f10000b0: 00000000 00000000 00000000 00000000 [................]
f10000c0: 00000000 00000000 00000000 00000000 [................]
f10000d0: 00000000 00000000 00000000 00000000 [................]
f10000e0: 00000000 00000000 00000000 00000000 [................]
f10000f0: 00000000 00000000 00000000 00000000 [................]
#
I don't mind either byte order.
What burns my goat is the way some documentation insists on labeling bits in decreasing order of imp8ortance: most significant bit 0. The only bit labeling that makes any sense to me is the mathematical one; for unsigned integers, bit i corresponding to value 2i.
I wonder WTF was in the head of Intel when they wanted to use LittleEndian ... it's unnatural for humans
I wonder WTF was in the head of Intel when they wanted to use LittleEndian ... it's unnatural for humans
Little-endian naturally cast between bytes, halfs and words without the need to move things around. And how things are physically located in the memory is mostly irrelevant.
QuoteI wonder WTF was in the head of Intel when they wanted to use LittleEndian ... it's unnatural for humansCopying the DEC PDP11, as were pretty much all the microcontroller manufacturers at the time.(although the 68000, with an arguably much-more-PDP11-like instruction set, is big endian.)
static inline uint32_t unpack_u32le(const unsigned char *const data)
{
return ((uint32_t)data[0])
| ((uint32_t)data[1] << 8)
| ((uint32_t)data[2] << 16)
| ((uint32_t)data[2] << 24);
}
static inline uint32_t unpack_32be(const unsigned char *const data)
{
return ((uint32_t)data[0] << 24)
| ((uint32_t)data[1] << 16)
| ((uint32_t)data[2] << 8)
| ((uint32_t)data[3]);
}
depending on the surrounding code. That's why I don't mind.How exactly do you tag data?
I like to have the byte order conversions explicitly visible.
most of the time you want to convert the data when it enters/leaves the MCU.
I don't want it to do the conversion before each operation.
current C compilers know how to optimize e.g. static inline uint32_t unpack_32be ...
Quotecurrent C compilers know how to optimize e.g. static inline uint32_t unpack_32be ...Which compilers? gcc-arm didn't optimize it "at all" (for CM4), nor does XCode LLVM :-(