Author Topic: Are data type compiler-dependent or target dependent  (Read 10430 times)

0 Members and 1 Guest are viewing this topic.

Offline King123Topic starter

  • Contributor
  • Posts: 18
  • Country: in
Are data type compiler-dependent or target dependent
« on: August 02, 2022, 01:19:22 pm »
I am being confused with data type in c programming language.

My question:

Are data type compiler-dependent or target dependent?
 

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1471
  • Country: pl
Re: Are data type compiler-dependent or target dependent
« Reply #1 on: August 02, 2022, 01:50:28 pm »
What exactly do you mean by “data type”? Or ather, what kind of dependence do you mean?

If you mean ranges, sizes, representation: both the compiler and the platform. Even more: may also depend on particular options passed to the compiler. The platform limits what makes sense, so it’s the primary factor, but not the only one.
« Last Edit: August 02, 2022, 01:52:38 pm by golden_labels »
People imagine AI as T1000. What we got so far is glorified T9.
 

Online tellurium

  • Frequent Contributor
  • **
  • Posts: 287
  • Country: ua
Re: Are data type compiler-dependent or target dependent
« Reply #2 on: August 02, 2022, 02:48:06 pm »
I am being confused with data type in c programming language.

My question:

Are data type compiler-dependent or target dependent?

There are basic types, like int, long. They do not require any header file. Their size depends on the target. For example, if you're compiling on 64-bit Windows machine using Arduino IDE, the AVR compiler uses 8-bit AVR compiler as a target, where "int" type is 2 bytes. If you're compiling for the 64-bit Windows, the "int" type would be 4 bytes.

There are other types, like size_t, uitn32_t, etc. They do require header files. When C compiler compiles a piece of code, all headers gets inlined and more complex types resolve to the basic types. The header files are, too, depend on the target. Usually, header flies are bundled together with the compiler.

Hope that clarifies
Open source embedded network library https://github.com/cesanta/mongoose
TCP/IP stack + TLS1.3 + HTTP/WebSocket/MQTT in a single file
 
The following users thanked this post: Nominal Animal

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #3 on: August 02, 2022, 03:28:49 pm »
Are data type compiler-dependent or target dependent?

data-type is always the same, but the data-size is a bit weird

e.g.
on MIPS4 "long" means 64-bit, "long long" means 64-bit
on hc11, gcc 3.0.*, "int" means 16 bit, while with icc-v11, "int" means 32-bit
on hc11, gcc.3.4.6 + patch, "int" are either 16 or 32-bit entities depending on a special flag
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #4 on: August 03, 2022, 07:03:33 am »
endianess { BE, LE } is also target dependent.
MIPS, SH, POWER and PowerPC can be LE or BE depending on a configuration bit at boot.
AMD and intel x86 are always LE.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline King123Topic starter

  • Contributor
  • Posts: 18
  • Country: in
Re: Are data type compiler-dependent or target dependent
« Reply #5 on: August 03, 2022, 11:05:17 am »
I am trying to understand What are difference between int, short and long in context of c standards?
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21218
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Are data type compiler-dependent or target dependent
« Reply #6 on: August 03, 2022, 12:03:18 pm »
I am being confused with data type in c programming language.

My question:

Are data type compiler-dependent or target dependent?

Yes :)

And add processor dependent.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #7 on: August 03, 2022, 12:04:22 pm »
difference between int, short and long in context of c standards

char is always supposed to be 8bit
short, int, long are always bigger than 8bit

int and long can vary their data-size, when unsure, you can check with sizeof(x)

Code: [Select]
#include <stdio.h>

int main()
{
    printf("the size of \"char\"      is %d byte\n", sizeof(char));
    printf("the size of \"short\"     is %d byte\n", sizeof(short));
    printf("the size of \"int\"       is %d byte\n", sizeof(int));
    printf("the size of \"long\"      is %d byte\n", sizeof(long));
    printf("the size of \"long long\" is %d byte\n", sizeof(long long));

    return 0;
}
the data-size and endianess are the only target&processor-depended and compiler-dependent differences

Code: [Select]
the size of "char"      is 1 byte
the size of "short"     is 2 byte
the size of "int"       is 4 byte
the size of "long"      is 4 byte
the size of "long long" is 8 byte
Gcc v4.1.2 on PowerPC-7550
« Last Edit: August 03, 2022, 12:06:17 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline emece67

  • Frequent Contributor
  • **
  • !
  • Posts: 614
  • Country: 00
Re: Are data type compiler-dependent or target dependent
« Reply #8 on: August 03, 2022, 12:13:04 pm »
.
« Last Edit: August 19, 2022, 05:44:06 pm by emece67 »
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #10 on: August 03, 2022, 12:40:50 pm »
I am trying to understand What are difference between int, short and long in context of c standards?
The range of values they can represent.  If you include <limits.h>, the compiler and the C library will expose a set of constants:
  • SHRT_MIN and SHRT_MAX describing the minimum and maximum integer values a short can represent.
    SHRT_MIN will be -32767 or smaller (-32768 is most common), and SHRT_MAX will be 32767 or larger (32767 is most common)
  • INT_MIN and INT_MAX describing the minimum and maximum integer values an int can represent.
    INT_MIN will be -32767 or smaller (-2147483648 is most common), and INT_MAX will be 32767 or larger (2147483647 is most common)
  • LONG_MIN and LONG_MAX describing the minimum and maximum integer values a long can represent.
    LONG_MIN will be -2147483647 or smaller (-2147483648 is most common), and LONG_MAX will be 2147483647 or larger (2147483647 is most common)

The type char can be signed or unsigned; it too varies depending on the compiler and the target (C library).
The type size_t is some unsigned integer type that can handle any in-memory size; int range may not suffice (and does not on typical 64-bit architectures).

The main thing to remember is that when doing arithmetic or logic, any integer type with smaller range than an int, will be promoted to int.  This is a "quirk" in the C language.  To limit the range and precision of an expression, we use casts: (type)(expression).  For example, if we have unsigned char  a, b; then (a + b) yields an int, but (unsigned char)(a + b) yields an unsigned char.

Similarly, when <stdarg.h> variable argument lists are used, types smaller than int are promoted to int, float is promoted to double, and so on.

When you include <stdint.h> (or <inttypes.h>), additional types may be (and are in practice) exposed with very useful features:
  • intN_t and uintN_t with N being 8, 16, 32, and 64.
    These are signed (with two's complement representation) and unsigned integer types of exactly N bits with no padding bits.
    These are extremely useful for file formats and other data interchange.  You still need to consider byte order, since that varies from architecture to architecture (although the vast majority is "little endian" or "big endian"), but there are very simple ways to do that.
  • int_fastN_t and uint_fastN_t with N being 8, 16, 32, and 64.
    These are signed (with two's complement representation) and unsigned integer types of at least N bits, that provide "fastest" arithmetic and logic on a given architecture.
    For example, on some architectures 16-bit arithmetic requires extra machine instructions (masking out the extra bits).  On those, int_fast16_t might correspond to int32_t or int64_t for example, whichever yields faster arithmetic.
    These are extremely useful for internal variables, say for loops and such.  Instead of hoping the compiler will generate efficient code, by choosing a suitable N to match the range you expect, using these types the compiler will generate the fastest code it can.
  • intmax_t and uintmax_t, corresponding to the signed and unsigned integer types with the largest range of representable values the architecture supports.
  • intptr_t and uintptr_t, corresponding to signed and unsigned integer types that are compatible with pointers; a pointer converted to one of these and then back to its original pointer type will retain its value.
    These are useful when one needs to represent a pointer as an integer for some reason.  There is a lot of code that does (int)(pointer_expression) , but that is a BUG: the int type often does not have the range to represent all possible pointer values, so such code may work on some machines, but fail on others, on the exact same architecture, depending on the actual pointer values!  The pain and suffering this assumption alone has caused is immense: please do not let anyone do that.

https://en.cppreference.com/w/cpp/language/types
C and C++ are two completely different languages.  Please, do not let the superficial similarities confuse people into thinking they are the same.  While a C++ compiler can compile most C code, it cannot compile all C standards compliant code.
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Are data type compiler-dependent or target dependent
« Reply #11 on: August 03, 2022, 12:58:04 pm »
https://en.cppreference.com/w/cpp/language/types
C and C++ are two completely different languages.  Please, do not let the superficial similarities confuse people into thinking they are the same.  While a C++ compiler can compile most C code, it cannot compile all C standards compliant code.

Well fuck me, guess I better dump all my code in the bitbucket and start over. ;)

(The table shows same as what you quoted, unless you mean to tell me they do in fact differ on this most basic of properties and I've missed something?)

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #12 on: August 03, 2022, 01:10:31 pm »
add processor dependent.

yup, weird example, but I am on my R18200 MIPS4 prototype right now, and I have some issue accessing the ASIC chip due to the way the CPU performs load/store operations.

There is a memory mapped ASIC chip, I need to access its registers with byte-granularity, but the hardware doesn't support such operations.
The CPU is 64bit register file, sizeof(long long) is the same as sizeof(long), and at the load/store side, the CPU only perform 64bit load/store accesses.

When you issue a load/store.byte, they always access 64 bit as single read cycle, and they they only consider the lowest byte
e.g.
Code: [Select]
load data from address EA
return data = data bitwiseAnd 0x00.00.00.00.00.00.00.ff <------ only consider the lowest byte

Thus, to access a byte with odd address, you have to properly calculate both EA and the mask

e.g. load byte @ EA=0x8000.0002 <----- you cannot use this address as is because it would trap an hw exception, bad alignment address
Code: [Select]
load {byte[0..7]} data from address (EA0 & 0xfffffffc) <----------- the address becomes 0x8000.0000
return data = { 0x00.00.00.00.00.00.00.byte[EA % 8] } <------------ it only consider the third byte, namely "byte2"
This can be done in assembly but the C compiler, or in hardware if the CPU supports it.

From the high level, unless you directly play with a sensible ASIC chip, you don't notice the difference. For you, it's a "long"-datatype read from memory.


Other CPUs like ijvm do exactly the opposite. They have a 8bit load/store unit, so when you access 64 bit they issue eight read cycles on the bus
e.g.
load data0 from address EA+0
load data1 from address EA+1
load data2 from address EA+2
load data3 from address EA+3
load data4 from address EA+4
load data5 from address EA+5
load data6 from address EA+6
load data7 from address EA+7
return data= { data0.data1.data2.data3.data4.data5.data6.data7 }

Again, from the high level, unless you directly play with a sensible ASIC chip, you don't notice the difference. For you, it's a "long"-datatype read from memory.


Endianess is also exposed, but only if you do low-level things.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #13 on: August 03, 2022, 01:21:07 pm »
C and C++ are two completely different languages.  Please, do not let the superficial similarities confuse people into thinking they are the same.  While a C++ compiler can compile most C code, it cannot compile all C standards compliant code.

Yup
Code: [Select]
c       is 0xdeadbeaf
c++     is 0xdeadcafe
C ^ C++ is 0xdead8aae

 ;D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #14 on: August 03, 2022, 02:24:07 pm »
(The table shows same as what you quoted, unless you mean to tell me they do in fact differ on this most basic of properties and I've missed something?)
No, that's not what I meant.

I meant that using a C++ reference to show something about C gives new programmers an untenable intuition: that the two are the same, or that one is a superset/subset of the other.

You used the link because in this detail, C and C++ use the same rules.  You didn't mention that, just posted the link.  So, I did the "That dog äläht's to which the kalig calaht's" thing, and barked, because therein lies a hidden danger that I've seen bit others in the butt.

Maybe it does not matter to you, but it for sure matters to me, especially because I like to do stuff in a mixed freestanding C and C++ environment.  Why?  Try explaining such an environment and quirks to a coder who thinks C is a subset of C++, and is angry that nothing works like they expect it to.  I have tried, and I've found it is useless, unless I first disabuse them from their incorrect notions (including C being a subset of C++; just consider the struct namespace, which is separate in C but the same as type namespace in C++).  Or when you port such freestanding code, and instead of understanding the boundary between standards and implementations, the author just made a full set of unstated assumptions that are only valid with that specific version of the compiler and target –– possibly because they checked it on their compiler, and it seemed to produce the wanted results.  Holy hell is that annoying and horrible to work with; basically unsalvageable mess.  I just want to save others from that sort of pain, OK?
 
The following users thanked this post: newbrain, MK14

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 
The following users thanked this post: MK14

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #16 on: August 03, 2022, 03:14:00 pm »
Fine, use this link:
https://en.cppreference.com/w/c/language/arithmetic_types
No, that's not it either.  I'd prefer you wrote something along the lines of "See C++ Reference for example.  C and C++ have the same definitions for integer types."

Personally, I do not trust cppreference.com at all when it comes to C, by the way.  (Edited to note: It is a Wiki directed towards C++ developers, after all.)  Just like I don't trust microsoft.com when it comes to Mac OS, either.  In particular, any "C reference" that excludes POSIX C is worthless shite in my opinion.  But you do you.
« Last Edit: August 03, 2022, 03:36:06 pm by Nominal Animal »
 

Offline eugene

  • Frequent Contributor
  • **
  • Posts: 497
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #17 on: August 03, 2022, 03:39:26 pm »
I am trying to understand What are difference between int, short and long in context of c standards?

That question has been answered and tellurium mentioned <stdint.h> in reply #2. I just want to emphasize that if what you are really asking is how to get an int of a specific size, then don't try to second guess the compiler. Just use types that are in <stdint.h>: int8_t when you want signed 8 bit integer, uint32_t when you want unsigned 32 bit, etc.
90% of quoted statistics are fictional
 
The following users thanked this post: Siwastaja, Nominal Animal

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #18 on: August 03, 2022, 05:48:11 pm »
In case this is OP's homework, let's drown them with useful information.  >:D

The following terms will help when one is looking for details on this stuff:
  • architecture – the instruction set and/or processor family used on the target

  • ABI – the binary interface provided by the compiler and the base libraries

  • Common C data models:
    • ILP32: int, long, and pointers are all 32 bits in size
    • LLP64: int and long are 32-bit, but long long and pointers are 64-bit values
    • LP64: int is 32-bit, but long and pointers are 64-bit
      (plus some rare others, like ILP64 and SILP64)

  • The primitive used for atomic access: CAS (compare and exchange) or LL/SC (load-link, store-conditional)
    CAS is based on an instruction that will update target memory atomically if it contains a specific value, and will fail otherwise.  LL/SC implements a load instruction that pairs with a store to the same address, with the store only succeeding if nothing modified that memory in between.

  • Endianness: when storing multi-byte values, in which order are the bytes stored in.
    Note that bits have only one order within a byte, based on their numerical value: bit n has numerical value 2n, with the least significant (integer) bit being bit 0.  In an N-bit unit, the most significant bit is bit N-1.
    When transmitted using a serial connection like SPI or I2C, the bits can be transmitted either the least significant or the most significant bit first.
    Finally, some documentation (like old IBM docs) labels bits starting at 0 at the most significant bit.  This can be problematic when trying to map the bit to a specific value (of a register containing bits labelled using such a scheme).

  • API or application programming interface – the programming interface that the base libraries and/or the kernel provides in a specific programming language like C.



For example, Windows running on 64-bit Intel and AMD x86-64 processors uses the Windows ABI and API, which has an LLP64 data model, is little-endian, and the base libraries provide a subset of C11, and even C code is intended to be compiled using a C++ compiler (as Microsoft does not provide a C compiler, only a C++ compiler that can compile most C code).

In comparison, Linux running on that same hardware uses the System V ABI, providing almost all of POSIX.1-2008 C, and some additional Linux- and GNU-specific C extensions.  It has an LP64 data model, is little-endian.

However, Linux also runs on a lot of other architectures (processors and instruction sets), on both 32-bit and 64-bit ones.  The 32-bit ones all have an ILP32 data model, and the 64-bit ones an LP64 data model.  (Because of this, in Linux, long and unsigned long are the same size as pointers.)
Depending on the architecture, byte order is either little-endian or big-endian.  (Some, like many ARM cores, can even switch between the two at run time, but I do not believe it is supported for userspace programs in Linux.)



The <stdint.h> header is actually provided by the compiler, in the sense that it is available even for freestanding code (with no standard C library features available).  You could say it provides much saner, easily predictable types for one to use in a reliable, portable manner.



To solve byte order issues, one can use wither the __BYTE_ORDER macro (which matches either __BIG_ENDIAN or __LITTLE_ENDIAN, if defined; see Pre-defined Compiler Macros for further info) at compile time, or C code similar to the following at run time:
Code: [Select]
#include <stdint.h>

typedef union {
    uint64_t       u64[1];
    int64_t        s64[1];
    uint32_t       u32[2];
    int32_t        s32[2];
    uint16_t       u16[4];
    int16_t        s16[4];
    uint8_t        u8[8];
    int8_t         s8[8];
    unsigned char  uc[8];
    signed char    sc[8];
    char           c[8];
    double         d[1];    /* Assumes IEEE 754 Binary64 - verify at run time */
    float          f[2];    /* Assumes IEEE 754 Binary32 - verify at run time */
} word64;

typedef union {
    uint32_t       u32[1];
    int32_t        s32[1];
    uint16_t       u16[2];
    int16_t        s16[2];
    uint8_t        u8[4];
    int8_t         s8[4];
    unsigned char  uc[4];
    signed char    sc[4];
    char           c[4];
    float          f[1];    /* Assumes IEEE 754 Binary32 - verify at run time */
} word32;

static inline word64 byteorder64(word64 w, int_fast8_t order)
{
    if (order & 1)
        w.u64[0] = ((w.u64[0] >>  8) & UINT64_C(0x00FF00FF00FF00FF))
                 | ((w.u64[0] & UINT64_C(0x00FF00FF00FF00FF)) <<  8);
    if (order & 2)
        w.u64[0] = ((w.u64[0] >> 16) & UINT64_C(0x0000FFFF0000FFFF))
                 | ((w.u64[0] & UINT64_C(0x0000FFFF0000FFFF)) << 16);
    if (order & 4)
        w.u64[0] = ((w.u64[0] >> 32) & UINT64_C(0x00000000FFFFFFFF))
                 | ((w.u64[0] & UINT64_C(0x00000000FFFFFFFF)) << 16);
    return w;
}

static inline word32 byteorder32(word32 w, int_fast8_t order)
{
    if (order & 1)
        w.u32[0] = ((w.u32[0] >>  8) & UINT32_C(0x00FF00FF))
                 | ((w.u32[0] & UINT32_C(0x00FF00FF)) <<  8);
    if (order & 2)
        w.u32[0] = ((w.u32[0] >> 16) & UINT32_C(0x0000FFFF))
                 | ((w.u32[0] & UINT32_C(0x0000FFFF)) << 16);
    return w;
}
Note that the inline above is irrelevant to the C compiler; static alone suffices.  I just use the static inline as an indicator for us humans that these are accessor or helper functions, typically defined in the header file, and not just ordinary "internal" functions (that I declare static only).

The basic idea here is that type punning via an union is described in a footnote in all C ISO standards as a way to reinterpret the storage of a variable, so we use that to reinterpret the storage representation of the known fixed-size integer types, as well as the two floating-point types that usually match IEEE 754 binary32 (single precision, float) and binary64 (double precision, double) types.

For 32 bit values, there are only four possible byte orders: native (0), swapped (3), or the two intermediate ones (1 and 2) that were only used in certain old systems.  If you check ((word32){ .uc = { 0x01, 0x02, 0x03, 0x04 }}).u32[0], it will have value 0x04030201 on little-endian architectures, and 0x01020304 on big-endian architectures.  The two other possibilities are 0x03040102 and 0x02010403, but they are nowadays extremely, extremely rare.

For 64 bit values, there are eight possible byte orders, with native (0) and swapped (7) being the most common ones.
If you check ((word64){ .uc = { 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08 }}).u64[0], it will have value UINT64_C(0x0807060504030201) on little-endian architectures, and UINT64_C(0x0102030405060708) on big-endian architectures.  The others are exceedingly rare.

The reason for using the UINTN_C(value) macros is that since the value might exceed the range int can represent (see my previous post above), <stdint.h> provides macros that adds a suffix (U, L, UL, LL, or ULL) if needed to tell the compiler the base type of the constant.  They can be used even in preprocessor #if statements.

If you have a peer (or client or server) you are communicating with, and it uses its native byte order but the types used in the above word32 and word64 unions, all you need to do is to agree with a prototype numeric value –– I recommend one double, one float, one 64-bit signed and negative integer, and one 32-bit signed or unsigned integer, for completeness; these take a total of 24 bytes ––, and then do a byte order discovery loop.  For example:
Code: [Select]
int_fast8_t find_byteorder64(word64 w, word64 expected)
{
    for (int_fast8_t  order = 0; order < 8; order++)
        if ((byteorder64(w, order)).u64[0] == expected.u64[0])
            return order;
    return -1;
}

int_fast8_t find_byteorder32(word32 w, word32 expected)
{
    for (int_fast8_t  order = 0; order < 4; order++)
        if ((byteorder32(w, order)).u32[0] == expected.u32[0])
            return order;
    return -1;
}
These will return the order needed for byteorderN() to convert the byte order to current native byte order, or -1 if no byte order conversion of the given word w matches the expected word expected.



You can use a heuristic check to verify that the target architecture has the same byte order for floating-point and integer types, and that the floating-point types match the assumption above, via for example
Code: [Select]
    if (((word32){ .f[0] = 0.0498918667435646f }).u32[0] != UINT32_C(0x3D4C5B6A))
        fprintf(stderr, "Warning: Invalid 32-bit byte order, or 'float' not IEEE 754-2008 Binary32.\n");
    if (((word64){ .d[0] = -2.125982314494425 }).u64[0] != UINT64_C(0xc001020304050607))
        fprintf(stderr, "Warning: Invalid 64-bit byte order, or 'double' not IEEE 754-2008 Binary64.\n");
Most C compilers, when optimizing (-Og or -O2) this code, generate no machine code because they can determine at compile time that the code cannot ever issue a warning, given the target properties.  Of course, instead of printing a warning to standard error, you can make this an assert() (#include <assert.h>).
 
The following users thanked this post: Siwastaja

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #19 on: August 03, 2022, 06:42:09 pm »
The primitive used for atomic access

CAS, CAS2, TAS, TAS2 are usually read-modify-write, looooooooooooooooong operation on the bus.
I used them on 68020 and 68030.

The CAS instruction compares the value in a memory location with the value in a data-register, and copies a second data register into the memory location if the compared values are equal. This provides a means of updating system counters, history information, and globally shared pointers. The instruction uses an indivisible read-modify-write cycle; after CAS reads the memory location, no other instruction can change that location before CAS has written the new value.

In a multiprocessor environment, the other processors must wait until the CAS instruction completes before accessing a global pointer, this doesn't perform well because it locks the bus.

read-modify-write is not used in pure RISC design (like MIPS) because for such special and lengthy bus operations the load / store requires a special stage to keep the pipeline busy which would be a big problem with superscalar machines like PowerPC where you would need special pipeline instructions { isync sync and eieio } to have minimal guarantees.

So, you will find CAS/TAS on neither PowerPC nor MIPS

LL/SC is better, and simpler to be implemented in a RISC design, and it's also better with multiprocessors because it doesn't lock the bus.
« Last Edit: August 03, 2022, 06:45:57 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15794
  • Country: fr
Re: Are data type compiler-dependent or target dependent
« Reply #20 on: August 03, 2022, 07:21:41 pm »
difference between int, short and long in context of c standards

char is always supposed to be 8bit

Well, nope. It's supposed to be at least 8-bit.

short, int, long are always bigger than 8bit

Well. Not necessarily. They are supposed to be at least 16-bit for short and int, and at least 32-bit for long.
All that being implementation-defined per the standard.
Which means that on some implementation, char, short and int could all be the same, like 32-bit, or even wider, and it would still be compliant with the standard.

The standard gives minimum ranges for all those types. They are not guaranteed to have a different width.
 
The following users thanked this post: newbrain, DiTBho

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #21 on: August 03, 2022, 09:46:42 pm »
Well, nope. It's supposed to be at least 8-bit.

yeah, will a char always-always-always have 8 bits?

Technically not always, most often *de-facto* it will, and that's the big shit with C since on the vast majority of platforms out there (including the Linux kernel and u-boot), "char" is always assumed to be of the same size of a byte even if technically the C standard says that it's supposed to be at least 8-bit.

Why? Well, because "8 bit" on CPUs(1), MPUs and DSPs made in 90s, 2000s, 2010s and 2020s (earlier CPUs were a bit different) is *de-facto* the smallest addressable amount of memory(2), and so *it should be* a char in C, even because sizeof(char) is supposed to always returns 1.

What I usually tend to assume
Code: [Select]
typedef unsigned char uint8_t;
(and it must be always verified)

And you know, it must be always verified because 99% of times it's true on the vast majority of platforms out there, but then someday you find that "char" on Ti320 is 16bit and, worse still, sizeof(char) should return 1, but on a some weird C-Ti320 compiler I have personally seen that sizeof(char) returns 2, which is "intuitively" correct, but it's completely wrong according to C standard, because sizeof(char) MUST always return 1 even if the size of char is 2 bytes.

if char is 16bit, sizeof(char)=1
if char is 8bit, sizeof(char)=1

WTF?  :palm:

That's pretty shit with the C language definition, which honestly is nothing but insane.

And, worse still, if you are interested in finding out just exactly how many bits of space your data types consume on your system, you can use the following line of code:

Code: [Select]
sizeof(type) * CHAR_BIT

So, that's how you can verify how many bits char is on a system

Code: [Select]
printf("The number of bits a 'char' has on my system: %zu\n", sizeof(char) * CHAR_BIT);
(taken from the GNU C Library Reference Manual)

crazy, but ... that's it  :-//


(1) CPUs, where "char" is 8 bit
8080, 8085, 8088, z80, z8000, etc
8051, 80c390, 80C400, etc
68hc11 and 68xx
m68xxx & Coldfire
m88xxx
PowerPC, PPC60x, 62x, 7xx, 74xx, e500, 40x, 440, 460, ...
POWER9, POWER10, POWER11
MIPS1, MIPS2, MIP3, MIPS4, MIPS32, MIPS64
x86
SH1, SH2, SH3, SH4
HPPA1 and HPPA2
ARM*

(2) ok, on my R18200 there is a problem with the load / store unit, and it is not physically possible to output a single byte bus-cycle, if you want to access a byte, the cpu physically emits a uint64 bus-cycle, but The CPU is somehow still able to handle 8-bit granularity thanks to special instructions that a clever C compiler can use, so there is no reason to define char as 8 bytes in size.

Corollary
it's my personal opinion that defining an hardware architecture where char is more than 8bit, and you don't have any other way to access 8 bit, means the design is crappy.

In fact the Ti320, where char is 16 bit, sucks.
« Last Edit: August 03, 2022, 10:03:17 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15794
  • Country: fr
Re: Are data type compiler-dependent or target dependent
« Reply #22 on: August 03, 2022, 09:55:05 pm »
This is why stdint was introduced in C99.
Note that exact-width types are only optional, but they are supported on most platforms these days except possibly for the very odd ones.

While I (and many others) wholeheartedly suggest using stdint's when you need to have some control over the width of integers, since exact-width types are only optional, if you use them, then your code is not guaranteed to be strictly portable anymore. Yeah, just the way it is.

The odd targets on which char may not be 8-bit and on which none, or only some of the exact-width integers are defined, are usually DSPs these days.

All this is not "insane", it all comes from the fact that the C standard has always had the goal of supporting a very wide range of targets while making it possible for compilers to produce efficient code.

 
The following users thanked this post: DiTBho

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4349
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #23 on: August 04, 2022, 08:25:16 am »
Quote
Are data type compiler-dependent or target dependent?
Compiler dependent.

But Compiler authors DO pay attention to the underlying CPU architecture...
You could theoretically write an ARM C compiler that made "int" 16 bits (say, if you were making a misguided attempt to address the 8bit microcontroller market), but AFAIK, no one has done that, and anyone who tried would probably be laughed at.
"char" is famously ambiguous as to whether it is signed or not (and different between different compiles.)
And breaking the 4Gbyte memory addressability limit introduced all sorts of "issues" when people were faced with pointers than maybe shouldn't be the same size as either "long" or "int."
 

Offline HwAoRrDk

  • Super Contributor
  • ***
  • Posts: 1607
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #24 on: August 04, 2022, 08:57:41 am »
Personally, I do not trust cppreference.com at all when it comes to C, by the way.  (Edited to note: It is a Wiki directed towards C++ developers, after all.)  Just like I don't trust microsoft.com when it comes to Mac OS, either.  In particular, any "C reference" that excludes POSIX C is worthless shite in my opinion.  But you do you.

I frequently refer to the C language information on cppreference.com. Can't say I recall the information provided there ever leading me astray or being flat-out wrong.

That it doesn't cover POSIX stuff is even a boon for me when doing embedded stuff, because many embedded C platforms don't have POSIX libraries.

What resource would you suggest instead? Straight from the horse's mouth, poring over insipid C standard PDF documents?
 
The following users thanked this post: T3sl4co1l, MK14

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #25 on: August 04, 2022, 09:01:05 am »
This is why stdint was introduced in C99.


All this is not "insane", it all comes from the fact that the C standard has always had the goal of supporting a very wide range of targets while making it possible for compilers to produce efficient code.

Why do you always have to defend that mistakes of the c compiler?

Byte must be 8 bit to have a common and universally accepted reference, like 1000mm is universally accepted as 1m, otherwise you need to introduce other constant which only makes the c compiler messed up with more crap and people more confused.

To really support a very wide range of targets aiming for true portability (which is utopia, anyway), char must be always equal to byte and sizeof(type) must be measured in 8bit modulo, and data type must be expressed with bit and sign.

What the frog is the meaning of sizeof(char)=1 if char is 16bit???

It's like saying that 1m is measuring 2000mm if you are around the north pole because measuring permafrost cannot be measured in 1000mm modulo, and you need a "adapting constant" to adjust things

1000mm_northpole = 1000m_measured_everywhereelse * magic_adapting_constant

printf("The number of bits a 'char' has on my system: %zu\n", sizeof(char) * CHAR_BIT);

(taken from the GNU C Library Reference Manual)

Seriously, what the fuck is that crap? Call things with their names, that is the biggest bullshit ever seen in computer science, in fact it has the only benefit of causing nothing but tons of stupid bugs

Therefore c must ban char, short, int, long and long long, and only accept uint8, uint16, uint32, uint64 and their signed versions

That also must be applied to pointers

All other things made with c is pure garbage
« Last Edit: August 04, 2022, 09:05:45 am by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #26 on: August 04, 2022, 09:54:29 am »
Why do you always have to defend that mistakes of the c compiler?
Hey, SiliconWizard wasn't defending, only explaining why.
Besides, they weren't mistakes back when C was first designed and later standardized by ISO and ANSI.

Remember, there were both PDP and DECsystems with 18-bit/36-bit words back then, and 8-bit chars weren't universally accepted yet.  That is the reason C calls its smallest integer base type char, and not byte.

(And sizeof expression and sizeof (type) return the size of the expression or type in chars, not bytes, with CHAR_BITS bits per char.  So, sizeof (short) * CHAR_BITS does return the effective number of bits in short.)

It is annoying now, yes, most definitely; but not a mistake.

In Linux, for example, you'll see a preprocessor macro __BITS_PER_LONG used rather extensively.  Because Linux uses either an ILP32 (with __BITS_PER_LONG=32) or LP64 (with __BITS_PER_LONG=64) data model, unsigned long is the optimum base type for bit arrays.  Each limb is a single unsigned long, and contains __BITS_PER_LONG bits.  This is true on all architectures Linux runs on.

In a mixed freestanding C/C++ environment as is typically used when programming microcontrollers, one can do something like
Code: [Select]
#include <stdint.h>
#include <limits.h>

#if ULONG_MAX == UINT32_MAX
// This architecture has 32-bit 'unsigned long'
#elif ULONG_MAX == UINT64_MAX
// This architecture has 64-bit 'unsigned long'
#else
// This architecture has an odd-sized 'unsigned long'; perhaps #error ?
#endif
and similarly with UCHAR_MAX, USHRT_MAX, UINT_MAX, and ULLONG_MAX.  You can even check CHAR_BIT.

My point is that while the C standard allows some really odd things, the fact that most of those odd things are exceedingly rare nowadays means we can use a different set of practical requirements.  This is also why I always say practice trumps theory; that what the C standard says is useful, but not the "law of the land": that which is practical in real life trumps the C (and POSIX C) standards.

But, instead of just making those assumptions silently, we should codify them in a header file explicitly describing and testing for them like above, and then just include that header in our various projects –– make it requirements.h or base-assumptions.h or similar, so it is clear for other human developers too.  (I do like the way GNU C library requires one to define macros like _GNU_SOURCE, _DEFAULT_SOURCE, _POSIX_C_SOURCE, and so on, to expose the interfaces.)

That way, you get exactly what you demand a sane C implementation should nowadays provide (and I don't really disagree), and if someone tries it on a strange architecture, instead of getting odd results and bugs, they'll get a warning/error at compile time that the target is a strange architecture.

Thing is, on those strange architectures (even on DSPs), there are usually compiler options that can be used to trade generated machine code efficiency for a more typical C environment.  On such, one can use the flags to compile the original code, then refactor the code to this strange architecture (typically a DSP with very specific/unique quirks in C), and verify the new code by comparing unit test results to the original (but possibly horribly slow) code.
In particular, on these architectures, you would usually not use the exact-width intN_t/uintN_t types, but either the base types, custom types provided by that target and C compiler, or int_leastN_t/uint_leastN_t/int_fastN_t/uint_fastN_t types.

So, it's really a win-win for us developers, and like SiliconWizard says, it does let C compilers produce more effective code.

Sure, it is a bit annoying, especially in that us developers need to find the way to express ourselves in C efficiently, which is not always logical (because of the concepts like "char" instead of "byte"); and we really do need to explicitly express the target assumptions (like in the <stdint.h>/<limits.h> check above –– remember, they're always available, even in a freestanding environment; you can consider them to be provided by the C compiler and not the standard C library implementation per se).
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15794
  • Country: fr
Re: Are data type compiler-dependent or target dependent
« Reply #27 on: August 04, 2022, 07:15:52 pm »
As Nominal (who kindly took the time for this elaborate answer) said.

If you really want a language more robust and with much fewer quirks, just use Ada. I don't even understand, given the statements you (@DiTBho) make on various programming topics, why you even bother with C. It clearly doesn't match your expectations/requirements.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #28 on: August 04, 2022, 11:05:27 pm »
If you really want a language more robust and with much fewer quirks, just use Ada.

I do use Ada! What is the point now? I think I am the only person in the world supporting Gnat on HPPA and MIPS4; it took me years to prepare a valid Gnat compiler

Do you know how difficult is preparing a Gnat compiler for MIPS4? That's why I have to use C.

The same applies to GoLang: porting llvm to HPPA is not a piece of cake. llvm doesn't work on HPPA because it has no support and without llvm you can forget Golang.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #29 on: August 04, 2022, 11:06:15 pm »
I don't even understand, given the statements you (@DiTBho) make on various programming topics, why you even bother with C. It clearly doesn't match your expectations/requirements.

Because things like sizeof(x) are really a stupid things in the way they have been implemented, and - here the annoying + frustrating part is that - it's would be extremely easy for competent people to be fixed them, but people argument that it's ok.

Why People don't fix it once and forever in say "C-2022" revision instead than introducing more stupid workarounds? I cannot believe "because otherwise it won't compile for PDP/8"

For the records, I have designed my own C-like compiler targetting MIPS4-only and

sizeof(type) has been renamed byte_sizeof(type), which always returns things measured in byte
char, short, int, long, long long are banned so you cannot use them
It doesn't need any external header to *redefine* the basic data-types
char8_t (always unsigned, and always 8bit, it's for ASCII stuff)
char16_t (always unsigned, and always 16bit, it's for unicode stuff)
uint8, uint16, uint32, uint64,
sint8, sint16, sint32, sint64,
fp32, fp64, fx1616, fx2408,
cplxfp32, cplxfp64, cplxfx1616, cplxfx2408, (complex numbers)
p_uint8, p_uint16, p_uint32, p_uint64,  (p_ means pointer)
p_sint8, p_sint16, p_sint32, p_sint64,
p_fp32, p_fp64, p_fx1616, p_fx2408,
p_cplxfp32, p_cplxfp64, p_cplxfx1616, p_cplxfx2408,
p_this, p_that (like void*)
p_char8
p_char16
string8 (this is like p_char8, but the first cell stores the length)
string16 (this is like p_char16, but the first cell stores the length)

There is also boolean, which is a true type with its logic operators
(p_boolean is its pointer)

All of these are built-in, you don't need any header (perpetually wrong, bugged, wrong) header.

I am so frustrated with the all the shit done by GNU with their glibc headers, always broken, always with problems because the last hacker decided to change something; supporting them on Gentoo is more frustrating than thinking you can cool hell, and this because people continuously modify those bloody headers and you have no more compiling stuff, or worse still, broken stuff, and this because people continuously modify those bloody headers and you have no more compiling stuff, or worse still, broken stuff.

Is it reasonable? I don't think so! And here, see my simple solution:

my MIPS4 prototype doesn't implements 16 bit operations (some instructions are missing in hardware), Gcc doesn't fully support a MIPS4 cpu like R18K (which is not officially existing, anyway, the last one was R16K, and the last supported was R12K)

So, I applied these nazy rules:

unt16, sint16, p_uint16, p_sint16 are NOT defined, and there is no way to define them, so the user cannot mess up anything!

Do you see how coherent, simple, elegant it is?

You have a piece of C code where you see "uint16_t ", the compiler won't accept it and returns

"sorry, this target doesn't support 16 bit datatype"

Which is super-clear, simple, and not bug-prone.

Applied to TI320, ... "char" is banned, the compiler doesn't accept it, you have to use "char8" or "char16", and when you try to define "char8 something"

"sorry, this target doesn't support 8 bit datatype"

I have these operators

bit_sizeof(type), returns the size of a type in bits, e.g. sizeof(uint8_t) returns 8
byte_sizeof(type), returns the size of a type in byte (byte=8bit), e.g. sizeof(uint8_t) returns 1
typeof(type), returns the type (grabbed from the typedef-space), that's super useful for polymorphic code
Code: [Select]
switch (typeof(type))
{
     case fx2408:
          ...
          break;
     case fx1616:
          ...
          break;
     ...
}

case can be of any type, you can compare strings, fp numbers, etc

logicalExOr    ^^ is not defined by C, and again you have to provide an header with this difinition
Code: [Select]
#define logicalExOr(A, B)     ((!A) != (!B)) // new
#define logicalExNOr(A, B)    ((!A) == (!B)) // new

My compiler comes with a built-in logicalExOr operator, which ONLY works on boolean stuff.
If you try to apply it to say "uint8" it will trigger an error.

Anyway, I cannot call my compiler "C", it's not C-compliant in several aspects, but who cares? Gcc can compile my compiler and it works on HPPA, and I am able to (cross)compile on an HPPA workstation an hacked version of XINU(1) for a MIPS4 prototype.

(the generated machine-code sucks about optimization, I know ... , but) next step, re-targeting for 68K


- in Conclusion -
I hope someone with more skills and attributes than me will one day take all the bullshit out of C and fix it right once and for all without making C too big like C++


(1) Written in C89, rewritten in "myC". It tooks a bit because I also cleaned it a bit, but it was not a difficult task.
« Last Edit: August 04, 2022, 11:20:05 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4767
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #30 on: August 04, 2022, 11:29:32 pm »
This is why stdint was introduced in C99.


All this is not "insane", it all comes from the fact that the C standard has always had the goal of supporting a very wide range of targets while making it possible for compilers to produce efficient code.

Why do you always have to defend that mistakes of the c compiler?

Language definition, not compiler. And why "mistakes"?

Quote
Byte must be 8 bit to have a common and universally accepted reference, like 1000mm is universally accepted as 1m, otherwise you need to introduce other constant which only makes the c compiler messed up with more crap and people more confused.

You may be thinking of "octet".

"Byte" is usually 8 bits these days, but not universally so.

Quote
To really support a very wide range of targets aiming for true portability (which is utopia, anyway), char must be always equal to byte and sizeof(type) must be measured in 8bit modulo, and data type must be expressed with bit and sign.

And yet C is the most portable efficient language we have, and it doesn't do that.

Quote
What the frog is the meaning of sizeof(char)=1 if char is 16bit???

Every object in C is an exact multiple of char in size and every positive integer is a possible object size.

If sizeof(char) was 2 then that would imply you could have something with size 3. Or 1. But you can't.  Char is the smallest addressable unit, and the measure of all things. It is 1.

Quote
All other things made with c is pure garbage

Some of us like it and find it fits our purposes. I'm sorry it doesn't meet your needs.
 
The following users thanked this post: newbrain, MK14

Offline newbrain

  • Super Contributor
  • ***
  • Posts: 1801
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #31 on: August 05, 2022, 09:13:51 am »
Quote
You have a piece of C code where you see "uint16_t ", the compiler won't accept it and returns

"sorry, this target doesn't support 16 bit datatype"

Which is super-clear, simple, and not bug-prone.
Oh, but the C standard already defines exact sized types as optional - C11, 7.20.1.1 Exact-width integer types, §3 (there since C99).
It only says that:
Quote
These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.

C23 slightly amended this, removing the 8-16-32-64 size reference (and the two's complement mention, as it's the mandatory signed integer representation):
Quote
If an implementation provides standard or extended integer types with a particular width and no padding bits, it shall define the corresponding typedef names

to exactly the same effect, but in a more general way.
CHAR_BIT is still at least 8, but can be more.

You can still have (u)int_least16_t (which will still be 32 bits) though.

So you reinvented an already well-thought-out wheel.  :-//

Quote
p_uint8, p_uint16, p_uint32, p_uint64,  (p_ means pointer)
Ok, I'll bite: what if I need a pointer to pointer? p_p_uint8?
Are these predefined typedefs or lexing/parsing tricks (i.e. the lexer will blindly consume 'p_' prefixes tokenizing them as '*')?
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: Siwastaja, MK14

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #32 on: August 05, 2022, 10:21:39 am »
what if I need a pointer to pointer? p_p_uint8?

casting: banned
pointer to pointer: banned
pointer to function: allowed
custom typedef: allowed, but pointer to pointer-typedef: banned

Are these predefined typedefs or lexing/parsing tricks

defined at the parsing level, inserted into the "typedef world", so they are really built-in typedef.

i.e. the lexer will blindly consume 'p_' prefixes tokenizing them as '*')?

"*" is only allowed (at the parsing level) for multiplications, and only if the target has a multiply-unit

otherwise, nazy-rule out:

"sorry, this hardware doesn't have any hardware multiply, you have to use softmul"
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline newbrain

  • Super Contributor
  • ***
  • Posts: 1801
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #33 on: August 05, 2022, 10:50:32 am »
...(The Horror! The Horror!)...
Ah well, some say there's pleasure in pain - who am I to judge other people's perversions...still, I cannot escape a feeling of morbid fascination.

How do you cope with, say, arrays of strings (which will decay to pointers to pointers in most contexts, first of all as function parameters)?
Do you also forego all the standard library functions that take ** arguments (e.g. strtod())?
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: MK14

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #34 on: August 05, 2022, 12:28:44 pm »
How do you cope with, say, arrays of strings

String8 and string16 are not a pointer-types, they are types class0 (basic type), therefore array of them is allowed as type class1(pointer type).

« Last Edit: August 05, 2022, 01:02:57 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #35 on: August 05, 2022, 12:40:47 pm »
If sizeof(char) was 2 then that would imply you could have something with size 3. Or 1. But you can't.  Char is the smallest addressable unit, and the measure of all things. It is 1.

Motorola 56000 is a very old and weird DSP. It never really had a C compiler until the architecture evolved into 56300, but there was an attempt to support 56000 during Gcc v2.95 era.

And you guess what? Operand sizes are defined as follows:
  • byte is 8 bits long
  • short word is16 bits long
  • word is 24 bits long
  • long word is 48 bits long
  • accumulator is 56 bits long
People gave up, and programmed it in assembly.


myC can easily solve the problem

  • byte --> uint8, sint8
  • short word --> uint16, sing16
  • word --> uint24, sint24
  • long word is 48 --> uint48, sing48
  • accumulator is 56 --> uint56, sint56
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #36 on: August 05, 2022, 12:43:59 pm »
Oh, but the C standard

but C90, 5.2.4.2.1 2.2.4.2.1 requires CHAR_BIT >= 8 and UCHAR_MAX >= 255. C89 uses a different section number but identical content.

They treat "char" and "byte" as essentially synonymous

What the fuck? Again  :palm:
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #37 on: August 05, 2022, 01:48:48 pm »
casting: banned
pointer to pointer: banned
pointer to function: allowed
custom typedef: allowed, but pointer to pointer-typedef: banned
You forget: You're not really working with C, you are working with an externally defined/dictated subset/superset of C.  You cannot really blame C for not being fit for those, can you?

I mean, if one could not pronounce the consonants r, t, or m, it wouldn't be the fault of the language they're speaking that others would have difficulty understanding them, right?  You could say it is stupid for languages to differentiate between the consonants L and R (as not all do), but it would not be a fair characterisation.  It is just a difficult situation compounded by a number of things, so we muddle through (grumbling as we go) the best we can.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #38 on: August 05, 2022, 02:18:41 pm »
You forget: You're not really working with C, you are working with an externally defined/dictated subset/superset of C.  You cannot really blame C for not being fit for those, can you?

Yes, I can!  :D

I can for all the time I wasted on Glibc on HPPA and MIPS because C favors bad programming practice, very prone to generate errors, misunderstandings(1), therefore bugs, I can because C desperately needs headers to correctly address its features, and again this generates errors, misunderstandings, therefore bugs;

Silly stuff like assuming char is unsigned when you don't explicitly define it. Last time it wasted 45 minutes of my time before I got it.
It was under an header that says

Code: [Select]
typedef char u8;

Fsk shit!!! 45 minutes of my life on that!

I'm working with myC not because I am masochist but rather because C sucks at certain things (to the point we had to invent MISRA to contain them) and people don't want to fix them at the language-level, and if nothing has changed until now, nothing will change in the near future.

Therefore I do blame C for all the flaws that people don't fix. I blame C because if those flaws are tolerable with mono cpu and common memory models, they become completely out of control, with machines like the R18200 and its tr-memory, and, worse still, because threads cannot be implemented as a C-library.

Oh, and if you think tr-memory is just something you'll never see in a consumer computer: my boss's POWER10 workstation has tr-memory in every POWER-core.

Now, if your try to handle it with C ... and you'll have a lot of trouble, inability to complete a single working program without wasting weeks of time on a debugger.



edit:
(1) here the problem is: glibc is regularly fixed only for mainstream hardware { x86, arm }, HPPA, MIPS, and SH are not so lucky, therefore you have to fix stuff yourself. Usually, a lot of these bugs are with headers, which are usually related to wrong size_t, wrong pointer size, wrong data-type size, wrong directive telling the compiler the wrong typedef, constant, etc. All silly stuff, that wastes a lot o time.
« Last Edit: August 05, 2022, 02:40:11 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline newbrain

  • Super Contributor
  • ***
  • Posts: 1801
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #39 on: August 05, 2022, 06:25:50 pm »
  • byte is 8 bits long
  • short word is16 bits long
  • word is 24 bits long
  • long word is 48 bits long
  • accumulator is 56 bits long
[...]
myC can easily solve the problem
As a said earlier, since C99 this is not a problem in C.
int8_t, int16_t, int24_t, int48_t and int56_t are all perfectly fine fixed size integers, together with their unsigned siblings.
No need to provide int32_t or int64_t, in fact, it might even be impossible on  that architecture (e.g. due to alignment constraints): exact size type must not have padding.

Then you have to provide the mandatory "least" and "fast" types:
int_least_8t -> int8_t
int_least_16t -> int16_t
int_least_32t -> int48_t
int_least_64t -> this needs to be implemented, not being native. Possibly as a 48+48.

yourC does not seem to have a real advantage here.

I'm still curious on how pointers to pointers are handled in library calls (unless you've thrown away parts of the standard library) and in other cases when you need to pass a pointer to a function that need to change it and give it back to the caller.

Of course one can wrap the target pointer in a structure, then it the actual argument becomes a pointer to a struct, that only contains a pointer to something else...it does not run afoul of the rules, just makes them pointless.

Or does yourC also allow passing by reference, à la C++?

Oh, and how is constantness (and volatility etc.)  addressed? Is it possible to declare a const pointer to a const object (and all the other combinations)?

Fascinating. I'll have to recite the 8 translation phases (5.1.1.2) ten times to cleanse my soul after staring into the abyss.
(Just joking. Not going to atone for other people's sins  >:D)
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: MK14

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15794
  • Country: fr
Re: Are data type compiler-dependent or target dependent
« Reply #40 on: August 05, 2022, 06:58:32 pm »
Designing a better C (I happened to have opened a thread dedicated to this a while ago and it unexpectedly got nowhere) is almost a lost cause. I know it sounds corny, but we now have 50 years to back this up.

Recent attempts that have claimed to be that while keeping the "spirit" of C (so no object orientation or any other fancy mix of paradigms) are actually relatively few. We can name Zig and Odin. Rust is definitely not that. While Zig is interesting, it just has its own set of problems. Odin also has interesting ideas, but it's a bit too opinionated to ever be widespread IMHO.

As I hinted and DiTBho confirmed, he actually likes Ada and would apparently like to use it more, his problem being that there is no Ada compiler for some of the platforms he targets. He doesn't need a "better C", he needs Ada.

So his prefered language has already been designed. No need to try and design another one, for which there wouldn't be any more compiler available anyway.

I personally find C pretty usable. For any advanced use, I definitely recommend reading the standard for whatever revision you're going to use. It's a must. And reading the latest revisions (C11, and even the C23 draft) could also give you a couple ideas and show you what "modern C" can bring to the table.

As to Ada, this is certainly a language that I would like to use, but not as is, at least not for most projects I work on. (I would have no problem using it for the super-critical stuff it's usually used for.)
So I would like some successor of Ada with only a subset of it (but which subset is the hard task), possibly a slightly more compact syntax, and a very limited or even no runtime required.

But that does not exist, and being pragmatic tells us that making the best of existing and well-established tools is way more productive than chasing after hypothetical ones.
 
The following users thanked this post: newbrain, MK14

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #41 on: August 05, 2022, 08:06:16 pm »
yourC does not seem to have a real advantage here.

I spent two years on this problem for two reasons
1) hacking Gcc in order to support "MIPS4++" is beyond my skills, and dealing with GNU guys is not ... sane
2) I had to solve tr-memory and concurrency problems with R18200, which is a weird beast

Cray is also odd, sizeof(char) = 1, but char is 32 bit, as well as short, int, and long.
MIPS4++, as is, exposes a similar problem: everything is 64 bit at the hardware level

uint8 comes through several tricks in the machine-level, but at least ... I have a couple of hardware instructions to correctly address a byte during Load/Store
Code: [Select]
size byte3 byte2 byte1 byte0  shift auto_mask    address_mask
08                         x  R(00) 0x0000.00ff  0xffff.fffc
08                   x        R(08) 0x0000.ff00  0xffff.fffc
08             x              R(16) 0x00ff.0000  0xffff.fffc
08       x                    R(24) 0xff00.0000  0xffff.fffc
16                   *** not supported ***
32       x     x     x     x  R(00) 0xffff.ffff  0xffff.ffff

thanks to this, I can still have byte-granularity, and I can also have 4byte granularity :D

tr-memory is even more problematic because it's 35bit (quad port), but you need to address it with 32bit modulo0, hence like a 32-bit-only memory.

Extra bits are "meta". The machine layer copies them into Reg14, and exposes them on demand through the operator ".meta", which doesn't return uint8 (casting is banned in myC) and it's only usable through a bit operator ".bit[]";

For the user it is really simple and intuitive

Code: [Select]
uint32_t addr;
tr_t trmem[4K_cells];
uint32_t data;
booleant_t bit_0;
booleant_t bit_1;
booleant_t bit_2;

addr = 0x8000.0000;
data = trmem[addr].data;
bit_0 = trmem[addr].meta.bit[0];
bit_1 = trmem[addr].meta.bit[1];
bit_2 = trmem[addr].meta.bit[2];

bit_sizeof(trmem); /* myC built-in operator, returns 35 */
bit_sizeof(trmem.data); /* myC built-in operator, returns 32 */
bit_sizeof(trmem.meta); /* myC built-in operator, returns 3 */
n_of_cells(trmem); /* myC built-in operator, return 4000 */

tr_t is built_in and accessible in every detail from the high level!
sizeof(type) makes no sense, and it has been removed.

when you need to pass a pointer to a function that need to change it and give it back to the caller

Code: [Select]
p_uint32_t do_xxxx
(
     p_uint32_t p_x
)
{
     p_uint32_t  ans;

     ans = p_x;
     return ans;
}

arguments

if you need a function that can modify an argument ... well it's not possible, it has been deliberately banned since the beginning, you have to create an object and pass its pointer, which means you have to re-thing your software design carefully, which is my purpose with other people tend to mess up code.

Code: [Select]
public void matrix_cell_is_fx1616
(
    p_matrix_t p_matrix
)
{
    p_matrix->context.method.cmp.isle = cmp_isle;
    p_matrix->context.method.cmp.islt = cmp_islt;
    p_matrix->context.method.cmp.isge = cmp_isge;
    p_matrix->context.method.cmp.isgt = cmp_isgt;
    p_matrix->context.method.cmp.is_0 = cmp_is_0;
    p_matrix->context.method.cmp.iseq = cmp_iseq;
    p_matrix->context.method.let.show = let_show;
    p_matrix->context.method.let.copy = let_copy;
    p_matrix->context.method.let.swap = let_swap;
    p_matrix->context.method.eval.add = eval_add;
    p_matrix->context.method.eval.sub = eval_sub;
    p_matrix->context.method.eval.mul = eval_mul;
    p_matrix->context.method.eval.div = eval_div;
    p_matrix->context.method.eval.mac = eval_mac;
    p_matrix->context.method.eval.msc = eval_msc;
    p_matrix->context.method.eval.rem = eval_rem;
    p_matrix->context.method.eval.abs = eval_abs;
    p_matrix->context.method.eval.clr = eval_clr;

Code: [Select]
    matrix_t        matrix;
    p_matrix_t      p_matrix;

    p_matrix = get_address(matrix);
    matrix_init(p_matrix, 4, 4, matrix0_data, matrix_cell_is_fx1616);

This is a polymorphic linear system solver written in myC: 70% is portable to C with a few modifications (thanks to a "compatibility header")

The code (cross)compiles on myC targeting the MIPS4++ R18200. It works correctly but the machine code is not optimized, because well ... there isn't yet an optimizer, myC only outputs -o0 assembly.

But it's stable and allows me to play with concurrency: I can split the LU decomposition in blocks, and performing the evaluations on four cores with results exposed on the tr-memory.

The function above modifies a lot of function pointers: this is the ONLY allowed way to modify a pointer in myC.

Is it possible to declare a const pointer to a const object (and all the other combinations)?

"const" is one of the deceiving word that I immediately banned as well as "break", "goto", "volatile", "static" and "extern".

myC v1: "Break" is banned when inside a loop, only allowed inside a switch case
myC v2: "Break" is entirely banned, switch case must use {}

Code: [Select]
switch ()
{
     case xxx /* <----- note xxx does no more look similar to a label, the symbol ":" is banned
          {
          }
     default xxx
          {
          }
}

"=", "==", "&", "&&", "!", "|", "||", "~", "^" are banned and replaced with operators.
if(..) and while(..) cannot accept an expression, only a boolean, this helps the ICE.

pointer arithmetic is banned, and when you need to dereference, you need to call the built-in operator dereference(..), which removes the ambiguity with "*" and helps the ICE





Personally, I have to say, myC is less frustrating than C by several order of magnitude, not because I made it, but rather because I made it exactly in the way it helps making the code clean and simple especially during my ICE-debugging sessions.
« Last Edit: August 05, 2022, 08:14:10 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: newbrain

Offline newbrain

  • Super Contributor
  • ***
  • Posts: 1801
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #42 on: August 05, 2022, 08:44:04 pm »
Quote
I had to solve tr-memory and concurrency problems with R18200, which is a weird beast
Wow.
See attachment.
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 161
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #43 on: August 05, 2022, 09:48:12 pm »
Are data type compiler-dependent or target dependent?

Absoltely _everything_ is compiler dependent (or, in more formal terms, implementation dependent). No exceptions.

However, compilers are not created in a vacuum. For the sake of efficiency, they do take into account specific properties of the target: hardware, OS, etc. But these are nothing more than considerations of common sense and efficiency. All of them can be ignored, circumvented and overriden by the compiler, should it become necessary for some reason.
« Last Edit: August 05, 2022, 09:50:19 pm by TheCalligrapher »
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4349
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #44 on: August 05, 2022, 10:56:24 pm »
Quote
You can still have (u)int_least16_t
I think the _least and _fast types are probably vastly under-used, even by programmers that use int8_t and similar religiously.

Which vaguely makes sense, since they're UGLY.  Part of a programming language's features is supposed to be readability.  "char msg[] = "Hello World"; is obvious and easy to read.   The "const unsigned char msg[]" form that C++ wants you to use, considerably less so.  "const uint_least8_t msg[]" is pretty horrible.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4767
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #45 on: August 06, 2022, 04:03:12 am »
Quote
You can still have (u)int_least16_t
I think the _least and _fast types are probably vastly under-used, even by programmers that use int8_t and similar religiously.

Which vaguely makes sense, since they're UGLY.  Part of a programming language's features is supposed to be readability.

They are certainly ugly.

The solution (which there are many other reasons to follow) is to not sprinkle them throughout your code, but to typedef meaningful names and use them everywhere instead of built in C types e.g. "typedef uint_least16_t CustID; typedef NativeChar CharArr[]".

Quote
"char msg[] = "Hello World"; is obvious and easy to read.   The "const unsigned char msg[]" form that C++ wants you to use, considerably less so.  "const uint_least8_t msg[]" is pretty horrible.

All the more so because it's wrong on x86 and MIPS (?), either way, where char and string literals are signed.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #46 on: August 06, 2022, 08:18:07 am »
_fast types

Probably only IBM guys love it.

If you program POWER10 or POWER11 machines you have hardware-loop instructions and it would be great if you can tell the compiler to make good use of them.
Code: [Select]
uint_fast32_t io; /* hope the compiler is smart, and  understand what I am telling here */
...
for (i0 = 0; i0 < n ; i0++) /* please, use a special loop register, and use a loop instruction */
{
         /* you also have use a general purpose register, i0 needs to go from 0 to n-1 */
         /* while the special loop register goes from n to 0 */
}
Why is it better? Well, because a loop hardware instruction never causes any wrong prediction, it's not an if then else branch, it's a down-counter-kind instruction, and when your pipeline is 14 or 20 stages, well it saves the pipeline 14 or 20 stages back.

You just like it when you do a big decomposition of the LUP matrix because you basically have three big nested loops :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4767
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #47 on: August 06, 2022, 08:31:52 am »
_fast types

Probably only IBM guys love it.

If you program POWER10 or POWER11 machines you have hardware-loop instructions and it would be great if you can tell the compiler to make good use of them.
Code: [Select]
uint_fast32_t io; /* hope the compiler is smart, and  understand what I am telling here */
...
for (i0 = 0; i0 < n ; i0++) /* please, use a special loop register, and use a loop instruction */
{
         /* you also have use a general purpose register, i0 needs to go from 0 to n-1 */
         /* while the special loop register goes from n to 0 */
}
Why is it better? Well, because a loop hardware instruction never causes any wrong prediction, it's not an if then else branch, it's a down-counter-kind instruction, and when your pipeline is 14 or 20 stages, well it saves the pipeline 14 or 20 stages back.

You just like it when you do a big decomposition of the LUP matrix because you basically have three big nested loops :D

There is only one CTR register (which lives in the instruction fetch unit, along with the PC and LR), so only the innermost loop can use it. CTR is also used for calling function pointers / C++ virtual functions, so if you're doing any of that in the inner loop (which of course you shouldn't be) then you don't get to use CTR as a loop counter at all.

CTR is the same size as any other integer register, so it's no great trick to get the compiler to use it. You simply need any counted loop that doesn't have another counted loop or indirect function call within it.

All this worked fine and as expected with the C/C++ compilers we had on PowerPC Macs 25 years ago.
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #48 on: August 06, 2022, 08:44:14 am »
Wow.

yes, you will find nothing on Google, that's a big problem for me: no documentation.

MIPS R16000 was a product sold by MIPS for high-end SGI workstations like Fuel and Tezro, but you get nothing when you try to get the cpu datasheet or its user manual.

The last document available is for R10000, a bit less for R12000, and we know about R14000 thanks to all the reverse engineering done to support R12K on Linux.

R16K is pretty undocumented, and not exactly compatible to R14K
R18K is very different in several aspects.

R16K documentation exists, but it's not publicly available. You may find it on the underground internet, but it's not legal, and MIPS is a very aggressive company about its intellectual properties.

R18200 is a prototype, pretty dead, and nobody will ever use it, but at least I got with some documentation, even if it's the kind that can't be printed, saved, emailed, uploaded to your Kindle ...

Physically there is a big FPGA soldered on CPU module for the Atlas MIPS EVB board, which accepts MIPS32 and MIPS64 CPU modules.

I have no HDL code, but I have the ISA documentation plus a little document that tells about the bus implementation and a second document that tells everything about the tr-memory with 35 bit of data they implemented inside the fpga; there is nothing more, but hey? it's better than nothing.


So, I know everything about the motherboard, just a fraction about the CPU.



Tr-memory has a similar story on its background, but at least you can find something on the Wikipedia. Unfortunately it's an abstract article with no implementation detail. I am no sure the tr-memory implemented in my boss's POWER10 is the same as the one I am working on the R18200 prototype  :-//
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: newbrain

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #49 on: August 06, 2022, 09:42:15 am »
it's no great trick to get the compiler to use it

d'oh, that's probably the reason why IBM recommends and supports -mcpu=power10 -unroll-loops

It's set as default compiler(1) flag in their MMA demos (AI stuff), and - according to the readme.txt - it's the best trick for better performance even if it performs more aggressive duplication of loop bodies than the compiler normally would.

(1) GCC v11.2
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online ledtester

  • Super Contributor
  • ***
  • Posts: 3275
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #50 on: August 06, 2022, 01:11:04 pm »
...
You have a piece of C code where you see "uint16_t ", the compiler won't accept it and returns

"sorry, this target doesn't support 16 bit datatype"
...

gcc has the "unavailable" attribute which will emit an error message when a type is used "anywhere in the source file" (see docs for specifics):

https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#Common-Type-Attributes

This code generated the unavailable error:

Code: [Select]
#include <stdint.h>

typedef uint16_t uint16_t __attribute((unavailable("sorry, this target doesn't support 16 bit datatype")));

int main()
{
    uint16_t foo;
}

I tested it at https://gcc.godbolt.org/ which uses gcc 12.1.

Types like unit16_t are defined in stdint.h, so you could also modify that file for your target to add that attribute.
« Last Edit: August 06, 2022, 01:14:24 pm by ledtester »
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #51 on: August 07, 2022, 07:41:43 pm »
gcc has the "unavailable" attribute which will emit an error message when a type is used "anywhere in the source file" (see docs for specifics):

this can be useful for a patch for the main file of gcc for a specific architecture (so, when you are compiling the cross-compiler) rather than in an header somewhere that someone usually messes up.

nice to have! nice to know  :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #52 on: August 07, 2022, 09:50:40 pm »
Designing a better C (I happened to have opened a thread dedicated to this a while ago and it unexpectedly got nowhere) is almost a lost cause. I know it sounds corny, but we now have 50 years to back this up.
Yup.  However, what we can kinda-sorta do, is replace the standard C library with something else.  (Which is something I've discussed in another thread.)
There are some house-sized warts to deal with, like the compiler emitting calls to memcpy() and memmove()), reserved names, implementation defined details in the standard regarding freestanding C, and they only grow bigger if you want it to be compilable with a C++ compiler also, but they're annoyances, not showstoppers.

What one cannot do, is change the base semantics and rules.  The compiler needs to support the ABI you choose.  You can set additional rules, like POSIX C does, if there is a way to tell the C compiler to abide by those rules.  You will end up having some compiler dependencies, especially function and variable attributes, but you can choose them from the set already supported by the main C compilers: GCC and clang already have a large common set.

It is also impossible to stop other developers from shooting themselves in the foot.  You can only make the robust, sensible ways easier, but you cannot idiot-proof anything based on top of C.

This is where my view diverges from e.g. DiTBho's.  I see things like MISRA C (and Annex K in ISO C11) as futile, and definitely do not want to work under such rules, because I see them as fundamentally flawed in their purpose: that they try to fix a perceived problem at the wrong complexity level.
I can accept the abstractions the (freestanding) C standard does, because to change those would be to fundamentally change the language, and as mentioned by SiliconWizard, none of the alternatives have fared as well as C has for the last half century or so.  Thus, while they are far, far from optimal, I see them as workable; and instead of fighting against them, I try to create interfaces and patterns that take advantage of them.

One of the details I've thought about a lot, and repeatedly see being a crucial piece in many embedded appliances (especially those with limited RAM), is memory allocation.  I've discussed arena-based allocators, but fact is, the separate arenas are not the reason why: the reason is, with arena-based allocators you can set practical, reliable run-time limits on any sub-task using a specific arena for allocations.

In a separate thread, peter-f is currently wrestling with a HTTP/HTTPS server running on an STM32 microcontroller.  Because the environment does not support arena-based or inheritably-limited allocations, memory use is a critical and hard part of the puzzle to solve.  Even with a single heap, if each allocation, reallocation, and free are done with respect to a context, with such contexts themselves forming a tree hierarchy, one can assign reasonable but dynamic limitations for each logical operation/task –– like responding to a HTTP request –– with nothing but standard C.  (You can even use a chain of longjmp()-based handlers, so that if a new allocation in a context cannot be fulfilled, the execution context reverts to the creation of the context failing, with a suitable error.  You'll most likely want to add cleanup handlers, closures, too, but it is all quite straightforward to implement.  It is the API design, the interface for others to use, that is hard to get right.)

As I hinted and DiTBho confirmed, he actually likes Ada and would apparently like to use it more, his problem being that there is no Ada compiler for some of the platforms he targets. He doesn't need a "better C", he needs Ada.
While I haven't done any serious programming in Ada, I can see its roots, and do believe it – or a strict subset or derivative like SPARK – would be better suited to the tasks MISRA C is usually applied.  (However, from outside, it looks like the big companies using Ada tend to hoover all the Ada developers, so that there isn't that big of a free/open source community around it.  In particular, what the hell is "GNAT Pro"?  Is it just a repeat of what Microchip did to GCC, to be able to hoodwink its customers into paying for a free compiler?)

It is for these reasons, that I firmly, but friendly ;), believe that it is wrong to blame the features of C.  It just isn't the right tool for the job here.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4349
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #53 on: August 07, 2022, 10:47:31 pm »
Quote
Designing a better C ... is almost a lost cause. ...  we now have 50 years to back this up.
But C HAS "gotten better" in that 50 years.
Maybe some of the original warts that are particularly horrible to modern eyes have turned out to be important features that can not be changed.  But still. I'd say it's gotten "much better."
 
The following users thanked this post: newbrain

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4767
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #54 on: August 07, 2022, 10:54:51 pm »
Designing a better C (I happened to have opened a thread dedicated to this a while ago and it unexpectedly got nowhere) is almost a lost cause. I know it sounds corny, but we now have 50 years to back this up.

My main complaint against the C language as such (rather than the standard library) is the syntax, especially the declaration syntax. I prefer something along Modula-2 / Ada lines.

However there is 30 or 40 years of experience of any language daring to ask people to type "end" instead of "}" failing in the market, while anything that copies C syntax is wildly successful, no matter how different the semantics are (Perl, Java, JavaScript, Go, Rust, ...).

I don't like it, but that's how it is.

Quote
As I hinted and DiTBho confirmed, he actually likes Ada and would apparently like to use it more, his problem being that there is no Ada compiler for some of the platforms he targets. He doesn't need a "better C", he needs Ada.

So his prefered language has already been designed. No need to try and design another one, for which there wouldn't be any more compiler available anyway.

As I understand it, Ada (GNAT) has long been available anywhere gcc is available, and there is also GNAT front-end combined with LLVM back end.

So Ada should be available pretty much everywhere except 6502, z80, 8051, PIC. And I wouldn't bet against 8051, actually.

Quote
I personally find C pretty usable. For any advanced use, I definitely recommend reading the standard for whatever revision you're going to use. It's a must. And reading the latest revisions (C11, and even the C23 draft) could also give you a couple ideas and show you what "modern C" can bring to the table.

C isn't my ideal language, certainly not for high-level programming in large programs. My preference is the Lisp family, and especially Dylan, as being expressive, efficient, garbage-collected, and with better macros, object systems, and exception handling than C/C++/Java.

But for small simple programs, especially things running on microcontrollers and other constrained systems, C gets the job done, and with a better mix of performance and portability than anything else.
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21218
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Are data type compiler-dependent or target dependent
« Reply #55 on: August 07, 2022, 11:10:21 pm »
Quote
Designing a better C ... is almost a lost cause. ...  we now have 50 years to back this up.
But C HAS "gotten better" in that 50 years.
Maybe some of the original warts that are particularly horrible to modern eyes have turned out to be important features that can not be changed.  But still. I'd say it's gotten "much better."

That's debatable.

K&R C was a reasonable match to PDP-11 era machines with simple caches and a single core. That meant the ambiguities were easily understandable (e.g. The C Puzzle Book) and the explicit lack of support for threading was acceptable.

Time has marched on and brought much more complex hardware. C responded by trying to be both a low-level and a high-level (for want of a better term) language. The response made C vastly more complex with many arcane rules that few programmers using the language and compilers fully understand. The consequence is that C has become a poorer match for many of today's applications and architectures.

In that practical sense, C has become worse, not better.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #56 on: August 07, 2022, 11:14:57 pm »
As I understand it, Ada (GNAT) has long been available anywhere gcc is available, and there is also GNAT front-end combined with LLVM back end.

You need a bootstrapper to compiler Gnat.
No bootstrap no party. A large part of Gnat is written in Ada.

Gnat is anywhere Gcc-compiled-with-ada-support  is available.

Code: [Select]
[2] hppa64-unknown-linux-gnu-4.9.4-gnat2016
[3] hppa64-unknown-linux-gnu-6.5.0
[4] hppa64-unknown-linux-gnu-7.3.1-gnat2018
[5] hppa64-unknown-linux-gnu-7.5.0
[6] hppa64-unknown-linux-gnu-8.3.1-gnat2019
[7] hppa64-unknown-linux-gnu-8.4.0
[8] hppa64-unknown-linux-gnu-9.3.0-gnat2021 *
[9] hppa64-unknown-linux-gnu-10.2.0

2,4,6,8 have this
Code: [Select]
--enable-languages=c,c++,fortran,ada
and they can be used to bootstrap gnat

It looks simple, but it was not, and it's not, it's pretty difficult to obtain.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #57 on: August 07, 2022, 11:29:38 pm »
I see things like MISRA C as futile, and definitely do not want to work under such rules

Years ago I thought the same. MISRA? Wtf!?!
Then I professionally met things like DO178B and DO178C, which require ICE-sessions.
And ICE-sessions became the reason why MISRA makes sense.

The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #58 on: August 07, 2022, 11:36:32 pm »
there is also GNAT front-end combined with LLVM back end.

LLVM doesn't have any support on HPPA and SH, and it has several problems on MIPS.
One things is LLVM on mainstream architectures { x86, arm }, one things is LLVM elsewhere.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4349
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #59 on: August 07, 2022, 11:38:47 pm »
Quote
Ada (GNAT) has long been available anywhere gcc is available, and there is also GNAT front-end combined with LLVM back end.
My impression is that with C, you can get away with very little in the way of "runtime library support", while most other modern languages require a lot more, and more that is architecture and OS-specific.  So that while you can release a "C compiler for Padauk" without even "libc", and a lot of people will be happy, the equivalent for Ada is much more difficult.
And I think there is less "portable" run-time library code as well.  (Although the portable C libraries are sometimes not much to write home about...)
Certainly "old" languages (Fortran, PL/1, Pascal) were very explicit about including massive amount of "IO" capabilities in the language itself (C has NOTHING at the language level!)  I haven't paid enough attention to "new" languages to see if any have been careful to separate the language from the OS and runtime environment, or even whether it's possible to do so and still have the improvements that people want.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #60 on: August 07, 2022, 11:49:59 pm »
the explicit lack of support for threading was acceptable.

With modern hardware threading and tr-memory is one of the problem today.

"Threads Cannot be Implemented as a Library" has been known since 2004, but very little has been done to address the problem.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #61 on: August 08, 2022, 12:02:44 am »
So that while you can release a "C compiler for Padauk" without even "libc", and a lot of people will be happy, the equivalent for Ada is much more difficult.
And I think there is less "portable" run-time library code as well. 

Yes, this is problem n2, once you somehow get a working Ada compiler - you need a working Ada runtime library.

The first run on HPPA was a bloody hell not only for adding "ada" to the gcc-enabled-language, but also to get a working run-time library.

With C you need only "cc1", and It will enough to obtain an assembly file from a C file; you can just create a crt0.S and some small libc.S, assembling them to .o files so linking them all together will be a runnable application.

That's indeed how "myC" works  :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21218
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Are data type compiler-dependent or target dependent
« Reply #62 on: August 08, 2022, 12:17:17 am »
the explicit lack of support for threading was acceptable.

With modern hardware threading and tr-memory is one of the problem today.

"Threads Cannot be Implemented as a Library" has been known since 2004, but very little has been done to address the problem.

K&R C in ~1982 is explicit that multithreading is a library problem, and that the language does not provide primitives. In other words multithreading had to be a machine+compiler specific hack in a library.

The 2004 papers were necessary, remarkably, because youngsters had forgotten the foundations, and the foundations were obscured by all the baroque complexity.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4767
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #63 on: August 08, 2022, 01:05:04 am »
there is also GNAT front-end combined with LLVM back end.

LLVM doesn't have any support on HPPA and SH, and it has several problems on MIPS.
One things is LLVM on mainstream architectures { x86, arm }, one things is LLVM elsewhere.

It's not mainstream vs non-mainstream. LLVM support for RISC-V is now better than gcc support for RISC-V.

The distinction is current ISAs vs legacy ISAs.

New work on new ISAs is now almost always done (or done first) in LLVM because more people can understand how to work with the LLVM source code than with the GCC source code, and because LLVM's more liberal licensing is easier for companies to deal with, and because it is much easier to upstream new and experimental code into LLVM than into GNU.

Old dead ISAs got GCC support in the 90s or 00s before LLVM existed.

Old inactive ISAs: use gcc, therefore GNAT

New and active ISAs: use llvm, therefore GNAT-on-LLVM

Disclaimer: I haven't used ADA since 1984 on a VAX. I seem to recall the compiler was from New York University and was written in SETL.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #64 on: August 08, 2022, 01:14:00 am »
I see things like MISRA C as futile, and definitely do not want to work under such rules
Years ago I thought the same. MISRA? Wtf!?!
Then I professionally met things like DO178B and DO178C, which require ICE-sessions.
And ICE-sessions became the reason why MISRA makes sense.
For me, it boils down to using the language that is most appropriate for the task.  I know C well enough to see that it is impossible to make it "safe" like MISRA and the various aeronautical etc. rules try to; it is futile.

The fact that there aren't toolchains available for the languages that would be much more appropriate for the task at hand, isn't the fault of an unrelated programming language, is my point.

If I wanted to work on something as safety critical as aeronautics or medical appliances, I for sure would like to use a programming language that can be statically verified and proven.  If that means porting the toolchain first, well, that would be the cost I'd have to accept, or not do it at all.  But that is just me, and my own personal quirks.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4767
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #65 on: August 08, 2022, 01:23:53 am »
the explicit lack of support for threading was acceptable.

With modern hardware threading and tr-memory is one of the problem today.

"Threads Cannot be Implemented as a Library" has been known since 2004, but very little has been done to address the problem.

I feel this paper is misrepresented.  It's not the library part that is the problem.  As soon as you have two CPU cores running two threads there may well be no more library calls, ever, after the initial spawning of the 2nd thread.

The problem was that C, at that time, didn't have a concept of memory model or things such as acquire and release semantics. Neither did any other language pretty much, other than Java.

Heck, most ISAs at the time didn't even have a formal notion of memory model.
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21218
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Are data type compiler-dependent or target dependent
« Reply #66 on: August 08, 2022, 07:17:47 am »
the explicit lack of support for threading was acceptable.

With modern hardware threading and tr-memory is one of the problem today.

"Threads Cannot be Implemented as a Library" has been known since 2004, but very little has been done to address the problem.

I feel this paper is misrepresented.  It's not the library part that is the problem.  As soon as you have two CPU cores running two threads there may well be no more library calls, ever, after the initial spawning of the 2nd thread.

The problem was that C, at that time, didn't have a concept of memory model or things such as acquire and release semantics. Neither did any other language pretty much, other than Java.

Heck, most ISAs at the time didn't even have a formal notion of memory model.

Yes, the lack of a memory model was the key issue - and I believe the reason Boehm wrote the paper was to force people to realise it. I still find it remarkable and noteworthy that the general ignorance in the (supposedly) expert community meant he had to write the paper.

People seem to have forgotten that K&R C stated both threading was a library issue and that the language specifically omitted the features required to allow it to be implemented in a library. That was acceptable in the 70s, but ridiculous a professional lifetime (>30 years) later.

Java was ahead of its time in many ways, and behind the times in good ways. If you look at Gosling's original whitepaper he give reasons why historic experience showed all the language features are beneficial and play well together, plus why he omitted features that experience showed were problematic - especially multiple inheritance of implementations (not traits/interfaces)). Gosling showed extreme good taste, as I would have expected from someone that created my first favourite editor: Unipress Emacs.
« Last Edit: August 08, 2022, 07:25:55 am by tggzzz »
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4349
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #67 on: August 08, 2022, 08:51:19 am »
Quote
K&R C in ~1982 is explicit that multithreading is a library problem, and that the language does not provide primitives. In other words multithreading had to be a machine+compiler specific hack in a library.
I just skimmed it, but didn’t the paper say that threading couldn’t be implemented in a library, even IF the library implemented machine and compiler specific hacks?


That’s not how I interpreted the claim the first time I heard it, through.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #68 on: August 08, 2022, 10:40:35 am »
multithreading and try-memory need language support otherwise you can only workaround machine+compiler specific hack in a library, this is particularly problematic with powerpc and power due to pipeline specific instructions on ooo load store which make things even worse

So you usually have critical code implemented in assembly to manage the shared memory ... but it is an ugly workaround

myC has built-in tr-memory support, dead lock free
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #69 on: August 08, 2022, 10:58:16 am »

I know C well enough to see that it is impossible to make it "safe" like MISRA and the various aeronautical etc. rules try to; it is futile.

My point is that MISRA does it make it safer as consequence of making it easier to be automatically debugged.

Easier to be debugged => safer

DO178 is about life cycle, life cycle is also about debugging sessions and test cases, and debugging sessions are about AI-assisted debuggers; what I usually call "ICEs" are smart devices with a super fast optic link (48Mbps) and a strong MMA processors able to understand C at the language level, but you have to help them, that is the purpose of MISRA

myC is more AI-ICE friendly than common C

So, what do you want to achieve, how to verify it, help the ICE AI to help you

Does not sound so terrible ... It is not "dark side", but rather simply collaboration between humans and AI
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21218
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Are data type compiler-dependent or target dependent
« Reply #70 on: August 08, 2022, 11:36:20 am »
...what I usually call "ICEs" ...

To me  in the context of an embedded system an "ICE" is an In Circuit Emulator, https://en.wikipedia.org/wiki/In-circuit_emulation

That appears, at a glance, different to your definition. A "Humpty Dumpty" situation, I suppose.
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #71 on: August 08, 2022, 12:23:12 pm »
I know C well enough to see that it is impossible to make it "safe" like MISRA and the various aeronautical etc. rules try to; it is futile.

My point is that MISRA does it make it safer
safer ≢safe.

Wearing rubber boots while standing under a tree in a thunderstorm is also safer than not wearing rubber boots, but you can still die from lightning hitting the tree.

myC is more AI-ICE friendly than common C
Sure, but that just proves that C is simply the wrong tool for the job!

And is precisely my point why you should not blame C for not being suitable for your task at hand –– do not forget, you yourself claimed that it is somehow C that is at blame here; that C should be "fixed".

Our difference in opinion is that I believe that C is unsuitable at your task –– and any task for which MISRA etc. act as band-aids ––, and you have argued that no, C itself should be changed and "fixed" because it is so hard to apply to your task at hand and MISRA etc. do help applying it somewhat better to the task; that it is C's fault for not being more amenable to the task at hand.

I could just as well blame my $25 cordless drill for its poor CNC capabilities, and demand the manufacturer adds at least DRO's, gyroscopic stabilizers, and a few servo motors to control it better.  Do you see my point?  :popcorn:
 

Online tggzzz

  • Super Contributor
  • ***
  • Posts: 21218
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Are data type compiler-dependent or target dependent
« Reply #72 on: August 08, 2022, 12:37:38 pm »
I could just as well blame my $25 cordless drill for its poor CNC capabilities, and demand the manufacturer adds at least DRO's, gyroscopic stabilizers, and a few servo motors to control it better.  Do you see my point?  :popcorn:

Isn't that what's happened, especially with C++  >:D
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #73 on: August 08, 2022, 01:06:16 pm »
To me  in the context of an embedded system an "ICE" is an In Circuit Emulator, https://en.wikipedia.org/wiki/In-circuit_emulation
That appears, at a glance, different to your definition. A "Humpty Dumpty" situation, I suppose.

Well, ICE is also a term used in cyberpunk literature to refer to security programs which protect computerized data from being accessed by hackers.

ICE
Intrusion
Countermeasures
Electronics

Gibson  ;D

Among avionics testers and developers, "ICE" for a "hardware debugger" is considered common special "jargon".
And here you are right, jargon is when you talk by words or expressions that are used by you particular profession and are they difficult for others to understand.

I mean, like "legal jargon" or "math jargon" ... only lawyers and mathematicians fully understand it  :o :o :o


Anyway, I do "ICEs", I mean I do physically design and build debuggers, from the firmware belt up, so for me it's automatic association that an "in circuit emulator" nowadays is more like TAP interface (BDM, nexus, ejtag, jtag, etc) able to talk to the host via an interface for reading/writing registers, therefore indirectly also the memory.

In my case, I tend to forget to mention that it is paired with a smart device that can perform analysis directly on the target
  • coverage
  • performance
  • testcases

To perform them, code instrumentation is always required, but the ICE helps because it provides an instrumenting tool (on the host side) plus the in-circuit capability to map test-points to a specific piece of C code that it only needs to query from the host, and here it's where MISRA helps. The host replies by sending the source code and the testing-plan, the "ICE" parses the C source (on the target side, the ICE is smart), and applies the testing points to verify them.

Everything is automatic, AI supervision-ed, unless you require "manual override" operation.

MISRA helps because it makes those testing points explicit, and makes life easy to the human being that has to "instrument" the ICE.

Practically
- you write MISRA && DO178 compliant C code
- you pair the C code to the instrumenting engine
- the skeleton of the basic test-cases is automatically generated, you can add more test-cases
- you modify/approve the test-cases
- once approved, the instrumenting engines compiles it
- a DWARF binary is generated, and an instrumentation file IIF
- DWARF and IIF are uploaded to the ICE (here is where a fast link helps)
- the ICE automatically starts analysis

« Last Edit: August 08, 2022, 01:11:30 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #74 on: August 08, 2022, 01:22:13 pm »
safer ≢safe.

sure, but "safer" is a step forward in progress to achieve my purpose (see below)

just proves that C is simply the wrong tool for the job!

myC demonstrates that even C89 can be modified to help debuggers. Which makes it a better tool, not perfect, but better.

For my point of view, ADA is difficult to be supported at the language-bootstrapping and run-time level (and, talking about supporting new-targets, it'salso extremely hard to pass certifications), while a compiler like myC is as easy as bootstrapping a C compiler for a target.

You see, get { CC1, crt0.s, libc.s } and you get a compiler with tr-memory support! Already ICE-friendly!

Purpose fully achieved  :D

Unfortunately, I'm not good enough with compilers to push a dominant change into Gcc or LLvm or something; myC is 60% based on lcc-v4, I am not able to write a C compiler from scratch, and the result requires people to adapt their coding style, that is *the* serious problem because most C programmers won't.

So, for me, I have the feeling that it's more an ideological problem, especially for those who don't care about debuggers.
« Last Edit: August 08, 2022, 01:31:18 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #75 on: August 08, 2022, 02:51:35 pm »
So, for me, I have the feeling that it's more an ideological problem, especially for those who don't care about debuggers.
Not in my case; the opposite.  If ICE or provably correct operation is desirable or required (like for life-saving equipment), then proper tools need to be available; and tools that only get part way there are just not good enough for me.  You might say I care too much...

Unfortunately, I'm not good enough with compilers to push a dominant change into Gcc or LLvm or something; myC is 60% based on lcc-v4, I am not able to write a C compiler from scratch, and the result requires people to adapt their coding style, that is *the* serious problem because most C programmers won't.
This is absolutely true.  A bit over a decade or so ago I did spend a couple of years (not continuously or every day, but you know, whenever I could) to push some fixes upstream to GCC and gfortran, and it being still the era of "Unless you have a doctorate in CS and peer-reviewed papers on compiler theory, we do not deign to respond to your posts, you insignificant loser" -type core GCC developers, it really put me off working on any kind of compiler..  The people at GCC are much different now, and there is even clang, but porting a toolchain to a new (old) architecture takes a lot of careful work, and has a steep learning curve.

I definitely would not want to have to port any GCC or clang frontends or backends to a new architecture, that's for sure.  It's just that if I were to start working on safety-critical code, that's pretty much what I feel I personally would have to do, because I just do not believe any ruleset or small tweak of C syntax can make it sufficiently safe, even if it were to fulfill some industry rulesets.

In particular, the kinds of problems I end up spending most time with, are 1) emergent behaviour arising from complex interactions between subsystems and mechanisms, and 2) efficient and effective interfaces (and abstractions at the correct complexity level); not typos or memory management or buffer overrun or misalignment type things.
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #76 on: August 08, 2022, 04:39:54 pm »
I definitely would not want to have to port any GCC or clang frontends or backends to a new architecture

Yup, I have in mind MIPS4++ and MIPS5, but also HPPA2 if Gcc will ever drop the support.
"just in case" plan  ;D

that's for sure.  It's just that if I were to start working on safety-critical code, that's pretty much what I feel I personally would have to do, because I just do not believe any ruleset or small tweak of C syntax can make it sufficiently safe, even if it were to fulfill some industry rulesets.

Under DO178, whatever you do, the main QA rule tells you that everything is always subjected to several verification and inspections at different QA levels.

draft -> engineering version -> QA approved level1 -> QA approved level2 -> product -> product revision -> ...

This is software life-cycle, an it's regulated by DO178.

It's not a matter of being sufficiently safe at the compiled level, things like myC cannot guaranty that, even official tool like Stood cannot guaranty that, but rather a matter of having debugging sufficiently facilitated at the self-instrumentation level so you can run more testing activities within the same budget and time.

$200/person per day
8-9 hours per day

the more you can test, more  bugs you can find

my AI-ICEs cost ~$20 per day(1) :D

(1) talking about 5 years mortgage for the company to pay the equipment
$20.000 total cost, 200 days ~ 5 years
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: Nominal Animal

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7186
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #77 on: August 08, 2022, 06:35:52 pm »
It's not a matter of being sufficiently safe at the compiled level, things like myC cannot guaranty that, even official tool like Stood cannot guaranty that, but rather a matter of having debugging sufficiently facilitated at the self-instrumentation level so you can run more testing activities within the same budget and time.
Right.  In other words, you don't do myC because you think it fixes problems inherent in C, but because it fits better into the (human) verification framework around the target implementations.  :-+

The point about SPARK/Ada is that verifiability is one of its design goals, and the existing tools –– I believe, not having used them myself –– have rather extensive verification and validation options compared to what is possible with C.  Having tools that can tear down the code into Hoare logic that is then passed to automated theorem provers, and find out its fundamental properties or deficiencies, is pretty mind-bending to me...  (It's exactly the kind of complex interactions that that kind of analysis can reveal that gives me most pause nowadays, like I already mentioned; not the "easy" stuff like memory management.)
Not using them for safety critical stuff, and "just settling" with MISRA and similar rulesets, seems like taking the easy way out to me, though.  You know, like choosing not to spend X to reduce the risk of loss of life by Y, because of commercial or business reasons.

Then again, I'm the last person anyone should consult about business matters!  :P
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15794
  • Country: fr
Re: Are data type compiler-dependent or target dependent
« Reply #78 on: August 08, 2022, 06:51:26 pm »
Well. Relying on debugging for proving the safety and correctness of code is possibly understandable in some situations, but definitely not the approach most people implementing safety-criticial stuff would like to take. Eek. Testing and "formal" proofing are two different things. They are complementary. But focusing on debugging sounds pretty horrible generally speaking.

MISRA is a mixed bag. While some of its rules are common sense or just good practice, others are like removing as much from C as possible while keeping it Turing-complete.
It's sometimes too close to "let's forbid additions, since additions in C can overflow". Just barely exaggerating. Yeah OK.

The thing with Ada is that, apart from gcc (which is already a good thing), tools are not abundant and tend to be expensive. And while it *can* be used for "small" targets, it's nowhere near as efficient (neither performance- nor code size-wise) as C. So this is often a tough choice. Not to mention it's basically taught almost nowhere these days, so Ada developers are either pretty old by now or just self-taught.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4366
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #79 on: August 08, 2022, 07:26:29 pm »
Right.  In other words, you don't do myC because you think it fixes problems inherent in C, but because it fits better into the (human) verification framework around the target implementations.  :-+

Yup, that's the purpose  ;D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf