Author Topic: Oh, C3!  (Read 5320 times)

0 Members and 1 Guest are viewing this topic.

Offline SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14488
  • Country: fr
Re: Oh, C3!
« Reply #25 on: April 23, 2023, 08:54:50 pm »
'nextcase' doesn't allow to jump around freely but only to the next 'case', as I've got it at least.

No, look closer - clearly that was their original idea, but they feature creeped their "fallthrough" to take an argument, allowing you to jump to any other cases, in any order. I'm ridiculing this feature.

Oh, you're right. You had a closer look than I did. ;D

There's this infamous "labelled nextcase". https://c3-lang.org/statements/#nextcase-and-labelled-nextcase

It's not even merely a "label", it's an expression that can be evaluated at run-time and this acts as though the code flow was looping back to the switch select with a value given by the "label" (which again isn't a label.)

That doesn't look good. :-DD Oh, and incidentally, with this construct, if the expression given to nextcase evaluates to a value that is not handled by any 'case', then it becomes an infinite loop. Nice!! :-DD
Or does the switch just exit in this case? Who knows, not sure I saw that clearly.

That said, the guy maybe had the typical state machine construct in mind. For which I would favor setting a variable with the next state, rather than directly controlling the flow anyway.
His approach avoids having to put the switch inside a loop, but then it makes a potentially implicit loop, which is horrific.

The benefit of using a variable holding the state is that it's much easier to trace. If you control the flow directly in many places, it makes tracing much more tedious.
« Last Edit: April 23, 2023, 08:58:26 pm by SiliconWizard »
 

Online PlainName

  • Super Contributor
  • ***
  • Posts: 6848
  • Country: va
Re: Oh, C3!
« Reply #26 on: April 25, 2023, 06:45:31 pm »
Quote
I'm sick and tired of bugs introduced by using sizeof() on an array when you should have used sizeof(a)/sizeof(a[0])

All my project have:
Code: [Select]
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6265
  • Country: fi
    • My home page and email address
Re: Oh, C3!
« Reply #27 on: April 25, 2023, 07:19:46 pm »
Ever since the last C-related programming language discussion, I've now and then examined the typical cases where human programmers make most errors in C.

I'm pretty darned convinced that the only fundamental change (in addition to various syntax etc. additions we have discussed here in other threads previously), would be to replace pointers with arrays as the base memory reference type.  (There are details wrt. read-only strings I'm not sure about, though.)  That would let the compiler do compile-time memory access validation, helping kill buffer-related bugs.  All other changes could be done incrementally, by replacing the C standard library.

Whenever I do such experiments –– i.e., how would code look like and what would it compile to, if I tweaked the compiler and libraries just so –– I always discover that the end result I desire is obtainable by a smaller 'real' change (but large paradigm/approach/theory-wise change) than one would initially assume or believe.  Furthermore, the things most developers get stuck on –– myself included, unless I monitor myself to explicitly avoid this ––, end up not affecting the actual language use much; only how it looks like on the surface, and how it can be described to other people.  Unimportant fluff, in other words.

To repeat from those other threads, I would like additional features that provide an unordered (data-parallel) for loop constructs, as well as a way to tell the computer that two independent code sections can be interleaved (that their relative order is unimportant, perhaps at block level).  But these are optimization, things that are somewhat difficult for compilers to optimize using current C rules; and I have not verified what kind of constructs would be needed and what kind of changes needed to the C standard abstract machine model to address these.  The all-pointers-are-actually-arrays change, however, would be suprisingly straightforward.
 

Online PlainName

  • Super Contributor
  • ***
  • Posts: 6848
  • Country: va
Re: Oh, C3!
« Reply #28 on: April 25, 2023, 08:36:25 pm »
Quote
replace pointers with arrays as the base memory reference type

Why would that be better?
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #29 on: April 25, 2023, 09:43:49 pm »
Quote
replace pointers with arrays as the base memory reference type

Why would that be better?

because they can transport borders and size { begin, end, size }
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6265
  • Country: fi
    • My home page and email address
Re: Oh, C3!
« Reply #30 on: April 25, 2023, 09:53:48 pm »
Quote
replace pointers with arrays as the base memory reference type
Why would that be better?
It makes it possible for the compiler (with current gcc and clang static analysis/warning capabilities) to verify buffer accesses are valid.  (That is, the compiler knows at compile time if each access is "valid" (safe, within the buffer), "invalid" (overrun/underrun), or "undetermined"; with the last one only affecting code that uses pointers or tricky indexing math whose limits are unknown at compile time.)

To explore this yourself, write some test code using the pattern
    type1 somefunc(size_t len, type2 buffer[len], ...)
i.e. instead of pointers, you pass an array; and to avoid having the array auto-decay to a pointer, you need to specify its size too.
If you introduce a typical buffer overrun bug in such a function, no matter how deep in a call chain, the compiler will tell you if you enable the relevant warnings.  There are no added run-time checks at all; look at the generated machine code too.

This alone does not do anything for existing code, say for example strlen().  The idea is to use the change to rewrite the standard library in a form where pointers are replaced with array references.  Currently it is a bit cumbersome, for example strlen() would best be written as
    ssize_t strlen(size_t len, const char s[len]);
so at minimum we'd need the compiler to allow the size of a parameter array to be defined later in the explicit parameter list (with ssize_t from stolen from POSIX; C currently uses int for it, which is problematic on LP64 architectures).

The necessary changes, as I said, are surprisingly small.  The effects, however, to how easily statically analyzable it makes efficient code, are surprising.  You really do need to experiment with it to see the possibilities.  As I also said, read-only/immutable strings have peculiarities I'm not sure yet how best to deal with, but basically the rest of POSIX C -like functionality (i.e., with different API/function signatures, but same or very similar functionality) is quite straightforward.  Oh, and functions allocating or reallocating memory, returning an array reference, may need syntactic sugar (as currently they really need to return a struct containing the start address of the allocated memory, and length in bytes).

(What I am not sure about yet, is whether we need a new non-scalar base "type" with two properties, start address and length.  Currently, we can do that and slicing (three properties: start address, step size, and count), just fine using structures.  But there might be additional compiler optimization/compile time static analysis opportunities, if it was a base type to begin with.)
 

Online PlainName

  • Super Contributor
  • ***
  • Posts: 6848
  • Country: va
Re: Oh, C3!
« Reply #31 on: April 25, 2023, 10:02:12 pm »
Quote
replace pointers with arrays as the base memory reference type
Why would that be better?
It makes it possible for the compiler (with current gcc and clang static analysis/warning capabilities) to verify buffer accesses are valid.  (That is, the compiler knows at compile time if each access is "valid" (safe, within the buffer), "invalid" (overrun/underrun), or "undetermined"; with the last one only affecting code that uses pointers or tricky indexing math whose limits are unknown at compile time.)

Ah! Of course, I was stuck in pointer mode thinking the array would be passed as a pointer and just look like an array to the programmer. But I see now that's not the idea :)
 

Offline SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14488
  • Country: fr
Re: Oh, C3!
« Reply #32 on: April 25, 2023, 11:11:07 pm »
That's more or less akin to always using the base pointer to an allocated block (rather than accessing it through a pointer that could point arbitrarily inside, or even outside of it) and some index for accessing its content. You also need to store the size.

After which it does look like a full-fledged array indeed.

That's something you can always do in pure C though, even if that means a bit more programming overhead and possibly a bit less opportunity for optimization (even though that would remain to be seen in practice.)

I have written a header file years ago, that I still use to this day (with some minor evolutions), that exposes a few macros to encapsulate dynamic memory management, including "dynamic arrays", which are accessed via indices only (either with, or without bounds checking depending on the use case.) I haven't directly called any malloc/realloc/free ever since. The runtime overhead is either zero or extremely small.

Sure having that built-in would be nice, but point is, this approach can still be - at least in essence - used without designing a new language. I for one wouldn't use C without this small "library" I wrote more than a decade ago, at least for anything requiring dynamic allocations. For pure static allocation stuff, part of it can still be used.

The basic idea is to create a type for your 'arrays', something like this:

Code: [Select]
typedef struct
{
    BaseType *Block;    // pointer to your memory block, from static or dynamic allocation
    size_t n;    // current number of items of type 'BaseType' in the memory block
    size_t nMax;    // max number of items of type 'BaseType' in the memory block
}   Array_t;

That can be initialized from a statically-allocated array just like so:
Code: [Select]
BaseType Array[xxx];

Array_t MyArray = { .Block = Array, .n = 0, .nMax = ARRAY_SIZE(Array) };

The variant for dynamically-allocated arrays is also easy. With this simple construct, one can see that handling dynamic arrays becomes 'straightforward'.

Accessing it is just:
Code: [Select]
MyArray.Block[index] // no bounds-checking, can be used with zero overhead when guarantees about 'index' are sufficient
MyArray.Block[index < MyArray.n? index : MyArray.n - 1] // bounds-checking, default to "saturating" the index

// You can add variants of bounds-checking that will execute some code in case of an out-of-bounds condition if needed.

With a few macros, that can be declared, and manipulated for just any base type very easily.
Some will find that clunky, especially if they don't like macros, as doing this without macros will be even clunkier.
Others will do this kind of stuff with macros and move on.

In any case, you'll indeed realize that directly playing with pointers is rarely necessary, and can be left to the very occasional and very low-level stuff.

The benefits of keeping all your pointers only pointing to *objects*, rather than potentially pointing arbitrarily *inside* an object are multiple.
« Last Edit: April 25, 2023, 11:13:15 pm by SiliconWizard »
 

Online PlainName

  • Super Contributor
  • ***
  • Posts: 6848
  • Country: va
Re: Oh, C3!
« Reply #33 on: April 25, 2023, 11:43:22 pm »
I do a similar thing but use functions rather than macros. There's an overhead in the function call, but more scope to mess around when debugging without affecting anything else. Also doesn't rely on the programmer remembering to use the safe access when appropriate :)
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6265
  • Country: fi
    • My home page and email address
Re: Oh, C3!
« Reply #34 on: April 26, 2023, 12:12:52 am »
It is very interesting (but unclear) to me exactly why most C programmers – myself included – prefer
    rettype funcname(elemtype *ptr, size_t len);
over
    rettype funcname(size_t len, elemtype buf[len]);
even though the only difference in machine code is the order of parameters; but the latter API pattern allowing much better buffer access checking at compile time, even through deep call chains (each call limiting to a smaller sub-array), helping catch buffer underrun/overrun errors.

The easy answer is inertia (or habit or familiarity or because everyone else does it that way too), but I'm not sure it is the whole answer.
Isn't it interesting how rarely anything like this (arrays-not-pointers) is suggested for "the next C", even though memory or buffer over/underrun bugs are the most common issues in C code?

Quote
replace pointers with arrays as the base memory reference type
Why would that be better?
It makes it possible for the compiler (with current gcc and clang static analysis/warning capabilities) to verify buffer accesses are valid.  (That is, the compiler knows at compile time if each access is "valid" (safe, within the buffer), "invalid" (overrun/underrun), or "undetermined"; with the last one only affecting code that uses pointers or tricky indexing math whose limits are unknown at compile time.)
Ah! Of course, I was stuck in pointer mode thinking the array would be passed as a pointer and just look like an array to the programmer. But I see now that's not the idea :)
Yep.  It's more like a cultural change than a technical one, even though its purpose is purely technical: help with compile-time static analysis wrt. buffer accesses.

That's more or less akin to always using the base pointer to an allocated block (rather than accessing it through a pointer that could point arbitrarily inside, or even outside of it) and some index for accessing its content. You also need to store the size.
Actually, what I want is for the compiler to be aware of the size whenever it is known at compile time.

If you consider the two funcname() definitions at the beginning of this post, you can clearly see the difference between the pointer and the array approach.  This difference is the critical one; it is not about adding explicit size information to interfaces that currently use a pointer only.  (Except for memory allocation functions: these should return both the allocated size and the base address, instead of just the base address.  This would actually be desirable in many grow-as-needed use cases, considered completely separately.  Oh, and possibly the string functions, which deserve to be redesigned anyway.)

That's something you can always do in pure C though, even if that means a bit more programming overhead and possibly a bit less opportunity for optimization (even though that would remain to be seen in practice.)
Note that the change would not cause any change to runtime code, no inherent additional runtime memory or CPU overhead at all.

Many string functions would actually add an explicit size parameter (ABI-wise), but I consider that a plus (and a deficiency in current standard C library string functions).  I've discussed the related issues especially in embedded environments before; let's just say that string handling can be done much better (faster, more reliably) even in current C than what the standard C library provides.

Passing an array forwards is trivial even in current C (since C99), although the size of the array must be before the array in the parameter list, but receiving an array from a function call is not supported.  Thus far, in my experiments I've simply assumed syntax "elemtype arrayname[sizetype count] = ...;" (declaring two variables at once, initialized by a single function call returning both the base pointer and the size, with the size divided by the element size to obtain the count "automagically"), but I'm sure better syntax can be devised.

The basic idea is to create a type for your 'arrays', something like this:
Code: [Select]
typedef struct
{
    BaseType *Block;    // pointer to your memory block, from static or dynamic allocation
    size_t n;    // current number of items of type 'BaseType' in the memory block
    size_t nMax;    // max number of items of type 'BaseType' in the memory block
}   Array_t;
Yes, I use this pattern extensively.  For some reason, I use 'used' for the current number of items, and 'size' for the maximum number of items, and 'item' for the pointer or C99 flexible array member.  It is very common to see a variant of
    typedef struct {
        size_t   size;
        size_t   used;
        elemtype item[];
    } elem_array;
in my code.

Indeed, whenever this information is already available, why don't we tell the C compiler about it, so it can help check the array boundaries for us at run time?

This is the core of this suggestion.  Not to add size and/or used to everywhere (except functions that in my opinion should have had the size from the beginning even in the standard C library), but to help the compiler understand better exactly what us humans intend, and help catch our thinkos at compile time.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #35 on: April 26, 2023, 08:54:36 am »
It is very interesting (but unclear) to me exactly why most C programmers – myself included – prefer
    rettype funcname(elemtype *ptr, size_t len);
over
    rettype funcname(size_t len, elemtype buf[len]);

actually I do prefer

    ans_t funcname(buffer_t buffer)

 :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #36 on: April 26, 2023, 09:08:11 am »
whenever this information is already available, why don't we tell the C compiler about it, so it can help check the array boundaries for us at run time?

even better, why don't we tell the ICE about it? so it can help automatic test-cases and autonomously check boundaries for us at run time?

even better++, why don't we facilitate AI-assisted ICEs? So they can also use that information to identify common patterns in pieces of code that have a high probability of containing bugs.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline JPortici

  • Super Contributor
  • ***
  • Posts: 3461
  • Country: it
Re: Oh, C3!
« Reply #37 on: April 26, 2023, 10:20:59 am »
Quote
I'm sick and tired of bugs introduced by using sizeof() on an array when you should have used sizeof(a)/sizeof(a[0])

All my project have:
Code: [Select]
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

of course, and i should do that.
However a lengthof(x) operator that can only take arrays as an input would have been better than a macro (ISTR lengthof as an extension in some compilers)
« Last Edit: April 26, 2023, 10:23:30 am by JPortici »
 

Offline SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14488
  • Country: fr
Re: Oh, C3!
« Reply #38 on: April 26, 2023, 08:40:00 pm »
I do a similar thing but use functions rather than macros. There's an overhead in the function call, but more scope to mess around when debugging without affecting anything else. Also doesn't rely on the programmer remembering to use the safe access when appropriate :)

Sure, problem is that you can't avoid macros to generate the type definitions themselves (such as the Array_t example I gave, for any given base type.)
You can kind of work around it by using a void * pointer for the allocated block and add an additional member for the 'element size', but then you lose any basic static check, and you suddenly get an even better way of shooting yourself in the foot than directly messing with pointers. :popcorn:

Or you hand-write every 'array' type definition, which is horrible. Macros are for avoiding to retype the same text over and over again, and that's what I use them for.
Want to add a member to your 'generic' type? Just modify the macro. Macros need some care to avoid the usual pitfalls, but when used with some care, they are infinitely preferable to duplicating code.
« Last Edit: April 26, 2023, 08:42:06 pm by SiliconWizard »
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8179
  • Country: fi
Re: Oh, C3!
« Reply #39 on: April 27, 2023, 09:06:56 am »
Or you hand-write every 'array' type definition, which is horrible. Macros are for avoiding to retype the same text over and over again, and that's what I use them for.
Want to add a member to your 'generic' type? Just modify the macro. Macros need some care to avoid the usual pitfalls, but when used with some care, they are infinitely preferable to duplicating code.

And I truly believe the C preprocessor is one of its strongest points and reason why C became so popular. People who invent "C replacements" tend to miss this fact. The first thing they do is they remove the preprocessor because it's so inelegant and dangerous; yet fail to come up with something with at least the same capabilities.

C programmers have this love-hate relationship with the preprocessor. It's horrible, but it's surprisingly powerful and makes it possible to do generic programming in C.
 
The following users thanked this post: SiliconWizard

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #40 on: April 27, 2023, 11:23:51 am »
Or you hand-write every 'array' type definition, which is horrible. Macros are for avoiding to retype the same text over and over again, and that's what I use them for.
Want to add a member to your 'generic' type? Just modify the macro. Macros need some care to avoid the usual pitfalls, but when used with some care, they are infinitely preferable to duplicating code.

And I truly believe the C preprocessor is one of its strongest points and reason why C became so popular. People who invent "C replacements" tend to miss this fact. The first thing they do is they remove the preprocessor because it's so inelegant and dangerous; yet fail to come up with something with at least the same capabilities.

C programmers have this love-hate relationship with the preprocessor. It's horrible, but it's surprisingly powerful and makes it possible to do generic programming in C.

sure! why not? in fact cpp was the first thing being banned and removed entirely in my-c

It can be done, and my-c doesn't miss anything, just it solves problems differently and makes life easier :D

Whereas C/89/99 ... well, the last bug i fought in the Linux kernel was a typo with "#define something SPACE MISTAKE etc" which got through the build Gcc-v12 steps but caused a silent but catastrophic and sneaky bug, and I wasted three weeks on it  :o :o :o
« Last Edit: April 27, 2023, 11:28:09 am by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #41 on: April 27, 2023, 11:31:56 am »
C programmers have this love-hate relationship with the preprocessor

Also, include those who write analysis software and develop ICE tools.
For all of us cpp is more than terrible.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 8179
  • Country: fi
Re: Oh, C3!
« Reply #42 on: April 27, 2023, 01:55:14 pm »
CPP = C Plus Plus
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #43 on: April 27, 2023, 02:08:10 pm »
CPP = C Pre Processor

belongs to
sys-devel/gcc ---> /usr/$arch-$computer-linux-gnu/gcc-bin/$gcc_version/cpp
dev-lang/gcc_gnat ---> /usr/$arch-$computer-linux-gnu/gnat-bin/$gcc_version/cpp

(gcc_gnat is ... gcc + gnat_ada_core, recompiled as gcc with languages={C, Ada} )

c++ = C plus plus
g++ = GNU C plus plus
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #44 on: April 27, 2023, 02:18:08 pm »
cpp also belongs to
overlay@idp: sys-devel/my-c~MIPS5++ ---> /usr/idp/my-c-bin/$my-c_version/cpp


what?  :o :o :o

didn't you say that cpp was banned?

Yup!

So why is cpp there in the my-c tree?

to cure your inertia at being tempted to invoke it with ... random punishments in form of
- console blocked for 5 minutes (like with the "SL" ncurses program)
- you cannot do nothing but get your random insults, ncurses full screen

(so at the end of the day you would like to DELETE it, and you cannot because you don't have root permissions)
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline SiliconWizardTopic starter

  • Super Contributor
  • ***
  • Posts: 14488
  • Country: fr
Re: Oh, C3!
« Reply #45 on: April 27, 2023, 07:41:13 pm »
C programmers have this love-hate relationship with the preprocessor. It's horrible, but it's surprisingly powerful and makes it possible to do generic programming in C.

I personally don't hate the preprocessor at all - I find it very useful.

Yep, every attempt at replacing the preprocessor to achieve the same level of generic programming have either led to something much less flexible/powerful, or true untamable and unverifiable monsters.

What many people seem to miss - and that Wirth has kept saying over and over again - is that simplicity should be a goal.
C is simple, the C preprocessor is simple.

C++ templates are monsters.
 
The following users thanked this post: Karel

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #46 on: April 28, 2023, 10:14:54 am »
every attempt at replacing the preprocessor to achieve the same level of generic programming have either led to something much less flexible/powerful, or true untamable and unverifiable monsters.

every attempt? except my-c, so it's some but not all  :o :o :o

The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #47 on: April 28, 2023, 10:20:45 am »
even replacing  #define macro() in cpp with a true compiler built-in macro() mechanism is better

everything that doesn't pre-process the source is better because it doesn't hide information
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online PlainName

  • Super Contributor
  • ***
  • Posts: 6848
  • Country: va
Re: Oh, C3!
« Reply #48 on: April 28, 2023, 10:39:41 am »
Quote
everything that doesn't pre-process the source is better because it doesn't hide information

But isn't that one of the main features of functions? They hide lots of nitty gritty detail behind a simple name (and, of course, let you reuse code without repeating it, which is also what macros can do).
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 3915
  • Country: gb
Re: Oh, C3!
« Reply #49 on: April 28, 2023, 03:26:26 pm »
macros (by cpp) vs functions:
- functions are not pre-processed but compiled
- macro does not check any Compile-Time Errors, Function checks Compile-Time Errors

the second is what I meant: you lose information during pre-processing.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf