Author Topic: the c-semantics project is aimed against - C undefined behavior -  (Read 22405 times)

0 Members and 1 Guest are viewing this topic.

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Quote
C provides just enough abstraction above assembly language for programmers to get their work done without having to worry about the details of the machines on which the programs run. Despite this abstraction, C is also known for the ease in which it allows programmers to write buggy programs. With no runtime checks and little static checking, in C the programmer is to be trusted entirely. Despite the abstraction, the language is still low-level enough that programmers can take advantage of assumptions about the underlying architecture. Trust in the programmer and the ability to write non-portable code are actually two of the design principles under which the C standard was written [14]. These ideas often work in concert to yield intricate, platform-dependent bugs. The potential subtlety of C bugs makes it an excellent candidate for formalization, as subtle bugs can often be caught only by more rigorous means.
...

To find out a solution the c-semantics project has been started as you can read from its - An Executable Formal Semantics of C with Applications - paper

What do you think about this project ? May be it is able to help us with C ?
See its git repo here (mind that this project is requiring dev-lang/ocaml)
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #1 on: March 26, 2014, 01:55:52 pm »
Interesting but sometimes you want that odd behavior that you can achieve with C, so as long as you can define the behaviour you want, it will take features away instead of adding new features.

For example on the timer counting front, you want to be able to use current_time - previous_time and allow for roll over intentionally.

Also int really isn't defined and therefore it's not portable so checking for portability when you are targeting an 8bit MCU, or a 16bit MCU and a 32bit MCU then yeah, it will be nice if you didn't have to change the code but that comes at a price of using more code that you wouldn't need to.

That said, for some tasks, that looks great. If only it allow sections of the code to be ignored by the semantics rules and well defined by the user. Kind of a namespace that tells the kcc program to leave me alone I know what I'm doing.
 

Offline granz

  • Regular Contributor
  • *
  • Posts: 136
  • Country: us
  • 6.62606957
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #2 on: March 26, 2014, 02:01:13 pm »
What do you think about this project ? May be it is able to help us with C ?
See its git repo here (mind that this project is requiring dev-lang/ocaml)

Valuable for proving that pure computational code is correct, but of little use for most firmware developers I would think (which I assume are the majority here).  If you are developing (or even just implementing) a complex algorithm in C it is very useful.  It doesn't help you with interacting with hardware peripheral registers on a particular microcontroller, for example.

 

Offline dannyf

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #3 on: March 26, 2014, 02:35:25 pm »
It is a mis-guided effort, in my view.

C is useful because it is flawed and imperfect.
================================
https://dannyelectronics.wordpress.com/
 

Offline granz

  • Regular Contributor
  • *
  • Posts: 136
  • Country: us
  • 6.62606957
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #4 on: March 26, 2014, 02:45:37 pm »
It is a mis-guided effort, in my view.

 :-+  Agreed.  Seems like it tries to force C to be something that it is not.

C is useful because it is flawed and imperfect.

Well, I'm not sure I would say "flawed," more like no-hand holding.
 

Offline Bored@Work

  • Super Contributor
  • ***
  • Posts: 3932
  • Country: 00
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #5 on: March 26, 2014, 04:00:05 pm »
Well, that project seems to start with some unproven assumption, the old claim that C is just a kind of better assembler and a similar old claim that it is easy to write buggy programs in C.

Since these claims are unproven there is actually no reason to read on. That c-semantics project looks awfully like yet another of those "I am too stupid to program, lets blame the language and invent a new vanity language" project. Next.
I delete PMs unread. If you have something to say, say it in public.
For all else: Profile->[Modify Profile]Buddies/Ignore List->Edit Ignore List
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #6 on: March 26, 2014, 11:17:06 pm »
Well, that project seems to start with some unproven assumption, the old claim that C is just a kind of better assembler and a similar old claim that it is easy to write buggy programs in C.
Do you really not think it's easy to write buggy programs in C?  This is a truly strange claim.

Quote
Since these claims are unproven there is actually no reason to read on. That c-semantics project looks awfully like yet another of those "I am too stupid to program, lets blame the language and invent a new vanity language" project. Next.
Except that's the exact opposite of what they're doing.  As far as I understand, they're trying to formalize the C11 standard, and create a C compiler whose output can be trusted to produce object code in compliance with the spec.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #7 on: March 26, 2014, 11:24:16 pm »
It's always seemed strange to me that people see C as a language particularly well-suited to microcontroller development, rather than a flawed language that we ended up with for historical reasons that's just barely good enough for programmers (who are creatures of habit) to not seek alternatives.  Even some things that are trivial to code in assembler. like adding up two signed integers and checking for overflow, are next to impossible to do in C.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #8 on: March 26, 2014, 11:54:06 pm »
Except that's the exact opposite of what they're doing.  As far as I understand, they're trying to formalize the C11 standard, and create a C compiler whose output can be trusted to produce object code in compliance with the spec.

I should have read the paper more in detail, tjaeger seems to be right, I just scanned the doc until section 3.2 and it seemed to imply that
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;

should give you c = 1000000 but it really states that it's defined to be the value that fits in an unsigned int since it's not promoted to the type of the receiving variable. i.e. 0x000F4240 truncated to 16 bits 0x4240 so the result will be 16960.

So what they really are doing is to get the code to comply with the standard, not redefine the standard and they are focusing on C1X since it will overcome C99 as far as they are concerned.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #9 on: March 27, 2014, 12:26:08 am »
It's always seemed strange to me that people see C as a language particularly well-suited to microcontroller development, rather than a flawed language that we ended up with for historical reasons that's just barely good enough for programmers (who are creatures of habit) to not seek alternatives.  Even some things that are trivial to code in assembler. like adding up two signed integers and checking for overflow, are next to impossible to do in C.

Agreed C is not well suited.
Disagree on programmers not seeking alternatives (they are just not present and no one is trying to make a high level mcu language)
Lastly C is flexible and will allow you to check for flags etc by using inline assembly and using macros to support different MCU's but that opens a can of worms in most environments because macros can be very dangerous if not understood.

Still C compilers are not that great to create assembler as most people tend to believe that optimizations write better code that the programmer doing it in assembler in the first place.

For example, MS Visual C doesn't even honor the register storage class specifier and MPLAB XC8 seems to abuse register 24, each instruction seems to use it as an intermediate step, making some code 5 times bigger or more. Seems it wants the PIC to just move stuff around more than actually doing work.

Can a new language be developed? yes, is someone doing it? we will see, will it become standard? probably not and it will fade away because it's not proven.

Only way a new language will have a chance is if a consortium of all the major MCU manufacturers did it and used it internally. Or even if one of the big players did it and adopted it, but that is risky so I doubt it will happen.

 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2823
  • Country: nz
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #10 on: March 27, 2014, 01:54:49 am »
Hands off my beloved language! :-)

I think C is a great language. It does just what you want, and nothing more. Sure there are some ragged edges, but it is the responsiblity of the programmer to unambiguiously say what they want.

If I was to code  "i = ++i + 1 " then I get all I deserve.

Also, they project can't have it both ways - you won't get no "C with no undefined bahaviour" if enabling optimizations can cause same code's output to be different (Yes, I'm looking at you, floating point!).


Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #11 on: March 27, 2014, 11:35:05 am »
What i think ? The C language is like a carving knife: simple, sharp, and extremely useful in skilled hands, but like any sharp tool, C can injure people who don’t know how to handle it.

The problem is skilled hands, do not underestimate it, i have being still reading too much shit from the web about people who is supposing to be "skilled" when they are not.
 

Offline Jarrod Roberson

  • Regular Contributor
  • *
  • Posts: 71
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #12 on: March 27, 2014, 12:56:47 pm »
I agree with the previous post that says the entire charter for this "project" is based on a flawed assumption.

C isn't any more responsible for bad programming than Intel is.

Most of the "complaints" I saw were not C specific but have more to do with the limitations of binary representation of numbers, which every language is going to have varying degrees of "correctness" of, of course offset by various tradeoffs of time and space considerations.

Blaming the language is blaming a tool because you picked the wrong tool doesn't make the tool to blame.
 

Offline dannyf

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #13 on: March 27, 2014, 01:16:46 pm »
My impression of their objectives is to make C harmless.

Unfortunately, a harmless tool is a useless tool.
================================
https://dannyelectronics.wordpress.com/
 

Offline hans

  • Super Contributor
  • ***
  • Posts: 1720
  • Country: 00
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #14 on: March 27, 2014, 01:27:30 pm »
I've seen some talks about semantics before in regards of safety/security of programs written in C/C++. Most of that talk was targeted towards hacking programs, especially code vulnerabilities (as windows & data vulnerabilities are basically non-patchable..).

The overall conclusion was: C(++) is a very powerful language, but with a lot of responsibility to the programmer. The performance is available, and why it's still used. "Higher level" languages can't always reach the same level of performance, because they have more run-time checks which slows down programs, but does make it safer. In addition, JIT may add a bit of unpredictability in the mix as well.

The biggest drawbacks for C/C++ is these extra semantics add a huge performance hit. Especially those that require no modification to the code. For example, there is a semenatic that does a static call analysis, and has a list of possible return addresses of a method. When a hacker would modify the stack (which is possible, data vulnerability), it would be detected and the program won't continue (blindly) to the hackers code. It would be very effective, but slows stuff down a lot, and you need more than just a header file to do this for DLL's too.

Thus I believe there is no "solution" (as with everything in engineering). It's a trade off. You want the performance? Write C, make assumptions. Want securer programs? Write in a high level language, or with these semantics, or add the checks yourself manually in C (but it's easily overlooked what can go wrong or abused), but in all cases take the performance hit.

Mozilla is developing a new language "RUST", which relies on the LLVM compiler back end. So you can compile this for your ARM Cortex devices too, and people are already doing it. Maybe I will grab upon it one day and see what the benefits are; but the problem with a new language means yo have to port/rewrite/drop/respin legacy libraries, code and "ways of doing things". That's a huge time hit, plus you probably have to drop some platforms which will not keep up (like some stubborn manufacturers who have barely got a C compiler, like Microchip).
« Last Edit: March 27, 2014, 01:30:03 pm by hans »
 

Offline granz

  • Regular Contributor
  • *
  • Posts: 136
  • Country: us
  • 6.62606957
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #15 on: March 27, 2014, 01:44:28 pm »
It's always seemed strange to me that people see C as a language particularly well-suited to microcontroller development, rather than a flawed language that we ended up with for historical reasons that's just barely good enough for programmers (who are creatures of habit) to not seek alternatives.  Even some things that are trivial to code in assembler. like adding up two signed integers and checking for overflow, are next to impossible to do in C.

Microcontrollers vary widely.  How would you create a universally consistent language for microcontrollers anyhow?  C has several undefined behaviors to keep it from being tied to any particular architecture.  C allows inline asm to check platform dependent flags etc.

Many people have tried for a long time to come up with alternatives.  Interpreted languages such as Java or C# essentially solve all of the undefined behavior issues, but the price is paid for in performance.  On a PC when you are writing application software it's worth it, but on a Microcontroller you aren't writing portable code anyhow--you are very tied to the hardware anyway.
 

Offline Kjelt

  • Super Contributor
  • ***
  • Posts: 6610
  • Country: nl
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #16 on: March 27, 2014, 01:58:30 pm »
Yeah but you could abstract from the hardware by a HAL API you then get two worlds, all below the HAL would be C or asm and above whatever you want. Still it would cost quite some code which would add to the final product cost.
No if C is programmed by an experienced C programmer there are little worries, I think personally you get into bigger trouble by forcing that experienced C programmer to program a new higher level language (s)he is not familiar with.
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 28622
  • Country: nl
    • NCT Developments
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #17 on: March 27, 2014, 03:08:11 pm »
It is a mis-guided effort, in my view.

C is useful because it is flawed and imperfect.
The answer is already there: C++
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #18 on: March 27, 2014, 05:12:39 pm »
It's always seemed strange to me that people see C as a language particularly well-suited to microcontroller development, rather than a flawed language that we ended up with for historical reasons that's just barely good enough for programmers (who are creatures of habit) to not seek alternatives.  Even some things that are trivial to code in assembler. like adding up two signed integers and checking for overflow, are next to impossible to do in C.

Microcontrollers vary widely.  How would you create a universally consistent language for microcontrollers anyhow?  C has several undefined behaviors to keep it from being tied to any particular architecture.  C allows inline asm to check platform dependent flags etc.
That's not really true.  C has implementation-defined behavior to cope with different architectures.  So for example when you cast a signed int to an unsigned int, you can get two's complement, one's complement or sign-absolute value depending on architecture. Undefined behavior is much more insidious and its rationale is to enable certain optimizations (if not outright mean-spiritedness).  So if you're implementing a binary heap and you realize that it costs less cycles to move between parent and children in 1-indexed arrays rather than 0-indexed arrays and you do things like
Code: [Select]
uint32_t a[16] = {0, };
uint32_t *heap = a-1;
you're might be in for a surprise depending on how aggressively your compiler optimizes.

Quote
Many people have tried for a long time to come up with alternatives.  Interpreted languages such as Java or C# essentially solve all of the undefined behavior issues, but the price is paid for in performance.  On a PC when you are writing application software it's worth it, but on a Microcontroller you aren't writing portable code anyhow--you are very tied to the hardware anyway.
It's the other way around.  Undefined behavior is a problem when you want to be tied to the hardware as closely as possible because the compiler is under no obligation to translate your code to what you would think it should correspond to.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #19 on: March 27, 2014, 05:15:52 pm »
It is a mis-guided effort, in my view.

C is useful because it is flawed and imperfect.
The answer is already there: C++
C++ has exactly the same issues with respect to undefined behavior as C.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #20 on: March 27, 2014, 05:18:40 pm »
My impression of their objectives is to make C harmless.

Unfortunately, a harmless tool is a useless tool.
You'd probably do well to familiarize yourself with their project before criticizing it for something that it's not.
 

Offline GiskardReventlov

  • Frequent Contributor
  • **
  • Posts: 598
  • Country: 00
  • How many pseudonyms do you have?
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #21 on: March 27, 2014, 05:33:41 pm »
Has anyone tried kcc?  I have a seasick-like feeling when I hear about YAC (yet another compiler). Could they not fork an existing compiler in hopes of a future merge? Clang? Gcc? Have I missed the whole point?

Or is kcc a fork of gcc?
 

Offline grumpydoc

  • Super Contributor
  • ***
  • Posts: 2952
  • Country: gb
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #22 on: March 27, 2014, 05:39:03 pm »
Quote
The answer is already there: C++
But what was the question?

C is (or more accurately was) a fine language - just high level enough to be productive but low enough to get most system tasks done. The analogy to a sharp knife is a good one - you can cut yourself but only if you use the language without care or without a decent understaning of its nuances.

C++ - if we are keeping with the blade analogy - would be some sort of whirling dervish of a gadget with blades sticking out at all angles. Think of the Warner Brothers version of a Tasmanian devil crossed with Edward Scisorhands.

You can occasionally get some good results but you are almost certain to cut yourself somewhere.

Not that this has stopped me designing and writing some large projects in C++ :)

 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #23 on: March 27, 2014, 06:05:28 pm »
Quote
The answer is already there: C++
C is (or more accurately was) a fine language ...

Definitely IS, if you think it's gone and/or replaced by C++ you are mistaken. Try to submit a kernel change in C++, or better yet, try to convince Linus Torvalds that C++ belongs in the kernel of linux.

I'll bring the popcorn for the rest of us :)

BTW I'm 100% with tjaeger, after I actually read the document they don't want to change the language, but they want to make a compiler that complies better with the specs.

Although they are not even close yet, it's a good effort, IMHO.

(edit: I guess I wasn't 100% with tjaeger, I read more and now I'm a naysayer)
« Last Edit: March 28, 2014, 12:55:20 am by miguelvp »
 

Offline mtdoc

  • Super Contributor
  • ***
  • Posts: 3575
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #24 on: March 27, 2014, 06:14:25 pm »
I know nothing about C++ and am just learning C, but I found this plot interesting when presented in the embedded C online course I am taking

 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #25 on: March 27, 2014, 06:27:44 pm »
Has anyone tried kcc?  I have a seasick-like feeling when I hear about YAC (yet another compiler). Could they not fork an existing compiler in hopes of a future merge? Clang? Gcc? Have I missed the whole point?
No, they couldn't given the goals of their project.  This is not a compiler that spits out object code.  Its output is interpreted by the rewriting logic engine Maude.  The point of this is not to actually run the code in a production environment, but to perform static and run-time analysis (for example to catch undefined behavior).
 

Offline GiskardReventlov

  • Frequent Contributor
  • **
  • Posts: 598
  • Country: 00
  • How many pseudonyms do you have?
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #26 on: March 27, 2014, 06:35:24 pm »
@tjaeger, ok that's good, then I can try it out, I've been building kicad and haven't looked at the code (apparently it's not recommended to look at) but will try kcc on it. Emphasis on "try".
 

Offline Jarrod Roberson

  • Regular Contributor
  • *
  • Posts: 71
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #27 on: March 27, 2014, 06:41:54 pm »
There are lies, damn lies and the TIOBE index.

Do a simple google search and you will find out why this TIOBE index is worst than useless, it doesn't use any scientific principals to gather OR interpret the data.

Any professor who presents TIOBE index as anything than something that should be ignored isn't qualified to be teaching anyone anything!
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #28 on: March 27, 2014, 06:49:49 pm »
@tjaeger, ok that's good, then I can try it out, I've been building kicad and haven't looked at the code (apparently it's not recommended to look at) but will try kcc on it. Emphasis on "try".

kicad is written in C++, so this won't work.  Even if it were written in C, I doubt kcc could handle a project of this scale.
 

Offline dannyf

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #29 on: March 27, 2014, 06:54:51 pm »
Quote
but I found this plot interesting when presented in the embedded C online course I am taking

Looks like some embedded engineers are trying to make a living writing code for smart phones and apps.

Good for them.
================================
https://dannyelectronics.wordpress.com/
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12590
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #30 on: March 27, 2014, 07:57:55 pm »
I should have read the paper more in detail, tjaeger seems to be right, I just scanned the doc until section 3.2 and it seemed to imply that
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;

should give you c = 1000000 but it really states that it's defined to be the value that fits in an unsigned int since it's not promoted to the type of the receiving variable. i.e. 0x000F4240 truncated to 16 bits 0x4240 so the result will be 16960.

So what they really are doing is to get the code to comply with the standard, not redefine the standard and they are focusing on C1X since it will overcome C99 as far as they are concerned.

But the standard defines exactly what should happen in the code sample you posted. In that sample "a * b" is an expression and the evaluation of that expression depends on the variable types of a and b according to very well defined promotion rules. Once the expression is evaluated and a result generated, then something may be done with that result, for example assign it to another variable. The key thing is that what you are going to do with the result later on does not influence how the result is computed to begin with. This is a principle found not only in C.

In regard to your closing paragraph, this behavior is nothing to do with noncompliance with standards. The standard is unambiguous about what should happen there.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #31 on: March 27, 2014, 08:09:04 pm »
I should have read the paper more in detail, tjaeger seems to be right, I just scanned the doc until section 3.2 and it seemed to imply that
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;

should give you c = 1000000 but it really states that it's defined to be the value that fits in an unsigned int since it's not promoted to the type of the receiving variable. i.e. 0x000F4240 truncated to 16 bits 0x4240 so the result will be 16960.

So what they really are doing is to get the code to comply with the standard, not redefine the standard and they are focusing on C1X since it will overcome C99 as far as they are concerned.

But the standard defines exactly what should happen in the code sample you posted. In that sample "a * b" is an expression and the evaluation of that expression depends on the variable types of a and b according to very well defined promotion rules. Once the expression is evaluated and a result generated, then something may be done with that result, for example assign it to another variable. The key thing is that what you are going to do with the result later on does not influence how the result is computed to begin with. This is a principle found not only in C.
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.
 

Offline grumpydoc

  • Super Contributor
  • ***
  • Posts: 2952
  • Country: gb
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #32 on: March 27, 2014, 08:42:47 pm »
Quote from: IanB
But the standard defines exactly what should happen in the code sample you posted.

Quote from: tjaeger
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.

In some ways both of the above statements are actually true.

The promotion rules for the expression evaluation are well defined.

The value of what gets assigned to c is undefined purely because C does not define what happens when an integer expression overflows (normally it's just whatever the underlying CPU does).

 

Offline jancumps

  • Supporter
  • ****
  • Posts: 1273
  • Country: be
  • New Low
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #33 on: March 27, 2014, 08:59:27 pm »
...

Not that this has stopped me designing and writing some large projects in C++ :)
Same here. I've earned a decent paycheck for years writing large projects (erp and airoplane handling related) in C++ in the 90s.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #34 on: March 27, 2014, 09:00:07 pm »
The value of what gets assigned to c is undefined purely because C does not define what happens when an integer expression overflows (normally it's just whatever the underlying CPU does).
No, the behavior of the program is undefined.  There are absolutely no guarantees of this sort.  For example, if the next few lines are
Code: [Select]
if (c == 0)
  a();
if (c != 0)
  b();
then the compiler is under no obligation to call either a() or b().
 

Offline granz

  • Regular Contributor
  • *
  • Posts: 136
  • Country: us
  • 6.62606957
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #35 on: March 27, 2014, 09:10:32 pm »
It's always seemed strange to me that people see C as a language particularly well-suited to microcontroller development, rather than a flawed language that we ended up with for historical reasons that's just barely good enough for programmers (who are creatures of habit) to not seek alternatives.  Even some things that are trivial to code in assembler. like adding up two signed integers and checking for overflow, are next to impossible to do in C.

Microcontrollers vary widely.  How would you create a universally consistent language for microcontrollers anyhow?  C has several undefined behaviors to keep it from being tied to any particular architecture.  C allows inline asm to check platform dependent flags etc.
That's not really true.  C has implementation-defined behavior to cope with different architectures.  So for example when you cast a signed int to an unsigned int, you can get two's complement, one's complement or sign-absolute value depending on architecture. Undefined behavior is much more insidious and its rationale is to enable certain optimizations (if not outright mean-spiritedness).  So if you're implementing a binary heap and you realize that it costs less cycles to move between parent and children in 1-indexed arrays rather than 0-indexed arrays and you do things like
Code: [Select]
uint32_t a[16] = {0, };
uint32_t *heap = a-1;
you're might be in for a surprise depending on how aggressively your compiler optimizes.

Quote
Many people have tried for a long time to come up with alternatives.  Interpreted languages such as Java or C# essentially solve all of the undefined behavior issues, but the price is paid for in performance.  On a PC when you are writing application software it's worth it, but on a Microcontroller you aren't writing portable code anyhow--you are very tied to the hardware anyway.
It's the other way around.  Undefined behavior is a problem when you want to be tied to the hardware as closely as possible because the compiler is under no obligation to translate your code to what you would think it should correspond to.

I not quite sure *how* you a disagreeing with me, but it seems you are.  I never said undefined behavior wasn't a problem, and you may have noticed that my first post mentioned that the c-semantics project would have uses.  I just doubt its usefulness for many firmware development projects, since often bugs stem from incorrectly interacting with peripherals etc.  This is how I interpreted the original question.

Undefined behavior is intended to aid in optimizations for sure, but I don't see how that negates it being related to platform differences.

For example, from the LLVM blog:

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

Quote
Shifting a uint32_t by 32 or more bits is undefined. My guess is that this originated because the underlying shift operations on various CPUs do different things with this: for example, X86 truncates 32-bit shift amount to 5 bits (so a shift by 32-bits is the same as a shift by 0-bits), but PowerPC truncates 32-bit shift amounts to 6 bits (so a shift by 32 produces zero). Because of these hardware differences, the behavior is completely undefined by C (thus shifting by 32-bits on PowerPC could format your hard drive, it is *not* guaranteed to produce zero). The cost of eliminating this undefined behavior is that the compiler would have to emit an extra operation (like an 'and') for variable shifts, which would make them twice as expensive on common CPUs.

My point was essentially, how would you fix those undefined behaviors in a consistent manner across architectures, without affecting performance on some platforms?  Sure you can identify them in generic C code, but then you just avoid those particular situations.  If you really need the performance you write that part in asm where the behavior is defined by the particular architecture.


 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12590
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #36 on: March 27, 2014, 09:43:41 pm »
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.

If the rules say the behavior is undefined in that situation, you have no business writing that code. More fool you.
« Last Edit: March 27, 2014, 09:45:19 pm by IanB »
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #37 on: March 27, 2014, 09:59:58 pm »
It's the other way around.  Undefined behavior is a problem when you want to be tied to the hardware as closely as possible because the compiler is under no obligation to translate your code to what you would think it should correspond to.

I not quite sure *how* you a disagreeing with me, but it seems you are.  I never said undefined behavior wasn't a problem, and you may have noticed that my first post mentioned that the c-semantics project would have uses.  I just doubt its usefulness for many firmware development projects, since often bugs stem from incorrectly interacting with peripherals etc.  This is how I interpreted the original question.

Undefined behavior is intended to aid in optimizations for sure, but I don't see how that negates it being related to platform differences.
Undefined behavior is generally not a good way to deal with hardware differences.  The point of undefined behavior is to allow the compiler to make assumptions about the code that it can't prove statically.
Quote
For example, from the LLVM blog:

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

Quote
Shifting a uint32_t by 32 or more bits is undefined. My guess is that this originated because the underlying shift operations on various CPUs do different things with this: for example, X86 truncates 32-bit shift amount to 5 bits (so a shift by 32-bits is the same as a shift by 0-bits), but PowerPC truncates 32-bit shift amounts to 6 bits (so a shift by 32 produces zero). Because of these hardware differences, the behavior is completely undefined by C (thus shifting by 32-bits on PowerPC could format your hard drive, it is *not* guaranteed to produce zero). The cost of eliminating this undefined behavior is that the compiler would have to emit an extra operation (like an 'and') for variable shifts, which would make them twice as expensive on common CPUs.
While the post that you're quoting is generally well-written, this paragraph just doesn't make any sense.  The standard committee had a few reasonable choices as to how to deal with shift-past-bitwidth:
  • Define shift-past-bitwidth to be 0: This would incur a performance penalty on some architectures, but programmers could easily get around that if compilers would emit efficient code for "assert(i >=0 && i < 32); foo = bar >> i;"
  • Make this behavior implementation-defined: In that case, you could still rely on a certain behavior on a particular architecture, but your code would be non-portable.
  • Leave the value of the operation unspecified: Now it's perfectly legal to execute a shift-past bitwidth operation, but you don't have any guarantees about what the result is.
Making the behavior of shift-past-bitwidth undefined is not reasonable unless there are architectures that handle it like some do division by zero -- by emitting a trap.

Quote
My point was essentially, how would you fix those undefined behaviors in a consistent manner across architectures, without affecting performance on some platforms?  Sure you can identify them in generic C code, but then you just avoid those particular situations.  If you really need the performance you write that part in asm where the behavior is defined by the particular architecture.
Having the behavior be implementation-defined rather than undefined would give you most of the benefits.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #38 on: March 27, 2014, 10:02:07 pm »
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.

If the rules say the behavior is undefined in that situation, you have no business writing that code. More fool you.
What??? You were the one that wrote
Quote
But the standard defines exactly what should happen in the code sample you posted.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2823
  • Country: nz
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #39 on: March 27, 2014, 10:22:46 pm »
I have to agree that C isn't suited to some targets, as ANSI C makes quite a few assumptions about the hardware model, including:

- That you have a deep stack,
- That you have STDIN, STDOUT and STDERR
- THat you have something to return to (e.g. an OS or DOS prompt)
- That your program and data memory space is the same address space

This is why there is no decent 'C" for 16F84 series PICs (which are great microcontrollers BTW).

I guess that AVR were careful when designing the ATmega to ensure that 'C' almost fits their hardware, except maybe for all that silly PROGMEM stuff...
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline grumpydoc

  • Super Contributor
  • ***
  • Posts: 2952
  • Country: gb
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #40 on: March 27, 2014, 10:29:54 pm »
The value of what gets assigned to c is undefined purely because C does not define what happens when an integer expression overflows (normally it's just whatever the underlying CPU does).
No, the behavior of the program is undefined.  There are absolutely no guarantees of this sort.  For example, if the next few lines are
Code: [Select]
if (c == 0)
  a();
if (c != 0)
  b();
then the compiler is under no obligation to call either a() or b().
OK, technically you are completely correct - in fact it would be standards conforming for the compiler to emit code to detect the overflow, print "I am a frog" and then explode although that's not an especially helpful way to write a compiler. Mind you ISTR that gcc did detect some #pragma's and, after trying to start hack, rogue or the towers of Hanoi simulation in emacs print the message "You are in a maze of twisty compiler features, all different"

It's not even hard to envisage a scenario where neither of the if's in your example would execute - because the overflow caused a hardware exception and the code never got to that point.

But, assuming that's not the case and that the compiler writers were not especially inventive in their vindictiveness the most likely scenario is that the multiply will silently fail and some value will end up in c - not necessarily 16960 either.

Arguably this is actually less useful than printing "I am a frog" and then exploding  >:D
« Last Edit: March 27, 2014, 11:29:03 pm by grumpydoc »
 

Offline grumpydoc

  • Super Contributor
  • ***
  • Posts: 2952
  • Country: gb
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #41 on: March 27, 2014, 10:48:15 pm »
I have to agree that C isn't suited to some targets, as ANSI C makes quite a few assumptions about the hardware model, including:

- That you have a deep stack,
- That you have STDIN, STDOUT and STDERR
- THat you have something to return to (e.g. an OS or DOS prompt)
- That your program and data memory space is the same address space

This is why there is no decent 'C" for 16F84 series PICs (which are great microcontrollers BTW).

I guess that AVR were careful when designing the ATmega to ensure that 'C' almost fits their hardware, except maybe for all that silly PROGMEM stuff...
I'm not clear that ANSI C does make those assumptions, at least not from the point of view of the language itself. The standard C library might be a different issue.

You don't need a hardware stack and as long as you have some sort of "jump and store the current program counter" and the ability to load the program counter with a computed value (even if via a trampoline) you can manage with a stack implemented in software.

STDIN, STDOUT and STDERR are more library than language concepts. You don't need to '#include <stdio.h> to write C code.

There is also no need for "somewhere to return to" since a C program needn't terminate. Some sort of environment to invoke your program is needed but that could itself be written in C, perhaps with a little bit of assembly language thrown in (hint, quite a few OSes are substantially written in C).

I'm pretty certain that C itself does not require a single address space - you do need to be able to manipulate code addresses to the extent of being able to store them and jump (or rather call) to a computed address (otherwise function pointers can't be implemented) but the language doesn't say you can read anything from the memory referenced by a function pointer.

Edit
Quote
This is why there is no decent 'C" for 16F84 series PICs (which are great microcontrollers BTW).

To be fair, if I'm reading the datasheet correctly (I've never used one) the 16F84 has 1024 words of program memory, 68 bytes of RAM and 64 bytes of EEPROM.

It wouldn't be impossible to target that with a C compiler but I'd think it was a bit of a challenge for any high level language.
« Last Edit: March 27, 2014, 10:54:42 pm by grumpydoc »
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12590
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #42 on: March 27, 2014, 11:08:35 pm »
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.

If the rules say the behavior is undefined in that situation, you have no business writing that code. More fool you.
What??? You were the one that wrote
Quote
But the standard defines exactly what should happen in the code sample you posted.

But I also read and take account of what others write and include it in my considerations.

The standard says that a particular implementation defines the size of an int on that platform (within certain constraints). The standard defines promotion rules for expressions involving mixed types. The outcome of these rules is that the result of a multiplication between two 16 bit signed integers will be a 16 bit signed integer.

But from what I read subsequently, the standard does not define what the result should be if that multiplication overflows (something I did not consider originally).

That being the case, you should be careful that your multiplications do not overflow if you care about having a portable outcome. One possibility in the case given (where perhaps inputs are in the range 0 .. 1000) is to use "long int" instead of "int", since a long int is guaranteed to have at least 32 bits.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #43 on: March 28, 2014, 12:38:12 am »
I should have read the paper more in detail, tjaeger seems to be right, I just scanned the doc until section 3.2 and it seemed to imply that
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;

should give you c = 1000000 but it really states that it's defined to be the value that fits in an unsigned int since it's not promoted to the type of the receiving variable. i.e. 0x000F4240 truncated to 16 bits 0x4240 so the result will be 16960.

So what they really are doing is to get the code to comply with the standard, not redefine the standard and they are focusing on C1X since it will overcome C99 as far as they are concerned.

But the standard defines exactly what should happen in the code sample you posted. In that sample "a * b" is an expression and the evaluation of that expression depends on the variable types of a and b according to very well defined promotion rules. Once the expression is evaluated and a result generated, then something may be done with that result, for example assign it to another variable. The key thing is that what you are going to do with the result later on does not influence how the result is computed to begin with. This is a principle found not only in C.

In regard to your closing paragraph, this behavior is nothing to do with noncompliance with standards. The standard is unambiguous about what should happen there.

Yes, that's what I was trying to say, that when it's defined they leave it alone. I did read the paper further and changed my mind on kcc being a good idea and here is why:

When it's undefined they just stop execution, for example if those variables where signed then the result will be undefined by the standard so it will stop execution in that case and give a run time error. That itself implies programming overhead, I don't want that.

The worse of it is when I found out that they are actually checking for things that are not even part of the C language, for example checking if strcpy is copying out of bounds. I don't consider strcpy part of C so the compiler shouldn't be even care about functions or their interpretation on how a function should behave.

So I'm changing sides again and I'm back on the side of this compiler not worth a thing.

Edit: Clarification, not worth a thing for embedded programming. (also I said numbers when I meant variables so I changed that too).

And btw I love C++ but I love C too and Assembly as well, an even Python. But they are tools that you use depending on the job. Sure I can use a butter knife to unscrew something, but I rather use the proper tool for the job.

C++ can be abused even more than C, it has all the problems C has and then adds more.
« Last Edit: March 28, 2014, 12:50:43 am by miguelvp »
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #44 on: March 28, 2014, 01:11:21 am »
There are lies, damn lies and the TIOBE index.

Do a simple google search and you will find out why this TIOBE index is worst than useless, it doesn't use any scientific principals to gather OR interpret the data.

Any professor who presents TIOBE index as anything than something that should be ignored isn't qualified to be teaching anyone anything!

Interesting site
http://langpop.com/

Btw I use C++ at work 99.999% of the time, so I'm not one of those that think C is better, not for what I do for a living anyways. The other 0.001% minor things in python.

At home playing with small processors, assembler, C, VHDL and Verilog.

C++ is great for very complex systems (meaning hundreds of thousands of lines of code) that run on computers with lots of resources i.e. cycles, memory, bandwidth, etc

C is great for medium sized non complex programs (under 100,000 lines of code), also for well defined kernels in general they are for more compact and efficient systems that are close to the OS or are the OS themselves. git for example is pure C and more efficient that C++ implementations of source control.

Assembly can't be beat when you want the most out of a system, and it's not any more difficult than C.

Python, well, that's a flexible beast for anything else on a system that has enough resources as well, it has everything plus the kitchen sink but I wouldn't use it for anything other than non time critical things. But I could do the time critical things in C++, C or even assembly and import the module to python if needed. Still only for systems with lots of resources. I find python to be a good candidate for embedding a console to an application so you can allow the user to create new behavior instead of making them compile plug-ins.
 

Offline dfmischler

  • Frequent Contributor
  • **
  • Posts: 548
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #45 on: March 28, 2014, 01:43:19 am »
- That your program and data memory space is the same address space
I can assure you that early C implementations could easily deal with separate instruction and data spaces (on some PDP-11 models, for example).  Nothing in ANSI C has changed that.  grumpydoc is correct that there is no guaranteed ability to read instruction memory by type punning (and this does not work when there are separate instruction and data spaces).

 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2823
  • Country: nz
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #46 on: March 28, 2014, 02:33:23 am »
I can assure you that early C implementations could easily deal with separate instruction and data spaces (on some PDP-11 models, for example).  Nothing in ANSI C has changed that.  grumpydoc is correct that there is no guaranteed ability to read instruction memory by type punning (and this does not work when there are separate instruction and data spaces).

Yep, I am sure that is the case. I also learnt C on Real mode 8086, with all it's segment registers - _near, _far and _huge pointers and so on. I remember the joy of moving to flat 32-bit address spaces :-)

It isn't only trying to read instruction space. One behaviou that gets a lot of people is excess RAM usage by are constant strings / character arrays that should be held in ROM or FLASH. During initalisation the strings have to get copied from the program ROM address space into RAM in case it is ever updated.

Makes perfect sense if you really know what is going on at the lowest levels...


Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #47 on: March 28, 2014, 02:45:07 am »
I also learnt C on Real mode 8086, with all it's segment registers - _near, _far and _huge pointers and so on. I remember the joy of moving to flat 32-bit address spaces :-)
Circa 1988
Code: [Select]
void far *finddata()
{
   void far *ptr;
   int *aux;

   aux = settextstyle ;
   aux = aux + 21;
   ptr = ((void far *) (((unsigned long)(_DS) << 16) | (unsigned)(*aux)));
   return (ptr);
}

Yes, that's how you accessed the Data Segment back then, let kcc figure that code out :-DD
« Last Edit: March 28, 2014, 02:47:19 am by miguelvp »
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2823
  • Country: nz
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #48 on: March 28, 2014, 02:52:48 am »
Yes, that's how you accessed the Data Segment back then, let kcc figure that code out :-DD

And don't get me started on using code overlays!

However, one thing I dearly miss is A000:0000-FFFF, where you could write pixels directly to the frame buffer. Oh the good old days!
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #49 on: March 28, 2014, 03:41:21 am »

And don't get me started on using code overlays!

However, one thing I dearly miss is A000:0000-FFFF, where you could write pixels directly to the frame buffer. Oh the good old days!

I had a hercules card so I was bound to B000

From my never released ZX Spectrum emulator that took the emulated frame buffer and displayed it on my hercules monitor, file date 2/24/1989:

Code: [Select]
typedef union {
   unsigned char far *ptr;
   struct {
      unsigned offset;
      unsigned segment;
   } addr;
} memtype ;

...

void refreshscr(unsigned segment)
{
   memtype source,destination;
   int XINC,YINC,x,y;


   source.addr.segment = segment;
   destination.addr.segment = 0xB000;
   YINC = 4;
   XINC = 90;

   for (y=0;y<192;y++) {
       source.addr.offset = 0x4000 + (unsigned)256*(8*(y/64)+y%8)
       + (unsigned)32*((y%64)/8);
       destination.addr.offset = 8192*(y%YINC) + XINC*(y/YINC);
       for (x=0;x<32;x++)
   destination.ptr[x] = source.ptr[x];
   }
}

Yeah, that could be optimized more but hey this is 25 years ago, the C code is 300 lines with comments, the assembler to emulate the z80 was 2000 lines with comments as well. Made my own cross assembler and typed the full disassembly mnemonics from a book that produced the rom. Also had an assembly routine that read the cassette tapes directly to an original IBM PC via the audio DIN connector, respecting the timings of the tape.

Back on topic before I destroy this thread going down on memory lane.

The C language only defines the compiler itself, but those guys are trying to define stdlib those are just functions. I know is not a good thing but I did write many applications that will self modify on purpose, like a run-time compiler that will take a math expression compile it in memory with 8087 code (actually the emulated opcodes that would be changed to 8087 ones if the math co-processor was available) and call it from the main program. Data, program space who cares if you can do it then why not?

So what will kcc do about that?
Let me exploit the language the way I want.

« Last Edit: March 28, 2014, 04:02:00 am by miguelvp »
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #50 on: March 28, 2014, 04:53:11 am »
Yes, that's what I was trying to say, that when it's defined they leave it alone. I did read the paper further and changed my mind on kcc being a good idea and here is why:

When it's undefined they just stop execution, for example if those variables where signed then the result will be undefined by the standard so it will stop execution in that case and give a run time error. That itself implies programming overhead, I don't want that.
Guess what, they'll also give you all possible outcomes if your program is non-deterministic.  This is not made (or even possible to) be executed on an embedded processor.  Doesn't mean it's not a worthwhile endeavor.

Quote
The worse of it is when I found out that they are actually checking for things that are not even part of the C language, for example checking if strcpy is copying out of bounds. I don't consider strcpy part of C so the compiler shouldn't be even care about functions or their interpretation on how a function should behave.
Nobody gives a crap if you consider strcpy to be part of C or not.  The standard committee clearly did, that's why they included it in the standard.  These things are not hard to look up.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #51 on: March 28, 2014, 05:00:29 am »
Quote
The worse of it is when I found out that they are actually checking for things that are not even part of the C language, for example checking if strcpy is copying out of bounds. I don't consider strcpy part of C so the compiler shouldn't be even care about functions or their interpretation on how a function should behave.
Nobody gives a crap if you consider strcpy to be part of C or not.  The standard committee clearly did, that's why they included it in the standard.  These things are not hard to look up.

I guess then all my code has to have the whole stdlib linked in order to run, brilliant!
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #52 on: March 28, 2014, 05:14:19 am »
Quote
The worse of it is when I found out that they are actually checking for things that are not even part of the C language, for example checking if strcpy is copying out of bounds. I don't consider strcpy part of C so the compiler shouldn't be even care about functions or their interpretation on how a function should behave.
Nobody gives a crap if you consider strcpy to be part of C or not.  The standard committee clearly did, that's why they included it in the standard.  These things are not hard to look up.

I guess then all my code has to have the whole stdlib linked in order to run, brilliant!

Let me rephrase that, some processors actually have an instruction to copy from source to destination until 0 is found on the source. That's a single instruction that allows overlaps. If I do want to use it that way, I guess the committee wont let me, right?

tempted to put Linus picture giving the finger at MS here :)

Edit: And say what you want, libraries are not part of the C language definition
Edit2: furthermore, I can link to stdlib with cobol if I wanted to, or visual basic, or fortran or pascal or ..... so I guess everything is the C language then. Read the books those are extensions to C by definition not part of it.

Edit3: just for kicks take this test: http://www.indiabix.com/c-programming/library-functions/114001

Edit4: (yeah on a roll here because I don't want to post so I rather edit)
riddle me this, can your recompile the C language itself? now what about the libraries?
And by that I don't meant recompile the compiler I mean the language.
The libraries are written in C, therefore they can't be C themselves!
The language is just definitions not written in a language but interpreted to make a compiler.
And this following comment is as nasty as I can get: Get a CLue!

Edit5: Multiple choice time, simple questions:
Code: [Select]
1) Associate a verb to a compiler:
    a) Run
    b) Link
2) Associate a verb to a library:
    a) Run
    b) Link
3) What generates assembly code from a source:
    a) Library
    b) Compiler
4) What aggregates assembly code to a program:
    a) Library
    b) Compiler
I could keep going but why?
« Last Edit: March 28, 2014, 06:19:40 am by miguelvp »
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #53 on: March 28, 2014, 06:21:52 am »
I guess then all my code has to have the whole stdlib linked in order to run, brilliant!
Basically, yes, the compiler is free to assume that it can take advantage of the standard library.  Try it out for yourself if you don't believe me:
Code: [Select]
void foo(uint32_t *p) {
  for (int i = 0; i < 128; ++i)
    p[i] = 0;
}
Now run arm-none-eabi-gcc with -O3 and look at the output:
Code: [Select]
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        stmfd   sp!, {r3, lr}
        mov     r1, #0
        mov     r2, #512
        bl      memset
        ldmfd   sp!, {r3, lr}
        bx      lr
It basically just calls memset.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #54 on: March 28, 2014, 06:24:52 am »
I guess then all my code has to have the whole stdlib linked in order to run, brilliant!
Basically, yes, the compiler is free to assume that it can take advantage of the standard library.  Try it out for yourself if you don't believe me:
Code: [Select]
void foo(uint32_t *p) {
  for (int i = 0; i < 128; ++i)
    p[i] = 0;
}
Now run arm-none-eabi-gcc with -O3 and look at the output:
Code: [Select]
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        stmfd   sp!, {r3, lr}
        mov     r1, #0
        mov     r2, #512
        bl      memset
        ldmfd   sp!, {r3, lr}
        bx      lr
It basically just calls memset.

That doesn't define C, it's an implementation, so you mean that when you divide numbers on a processor that doesn't have an instruction and the compiler brings _lldiv then _lldiv is part of C?

Please try again!
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #55 on: March 28, 2014, 06:36:27 am »
The C standard defines the behavior (not the implementation) of the C standard library.  If you want to split hairs and say that this means that the standard library is not a "part of C" then that's fine with me.  If you take that to mean that an analysis tool isn't allowed to check whether a program invokes a standard library function in an undefined way (as you clearly did above), we have a problem.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #56 on: March 28, 2014, 06:43:45 am »
The C standard defines the behavior (not the implementation) of the C standard library.  If you want to split hairs and say that this means that the standard library is not a "part of C" then that's fine with me.  If you take that to mean that an analysis tool isn't allowed to check whether a program invokes a standard library function in an undefined way (as you clearly did above), we have a problem.

But strcpy is well defined (even if it's not part of the C language), it takes a source and a destination and copies from the source to the destination until the source reaches the value 0 and copies that to the end of the buffer and it's done.

I don't want some tool to tell me that I'm miss-using it because they overlap or I don't have enough space.
I might want to avoid bringing the strtok functions due to space limitations and use strcpy to tokenize strings just because I'm already using it somewhere and using strcpy by making it overrun buffers if I feel that's more optimal for my program.

I have a better brain that the compiler and if I know what I want to do, let me!

 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #57 on: March 28, 2014, 06:50:49 am »
Let me rephrase that, some processors actually have an instruction to copy from source to destination until 0 is found on the source. That's a single instruction that allows overlaps. If I do want to use it that way, I guess the committee wont let me, right?
You really don't appear to understand undefined behavior.  If you have two overlapping ranges and you want to copy from one to the other, you are not allowed to use strcpy, period.  So if your compiler can statically prove that the ranges overlap, it is perfectly within its rights to "optimize out" the call to strcpy altogether or do any number of nasty things.  The only way to rely on your architecture's behavior is use inline asm.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #58 on: March 28, 2014, 06:52:29 am »
The C standard defines the behavior (not the implementation) of the C standard library.  If you want to split hairs and say that this means that the standard library is not a "part of C" then that's fine with me.  If you take that to mean that an analysis tool isn't allowed to check whether a program invokes a standard library function in an undefined way (as you clearly did above), we have a problem.

But strcpy is well defined (even if it's not part of the C language), it takes a source and a destination and copies from the source to the destination until the source reaches the value 0 and copies that to the end of the buffer and it's done.

I don't want some tool to tell me that I'm miss-using it because they overlap or I don't have enough space.
I might want to avoid bringing the strtok functions due to space limitations and use strcpy to tokenize strings just because I'm already using it somewhere and using strcpy by making it overrun buffers if I feel that's more optimal for my program.

I have a better brain that the compiler and if I know what I want to do, let me!
Then you have to use a different language than C.  Sorry, them's the rules.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #59 on: March 28, 2014, 07:06:34 am »
Let me rephrase that, some processors actually have an instruction to copy from source to destination until 0 is found on the source. That's a single instruction that allows overlaps. If I do want to use it that way, I guess the committee wont let me, right?
You really don't appear to understand undefined behavior.  If you have two overlapping ranges and you want to copy from one to the other, you are not allowed to use strcpy, period.  So if your compiler can statically prove that the ranges overlap, it is perfectly within its rights to "optimize out" the call to strcpy altogether or do any number of nasty things.  The only way to rely on your architecture's behavior is use inline asm.

I do understand undefined behavior, but I also understand historical implementations.
memcpy and strcpy have always been implemented from low memory to high memory
yeah, I get I can do inline assembly but why if I don't need to?

Then again, both of them are not part of C that's my point, I can rewrite them the way I want and link them instead of the ones someone else wrote (in C at that).
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #60 on: March 28, 2014, 07:25:07 am »
The C standard defines the behavior (not the implementation) of the C standard library.  If you want to split hairs and say that this means that the standard library is not a "part of C" then that's fine with me.  If you take that to mean that an analysis tool isn't allowed to check whether a program invokes a standard library function in an undefined way (as you clearly did above), we have a problem.

But strcpy is well defined (even if it's not part of the C language), it takes a source and a destination and copies from the source to the destination until the source reaches the value 0 and copies that to the end of the buffer and it's done.

I don't want some tool to tell me that I'm miss-using it because they overlap or I don't have enough space.
I might want to avoid bringing the strtok functions due to space limitations and use strcpy to tokenize strings just because I'm already using it somewhere and using strcpy by making it overrun buffers if I feel that's more optimal for my program.

I have a better brain that the compiler and if I know what I want to do, let me!
Then you have to use a different language than C.  Sorry, them's the rules.

Again, they have nothing to do with C, they are extensions i.e. object files that get linked. C doesn't even do I/O that's the beauty of C that it only cares about code generation and I don't want some compiler telling me about how to use libraries that I can write myself to begin with.

Take for example asctime, what if I don't want it in English and have day of week, month, hour, minute, second & year? Oh but the standard is English, so adapt right?

Well it's the standard! you'll say, but that's not even C, and that's my point.

Anyways, I'm done and spent too much time on this, I won't adopt kcc and stick to my tools. It's been a while since I needed a babysitter.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #61 on: March 28, 2014, 07:26:59 am »
You really don't appear to understand undefined behavior.  If you have two overlapping ranges and you want to copy from one to the other, you are not allowed to use strcpy, period.  So if your compiler can statically prove that the ranges overlap, it is perfectly within its rights to "optimize out" the call to strcpy altogether or do any number of nasty things.  The only way to rely on your architecture's behavior is use inline asm.

I do understand undefined behavior, but I also understand historical implementations.
memcpy and strcpy have always been implemented from low memory to high memory
yeah, I get I can do inline assembly but why if I don't need to?

Then again, both of them are not part of C that's my point, I can rewrite them the way I want and link them instead of the ones someone else wrote (in C at that).
It doesn't work that way.  The compiler is allowed to assume that if it encounters a function named strcpy, this function's behavior conforms to the C standard.  And to do whatever it wants if it encounters undefined behavior.  Most compilers won't be that mean-spirited, but they could if they wanted to.  Try it out for yourself:
Code: [Select]
void foo(uint32_t *p) {
  memset(p, 0, 4);
}
compiles to
Code: [Select]
foo:
        .cfi_startproc
        movl    $0, (%rdi)
        ret
        .cfi_endproc
This happens before any linking takes place.
 

Offline andersm

  • Super Contributor
  • ***
  • Posts: 1198
  • Country: fi
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #62 on: March 28, 2014, 07:37:13 am »
Quote from: IanB
But the standard defines exactly what should happen in the code sample you posted.
Quote from: tjaeger
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.

In some ways both of the above statements are actually true.

The promotion rules for the expression evaluation are well defined.

The value of what gets assigned to c is undefined purely because C does not define what happens when an integer expression overflows (normally it's just whatever the underlying CPU does).
Quote from: ISO/IEC 9899:TC3 §6.2.5.9
The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
The fragment in question used unsigned integers, so there is nothing undefined about it.

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #63 on: March 28, 2014, 07:44:57 am »
here is my foo with memset included on a P12F609

Code: [Select]
_memset:
;__Lib_CString.c,77 ::
;__Lib_CString.c,80 ::
0x0009 0x1283      BCF        STATUS, 5
0x000A 0x084B      MOVF       FARG_memset_p1, 0
0x000B 0x00F2      MOVWF      R2
;__Lib_CString.c,81 ::
L_memset20:
0x000C 0x084D      MOVF       FARG_memset_n, 0
0x000D 0x00F0      MOVWF      R0
0x000E 0x084E      MOVF       FARG_memset_n+1, 0
0x000F 0x00F1      MOVWF      R0+1
0x0010 0x3001      MOVLW      1
0x0011 0x02CD      SUBWF      FARG_memset_n, 1
0x0012 0x1C03      BTFSS      STATUS, 0
0x0013 0x03CE      DECF       FARG_memset_n+1, 1
0x0014 0x0870      MOVF       R0, 0
0x0015 0x0471      IORWF      R0+1, 0
0x0016 0x1903      BTFSC      STATUS, 2
0x0017 0x281E      GOTO       L_memset21
;__Lib_CString.c,82 ::
0x0018 0x0872      MOVF       R2, 0
0x0019 0x0084      MOVWF      FSR
0x001A 0x084C      MOVF       FARG_memset_character, 0
0x001B 0x0080      MOVWF      INDF
0x001C 0x0AF2      INCF       R2, 1
0x001D 0x280C      GOTO       L_memset20
L_memset21:
;__Lib_CString.c,83 ::
0x001E 0x084B      MOVF       FARG_memset_p1, 0
0x001F 0x00F0      MOVWF      R0
;__Lib_CString.c,84 ::
L_end_memset:
0x0020 0x0008      RETURN
; end of _memset
_foo:
;MyProject.c,1 :: void foo(unsigned long *p) {
;MyProject.c,2 :: memset(p, 0, 4);
0x0021 0x1283      BCF        STATUS, 5
0x0022 0x084A      MOVF       FARG_foo_p, 0
0x0023 0x00CB      MOVWF      FARG_memset_p1
0x0024 0x01CC      CLRF       FARG_memset_character
0x0025 0x3004      MOVLW      4
0x0026 0x00CD      MOVWF      FARG_memset_n
0x0027 0x3000      MOVLW      0
0x0028 0x00CE      MOVWF      FARG_memset_n+1
0x0029 0x2009      CALL       _memset
;MyProject.c,3 :: }
L_end_foo:
0x002A 0x0008      RETURN
; end of _foo

Seems it has no trouble at all, and that's as primitive as a chip can be.

Edit: But yeah it can be optimized if you are setting a 32bit unsigned integer and your processor can clear it that way. But memcpy and strcpy can be optimized that way.

Still memcpy, memset, strcpy etc are not part of C, implementation aside.
« Last Edit: March 28, 2014, 08:00:22 am by miguelvp »
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #64 on: March 28, 2014, 07:50:19 am »
Quote from: IanB
But the standard defines exactly what should happen in the code sample you posted.
Quote from: tjaeger
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.

In some ways both of the above statements are actually true.

The promotion rules for the expression evaluation are well defined.

The value of what gets assigned to c is undefined purely because C does not define what happens when an integer expression overflows (normally it's just whatever the underlying CPU does).
Quote from: ISO/IEC 9899:TC3 §6.2.5.9
The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
The fragment in question used unsigned integers, so there is nothing undefined about it.
This was the fragment in question, as far as I can tell.
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #65 on: March 28, 2014, 07:59:04 am »
here is my foo with memset included on a P12F609

Code: [Select]
_memset:
;__Lib_CString.c,77 ::
;__Lib_CString.c,80 ::
0x0009 0x1283      BCF        STATUS, 5
0x000A 0x084B      MOVF       FARG_memset_p1, 0
0x000B 0x00F2      MOVWF      R2
;__Lib_CString.c,81 ::
L_memset20:
0x000C 0x084D      MOVF       FARG_memset_n, 0
0x000D 0x00F0      MOVWF      R0
0x000E 0x084E      MOVF       FARG_memset_n+1, 0
0x000F 0x00F1      MOVWF      R0+1
0x0010 0x3001      MOVLW      1
0x0011 0x02CD      SUBWF      FARG_memset_n, 1
0x0012 0x1C03      BTFSS      STATUS, 0
0x0013 0x03CE      DECF       FARG_memset_n+1, 1
0x0014 0x0870      MOVF       R0, 0
0x0015 0x0471      IORWF      R0+1, 0
0x0016 0x1903      BTFSC      STATUS, 2
0x0017 0x281E      GOTO       L_memset21
;__Lib_CString.c,82 ::
0x0018 0x0872      MOVF       R2, 0
0x0019 0x0084      MOVWF      FSR
0x001A 0x084C      MOVF       FARG_memset_character, 0
0x001B 0x0080      MOVWF      INDF
0x001C 0x0AF2      INCF       R2, 1
0x001D 0x280C      GOTO       L_memset20
L_memset21:
;__Lib_CString.c,83 ::
0x001E 0x084B      MOVF       FARG_memset_p1, 0
0x001F 0x00F0      MOVWF      R0
;__Lib_CString.c,84 ::
L_end_memset:
0x0020 0x0008      RETURN
; end of _memset
_foo:
;MyProject.c,1 :: void foo(unsigned long *p) {
;MyProject.c,2 :: memset(p, 0, 4);
0x0021 0x1283      BCF        STATUS, 5
0x0022 0x084A      MOVF       FARG_foo_p, 0
0x0023 0x00CB      MOVWF      FARG_memset_p1
0x0024 0x01CC      CLRF       FARG_memset_character
0x0025 0x3004      MOVLW      4
0x0026 0x00CD      MOVWF      FARG_memset_n
0x0027 0x3000      MOVLW      0
0x0028 0x00CE      MOVWF      FARG_memset_n+1
0x0029 0x2009      CALL       _memset
;MyProject.c,3 :: }
L_end_foo:
0x002A 0x0008      RETURN
; end of _foo

Seems it has no trouble at all, and that's as primitive as a chip can be.
Congratulations! On this particular compiler, with the particular optimization settings you used, memset doesn't get "inlined".  So you'll probably be fine with an overlapping memcpy, too.  You'd still be relying on undefined behavior, though.  Just because you can get away with it doesn't make it right.
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 12590
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #66 on: March 28, 2014, 08:00:19 am »
Quote from: ISO/IEC 9899:TC3 §6.2.5.9
The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
The fragment in question used unsigned integers, so there is nothing undefined about it.

Thanks for that.

So the standard says that "1000 * 1000" with unsigned 16 bit integer types must produce 1000000 modulo 65536 = 16960? And that this is not implementation defined, but required. Now that I have access to my copy of K&R I see that this is indeed the case.
 

Offline miguelvp

  • Super Contributor
  • ***
  • Posts: 5550
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #67 on: March 28, 2014, 08:03:52 am »
Quote from: IanB
But the standard defines exactly what should happen in the code sample you posted.
Quote from: tjaeger
No. In this example, sizeof(int) was supposed to be 2 and the behavior of the code fragment is undefined.

In some ways both of the above statements are actually true.

The promotion rules for the expression evaluation are well defined.

The value of what gets assigned to c is undefined purely because C does not define what happens when an integer expression overflows (normally it's just whatever the underlying CPU does).
Quote from: ISO/IEC 9899:TC3 §6.2.5.9
The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
The fragment in question used unsigned integers, so there is nothing undefined about it.
This was the fragment in question, as far as I can tell.
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;

Even the OP paper mentions that the operation is totally defined

Quote
What if we make the types of a and b unsigned (0 to 65535)?
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;
Here, the arithmetic is again performed at the level of the operands,
but overflow on unsigned types is completely defined in C. The result
is computed by simply reducing the value modulo one more than
the max value [15, §6.3.1.3:2]. 1000000 mod 65536 gives us 16960
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #68 on: March 28, 2014, 08:07:05 am »
This was the fragment in question, as far as I can tell.
Code: [Select]
unsigned int a = 1000, b = 1000;
long int c = a * b;
Sorry about this, I finally see what you are saying. Yes, you're absolutely right, this is well-defined.  In my head, the fragment read as follows the whole time (which is actually the first example they give in the paper).
Code: [Select]
int a = 1000, b = 1000;
long int c = a * b;
 

Offline grumpydoc

  • Super Contributor
  • ***
  • Posts: 2952
  • Country: gb
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #69 on: March 28, 2014, 08:08:33 am »
Quote
In my head, the fragment read as follows the whole time (which is actually the first example they give in the paper).
You're not the only one to have read it like that - I did as well.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2823
  • Country: nz
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #70 on: March 28, 2014, 08:38:27 am »
here is my foo with memset included on a P12F609

Code: [Select]
...

Seems it has no trouble at all, and that's as primitive as a chip can be.

Still memcpy, memset, strcpy etc are not part of C, implementation aside.

How about a recursive function?

Code: [Select]
   unsigned factorial(unsigned n)
   {
      return n > 1 ? n * factorial(n-1) : 1;
   }

IIRC that chip as 8 levels of call stack - do you think it factorial( 8 ) will work? How is the support for floats and doubles? also part of the standard C? How about using pointers to functions?

I guess a really smart compiler could implement jump tables and other tricks, but it is pretty clear  to me that C isn't a natural fit to these MCUs.
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #71 on: March 28, 2014, 08:51:41 am »
And to do whatever it wants if it encounters undefined behavior.  Most compilers won't be that mean-spirited, but they could if they wanted to.
Maybe I was a little hasty here.  Check out gcc (version 4.8.2):
Code: [Select]
#include <stdint.h>
#include <string.h>
#include <stdio.h>

uint64_t __attribute__ ((noinline)) foo(size_t i) {
  uint64_t a[3] = {1, 2, 3};
  memcpy(a+1, a, 2*sizeof(uint64_t));
  return a[i];
}

uint64_t bar(size_t i) {
  uint64_t a[3] = {1, 2, 3};
  memcpy(a+1, a, 2*sizeof(uint64_t));
  return a[i];
}

void *memcpy(void *dest, const void *src, size_t n) {
  printf("Why won't anyone pay attention to me?\n");
  return NULL;
}

int main() {
  printf("%lu, %lu\n", foo(2), bar(2));
}
If I compile with -O2 this outputs
Code: [Select]
1, 2
The only difference between foo() and bar() is that foo() is not allowed to be inlined.  For both clang and gcc, the output of the program changes with optimization settings.
 

Offline hamster_nz

  • Super Contributor
  • ***
  • Posts: 2823
  • Country: nz
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #72 on: March 28, 2014, 09:05:26 am »
Code: [Select]
uint64_t __attribute__ ((noinline)) foo(size_t i) {
  uint64_t a[3] = {1, 2, 3};
  memcpy(a+1, a, 2*sizeof(uint64_t));
  return a[i];
}

For some reason, seeing non-static arrays declared within functions with it's elements being initialized makes me feel a little bit ill... I know it is a contrived example but EEEWWWW!
Gaze not into the abyss, lest you become recognized as an abyss domain expert, and they expect you keep gazing into the damn thing.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #73 on: March 28, 2014, 12:54:56 pm »
the issue seems related to uint64_t
 

Offline dannyf

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #74 on: March 28, 2014, 12:57:32 pm »
real men code;

everybody else debates about coding.

:)
================================
https://dannyelectronics.wordpress.com/
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #75 on: March 28, 2014, 01:20:37 pm »
Quote
Cyclone is a safe dialect of C.

Cyclone is like C: it has pointers and pointer arithmetic, structs, arrays, goto, manual memory management, and C’s preprocessor and syntax.
Cyclone adds features such as pattern matching, algebraic datatypes, exceptions, region-based memory management, and optional garbage collection.
Cyclone is safe: pure Cyclone programs are not vulnerable to a wide class of bugs that plague C programs: buffer overflows, format string attacks, double free bugs, dangling pointer accesses, etc.

have you seen Cyclone programming language ?

Unfortunately the last version was released on 2006 and … it seems dead, also issues with modern gcc compilers.
 

Offline tjaeger

  • Regular Contributor
  • *
  • Posts: 101
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #76 on: March 28, 2014, 03:49:59 pm »
the issue seems related to uint64_t
There is no issue.  memcpy() is undefined for overlapping memory areas, so compilers can do whatever they want.
 

Offline GiskardReventlov

  • Frequent Contributor
  • **
  • Posts: 598
  • Country: 00
  • How many pseudonyms do you have?
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #77 on: March 28, 2014, 09:22:58 pm »
kicad is written in C++, so this won't work.  Even if it were written in C, I doubt kcc could handle a project of this scale.

You know I knew kicad was written in C++.... Well anyway I will try it out on some other C code. I think Bind is C.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #78 on: March 28, 2014, 10:05:48 pm »
have you seen Cyclone ?

i'd like to compile it but it seems to have issues with modern compiler/system, i am having issues with gentoo/x86 with gcc-3.4.6

also, anybody has tried to compile and install the C-semantics tool ?

I am considering both these tools as "C validators", about this .. do you know good MISRA C validator ? Anything commercial (but cheap) or open source around ?
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4382
  • Country: us
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #79 on: March 29, 2014, 03:07:36 am »
Quote
programmers can take advantage of assumptions about the underlying architecture.  ... These ideas often work in concert to yield intricate, platform-dependent bugs
I'm not sure about that "often" part.
Yes, C has a bunch of "undefined" and implementation-dependent areas that can be annoying.
(and probably impossible to fix, although not necessarily impossible to check for.  Endianness?)

But I bet you could fix more actual real-world bugs by adding a "string" datatype and associated set of functions and operators, and having people use them when they needed, you know, STRINGS.
 

Offline dannyf

  • Super Contributor
  • ***
  • Posts: 8221
  • Country: 00
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #80 on: March 29, 2014, 05:01:30 pm »
Quote
do you know good MISRA C validator ?

Lint is widely used.

I personally use IAR's built-in misra validators - not as elaborate but good compromise between functionality and costs. You can also try greenhill but they are pricey.

MISRA to me is the starting point of good coding habits / common sense practice. If you can be 90% MISRA compliant, you are doing a very good job.
================================
https://dannyelectronics.wordpress.com/
 

Offline Q-Kernel

  • Contributor
  • Posts: 13
Re: the c-semantics project is aimed against - C undefined behavior -
« Reply #81 on: April 02, 2014, 03:10:26 am »
do you know good MISRA C validator ? Anything commercial (but cheap) or open source around ?

PC-Lint $389
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf