EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: MikeK on August 05, 2023, 10:51:45 pm

Title: Program output different on two machines
Post by: MikeK on August 05, 2023, 10:51:45 pm
I have a simple C program on two machines. One desktop, one laptop...both running Linux Mint 21 and the same version of gcc. But when I compile and run the program I get different output (a file) on each machine. The output files have the same number of bytes, but display differently. I wonder if one of the machine is missing a language pack or something like that? How can I figure this out and get them to behave identically?

Attached is a screenshot showing both files in Notepad++ (yes, from my Win10 machine, but that isn't the problem).  Also attached is the first record from each machine, in hex.  And I might as well attach the program (mmap_eg.c).

Any clue what is going on?

[attachimg=1]

[attachimg=2]

[attachimg=3]
Title: Re: Program output different on two machines
Post by: ataradov on August 05, 2023, 11:06:32 pm
You are not fully initializing the stack variable, so whatever is left after the  printed string would be left filled with random garbage from the stack.

mamset() "record" variable to 0 before sprintf() if you want consistent results.
Title: Re: Program output different on two machines
Post by: MikeK on August 05, 2023, 11:45:52 pm
Both outputs are identical now.  Thanks, Alex.

Code: [Select]
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
    int integer;
    char string[24];
} RECORD;

#define NRECORDS (100)

int main(void) {
    RECORD record, *mapped;
    int i, f;
    FILE *fp;

    fp = fopen("records.dat", "w+");
    for (i=0; i<NRECORDS; i++) {
        record.integer = i;
        memset(record.string, 0, 24);
        sprintf(record.string, "RECORD-%d", i);
        fwrite(&record, sizeof(record), 1, fp);
    }
    fclose(fp);

    fp = fopen("records.dat", "r+");
    fseek(fp, 43*sizeof(record), SEEK_SET);
    fread(&record, sizeof(record), 1, fp);

    record.integer = 143;
    memset(record.string, 0, 24);
    sprintf(record.string, "RECORD-%d", record.integer);

    fseek(fp, 43*sizeof(record), SEEK_SET);
    fwrite(&record, sizeof(record), 1, fp);
    fclose(fp);

    f = open("records.dat", O_RDWR);
    mapped = (RECORD *)mmap(0, NRECORDS*sizeof(record),
                        PROT_READ|PROT_WRITE, MAP_SHARED, f, 0);

   mapped[43].integer = 243;
   memset(record.string, 0, 24);
   sprintf(mapped[43].string, "RECORD-%d", mapped[43].integer);

   msync((void *)mapped, NRECORDS*sizeof(record), MS_ASYNC);
   munmap((void *)mapped, NRECORDS*sizeof(record));
   close(f);

   exit(0);
}
Title: Re: Program output different on two machines
Post by: TheCalligrapher on August 06, 2023, 01:45:12 am
mamset() "record" variable to 0 before sprintf() if you want consistent results.

... which would be a questionable practice, unless you absolutely positively have to set padding bytes to zero as well. `memset`? Do not use `memset`. You have `= { 0 }` initializers for that purpose (and `= {}` starting from C23)

Code: [Select]
RECORD record = { 0 };

and even where initialization is not an option, assignment from a compound literal is a better idea than `memset`

Code: [Select]
RECORD record;
...
record = (RECORD) { 0 };

The OP's code is poorly structured, since it follows the bad practice of declaring the variables at the beginning of the function. If the OP opted to declare `record` locally, they'd be able to simply use `= { 0 }` initializer at the point of declaration.

P.S. Having said that, the OP's code might indeed be better off with `memset`, since the object is then streamed to a binary stream in its entirely, padding bytes and all. Even if padding byte values might not matter, it is better to zero them out before sending the struct through a raw binary interface. But this is only true because C language seems to remain lazy/undecided on matters of padding initialization. Let's see what C23 will look like in that regard once it is ready.
Title: Re: Program output different on two machines
Post by: MikeK on August 06, 2023, 02:19:32 am
That code wasn't mine; it's taken directly from a Linux Programming book.
Title: Re: Program output different on two machines
Post by: SiliconWizard on August 06, 2023, 02:23:44 am
That code wasn't mine; it's taken directly from a Linux Programming book.

Ouch. So much bad C in books out there.
Title: Re: Program output different on two machines
Post by: ataradov on August 06, 2023, 02:34:27 am
If we are criticizing the code, then writing  types like "int" directly into the file is not the best idea either.

This code is good enough for a quick demo of mmap() stuff. I personally much rather see a simple code that has limitations than the most secure code that is huge and requires removal of a lot of crap before it is clear what it is actually doing. But I can see how there may be different expectations from different people.

But I also don't copy-paste random code into my stuff without understanding what it is doing.
Title: Re: Program output different on two machines
Post by: peter-h on August 07, 2023, 08:32:17 am
Quote
You have `= { 0 }` initializers for that purpose

I thought this is not guaranteed, except for structures.

Title: Re: Program output different on two machines
Post by: ataradov on August 07, 2023, 03:24:50 pm
It is guaranteed for all aggregate types (arrays and structures):

Quote
Array and structure types are collectively called aggregate types.
and
Quote
If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.

Title: Re: Program output different on two machines
Post by: peter-h on August 07, 2023, 04:01:15 pm
I am trying to work this out.

From my project notes:

With ARM GCC, statics are categorised thus:
int fred; goes into BSS (used to go into COMMON in GCC versions before v10)
int fred=0; goes into BSS (which by definition is zeroed by startup code)
int fred=1; goes into DATA (statics initialised to nonzero, copied to RAM at start)

Now for arrays:

uint8_t fred[1000] = {0};   // I don't think this does anything other than fred[0]=0;

Only a struct = {0} gets filled completely.

If I am wrong then (assuming the following are statics)

uint8_t fred[1000] = {0} places fred into BSS
uint8_t fred[1000] = {1} places fred into DATA and sets up a block of 1000 bytes, all 0x01, in FLASH, to be copied to DATA at startup.

and similarly (assuming the following are on some function stack)

uint8_t fred[1000] = {0} calls memset() to fill with zeroes, every time the function is called
uint8_t fred[1000] = {1} does god knows what...

I have seen various weird things so I use {x,y,z} only to fill an array which 3 elements, and use memset() etc for everything else.

Title: Re: Program output different on two machines
Post by: ataradov on August 07, 2023, 04:14:53 pm
You need to be more specific. Are you talking about global or local variables? For global variables initialization is guaranteed with no extra effort. Not that it will help in this case, since the variable is set multiple times.

Quote
uint8_t fred[1000] = {0};   // I don't think this does anything other than fred[0]=0;
There is no need to think, I quoted the standard that specifically says that the rest of the members would be initialized to 0 (as anything with static storage duration).

Quote
uint8_t fred[1000] = {1} places fred into DATA and sets up a block of 1000 bytes, all 0x01, in FLASH, to be copied to DATA at startup.
No all 0x01, just the first one.

Quote
uint8_t fred[1000] = {1} does god knows what...
The standard knows. Read it, it is good.
Title: Re: Program output different on two machines
Post by: ejeffrey on August 07, 2023, 04:34:40 pm
I am trying to work this out.

From my project notes:

With ARM GCC, statics are categorised thus:
int fred; goes into BSS (used to go into COMMON in GCC versions before v10)
int fred=0; goes into BSS (which by definition is zeroed by startup code)
int fred=1; goes into DATA (statics initialised to nonzero, copied to RAM at start)

It's important to remember that the C standard does not specify anything about .data, .bss, and COMMON.  These are how platforms satisfy the requirements of the standard.  It's definitely good to understand as part of how your code goes from your intent to the machine execution, and also to help debug when your toolchain is doing something wrong or you are responsible for part of the implementation.  But observing what section the compiler puts symbols in is not authoritative and can certainly be misleading.  For instance the standard is totally fine with the compiler putting a zero initialized value in .data instead of .bss, or it could put an object in bss and emit startup code to initialize it. These would both be unusual behavior and might violate platform specific guarantees, but the standard doesn't care.

The point here is not that you should worry about whether your compiler is doing something tricky, it's that you should understand the required behavior of the C standard.
Title: Re: Program output different on two machines
Post by: SiliconWizard on August 07, 2023, 10:06:28 pm
It is guaranteed for all aggregate types (arrays and structures):

Quote
Array and structure types are collectively called aggregate types.
and
Quote
If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.

Yes it is guaranteed, absolutely, unambiguously.

C23 introduces the empty initializer '{}', which initializes with zeroes without having to care about the type of the first item/field.

Title: Re: Program output different on two machines
Post by: TheCalligrapher on August 08, 2023, 03:28:38 am
C23 introduces the empty initializer '{}', which initializes with zeroes without having to care about the type of the first item/field.

Um, this is misleading. In C constant zero can initialize anything. For which reason in C you never had "to care about the type of the first item/field". Since the beginning of times `= { 0 }` has been used in C as a "universal zero initializer" idiom. It can zero-initialize absolutely anything. It can be used with scalar objects too (not only with aggregates)

Code: [Select]
int a = { 0 }; /* This has always been allowed in standard C */

The reasons C23 introduced `= {}` are different: 1) improve cross-compilability with C++ code (since in C++ you do have to care about the type of the first item/field), and 2) this is allowed as a VLA initializer.
Title: Re: Program output different on two machines
Post by: peter-h on August 08, 2023, 09:34:31 am
Quote
For instance the standard is totally fine with the compiler putting a zero initialized value in .data instead of .bss, or it could put an object in bss and emit startup code to initialize it. These would both be unusual behavior and might violate platform specific guarantees, but the standard doesn't care.

Then one cannot implement anything because you need to know how the sections work in order to build a working product :) So the standard is useless; you have to write some code with your chosen compiler and see where the variables ended up in the .map file.

Quote
C23 introduces the empty initializer '{}', which initializes with zeroes without having to care about the type of the first item/field.

In GCC v10:

uint8_t fred[1000];   // gets placed into BSS and thus filled with zeroes (I would have run memset() on that one, just in case)
uint8_t fred[1000] = {0};    // according to above, gets filled with zeroes (not sure if GCC v8 did that though)

For local (stack based) variables, I posted that above. The compiler generates explicit code which executes when the function is called. That bit if fairly obvious. but AIU

void function_joe (void
{
   uint8_t fred[1000];  // does not get initialised
   uint8_t fred[1000] = {0};  // gets zeroed with special code executed when the function is called (I would have run memset() on that one)
   static uint8_t fred[1000];  // is placed into BSS (under a private name) and zeroed via BSS getting zeroed   (I would have run memset() on that one)
   static uint8_t fred[1000] = {0};  // same as above (I would have run memset() on that one)
}

Which bits are wrong?

Quote
Yes it is guaranteed, absolutely, unambiguously.

Quote
Um, this is misleading

Interesting how stupid people like me can start a discussion ;)
Title: Re: Program output different on two machines
Post by: ejeffrey on August 08, 2023, 01:24:28 pm

Then one cannot implement anything because you need to know how the sections work in order to build a working product :)

That's really not true.  If you are writing the startup code you need to make sure that bss is zeroed, that data is loaded to the appropriate memory address and so on.  But but the rules for variable initialization  are spelled out in the standard and that is what you should use as your primary reference.  Trying to figure this stuff out by looking at map files is wrong, incomplete (it doesn't include runtime initialization code) and more work than just using the standard.  You are making your own life harder and making more mistakes because you don't just read the definition of how variables are initialized.

Quote
So the standard is useless; you have to write some code with your chosen compiler and see where the variables ended up in the .map file.

You often post self deprecating things here like "I may be stupid... You are all smarter.."  I don't think you are stupid and I commend your interest in tracking down how things happen but your unwillingness to actually read and understand the standard puts you at a significant disadvantage.  Looking at compiler behavior is great but it is no substitute for learning the rules of why a compiler does a particular thing.
Title: Re: Program output different on two machines
Post by: peter-h on August 08, 2023, 01:30:36 pm
Quote
unwillingness to actually read and understand the standard puts you at a significant disadvantage

I can't understand obscure writing like that - sorry. It was written by experts for experts, like "It can be used with scalar objects too (not only with aggregates)" ... what does that actually mean?

I am not a CS professor; I just do hardware, software, and run a small business selling little boxes, and been doing that for 45 years. I also avoid pointers ;)

That is why I posted those concrete examples with fred[1000]. I was hoping the experts might comment on them.
Title: Re: Program output different on two machines
Post by: peter-h on August 08, 2023, 01:45:10 pm
Haha you said you put me in your killfile, and I felt truly honoured, but it seems to have slipped out ;)

This isn't Usenet, old chum...
Title: Re: Program output different on two machines
Post by: ataradov on August 08, 2023, 03:11:30 pm
So the standard is useless; you have to write some code with your chosen compiler and see where the variables ended up in the .map file.
No, quite the opposite, if you have a standard compiler, then you have guaranteed behavior on any environment.

But keep in mind that initialization code is also part of the "compiler". The standard only deals with the stuff that happens after main() is called. The environment setup you see in the reset handler is there to prepare the environment so that main() behaves as expected. You never have to think about where things went after main(). Embedded environment just puts some of the initializations on you, so you have to know a few details about your compiler. On a full OS you never have to see a linker script or startup code, your part starts from main().

Which bits are wrong?
None of this is wrong.
Title: Re: Program output different on two machines
Post by: peter-h on August 08, 2023, 03:21:51 pm
This happens to be something I am doing right now, and revisiting some old code

(https://peter-ftp.co.uk/screenshots/20230808005791916.jpg)

The answer seems to be Yes. Curiously the two arrays initialised with {0} were done by someone who knew about that behaviour. I have avoided that in my code, since one experienced dev I used to work with told me this cannot be relied on.

Quote
The environment setup you see in the reset handler is there to prepare the environment so that main() behaves as expected. You never have to think about where things went after main(). Embedded environment just puts some of the initializations on you, so you have to know a few details about your compiler. On a full OS you never have to see a linker script or startup code, your part starts from main().

Right, but in embedded stuff, in order to meet the expectations of the compiler standards, the init code needs to be done right.

And this varies. See e.g.

Quote
int fred; goes into BSS (used to go into COMMON in GCC versions before v10)

above. It is an example where a new compiler version can break something, despite both being "standards compliant". In this case, the change from COMMON to BSS needs to be correctly picked up by the linker script.
Title: Re: Program output different on two machines
Post by: ataradov on August 08, 2023, 03:33:21 pm
The answer seems to be Yes.
Yes, you can just use = {0}.

Right, but in embedded stuff, in order to meet the expectations of the compiler standards, the init code needs to be done right.
Yes, but it is compiler creator's job to write a linker script for their compiler. You may choose to mess with it for your own needs, but then you are taking a job of a compiler writer and obviously need to have understanding of what compiler is doing. This is not a fault of the standard.

Linker scripts are in no way specified in the standard, so compiler may change their format or details at any point, it is on you to track those changes. Or just take the linker script from the compiler every time you update.
Title: Re: Program output different on two machines
Post by: peter-h on August 08, 2023, 03:40:24 pm
Quote
it is compiler creator's job to write a linker script for their compiler.

That's complete news to me. Where would this be found, for example, for GCC v10 as supplied with Cube IDE v1.13.1?

There is a linkfile which comes with Cube but it is some hack which makes no reference to GCC.

Quote
Linker scripts are in no way specified in the standard, so compiler may change their format or details at any point, it is on you to track those changes.

That seems to be a contradiction of the above :)

BTW I also see a trap: if you write

uint8_t fred[1000] = {0};

for a local (stack) array, that is quite likely to call memset(), and if it doesn't actually do that (and puts in a loop) then at some level of optimisation the compiler will replace it with memset() anyway, and that will happen even if you have not #included stdlib.h, which will bite you in the bum if your project is not loading stdlib to save space, etc. For such cases I write a custom memset but one cannot get the compiler to use that instead.

I had a whole lot of fun with that in a 32k boot loader. I tried the compiler option but it didn't work reliably, and it was solved by an -O0 attribute for each function which might contain a candidate loop.
Title: Re: Program output different on two machines
Post by: ataradov on August 08, 2023, 03:51:55 pm
That's complete news to me. Where would this be found, for example, for GCC v10 as supplied with Cube IDE v1.13.1?
The stock example file that comes with GCC itself is located here arm-gnu-toolchain\share\gcc-arm-none-eabi\samples\ldscripts  (in the stock GCC, I have no idea how ST packages it).

To be clear, I'm not saying that it is convenient or even a good idea to use those standard linker scripts for embedded development. But if you choose not to use them, you are taking on responsibility for the correct environment setup.

There is a linkfile which comes with Cube but it is some hack which makes no reference to GCC.
This is because "embedded" varies a lot, so vendors opt for making their own files. And you can make your own if you want. But you need to realize that this file would be specific to the compiler. IAR linker scripts and general memory model look nothing like this.


That seems to be a contradiction of the above :)
How so? Linker scripts are implementation detail of a specific compiler. They don't even need to exist. It is possible to make a compiler that does not use them at all.

For such cases I write a custom memset but one cannot get the compiler to use that instead.
Well, this is your problem. memset() is part of a C library that standard defines. Any low level removal of the libraries you do for optimization are up to you, but you are no longer in the standard territory.


Title: Re: Program output different on two machines
Post by: ejeffrey on August 08, 2023, 05:01:10 pm
Right, but in embedded stuff, in order to meet the expectations of the compiler standards, the init code needs to be done right.
Yes, but it is compiler creator's job to write a linker script for their compiler. You may choose to mess with it for your own needs, but then you are taking a job of a compiler writer and obviously need to have understanding of what compiler is doing. This is not a fault of the standard.

Also: if you are taking on part of the job of the compiler (such as modifying the startup code or linker script) you absolutely, 100% need to know what the standard says.  It's not an excuse to ignore the standard, it's an absolute mandate that you understand it.  If your startup code and linker scripts doesn't satisfy those requirements, you are going to break the standard library and every third party library which I promise you are written by people who are expecting the compiler (including your changes) to follow the standard.

Can you imagine if the GCC or clang maintainers just said "we don't read the standard because it's too complicated.  We just try to make sure that we like the way it operates?"  If you are messing with startup code that's basically what you are saying.

And yes, as an embedded developer you * also* need to know how your particular compiler works.  You need to know that variables that need zero initialization will be placed in .bss and that you need to make sure you zero it before you call main.  You need to know how the compiler is expecting you to copy data into memory.  That is what you have to do in order to have a compliant system.  Once you do that and main() is called, you should be referring to the standard for things like initializer syntax.
Title: Re: Program output different on two machines
Post by: peter-h on August 08, 2023, 08:38:46 pm
I've been doing this 40+ years and I bet this has bitten a lot of people in the bum who mostly didn't post about it.

On the one hand people talk about standards, you must read the standards, then they tell you that the standards do not apply to basically most embedded applications.
Title: Re: Program output different on two machines
Post by: ataradov on August 08, 2023, 08:57:47 pm
The standard applies here fully.  It specifies the execution environment and what a program loader must do to prepare that environment.  C is not specified for real hardware, it is specified for an abstract machine. It is up to the compiler/OS authors to make sure that a real machine is adapted to appear as that abstract machine.

It is not a standard's fault you have not read it and don't understand it.
Title: Re: Program output different on two machines
Post by: SiliconWizard on August 08, 2023, 09:13:48 pm
In GCC v10:

uint8_t fred[1000];   // gets placed into BSS and thus filled with zeroes (I would have run memset() on that one, just in case)
uint8_t fred[1000] = {0};    // according to above, gets filled with zeroes (not sure if GCC v8 did that though)

For local (stack based) variables, I posted that above. The compiler generates explicit code which executes when the function is called. That bit if fairly obvious. but AIU

void function_joe (void
{
   uint8_t fred[1000];  // does not get initialised
   uint8_t fred[1000] = {0};  // gets zeroed with special code executed when the function is called (I would have run memset() on that one)
   static uint8_t fred[1000];  // is placed into BSS (under a private name) and zeroed via BSS getting zeroed   (I would have run memset() on that one)
   static uint8_t fred[1000] = {0};  // same as above (I would have run memset() on that one)
}

Which bits are wrong?

None, this is all correct and has been the normal behavior beginning with C89. (I don't know about pre-standard C, but you are unlikely to ever run into it unless you're suddenly getting into dinosaur computing.)
Explicitely initializing a statically allocated object with '{0}' is not required as it's guaranteed, but if you prefer doing it for a matter of code style, it won't hurt (much).

It should yield the exact same compiled code in theory, although I have already seen compilers which would place an explicitely initialized static variable, even with just a 0 initializer, in the 'data' section rather than 'bss'. The behavior will be the same though, but it may make the startup phase slightly longer (probably insignificantly so though).

As to whether a given compiler for a given target, a given code context and given optimization options will call memset() or inline code for zeroing out an object in memory, it's impossible to predict unless you know your tools extremely well, and it should usually not matter. But if you want complete control over the initialization of local variables for some reason, declare them with no initializer and initialize them by hand in the way you want, not that it should be useful in most cases though, but just in case.

Title: Re: Program output different on two machines
Post by: peter-h on August 08, 2023, 09:20:41 pm
Quote
I have already seen compilers which would place an explicitely initialized static variable, even with just a 0 initializer, in the 'data' section rather than 'bss'.

That is dumb, surely, because what is the point of the concept of BSS? The DATA section is placed in FLASH and then copied over to RAM, so it is wasting a load of FLASH.

Quote
unless you're suddenly getting into dinosaur computing

Or have to work on an old product. One of my best selling boxes was done in 1995 :) In asm and Hitech H8/300 C. Still sells very well, no bugs found. It would be crazy to port it to a new environment. My current project is actually similar but with more features and ETH, USB, etc.

Quote
declare them with no initializer

I am fairly sure that old compilers did not initialise them at all. I recall reading about that as a common criticism of C.
Title: Re: Program output different on two machines
Post by: SiliconWizard on August 08, 2023, 09:32:35 pm
Quote
I have already seen compilers which would place an explicitely initialized static variable, even with just a 0 initializer, in the 'data' section rather than 'bss'.

That is dumb, surely, because what is the point of the concept of BSS? The DATA section is placed in FLASH and then copied over to RAM, so it is wasting a load of FLASH.

That can be considered suboptimal in some use cases (including when the target is a MCU), but that is not completely dumb.
https://en.wikipedia.org/wiki/.bss

Memory segments are not part of the standard AFAIK. Compilers are free to use them as they see fit, or even make no use of segments if there is no separate linker with the concept of segments.

Quote
unless you're suddenly getting into dinosaur computing

Or have to work on an old product. One of my best selling boxes was done in 1995 :) In asm and Hitech H8/300 C. Still sells very well, no bugs found. It would be crazy to port it to a new environment. My current project is actually similar but with more features and ETH, USB, etc.

Well, the question is, is the C compiler compliant with at least C89 or not. If so, then the above should be guaranteed. I don't know if this Hitech compiler was.
Title: Re: Program output different on two machines
Post by: ejeffrey on August 09, 2023, 02:09:20 am

I am fairly sure that old compilers did not initialise them at all. I recall reading about that as a common criticism of C.

"The C programming language", 1st edition, from 1978 says on page 82:

"In the absence of explicit initialization, external and static variables are guaranteed to be initialized to zero; automatic and register variables have undefined (i.e., garbage) values."

That is the closest thing there was to a standard for years straight from the original designers.  For reference, this book includes descriptions of C operating on actual machines using 9 bit bytes and non-ASCII character sets.  Maybe in the early 70s you could fine a prototype compiler that didn't do that, but it has been mandatory for as long as C has been widely used and I can almost guarantee that you have never used a compiler old enough that it legitimately did not do this.  That doesn't mean you haven't used a non-compliant compiler, but I seriously doubt it.  I think it's just a rumor that you think you heard once and never checked to see if it was true or not.
Title: Re: Program output different on two machines
Post by: TheCalligrapher on August 09, 2023, 06:54:47 am

I am fairly sure that old compilers did not initialise them at all. I recall reading about that as a common criticism of C.

"The C programming language", 1st edition, from 1978 says on page 82:

"In the absence of explicit initialization, external and static variables are guaranteed to be initialized to zero; automatic and register variables have undefined (i.e., garbage) values."

"The C programming language", 1st edition is beside the point in this case, since in that version of the language it was explicitly prohibited to supply initializers when declaring automatic aggregates (same book, page 198). In K&R 1st edition automatic aggregates always began their lives fully uninitialized. No way around it. So, the matter of "full vs. partial" initialization of automatic aggregates simple did not exist at that time. (Meanwhile, for static aggregates zeroing-out beyond partial initializer was guaranteed by the 1st edition.)

Support for initializers for automatic aggregates appeared in the 2nd edition, and at that time C has already adopted "all-or-nothing" approach to initialization: if you explicitly initialize only a part of an aggregate, the rest is always guaranteed to be zeroed-out.

Note, that this behavior is also critical to proper functionality of fixed-width strings: `char fw[64] = "abc"` has to zero-out the whole array for `fw` to become a proper fixed-width string.
Title: Re: Program output different on two machines
Post by: ejeffrey on August 09, 2023, 04:39:34 pm
["The C programming language", 1st edition is beside the point in this case, since in that version of the language it was explicitly prohibited to supply initializers when declaring automatic aggregates (same book, page 198). In K&R 1st edition automatic aggregates always began their lives fully uninitialized. No way around it. So, the matter of "full vs. partial" initialization of automatic aggregates simple did not exist at that time. (Meanwhile, for static aggregates zeroing-out beyond partial initializer was guaranteed by the 1st edition.)

As near as I can tell, peter-h is confused about or unwilling to believe that the C language guarantees initialization of static and global variables in general.  While initialization syntax has added some new options, C has guaranteed initialization of global and static variables, aggregate or not, regardless of whether a value is specified in the source and has for as long as it has existed.  He seems to believe this is an implementation choice or a recent addition to the language, and that an arbitrarily old or obscure compiler shouldn't be expected to do so.  If my understanding of his complaints is correct and his qualms only relate to partial specification of initialization of automatic variables, then indeed the 1st edition book is not relevant.
Title: Re: Program output different on two machines
Post by: MikeK on August 09, 2023, 05:59:11 pm
This has me thinking about my original question...

I understand now that I wasn't initializing the record.string variable.  But the desktop and laptop should have shown similar behavior, shouldn't they?  I wouldn't expect them to have the same random characters, but the desktop was obviously zeroing-out the variable and the laptop wasn't.  (See the image of the hex dump in my original post #1).
Title: Re: Program output different on two machines
Post by: ataradov on August 09, 2023, 07:02:12 pm
It might depend on how the memory pages got mapped, which would depend on what other software is running and how long the OS was running.
Title: Re: Program output different on two machines
Post by: ejeffrey on August 09, 2023, 07:17:08 pm
The garbage is probably whatever is left on the stack by the libc startup code.  This would depend less on the version of gcc than glibc and the dynamic loader machinery.  It can also depend on the contents of the environment variables.  One thing to try if you are curious about this is to create a subshell with a minimal environment and the try again.
Title: Re: Program output different on two machines
Post by: SiliconWizard on August 09, 2023, 09:58:31 pm
Zeroes are valid random values. Seriously. Yes as said above, depends on what was in memory at that location.

Note that it usually also depends on the compiler itself and optimization options.
Most C compilers zero out uninitialized local variables when compiling with zero optimizations/debug mode.
That's why many developers have been bitten by this - I'd be curious to find even one person that has never at least encountered this when they were learning (and many long after that!).
Classic symptom: your code appears to run fine in "debug mode" and starts behaving erratically when you switch to a "release mode".
Title: Re: Program output different on two machines
Post by: ataradov on August 09, 2023, 10:16:18 pm
I'm not sure I've seen a single compiler that would initialize locals in any mode.

Neither GCC or Clang do that for sure.

But both GCC and Clang warn about using uninitialized variables even at the lowest warning level.
Title: Re: Program output different on two machines
Post by: MikeK on August 09, 2023, 10:30:44 pm
I ran the program, without memset, on the desktop and in a subshell and it still filled the extra record space with zeros.  I'm compiling and executing without any options:

Code: [Select]
$gcc -o mmap_eg mmap_eg.c
$./mmap_eg

Still seems odd to me that the desktop does this, but the laptop doesn't.
Title: Re: Program output different on two machines
Post by: ejeffrey on August 09, 2023, 11:00:56 pm
The compiler won't generate code to initialize locals if you don't request it, but the Linux kernel always provides zero cleared memory to the program.  That means the first time you use a given stack location, it will be zero.  So yes, zero is a common observed value for uninitialized data.  If glibc startup doesn't need to call any complex functions that write data on the stack, then your code is most likely to see zeros.  But if any of the startup code does use the stack, it may leave other data on the stack.

I don't remember any compilers deliberately zeroing uninitialized variables even in debug mode, although that doesn't mean such compilers don't exist.  But there are a number of other behaviors that can cause similar behavior.  In debug mode, generally each variable is assigned dedicated space on the stack to make it easier for the debugger to display local variables.  If the stack starts off zeroed, then every variable will start zeroed.  However, on an optimized build variables whose lifetimes don't overlap can share the same memory.  A write to one variable can then be read back via a second if that variable was not initialized.

Quote
Still seems odd to me that the desktop does this, but the laptop doesn't.

If you want to track this down, you can use gdb to trace the startup code and see what is writing to the stack, but it's going to be something boring.
Title: Re: Program output different on two machines
Post by: MikeK on August 09, 2023, 11:31:59 pm
Classic symptom: your code appears to run fine in "debug mode" and starts behaving erratically when you switch to a "release mode".

But I'm not running in "debug mode".  I'm building on both machines the same way, as I mention in one of my later posts.  And executing the same way...no switching at all.  And "behaving erratically" is quite an exaggeration.
Title: Re: Program output different on two machines
Post by: SiliconWizard on August 09, 2023, 11:47:13 pm
I'm not sure I've seen a single compiler that would initialize locals in any mode.

Neither GCC or Clang do that for sure.

That was something common with MS compilers. But I've seen it with other compilers and platforms certainly.

But both GCC and Clang warn about using uninitialized variables even at the lowest warning level.

Sure, that wasn't the point. The point was about what *can* happen when variables with auto storage are not initialized. And it's of course just a matter of implementation as it's UB.
So you may find as many actual behaviors as there are compilers and platforms. It doesn't really matter, what matters is to know that it's UB and so shouldn't be relied upon.

Now what the compiler does or does not, or whether it's not directly the compiler, but the CRT or even the OS itself is, as well, just a matter of implementation. It doesn't really matter for the developer, except out of interest.

What happened (happens?) with MS compilers is that from what I know, it wasn't so much that the compilers were directly initializing uninitialized variables in "debug mode", but the fact that the CRT that was linked was different - there was always one debug and one release CRT with MS tools. And the debug CRT would initialize stack memory (and also heap memory I think) with some predefined values which have changed over the years.

On the OS level, that can also be that memory allocation (stack or heap, stack in this case) can be filled with some predefined values before being handed to the executable.

The kind of predefined values that are used in those cases varies. MS likes to fill stacks with non-zero values (like 0xCD) to help catching bugs (that's the developer's POV) and some OSs like to zero them out as it's usually considered a "safer" default.

Again in any case, all this can be interesting or good to know, but should never be relied upon. Initialize your local variables, or more correctly, your variables with auto storage.
Title: Re: Program output different on two machines
Post by: ataradov on August 09, 2023, 11:51:37 pm
I tried on Ubuntu 22.04, and I get random garbage in the empty space.

I checked how much of the stack was used by the time main() was called, so I made a 1 MB array and checked elements from the top for 0. Most of the array was 0, but the last ~6 KB had non-zero values. So, yes, it looks like on load the stack was zeroed out, but by the time main() got called 6 KB of the stack was used, and now being reused by main().
Title: Re: Program output different on two machines
Post by: MikeK on August 10, 2023, 12:16:48 am
Alright, I'll let this lie.  Thanks for all the input, guys.
Title: Re: Program output different on two machines
Post by: peter-h on August 10, 2023, 04:57:19 pm
Good thread. I've learnt a lot :)