Author Topic: Why compiler doesn't allocate required memory for struct  (Read 1228 times)

0 Members and 1 Guest are viewing this topic.

Offline Kittu20Topic starter

  • Regular Contributor
  • *
  • Posts: 96
  • Country: in
Why compiler doesn't allocate required memory for struct
« on: January 23, 2024, 03:22:42 pm »
I'm trying to understand why compilers allocate additional memory when using a structure in a program ( padding). I'm puzzled by the fact that why it doesn't allocate the exact memory needed for each structure object.

In my observation, when having two structure members—one of int type and the other of char type—the compiler allocates a total of 8 bytes of memory (4 bytes for the int type object and 4 bytes for the char type object). However, I'm curious about why there's an extra 3 bytes allocated for the char type object.why compiler not allocate 1 byte

Could someone kindly elaborate reason of structure padding c language
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11653
  • Country: my
  • reassessing directives...
Re: Why compiler doesn't allocate required memory for struct
« Reply #1 on: January 23, 2024, 04:14:33 pm »
In my observation, when having two structure members—one of int type and the other of char type—the compiler allocates a total of 8 bytes of memory (4 bytes for the int type object and 4 bytes for the char type object). However, I'm curious about why there's an extra 3 bytes allocated for the char type object.why compiler not allocate 1 byte
it goes back to computer architecture... https://en.wikipedia.org/wiki/32-bit_computing
if you are compiling 64-bit app or on 64-bit machine, as commonly used PC today... you'll see worse... https://en.wikipedia.org/wiki/64-bit_computing
more about data alignment in memory.. https://en.wikipedia.org/wiki/Data_structure_alignment

data padding and packing...
https://www.skillvertex.com/blog/structure-member-alignment-padding-and-data-packing/
https://www.geeksforgeeks.org/structure-member-alignment-padding-and-data-packing/
fwiw...
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 8652
  • Country: gb
Re: Why compiler doesn't allocate required memory for struct
« Reply #2 on: January 23, 2024, 04:23:37 pm »
I'm trying to understand why compilers allocate additional memory when using a structure in a program ( padding). I'm puzzled by the fact that why it doesn't allocate the exact memory needed for each structure object.

In my observation, when having two structure members—one of int type and the other of char type—the compiler allocates a total of 8 bytes of memory (4 bytes for the int type object and 4 bytes for the char type object). However, I'm curious about why there's an extra 3 bytes allocated for the char type object.why compiler not allocate 1 byte

Could someone kindly elaborate reason of structure padding c language
I assume you are working with C. With most compilers you have a choice. The default is not to tightly pack structures, and keep things aligned in a way that maximises performance. If you really want them packed, most compilers have proprietary language extensions (e.g. "#pragma pack", or an "attribute" declaration) to control packing. This can be good, if you have a large array of structures, and their not being packed starts to waste substantial memory. If you are expecting your packed structure to look like, say, a record in a communications structure, you won't achieve your goal simply by packaging the structure. You need to allow for the processor's byte ordering, too. A gotcha with some compilers is they will freely reorder the data in a structure, to minimise the wastage caused by alignment. I can't remember which compilers have hit me with that one, but they are out there.
 

Offline Kittu20Topic starter

  • Regular Contributor
  • *
  • Posts: 96
  • Country: in
Re: Why compiler doesn't allocate required memory for struct
« Reply #3 on: January 23, 2024, 04:38:24 pm »
Quote from: coppice
. aligned
Can you please help me what's meaning of " aligned " in this context.

Basically I am trying to understand why compiler think structure padding is required?
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 11895
  • Country: us
Re: Why compiler doesn't allocate required memory for struct
« Reply #4 on: January 23, 2024, 04:44:35 pm »
Can you please help me what's meaning of " aligned " in this context.

Basically I am trying to understand why compiler think structure padding is required?

The answers were provided in the first reply in the thread. If you are not prepared to read the information given, why are you asking the question?
 

Online IanB

  • Super Contributor
  • ***
  • Posts: 11895
  • Country: us
Re: Why compiler doesn't allocate required memory for struct
« Reply #5 on: January 23, 2024, 05:02:38 pm »
To help you make sense of the information in the links, you should know that the basic unit of memory in most computers is the "word". Depending on the computer, the word size may be 16 bits, 32 bits or 64 bits. (Other word sizes like 36 bits have existed in the past but are uncommon now.)

Memory access is typically most efficient when data is aligned on word boundaries. Therefore compilers will try to organize data in memory to conform to this.
 
The following users thanked this post: Kittu20

Offline audiotubes

  • Regular Contributor
  • *
  • Posts: 176
  • Country: cz
Re: Why compiler doesn't allocate required memory for struct
« Reply #6 on: January 23, 2024, 05:38:56 pm »
For character and byte-oriented data alignment (usually) does not matter.

Create a few structure examples, some without integers or floats, and others with.

You will be able to see the difference, since integers and floats need to be aligned (in general) and byte data (bit strings, some number of bytes or characters) does not.

When fixed length types like integers and floats are interspered with byte data in the stucture, they would not be aligned correctly in storage. So compilers insert slack-bytes so that the types are aligned when mapped in storage. Assemblers also do this, at least the ones I know of.
« Last Edit: January 23, 2024, 05:42:10 pm by audiotubes »
I have taken apart more gear than many people. But I have put less gear back together than most people. So there is still room for improvement.
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3722
  • Country: us
Re: Why compiler doesn't allocate required memory for struct
« Reply #7 on: January 23, 2024, 05:54:48 pm »
I'm trying to understand why compilers allocate additional memory when using a structure in a program ( padding). I'm puzzled by the fact that why it doesn't allocate the exact memory needed for each structure object.

In my observation, when having two structure members—one of int type and the other of char type—the compiler allocates a total of 8 bytes of memory (4 bytes for the int type object and 4 bytes for the char type object). However, I'm curious about why there's an extra 3 bytes allocated for the char type object.why compiler not allocate 1 byte

Could someone kindly elaborate reason of structure padding c language

First off, it's not about the C language except that the C language defines that platforms are *allowed* to add padding bytes within a structure (with some restrictions).  It's a platform specific requirement.

The reason they do this, and the reason it is allowed is that most CPUs require data of different types to be aligned (least few bits of the address zero) for maximum performance.  For instance, on most 32-bit and 64-bit platforms, a 32-bit integer needs to be aligned to an address that is a multiple of 4 bytes.  Failure to do that will either result in the CPU taking more cycles than necessary to load the data or it could even cause a fault that either crashes the program or requires software handler to fix.

When you write:

Code: [Select]
struct foo {
   int32 a;
   char b;
}

the structure needs to be aligned on a 4 byte boundary for performance.  For a single structure you might thing it would be OK to have the length be 5 as long as the starting address is properly aligned.  But if you want to make an array of struct foo, the elements need to be spaced 8 apart to keep the alignment correct for every element.  C doesn't allow padding *between* elements of arrays, so the struct itself must have padding.  That is why sizeof(foo) == 8.

The way data is represented by compilers on a specific platform is governed by the platform ABI (application binary interface).  This sets up rules to make sure that code interoperates properly, and includes specifying the data representation and layout of data types.  The ABI will specify padding based on the architecture's requirements, and compilers will follow the ABI even when it wouldn't be necessary in a particular situation so that code can interoperate.
 

Offline Bud

  • Super Contributor
  • ***
  • Posts: 6912
  • Country: ca
Re: Why compiler doesn't allocate required memory for struct
« Reply #8 on: January 23, 2024, 05:58:25 pm »
Kind of same concept as storing data on a hard drive. A one byte file will occupy the entire sector on the HDD, because the HDD controller "thinks" sector size as the minimum data space unit. Create a 1 byte long file on the HDD and check its Properties. In Windows you will get two numbers : The actual file size and Size on disk. Size on disk will generally be always bigger than the actual file size, unless file size exactly aligns with sector size.
Facebook-free life and Rigol-free shack.
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 8652
  • Country: gb
Re: Why compiler doesn't allocate required memory for struct
« Reply #9 on: January 23, 2024, 06:05:37 pm »
Quote from: coppice
. aligned
Can you please help me what's meaning of " aligned " in this context.

Basically I am trying to understand why compiler think structure padding is required?
Mechatrommer had already given you references to understand alignment, and I gave you its benefit, speed. I also gave you the alternative when you care more about dense memory usage than speed, or where you have something big you don't access a lot, so there it little performance to be gained by aligning. What else are you looking for?
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 151
  • Country: us
Re: Why compiler doesn't allocate required memory for struct
« Reply #10 on: January 24, 2024, 04:43:18 pm »
Kind of same concept as storing data on a hard drive. A one byte file will occupy the entire sector on the HDD, because the HDD controller "thinks" sector size as the minimum data space unit. Create a 1 byte long file on the HDD and check its Properties. In Windows you will get two numbers : The actual file size and Size on disk. Size on disk will generally be always bigger than the actual file size, unless file size exactly aligns with sector size.

This is not a very precise analogy, since it is rooted in completely different considerations. HDD space allocation is quite analogous to dynamic memory allocation through `malloc`. E.g. if you request 1 byte from `malloc` on x64 you will actually get 16. Alignment plays some role in this, but it is secondary (and is based on max-alignment, not on individual alignment).

Structure member is much more elaborate than that: each field's alignment depends on data type and can be different for different data types. For this reason, for example, struct type size might easily depend on the ordering of its fields.
« Last Edit: January 24, 2024, 04:45:07 pm by TheCalligrapher »
 

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1210
  • Country: pl
Re: Why compiler doesn't allocate required memory for struct
« Reply #11 on: January 24, 2024, 10:40:39 pm »
Mechatrommer’s post answers the question of alignment. None answers the actual question posted. Alignment alone explains padding between elements, not after the last element.

The answer to the question of why C allocates more memory is: it does, because you tell it to do so. C has no features to allocate structures. malloc, calloc and realloc are unaware of the type being allocated. Each of them receive the number of bytes to allocate. You — the programmer — indicate, how many bytes to allocate. If you think otherwise: stop here, carefully examine the expressions you are using, and what is the meaning of each of them at each step.

What you probably meant to ask is why sizeof returns 8 instead of 7. In C this is required to fulfil two requirements. An array type is defined as “continguously allocated nonempty set of objects”. At the same time each object (in array or not) must be aligned. The only way to make this true is if sizeof returns value compatible with structure’s alignment.

“Because the standard says so” is rarely a satisfying answer. In this case it’s even less satisfying: the same thing applies not only to C and it doesn’t seem like this alignment is always required. Yes, there is a deeper, more technical reason: the way modern memory management is done. I don’t know, how much you learned about memory allocation so far. Is the picture you hold like this: a program wants to allocate 7 bytes of memory, so it asks operating system to give it 7 bytes, and then passes it to the caller? If yes, it’s wrong. Operating systems pass memory to programs in pages: for example on modern Linux systems it’s 4096 bytes. It would also be inefficient and burdensome for a program to constantly ask for single pages, so more often this is done in bunches ranging from megabytes to gigabytes. This address space is then distributed by the allocator. Managing memory can’t be done efficiently, if each time an exact number of bytes is reserved. Allocators keep pools of chunks of pre-defined sizes. E.g. 4 bytes, 16 bytes, 64 bytes, and 1024 bytes.(1)(2) Then they decide from which chunk to return fit the request best. If you see the problem from this perspective, the extra bytes at the end are becoming irrelevant. They will be allocated anyway, because the memory management simply doesn’t offer higher granularity.


(1) The actual numbers and their count differ between environments. You may see jemalloc paper for an overview of one of such systems.
(2) Note that in C sizeof still returns what is indicated by alignment, not by how much memory the allocator may need to reserve.
People imagine AI as T1000. What we got so far is glorified T9.
 

Offline mianos

  • Contributor
  • Posts: 18
  • Country: au
Re: Why compiler doesn't allocate required memory for struct
« Reply #12 on: January 25, 2024, 12:42:19 am »
Malloc and friends are the 'C' higher level interface to 'set break'. I can assure you, they are very aware of word boundaries and will never return a point that is not on a word boundary. " The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object"

 

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1210
  • Country: pl
Re: Why compiler doesn't allocate required memory for struct
« Reply #13 on: January 25, 2024, 03:22:50 am »
In C itself there is no such requirement. The actual quote, with the part you removed, is (deleted part underlined):
Quote from: 7.22.3§1
The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and size less than or equal to the size requested.

The reason you may see apparent alignment to some specific value is the implementation detail of the allocator, not language’s requirement. Described in my post above. It’s also likely to be higher than “word”. For example on my x86_64 Linux glibc uses 32-byte alignment for one-byte allocations.

The C language is perfectly fine with creating objects as dictated by their actual alignment requirements. The example below works only in one direction (only the positive result is meaningful) and it must use stack to avoid using allocator, but — if it works on your platform — you can clearly see the language itself doesn’t care at all:
Code: [Select]
#include <stdlib.h>
#include <stdio.h>

int main(void) {
    char a[1];
    char b[2];
    char c[4];
    char d[1];
    char e[3];
    char f[8];
   
    printf("%p\n%p\n%p\n%p\n%p\n%p\n", a, b, c, d, e, f);

    return EXIT_SUCCESS;
}

malloc and friends are also not “higher level interface” to brk, as they do a completely different thing. brk is merely a step in obtaining address space from the system. You wouldn’t be able to do anything with it alone. Management is done by the allocator, and malloc & co. are a standardized interface to the allocator. Relying on brk, or at least solely on brk, is also a somewhat dated concept. Nowadays mmap is used also/instead. For example glibc uses hybrid approach.
People imagine AI as T1000. What we got so far is glorified T9.
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 151
  • Country: us
Re: Why compiler doesn't allocate required memory for struct
« Reply #14 on: January 25, 2024, 06:39:44 am »
What you probably meant to ask is why sizeof returns 8 instead of 7. In C this is required to fulfil two requirements. An array type is defined as “continguously allocated nonempty set of objects”. At the same time each object (in array or not) must be aligned. The only way to make this true is if sizeof returns value compatible with structure’s alignment.

While the above is correct, it still lacks an important piece of the puzzle.

Yes, all elements in an array must be aligned. But how does that prevent sizeof from returning 5 anyway (for struct from post #7)? Just let sizeof return 5 and just keep aligning the elements in an array on 8-byte boundary. What's the problem with that? The extra 3 padding bytes are, of course, still necessary between array elements, but it does not have to be included into the above struct's sizeof. In other words, trailing padding does not have to count as part of each array element. It is only added by the array itself if/when necessary. Done.

However, C and C++ decided to make an additional guarantee. For an array type T [N] the following is required to hold true: sizeof(T [N]) == sizeof(T) * N. This guarantee is what immediately shoots down the idea to add trailing padding only in arrays "as necessary". This is what forces the trailing padding into struct types, making it an integral part of struct type. This is why aforementioned sizeof have to return 8, not 5.
« Last Edit: January 25, 2024, 06:49:04 am by TheCalligrapher »
 

Offline Mechatrommer

  • Super Contributor
  • ***
  • Posts: 11653
  • Country: my
  • reassessing directives...
Re: Why compiler doesn't allocate required memory for struct
« Reply #15 on: January 25, 2024, 07:21:28 am »
Whats the point of all that discussion? You declare pragma or attribute 'pack', and all that alignment thing becomes nonsense. anyway its the programmer's job to be aware of this matter and take action accordingly to SW spec. Esp in cross OS/machine/storage medium environment.. apart from sizeof, you can analyze offset of returned addressOf if padding or packing occured. And there are many more techniques to guard from reading the wrong memory. Ymmv.
Nature: Evolution and the Illusion of Randomness (Stephen L. Talbott): Its now indisputable that... organisms “expertise” contextualizes its genome, and its nonsense to say that these powers are under the control of the genome being contextualized - Barbara McClintock
 

Offline golden_labels

  • Super Contributor
  • ***
  • Posts: 1210
  • Country: pl
Re: Why compiler doesn't allocate required memory for struct
« Reply #16 on: January 25, 2024, 03:22:18 pm »
What you probably meant to ask is why sizeof returns 8 instead of 7. In C this is required to fulfil two requirements. An array type is defined as “continguously allocated nonempty set of objects”. At the same time each object (in array or not) must be aligned. The only way to make this true is if sizeof returns value compatible with structure’s alignment.
(…) But how does that prevent sizeof from returning 5 anyway (for struct from post #7)? (…)
Through the word I underlined. The array must be contiguous: there can be no additional space between objects.

While the language never gives that as an explicit reason, pointer arithmetic as defined by C wouldn’t be possible without the contiguity requirement. Casting to char*, offseting by the size of the object, and casting back to the original type must yield a valid pointer to the object.(1) If sizeof returned 5 instead o 8, the resulting pointer would be invalid. Both in terms of program logic (not actually indicating an object), and at the implementation level (invalid alignment).

You declare pragma or attribute 'pack', and all that alignment thing becomes nonsense.
OP asked about why, not how to avoid that. Using “packing” is not answering the “why” part at all.

Other than that, the question was stated in the context of C, not C-on-particular-platform-with-particular-compiler. C itself has no concept of “packing”: it’s a platform-specific extension.


(1) Assuming there is any object there, of course.
People imagine AI as T1000. What we got so far is glorified T9.
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 151
  • Country: us
Re: Why compiler doesn't allocate required memory for struct
« Reply #17 on: January 25, 2024, 04:25:10 pm »
Through the word I underlined. The array must be contiguous: there can be no additional space between objects.

Well, no. Theoretically, as one alternative language design decision, there can be additional space between objects. It is possible to detach the concept of "alignment" from the concept of "size". It would require more work, make the language spec more complicated in some areas (and less complicated in others), but it is doable. However, C and C++ chose to follow a different design alternative and tie these concepts together. I.e., as I said above, C and C++ decided to postulate that sizeof(T [N]) = sizeof(T) * N. Tight stacking of array elements one after another should automatically produce proper alignment. This is a simpler approach.

It does cause some problems, like the well-known catastrophically problematic situation with sizeof(long double) == 16 in GCC on x64 today. But it is a simpler approach.

While the language never gives that as an explicit reason, pointer arithmetic as defined by C wouldn’t be possible without the contiguity requirement. Casting to char*, offseting by the size of the object, and casting back to the original type must yield a valid pointer to the object.

That's not a reason, but a consequence, tailored to serve the same design choice I mentioned above.

Just as well, the language could have postulated that pointer arithmetic shall work in terms of alignment, not in terms of size. I.e. "casting to char*, offseting by the alignment requirement of the object, and casting back to the original type must yield a valid pointer to the object." (Here, fixed that for you.) That would also work.

This alternative approach would change nothing wrt arrays, i.e. instead of sizeof(T [N]) = sizeof(T) * N we'd have sizeof(T [N]) = alignof(T) * N, which is the same thing. However, it would've allowed tighter packing of standalone objects, like consecutive fields in a struct. Modern C and C++ language specs make serious efforts trying to resolve such issues within the current approach.
« Last Edit: January 25, 2024, 04:32:31 pm by TheCalligrapher »
 

Offline mianos

  • Contributor
  • Posts: 18
  • Country: au
Re: Why compiler doesn't allocate required memory for struct
« Reply #18 on: January 29, 2024, 12:17:58 am »
Maybe not true these days, but the original malloc called brk, that's what I call a higher level interface, opinions may differ.
Mallocs have always tended to return word aligned blocks. There is also a second reason. There is a structure that is placed in front of the pointer returned to manage the free list, which contains pointers for the list, which want to be pointer aligned so a pointer returned will be on the end of this, and aligned.
There are probably exceptions but this is just from the source for the old malloc and friends.
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 151
  • Country: us
Re: Why compiler doesn't allocate required memory for struct
« Reply #19 on: January 29, 2024, 12:30:45 am »
There is a structure that is placed in front of the pointer returned to manage the free list, which contains pointers for the list, which want to be pointer aligned so a pointer returned will be on the end of this, and aligned.

The "classic" implementation of `malloc` stores size of the block in front of the pointer. There's no need to store anything for the free list there simply because the block is not free. Once the block is freed, its own "normal" memory will be re-used to store additional information used by free list. This is the reason why freed memory blocks generally retain their contents, except for a few initial bytes.

This means that a "classic" `malloc`ed memory block must be at least large enough to store two pointers (assuming the size is as large as a pointer). And should stick to size/pointer alignment requirements. This is how many implementations work to this day.

 

Offline mianos

  • Contributor
  • Posts: 18
  • Country: au
Re: Why compiler doesn't allocate required memory for struct
« Reply #20 on: January 29, 2024, 05:48:56 am »
Maybe true these days, but we used to store the pointers to adjacent blocks so we don't have to search the whole free list to join them up.
Again, in practice it may be different, this is only from my knowledge implementing the allocator for the M68K C runtime. (no mmap at the time).
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf