Author Topic: Typecasting pointer in C  (Read 11167 times)

0 Members and 1 Guest are viewing this topic.

Offline nForceTopic starter

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: ee
Typecasting pointer in C
« on: November 03, 2018, 01:30:33 pm »
I have a basic question:

Why are we typecasting pointers in C like this:
Code: [Select]

    int b = 65;
    int *p = &b;
   
    char *c = (char *)p;

and not as this:

Code: [Select]
char *c = (char) *p;

Thanks.
 

Offline sokoloff

  • Super Contributor
  • ***
  • Posts: 1799
  • Country: us
Re: Typecasting pointer in C
« Reply #1 on: November 03, 2018, 01:38:22 pm »
I have a basic question:

Why are we typecasting pointers in C like this:
Code: [Select]

    int b = 65;
    int *p = &b;
   
    char *c = (char *)p;

and not as this:

Code: [Select]
char *c = (char) *p;

Thanks.
*p is "the thing that p points to". In this case, with p being an int*, *p is an int.

Your second piece of code would be:
Take *p (which is an int).
Cast it to a char.
Assign that char to char* c. (non-sensical)

In your first code example, p is an int* again. Your assignment to c (a char*) is take the pointer-to-int p, cast it to a pointer-to-char, and assign that to c. That means that c points to some portion of b (because char is narrower than int on [almost?] every platform).
 
The following users thanked this post: nForce

Offline legacy

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: Typecasting pointer in C
« Reply #2 on: November 03, 2018, 02:06:06 pm »
look at this difference

on m68k, casting (uint8_t) in arithmetic operations forces the compiler to use 8bit-size operations ( in Motorola assembly are operations with the suffix ".b"), and this is relevant for how the ALU sets flags in the status register, specifically it's interesting for the overflow flag, which is handled in 2^(8-1) range, 2^(16-1) range, 2^(32-1) range, depending on suffix { ".b", ".w", ". l" }

on MIPS32, casting (uint8_t) in arithmetic operations ... seems to be completely irrelevant, what really matters on my Atlas C compiler is if you cast signed or unsigned. MIPS has no sized operation, everything is supposed to be 32bit. Oh, I still have to check what happens on MIPS64 where the ALU is 64bit.

The casting of pointers has the meaning of setting the operation size

e.g. you cast a pointer to 8bit, the alignment is 1byte, if you type pointer++, it's incremented by 1.
e.g. you cast a pointer to 16bit, the alignment is 2byte, if you type pointer++, it's incremented by 2.
e.g. you cast a pointer to 32bit, the alignment is 4byte, if you type pointer++, it's incremented by 4.

Code: [Select]
private void hAllo()
{
    p_this_t   p_this; /* void pointer */
    p_uint8_t  p_08;
    p_uint16_t p_16;
    p_uint32_t p_32;
    char_t     body[] = "0123456789abcdef";

    p_this = body;
    p_08   = p_this;
    p_16   = p_this;
    p_32   = p_this;

    p_08++;
    p_16++;
    p_32++;

    console_out_char( "08: ", (char_t) dereference(p_08), "\n" );
    console_out_char( "16: ", (char_t) dereference(p_16), "\n" );
    console_out_char( "32: ", (char_t) dereference(p_32), "\n" );
}

Code: [Select]
OrangeCube pointers $ obj/hAllo

08: 1
16: 2
32: 4
 
The following users thanked this post: nForce

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Typecasting pointer in C
« Reply #3 on: November 03, 2018, 03:30:51 pm »
I have a basic question:

Why are we typecasting pointers in C like this:
Code: [Select]

    int b = 65;
    int *p = &b;
   
    char *c = (char *)p;

and not as this:

Code: [Select]
char *c = (char) *p;

Thanks.
Why? Because the programmer who wrote that clearly doesn't know how to write C. Do yourself a favour and never ever use typecasting.

One of the problems you'll run into is that many architectures (ARM for example) don't allow to do unaligned access. This can lead to two problems: 1) the compiler ignores the alignment and you'll get hard to track down data corruption or the program crashes randomly. 2) The compiler sees the unaligned access and decides to produce code which accesses the data byte per byte making the program bloated and slow.

PS: 'char *c = (char) *p' is different than 'char *c = (char *) p'. The first says to give pointer c the value of a character at the address p points to. If even it compiles then it is still wrong. The second says make pointer c point to whatever pointer p is pointing to and treat pointer p as being a pointer to a character.
« Last Edit: November 03, 2018, 03:35:08 pm by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline legacy

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: Typecasting pointer in C
« Reply #4 on: November 03, 2018, 04:13:11 pm »
the compiler ignores the alignment and you'll get hard to track down data corruption or the program crashes randomly

yeah, the Atlas C89 compiler is a neat example of this for MIPS :D
 

Offline ale500

  • Frequent Contributor
  • **
  • Posts: 415
Re: Typecasting pointer in C
« Reply #5 on: November 03, 2018, 04:19:30 pm »
Quote
One of the problems you'll run into is that many architectures (ARM for example) don't allow to do unaligned access

I'm pretty sure Cortex M3 cores allow it, while MIPS cores do not (there are special instructions for those),  I think older ARMs didn't too.
 

Offline joshtyler

  • Contributor
  • Posts: 36
  • Country: gb
Re: Typecasting pointer in C
« Reply #6 on: November 03, 2018, 04:25:19 pm »
Do yourself a favour and never ever use typecasting.

I think this advice is a bit too strong. Typecasting in C does make it easy to get yourself into a mess (as your comment demonstrates), but sometimes you need typecast variables, and in some circumstances doing so is perfectly safe.

For example, I would argue the following code is a perfectly safe and valid thing to write, despite the implicit typecast:
Code: [Select]
char c = 'a';
putchar(c); //Note, putchar has signature: int putchar(int character)
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Typecasting pointer in C
« Reply #7 on: November 03, 2018, 06:24:31 pm »
Do yourself a favour and never ever use typecasting.

I think this advice is a bit too strong. Typecasting in C does make it easy to get yourself into a mess (as your comment demonstrates), but sometimes you need typecast variables, and in some circumstances doing so is perfectly safe.

For example, I would argue the following code is a perfectly safe and valid thing to write, despite the implicit typecast:
Code: [Select]
char c = 'a';
putchar(c); //Note, putchar has signature: int putchar(int character)
The problem with this particular example is that it is an inheritance from a past where safe coding wasn't very high on anyone's list and saving memory space had a higher priority. It leads to situations where the chare type just doesn't cut it. Another fine example of this duplicity is getchar() which returns an int. When the returned value is negative no character has been received.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3642
  • Country: us
Re: Typecasting pointer in C
« Reply #8 on: November 03, 2018, 06:33:42 pm »
I wouldn't call that duplicitous. It's the programmer's job to read and internalize his language manual instead of making assumptions.
In C you don't have fancy exceptions and so the return value is used to signal success or failure.

I don't know if I agree that "saving memory space had a higher priority in the past." In K&R C, all function arguments are either ints or doubles, you cannot even have a function taking only a char. The automatic widening in putchar(c) would have also been true for any function you could have defined to operate on characters.
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Typecasting pointer in C
« Reply #9 on: November 03, 2018, 07:18:53 pm »
I wouldn't call that duplicitous. It's the programmer's job to read and internalize his language manual instead of making assumptions.
In C you don't have fancy exceptions and so the return value is used to signal success or failure.

I don't know if I agree that "saving memory space had a higher priority in the past." In K&R C, all function arguments are either ints or doubles, you cannot even have a function taking only a char. The automatic widening in putchar(c) would have also been true for any function you could have defined to operate on characters.
Remember that an int's actual size (=number of bits) depends on the architecture. On some (ancient) architectures an int may be an 8 bit signed.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline joshtyler

  • Contributor
  • Posts: 36
  • Country: gb
Re: Typecasting pointer in C
« Reply #10 on: November 03, 2018, 07:39:52 pm »
Even if you don't like that example, it is trivial to construct another one. Lets say we have an 8 bit UART, and want a program that sums all of the numbers received over that UART:
Code: [Select]
uint8_t read_from_uart(void);
...
uint16_t sum = 0;
while(1)
{
    sum += read_from_uart(); // Typecast
    ...
    // Do something useful here
}

Or maybe you're writing a desktop application and need some memory off the heap. You will need malloc, good luck using the output without casting void * to something else!
Code: [Select]
void * malloc(size_t size);


I think that is impossible to avoid the fact that typecasting is inherent to the C language, and you are forced to do it sometimes. C doesn't give you a lot of safety nets, it is up to you as the programmer to build them in.
 
The following users thanked this post: nForce

Offline sokoloff

  • Super Contributor
  • ***
  • Posts: 1799
  • Country: us
Re: Typecasting pointer in C
« Reply #11 on: November 03, 2018, 08:09:21 pm »
Remember that an int's actual size (=number of bits) depends on the architecture. On some (ancient) architectures an int may be an 8 bit signed.
Ints are guaranteed since at least the C89 language spec to span [-32767, 32767] (inclusive).
In practice, that guarantees they're going to be at least 16 bits on a C compiler meeting the 1989 or later C spec.
I'm too lazy to go chase down my 1978 copy of K&R to verify that it was spec'd the same back then as well.
 

Offline sokoloff

  • Super Contributor
  • ***
  • Posts: 1799
  • Country: us
Re: Typecasting pointer in C
« Reply #12 on: November 03, 2018, 08:15:33 pm »
I found a scanned version of 1978 K&R online.

It does not specify that ints must be at least that large, but rather leaves it as "the natural size of integers on the host machine".
At that time, no implementation detailed in the table had fewer than 16 bits.

Reference: https://archive.org/details/TheCProgrammingLanguageFirstEdition/page/n41
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Typecasting pointer in C
« Reply #13 on: November 03, 2018, 08:46:27 pm »
You've got to love that text: 'short is no longer than long'. Having types with basically unknown ranges cause lots of grief.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline sokoloff

  • Super Contributor
  • ***
  • Posts: 1799
  • Country: us
Re: Typecasting pointer in C
« Reply #14 on: November 03, 2018, 08:51:42 pm »
You've got to love that text: 'short is no longer than long'. Having types with basically unknown ranges cause lots of grief.
Totally agreed. That's why <limits.h> exists, but the programmers who are most likely to be bitten by overflow bugs (ones on the junior "half" of the experience curve) are the ones least likely to know about limits.h
 

Offline glarsson

  • Frequent Contributor
  • **
  • Posts: 814
  • Country: se
Re: Typecasting pointer in C
« Reply #15 on: November 03, 2018, 09:07:37 pm »
You've got to love that text: 'short is no longer than long'. Having types with basically unknown ranges cause lots of grief.
But it made it possible to write efficient C compilers for machines with, today, odd word sizes (12, 18, 24, 36...). Also, it didn't prevent C from prosper and survive, outclassing many other languages.
 

Offline nctnico

  • Super Contributor
  • ***
  • Posts: 26906
  • Country: nl
    • NCT Developments
Re: Typecasting pointer in C
« Reply #16 on: November 03, 2018, 09:08:42 pm »
You've got to love that text: 'short is no longer than long'. Having types with basically unknown ranges cause lots of grief.
Totally agreed. That's why <limits.h> exists, but the programmers who are most likely to be bitten by overflow bugs (ones on the junior "half" of the experience curve) are the ones least likely to know about limits.h
True. I always tell people to use the types from stdint.h to avoid any nasty surprises.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline bsfeechannel

  • Super Contributor
  • ***
  • Posts: 1667
  • Country: 00
Re: Typecasting pointer in C
« Reply #17 on: November 03, 2018, 09:46:27 pm »
I have a basic question:

Why are we typecasting pointers in C like this:
Code: [Select]

    int b = 65;
    int *p = &b;
   
    char *c = (char *)p;

and not as this:

Code: [Select]
char *c = (char) *p;

Thanks.

Because this is one of the "quirks", recognized by its author, Dennis Ritchie, of the C language. The asterisk * means different things in different contexts in C. The most confusing aspect of the use of the asterisk is exactly in declarations.

On the left side of the  equals sign of a declaration an asterisk means " is a pointer to a(n)".  C is read from the right to the left. So

char *c;

means "c is a pointer to a char".

On the right side of the equals sign, the asterisk means "get the value pointed by". So

char *c = (char) *p;

means " get the value pointed by p(which is an integer), cast it as a char and attribute it to c, which is a pointer to a char.

You'll be yelled at by the compiler, because c expects an address, and you are attributing it a char.

When you write

char *c = (char *) p;

you are telling the compiler to take the value of p (not the value pointed by p), cast it to a pointer to a char, which is an address and attribute it to c.

However, if I'm not mistaken, the C standard frowns upon the fact that (char *) p is not a constant expression. Your compiler may say "bugger that", anyway.
 
The following users thanked this post: nForce, BrianHG

Offline nForceTopic starter

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: ee
Re: Typecasting pointer in C
« Reply #18 on: November 04, 2018, 02:00:19 pm »
Thanks   :)

I have another question: Why can't we use this syntax for multi-dimensional arrays:

Code: [Select]

void pointerArray(int **A){
    //Some code
}

int main() {
   
    int somearray[2][3] = {{1,2,3},{4,5,6}};
   
    pointerArray(somearray);

    return 0;
}

Because we can use for the one-dimensional array as this :

Code: [Select]
void pointerOneDimensional(int *A){
    //Some code
}


Thanks again, you guys are great.
 

Offline sokoloff

  • Super Contributor
  • ***
  • Posts: 1799
  • Country: us
Re: Typecasting pointer in C
« Reply #19 on: November 04, 2018, 02:17:08 pm »
Because int** doesn't convey the dimension of the array.

Give the compiler an int **A, and ask the compiler to dereference A[1][1]. Without knowing the dimensions (which int** doesn't carry), it can't do it.

An int[2][3] array is not stored as an array of 2 int*s, each pointing to an array of 3 ints, but rather more likely stored as a block of 6 ints, with the compiler knowing how to calculate the offsets into the array to make it appear as a 2-by-3 array.

I don't know if the spec requires this behavior, but it seems almost universal.
 
The following users thanked this post: nForce

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1719
  • Country: se
Re: Typecasting pointer in C
« Reply #20 on: November 05, 2018, 10:15:40 am »
I don't know if the spec requires this behavior, but it seems almost universal.

Yes, it does!
Chapter 6.5.2.1 of C99 (AFAICR, unchanged from C89), end of clause 3:
Quote
It follows from this that arrays are stored in row-major order (last subscript varies fastest).
The example following in clause 4 is very close to your description.

Code: [Select]
    sum += read_from_uart(); // Typecast
No typecast there, I see only an implicit conversion.

Or maybe you're writing a desktop application and need some memory off the heap. You will need malloc, good luck using the output without casting void * to something else!
Code: [Select]
void * malloc(size_t size);
You might be mixing up C and C++.

In C one can go from pointer to void to pointer to object and back without casts (C89 chapter 6.3.2.3).
In C++ only going from pointer to object to pointer to void is allowed without an explicit cast.
But maybe you are thinking of an implicit conversion also in this case.  :-//
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: sokoloff, joshtyler

Offline sokoloff

  • Super Contributor
  • ***
  • Posts: 1799
  • Country: us
Re: Typecasting pointer in C
« Reply #21 on: November 05, 2018, 11:19:19 am »
I don't know if the spec requires this behavior, but it seems almost universal.
Yes, it does!
Chapter 6.5.2.1 of C99 (AFAICR, unchanged from C89), end of clause 3:
I normally look those things up myself, but didn't have time at that moment.
Very much appreciate you finding and giving the reference. Thanks!
 

Offline joshtyler

  • Contributor
  • Posts: 36
  • Country: gb
Re: Typecasting pointer in C
« Reply #22 on: November 05, 2018, 07:20:13 pm »

No typecast there, I see only an implicit conversion.

...

You might be mixing up C and C++.

In C one can go from pointer to void to pointer to object and back without casts (C89 chapter 6.3.2.3).
In C++ only going from pointer to object to pointer to void is allowed without an explicit cast.
But maybe you are thinking of an implicit conversion also in this case.  :-//

I was using typecasting to cover both explicit (using a cast operator) and implicit type conversions, as I've usually heard it used to mean both. However I can see that there can be some ambiguity there since, as far as I am aware, the C standard doesn't actually use the term "typecasting" anywhere!

Edit: Having a quick google, it does seem that whilst the terms are used loosely, typecasting is usually reserved for explicit casting, rather than implicit conversion. So I can see that my comment was likely misleading.
« Last Edit: November 05, 2018, 07:27:10 pm by joshtyler »
 
The following users thanked this post: newbrain

Offline nForceTopic starter

  • Frequent Contributor
  • **
  • Posts: 393
  • Country: ee
Re: Typecasting pointer in C
« Reply #23 on: November 05, 2018, 07:27:17 pm »
Because int** doesn't convey the dimension of the array.

Give the compiler an int **A, and ask the compiler to dereference A[1][1]. Without knowing the dimensions (which int** doesn't carry), it can't do it.

An int[2][3] array is not stored as an array of 2 int*s, each pointing to an array of 3 ints, but rather more likely stored as a block of 6 ints, with the compiler knowing how to calculate the offsets into the array to make it appear as a 2-by-3 array.

I don't know if the spec requires this behavior, but it seems almost universal.

How would be the correct way to pass a multidimensional array to a function?
And why is this equal:

Code: [Select]
char **argv

//and

char *argv[]
?
 

Offline helius

  • Super Contributor
  • ***
  • Posts: 3642
  • Country: us
Re: Typecasting pointer in C
« Reply #24 on: November 05, 2018, 08:01:35 pm »
And why is this equal: ?
They aren't equal! When you declare a variable, a pointer type (with a *) is only a scalar variable that holds an address. An array type (with []s) is an array variable with multiple slots. However, when declaring the parameters in a function prototype, they both mean that a pointer is passed. This is because array variables are converted to ("decay to") pointers when they are used in any way other than with the [] or sizeof operators.

Together with the definition of strings as arrays, this has some surprising consequences. Consider the main(int argc, char *argv[]) function. You can write the second parameter as char **argv, and as char argv[][], as well: they mean the same thing in a function prototype.

The main function never knows the dimensions of argv. Instead, it relies on the fact that strings are null-terminated and the number of strings in the argv array is passed in the first argument, argc. So when it calls strlen(argv[j]), it is guaranteed to only access legal elements of the multidimensional array, as long as (j<argc) is true. The result of strlen tells it how many elements of the given string it can access.

This dependence on the data content being well-formed to prevent out-of-bounds memory access is one reason C is weakly typed.

How would be the correct way to pass a multidimensional array to a function?
You can define a parameter with dimensions except for the most-major:
Code: [Select]
int example(int multi[][10][16]) {
    if (multi[0][9][15] == 0) ...
}
The important bit is that the minor (rapidly varying) dimensions are known so that the function knows how to calculate the address. multi[j][k][l] is the same as *(l + (z * k + (y * j + &multi))) where x, y, and z were the dimensions given in the array's definition.

Here's an example: https://onlinegdb.com/HyYy9HCn7
« Last Edit: November 06, 2018, 05:56:45 am by helius »
 
The following users thanked this post: nForce


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf