-
Hey guys!
Okay so I made this function in C "add_two" which just adds two to each value in an array.
It works, but Im wondering why it doesnt have to return anything to main. Is it because the array values are stored in the program memory (accessible to both my function and main)? or another reason. Pointer puzzle me a bit. heres the code:
Thanks,
-Dom
/******************************************************************************
pointers (a memory address that POINTS to a variable)
*******************************************************************************/
#include <stdio.h>
void add_two(int *array, int size);
int main()
{
char * name = "John"; /* "name" is a stack variable, or a pointer to a single character, the rest of the characters follow.
Think like a stack of registers, where "J" is the first (top), followed by "h","o","n","\0"
therefore the pointer points to a larger amount of memory */
// arrays are actually pointers (makes sense when you think about the variables like a stack of RAM)
int my_array[5] = {2,2,2,2,2};
int my_size = sizeof(my_array) / sizeof(my_array[0]);
add_two(my_array,my_size);
for(int i = 0; i < my_size; i++)
{
printf("%d ",my_array[i]);
}
return 0;
}
void add_two(int *array, int size)
{
for(int i = 0; i < size; i++)
{
array[i] += 2;
} // dont need to return anything because array is still stored in memory??
}
-
Im wondering why it doesnt have to return anything to main.
Because you provide the function with a pointer to the data to be modified, the modifications are visible to the caller.
If you were to change the pointer itself, say array++ at the end of the function, that change would not be visible to the caller.
In C, the parameters themselves are local to the function, and changing their value does not change their value in the caller.
So, when you pass a pointer (or an array, as the name of an array is equivalent to a pointer to its first element), any changes to that pointer itself are not visible to the caller, but if you dereference the pointer to modify the data referred to by the pointer, those obviously are visible to the caller.
Pointer puzzle me a bit.
Pointers are references to other things in memory. What things, is described by the type of the pointer.
When you read the definition of a pointer, read the parts separated by * from right to left, with each * as "a pointer to". For example, int *array reads as "array is a pointer to int(s)". The final (s) is because only the programmer knows whether a pointer points to a single element, or the first element of many.
-
great answer thank you!!
-
...
Pointer puzzle me a bit. heres the code:
...
Another way to demystify pointer is: It is an address, a location.
So,
A pointer is the address of a house. The contractor (ie: the function) knows what to do (say for example, paint it green). You tell the contractor (ie: pass it to the function) the address of the house, the contractor can go in and paint it green. When you get to that address (ie:that house) later, it would be painted green.
A pointer to an array is the address of the first house of a row of houses. Now the contractor can go there and paint each house, same or different colors depending on what the contractor (the function) is suppose to do. Note also that the contractor (the function) needs to know when to stop -- be it two houses or twenty. That end point is another piece of information you must pass along. You can pass along the information 2 mean paint two houses --OR-- You can of course make sure that you blew the last house up (set the last entry of the array to null), and the contractor (the function) knows "to stop when there is no house sitting on the foundation". Typically well written function would not have the contractor hanging himself near by the empty house foundation, or continue on in a straight line even if there is no house there but a park, so the park is painted (ie:altering memory locations not part of the array).
An array of of pointers is an address book - you have a list of address that may be consecutive and/or may be a house in different town.
A pointer to an array of of pointers is the location of the address book. You can merely pass along the information that the address book is "under the welcome mat of the main entrance of your shop", he can go there, get the address book, and now he can go paint.
-
Im wondering why it doesnt have to return anything to main.
Because you provide the function with a pointer to the data to be modified, the modifications are visible to the caller.
If you were to change the pointer itself, say array++ at the end of the function, that change would not be visible to the caller.
Not even necessarily the end of the function. add_two() could be written like this...
void add_two(int *array, int size)
{
while (size > 0)
{
*array += 2;
array++;
size--;
}
.. or even this (which means the same) ...
void add_two(int *array, int size)
{
while (size-- > 0) *array++ += 2;
}
-
Not even necessarily the end of the function.
True! But I worded it that way because if one were to only add array++ before the existing loop, then that loop would end up accessing one element beyond the array; a buffer overrun of the off-by-one type.
Your versions change the loop too, and do the right thing, obviously.
I often write such functions this way:
void add_two(int *array, const size_t size)
{
int *const end = array + size;
while (array < end)
*(array++) += 2;
}
Here,the two consts are mostly for us humans to see the intention that the code will not try to change the values. Theoretically, it might help some C compilers to generate better code, but all the current compilers seem to optimize this kind of code just as well without the extra const qualifiers. However, I find them informative for myself.
size_t is the proper type for sizes in memory (or ssize_t if you need negative values too). On ILP64 architectures (64-bit linux), int is only 32 bits, whereas pointers and size_t are 64-bit. This allows you to use arrays (and memory regions) with more than four short billion (232) elements.
GCC tends to generate better code when using pointers instead of array indexing on some architectures. Not a big difference, but I like it.
The parentheses around (array++) are there only to remind the human programmer that the expression dereferences the array pointer (and afterwards increments the pointer, to point to the next element in the array), adds two to the dereferenced value: thus adding two to the array value,
None of these are any kind of hard rules – or always true! – but it is the "coding style" that has served me well.
The key point I'd like to emphasize here is how to read pointer types, as I explained at the end of my previous post. It clarifies the use of such pointers. An additional technique I've used is to add parentheses according to the C operator precedence rules when there is a possibility of ambiguity.
Even when you see famed three-star code (const char *volatile *const *const ptr), you can immediately read it in human terms (ptr is a const, a pointer to a const, a pointer to a volatile, a pointer to a const char), meaning the pointer itself and the pointer it points to are not changed by the current code (except if via casts), but the pointer that points to can (it being volatile meaning code not visible in this code may modify it at any point in time, so the compiler must not cache its value, and instead has to use its current value in each expression), and that points to constant character data, usually a string possibly in read-only memory.
Here is a real world example of code I do actually use: this is a POSIX low-level async-signal safe write() function that returns zero if the entire data was successfully written, or an errno code otherwise, while keeping errno code unchanged:
static inline int write_all(const int fd, const void *const src, const size_t len)
{
const char *ptr = (const char *)src;
const char *const end = (const char *)src + len;
ssize_t len;
int saved_errno, retval;
if (fd == -1)
return EBADF;
if (len < 1)
return 0;
if (!src)
return EINVAL;
saved_errno = errno;
retval = 0;
while (ptr < end) {
n = write(fd, ptr, (size_t)(end - ptr));
if (n > 0) {
ptr += n;
} else
if (n != -1) {
retval = EIO;
break;
} else
if (errno != EINTR) {
retval = errno;
break;
}
}
errno = saved_errno;
return retval;
}
It is particularly useful when experimenting with (POSIX) signals and (POSIX) signal handlers. Signal handlers are special functions that are called when a signal is delivered to a process (or to a specific thread within a process; usually the kernel just picks one thread that does not currently block that signal). Within a signal handler, only async-signal safe functions (https://man7.org/linux/man-pages/man7/signal-safety.7.html) can be safely called; using any other function (including printf() and so on) can produce unexpected effects. (And it is not enough to assume that if they work for you in one situation, that pattern works in others, because there are complex asynchronous dependencies here. There are only a couple of standard C functions whose async-signal safeness is still being discussed, because some of the implementations are and others are not.)
However, for signal handlers, one can use the following wrerr() wrappers:
static int wrerr(const char *msg)
{
if (msg) {
const char *end = msg;
while (*end)
end++;
return write_all(STDERR_FILENO, msg, (size_t)(end - msg));
} else
return 0;
}
static int wrerrl(const long value)
{
char buffer[32];
char *ptr = buffer + sizeof buffer;
unsigned long u = (value < 0) ? -value : value;
do {
*(--ptr) = '0' + (u % 10);
u /= 10;
} while (u);
if (value < 0)
*(--ptr) = '-';
return write_all(STDERR_FILENO, ptr, (size_t)(buffer + (sizeof buffer) - ptr));
}
so that one can safely explore POSIX signal handling using e.g.
void my_signal_handler(int signum)
{
wrerr("Received signal ");
wrerrl(signum);
wrerr(".\n");
}
or, if installed with SA_SIGINFO flag,
void my_signal_handler(int signum, siginfo_t *info, void *context)
{
wrerr("Process ");
wrerrl(getpid());
wrerr(" received signal ");
wrerrl(signum);
if (info->si_pid) {
wrerr(" from process ");
wrerrl(info->si_pid);
wrerr(".\n");
} else
wrerr(" from the kernel.\n");
}
On Linux, the variant
pid_t gettid(void) { return syscall(SYS_gettid); }
void my_signal_handler(int signum, siginfo_t *info, void *context)
{
wrerr("Process ");
wrerrl(getpid());
wrerr(" task ");
wrerrl(gettid());
wrerr(" received signal ");
wrerrl(signum);
if (info->si_pid) {
wrerr(" from process ");
wrerrl(info->si_pid);
wrerr(".\n");
} else
wrerr(" from the kernel.\n");
}
because this latter will also tell you the task ID (similar purpose as pthread_t, but different numerical values), and show in practice how signals can be delivered to signal handlers using essentially a random thread in a process. (Well, not "random" in the sense of random numbers, but in the sense that there are no guarantees, unless you block the signal in all but one thread. Which is common to do.)
Oops, sorry for the wall of text.
-
Not even necessarily the end of the function.
True! But I worded it that way because if one were to only add array++ before the existing loop, then that loop would end up accessing one element beyond the array; a buffer overrun of the off-by-one type.
Yup. Not trying to correct you in any way, just add more examples for the benefit of the OP.
I often write such functions this way:
void add_two(int *array, const size_t size)
{
int *const end = array + size;
while (array < end)
*(array++) += 2;
}
Me too.
-
An intuitive way of thinking about pointers in the function the OP is talking about is like that. When you're buying pizza, you have to call the pizzeria, order your pizza flavor and wait for delivery.
The pizzeria is the function. The pizza flavor is the argument and the pizza is the return value.
Now let's suppose that you want a painter to paint your house with a different color. You call the painter and tell him your address. The painter comes and paints your house.
The painter is the function. Your address is the pointer. There's no need to deliver, i. e., to return anything since the job has been done on location at the desired address.
-
Your function modifies the "array" *in place*. The array *must exist* before you call the function, and will still exist at the same scope level when it returns. It's basic C.
The function itself doesn't need to return anything to do that. You can of course decide to return some kind of error code, or anything else you see fit, but that's up to you.
Maybe you were thinking of a function taking an array as an argument, and returning the modified array. This is what a functional language would do. C is not a functional language, and C functions are not exactly true functions mathematically speaking, as they can change the state of their parameters (when passed as pointers to something), and otherwise global states, which a mathematical function can't do, and which is "frowned upon" by true functional languages.
Of course, C, as many other languages, allows you to use a "functional style", and it's even common for simpler tasks. For instance:
double MySin2(double x)
{
double y = sin(x);
return y * y;
}
But in your case, the C language just doesn't allow returning arrays, so you can't do that. C is a simple language, and would have no means of implementing that in an efficient way - parameters and return values are passed either in registers or on the stack. Functional languages have a number of sophisticated ways of doing that efficiently under the hood.
Still, you could do this with your function with an array, but with a workaround. C doesn't allow returning arrays, but you can return structs. You could thus encapsulate an array inside a struct (with the limitation of having to define an array member of the max size you'd ever need), and return it. Unless said array is very small, do not do this though - it would be pretty inefficient, and would look weird to any C programmer. I'm still giving this as an illustration of what I just said - and don't suggest actually doing it unless you have a major reason for doing so. (Alternatively, you could define your own dynamic array type, with all the required boilerplate functions, and return this instead. Could be workable but you'd need again a very good reason for doing so.) So here is a naive version:
#define NMAX 100
typedef struct
{
int array[NMAX];
int size;
} Array_t;
Array_t add_two(const Array_t *pArray)
{
Array_t Result;
for(int i = 0; i < pArray->size; i++)
Result.array[i] = pArray->array[i] + 2;
Result.size = pArray->size;
return Result;
}
The const qualifier for the parameter means that the function can't modify the content of the passed array. So here you somewhat get the functional equivalent of your function.
Again, this is just for illustration purposes. You should usually NOT do this.
Of course, apart from the fact that the above would look "functional", the other benefit is that it wouldn't modify the original array. If you don't want to, and thus don't want to modify the array "in place", the simple, efficient and idiomatic way of doing this in C would be to declare your function with two pointers as arguments instead of one (original and resulting array). Again, in that case, both must exist (declared, allocated...) before calling the function. You function would simply become:
void add_two(int *arrayDest, int *arraySrc, int size)
{
for(int i = 0; i < size; i++)
{
arrayDest[i] = arraySrc[i] + 2;
}
}
This would be the basic idiomatic C way of doing it.
In "idiomatic" C, it's also common to declare the destination as the first argument, such as with the standard memcpy() function and the likes.
Speaking of which (and it will likely become confusing to you), an idiomatic approach for such a function would be to return the destination pointer, like memcpy() does. Note that it just returns the first argument (which allows you to chain calls), but it's just a pointer. It wouldn't return the "array". That would be:
int * add_two(int *arrayDest, const int *arraySrc, int size)
{
for(int i = 0; i < size; i++)
{
arrayDest[i] = arraySrc[i] + 2;
}
return arrayDest;
}
Which would allow you to chain calls such as:
add_two(array2, add_two(array2, array1, size), size);
-
I often write such functions this way:
void add_two(int *array, const size_t size)
{
int *const end = array + size;
while (array < end)
*(array++) += 2;
}
Well, I don't. This very much looks like how many compilers would compile the OP's code on many architectures, and looks too low-level for my taste. A decent modern optimizing compiler will already compile simple array indexing as this in many cases, and array indexing usually looks clearer to read, which is always the number one thing I consider.
The "const" qualifier on the 'size' parameter is a good idea though. As in brucehoult's example, it's often tempting to use a parameter as a local variable to iterate. Avoids having to declare an extra local variable. In small pieces of code like this, it's usually OK, but if the function becomes larger, it can be a good source of bugs: you may need the original 'size' value later on (after some iteration for instance) and forget it lost its initial value.
-
A pointer is the address of a house. The contractor (ie: the function) knows what to do (say for example, paint it green). You tell the contractor (ie: pass it to the function) the address of the house, the contractor can go in and paint it green. When you get to that address (ie:that house) later, it would be painted green.
That's a neat way of putting it I haven't seen before.