English integers are printed in reverse order, so you necessarily have to start at the LSBs, divide to find the last digit, then the second last, and so on until zero (or continuing past zero until the required number of digits have been generated, if you want leading zeroes). These have to go in a sequential array (buffer), so that they can come out correct (left to right, rising index) if the entries were stored backwards (i.e., descending index).
Now, if you can do shifts instead of divisions, you can cheat, and obtain the digits directly without dividing. So you can do it in O(1) memory, i.e., no array/buffer. You have to know what size of number you're starting with (and you have to iterate through zeroes whether you print them or not, rather than terminating on the first zero), but that's rather trivial as the datatype must be specified in the first place. (Speaking of, do use size-specific integers! int should be deprecated, because it has inconsistent sizes on different platforms. uint16_t and such are preferred.)
As for actual memory usage, you'll have to check the assembled output. Even on "high" optimization settings, C compilers are usually mediocre (and this has been my experience with GCC on AVR). The entirety of this routine should be doable in maybe three registers, including the input variable, and probably under 50 words of instructions. It will incur more stack size than strictly necessary, of course, due to its own footprint and that of the write(char) function required. (If you wish to save that as well, consider combining them into a custom function. It will take more ROM due to the duplication of write functionality though. Which might perhaps be saved by nesting jumps inside the assembly routine ... ah, but now we're bordering on the dark realm of spaghettification in search of dubious bytes, so I shall digress.)
Justification for "reverse order", as it may not be obvious:
1. Wouldn't it be nice to tell how big a number is, at a glance?
2. Wouldn't it make sense that, each subsequent digit you read, is getting more and more significant, until the final and most significant digit at the end?
3. Wouldn't it make sense that integer and fractional parts read with the same order of significance?
4. Wouldn't it be easier to write all of these on a [little-endian] computer system?
If one million were written "0000001", then it would be very simple to print: each character (or position in the write buffer, incrementing from left to right) is generated in order, from each division step, in real time, no buffering needed.
This is very natural to a [little-endian] computer, because if you mentally group the bits in a variable as numerals in a base-2^N number, then you can treat each group in precisely the same order, as stored in memory and as printed to the screen. Suppose instead of hexadecimal (4 bits per digit), you have, sexaquinbicentimal I suppose it would be (i.e., base 256, 8 bits per digit -- a byte at a time). And suppose you have a 32 bit integer stored in byte-width memory. This number has up to four digits (except not "digits", but, sexaquinbicentits? erm, nevermind..). The first byte (at address (pointer)) is the low byte, the ones column (except not really "ones", but representing the addition of 0-255 to the total). The second byte is in the second address, (pointer+1), and represents "256s column" (a value of 0-255, multiplied by 256^1). And so on, with the high byte at (pointer+3) being the most significant. To print this number (in base 256, assuming you had a character encoding to do so), you simply copy the bytes straight out of memory!
Note that:
1. The order of magnitude is simply the number of digits (well.. duh, it always has, that's no help),
2. The weight of each digit is increasing, at the same time the address is increasing, and at the same time the iterations are increasing.
Everything is always consistently increasing, from left to right.
Conversely, for fractional components, starting from the decimal point, you multiply iteratively, so that the overflow from each step is the digit to print or store. (Which again, in base 2^N, is just shifting.) These come out as correct (normal reading) order as-is, so nothing's wrong to begin with. That's good.
Although, if we were to start writing numbers as, e.g. "one thousand and three sixteenths" =>
0001.1875
we have the problem of increasing power, then a sudden jump to decreasingly negative powers!
I would propose that numbers "should" be written thusly:
0001
.1875
so that, from left to right, the orders are always iterating, one in the positive direction and one in the negative direction. It's complementary. You can tell at a glance what the integral and fractional parts are (just as you can tell with the proper fraction "1000 3/16"). And you can compute, represent and store them at the same time (i.e., within the same loop, one digit of each at a time), on a little-endian computer, without any extra trouble.
This does hinge on the commonality of the little-endian computer. Reverse digit order would be just as justifiable on a big-endian machine. However, we commonly refer to the "upper" bits positively: bit 7 is stored in byte 0, so bit 8 should be stored in byte 1, not byte -1. I suggest little-endian is the more ideal system for this reason. (Which, still, is rooted in our bias towards incrementality, if that can be considered such a thing. But having one or two of something still makes more physical sense than the absence of one or two, so this doesn't seem like much of a cop-out, even on an abstract basis. That is, the notion of positivity is rooted in "successor of" and set theory counting principles, with negative numbers seemingly defined by abstraction of the process.)
Tim