I'm building this for an ATTiny2313A which has 2K of flash and 128 Bytes of RAM. It produced a 3.8K HEX file and the output below.  This is my first real project in AVR studio other than just blinking an LED.

So if I understand this correctly the output should fit? 1.3K for the program and 44Bytes of RAM in use?
I noticed also it said debug, but since I'll be flashing the hex with a third party device I don't need any debug support, so that should be something I can turn off to save space, right? Is 44Bytes of RAM all it needs total, or is that just an estimate?

The code I'm using below won't be the final version. I just found a similar project and am basing mine off of it. I may need more space because I'll be adding more custom characters/symbols. So I may need a little more RAM.

Also I need to make sure I don't mess up and make it so I can't reprogram the chip if I need to. I prefer to use MOSI,MISO,SCK which is easy to do from an Arduino as a programmer. I've also got a cheap chinese USB programmer that I have managed to get working with AVRDude in the past, it also programs using MOSI,MISO,SCK.

------ Build started: Project: Scroll, Configuration: Debug AVR ------
Build started.
Project "Scroll.cproj" (default targets):
Target "PreBuildEvent" skipped, due to false condition; ('$(PreBuildEvent)'!='') was evaluated as (''!='').
Target "CoreBuild" in file "C:\Program Files\Atmel\Atmel Studio 6.1\Vs\Compiler.targets" from project "C:\Users\stonent\Documents\Atmel Studio\6.1\Scroll\Scroll\Scroll.cproj" (target "Build" depends on it):
Task "RunCompilerTask"
C:\Program Files\Atmel\Atmel Studio 6.1\shellUtils\make.exe all
Building file: .././Scroll.c
Invoking: AVR/GNU C Compiler : 3.4.2
"C:\Program Files\Atmel\Atmel Toolchain\AVR8 GCC\Native\\avr8-gnu-toolchain\bin\avr-gcc.exe"  -funsigned-char -funsigned-bitfields -DDEBUG  -O1 -ffunction-sections -fdata-sections -fpack-struct -fshort-enums -g2 -Wall -mmcu=attiny2313a -c -std=gnu99 -MD -MP -MF "Scroll.d" -MT"Scroll.d" -MT"Scroll.o"   -o "Scroll.o" ".././Scroll.c"
Finished building: .././Scroll.c
Building target: Scroll.elf
Invoking: AVR/GNU Linker : 3.4.2
"C:\Program Files\Atmel\Atmel Toolchain\AVR8 GCC\Native\\avr8-gnu-toolchain\bin\avr-gcc.exe" -o Scroll.elf  Scroll.o   -Wl,-Map="" -Wl,--start-group -Wl,-lm  -Wl,--end-group -Wl,--gc-sections -mmcu=attiny2313a
Finished building target: Scroll.elf
"C:\Program Files\Atmel\Atmel Toolchain\AVR8 GCC\Native\\avr8-gnu-toolchain\bin\avr-objcopy.exe" -O ihex -R .eeprom -R .fuse -R .lock -R .signature  "Scroll.elf" "Scroll.hex"
"C:\Program Files\Atmel\Atmel Toolchain\AVR8 GCC\Native\\avr8-gnu-toolchain\bin\avr-objcopy.exe" -j .eeprom  --set-section-flags=.eeprom=alloc,load --change-section-lma .eeprom=0  --no-change-warnings -O ihex "Scroll.elf" "Scroll.eep" || exit 0
"C:\Program Files\Atmel\Atmel Toolchain\AVR8 GCC\Native\\avr8-gnu-toolchain\bin\avr-objdump.exe" -h -S "Scroll.elf" > "Scroll.lss"
"C:\Program Files\Atmel\Atmel Toolchain\AVR8 GCC\Native\\avr8-gnu-toolchain\bin\avr-objcopy.exe" -O srec -R .eeprom -R .fuse -R .lock -R .signature  "Scroll.elf" "Scroll.srec"
"C:\Program Files\Atmel\Atmel Toolchain\AVR8 GCC\Native\\avr8-gnu-toolchain\bin\avr-size.exe" "Scroll.elf"
   text    data     bss     dec     hex filename
   1320      30      14    1364     554 Scroll.elf
Done executing task "RunCompilerTask".
Task "RunOutputFileVerifyTask"
Program Memory Usage : 1350 bytes   65.9 % Full
Data Memory Usage : 44 bytes   34.4 % Full
Done executing task "RunOutputFileVerifyTask".
Done building target "CoreBuild" in project "Scroll.cproj".
Target "PostBuildEvent" skipped, due to false condition; ('$(PostBuildEvent)' != '') was evaluated as ('' != '').
Target "Build" in file "C:\Program Files\Atmel\Atmel Studio 6.1\Vs\Avr.common.targets" from project "C:\Users\stonent\Documents\Atmel Studio\6.1\Scroll\Scroll\Scroll.cproj" (entry point):
Done building target "Build" in project "Scroll.cproj".
Done building project "Scroll.cproj".

Build succeeded.
========== Build: 1 succeeded or up-to-date, 0 failed, 0 skipped ==========

// BMATRIX: NECKLACE with LED-MATRIX that displays text messages
// Based on Nuts&Volts Jul 2013 article: "Smart Necklace", p. 40
// Author : Bruce E. Hall
// Website :
// Version : 1.0
// Date : 27 Jul 2013
// Target : ATTINY4313 or ATTINY2313 microcontroller
// Language : C, using AVR studio 6
// ---------------------------------------------------------------------------
// Uses LITE-ON LTP-757G 5x7 LED MATRIX (Column Cathode) Display
// ---------------------------------- ------------------------
// LED FUNCTION to PORT (PIN) 7 - - -
// ---------------------------------- 6 - row0 -
// Col 0 - PD1 // Row 0 - PB6 5 - row1 -
// Col 1 - PA0 // Row 1 - PB5 4 - col2 row6
// Col 2 - PB4 // Row 2 - PA1 3 - row3 row5
// Col 3 - PB1 // Row 3 - PB3 2 - col4 row4
// Col 4 - PB2 // Row 4 - PD2 1 row2 col3 col0
// // Row 5 - PD3 0 col1 - -
// // Row 6 - PD4
// Since this a column cathode display,
// Columns are active LOW; to set a col, PORTx &= ~(1<<bit)
// Rows are active HIGH; to set a row, PORTx ~= (1<<bit)
// Fuse settings: 4 MHz osc with 65 ms Delay, SPI enable; *NO* clock/8

// ---------------------------------------------------------------------------

#define F_CPU 4000000L // run CPU at 4 MHz
#define ROWS 7 // LED matrix has 7 rows, 5 columns
#define COLS 5
#define SCROLLDELAY 15 // delay in cs between column shifts
#define FLASHDELAY 17 // delay in cs between symbol flashes
#define BEATDELAY 30 // delay in cs between heartbeats
#define HEARTCHAR 99
#define TEXT1 "I Love You! "
#define TEXT2 "Ich Liebe Dich! "

// ---------------------------------------------------------------------------

#include <avr/io.h> // deal with port registers
#include <avr/interrupt.h> // deal with interrupt calls
#include <avr/pgmspace.h> // put character data into progmem
#include <util/delay.h> // used for _delay_ms function
#include <string.h> // string manipulation routines
#include <avr/sleep.h> // used for sleep functions

// ---------------------------------------------------------------------------

char buf[12]; // display buffer; each byte = 1 column
// buf[0] is the left-most column (col0)
// buf[4] is the right-most column (col4)
// buf[5] is a blank column between chars
// buf[6]..buf[10] are scrolled onto display

int curCol; // current column; values 0-4

const unsigned char FONT_CHARS[107][5] PROGMEM =
{ 0x00, 0x00, 0x00, 0x00, 0x00 }, // (space)
{ 0x00, 0x00, 0x5F, 0x00, 0x00 }, // !
{ 0x00, 0x07, 0x00, 0x07, 0x00 }, // "
{ 0x14, 0x7F, 0x14, 0x7F, 0x14 }, // #
{ 0x24, 0x2A, 0x7F, 0x2A, 0x12 }, // $
{ 0x23, 0x13, 0x08, 0x64, 0x62 }, // %
{ 0x36, 0x49, 0x55, 0x22, 0x50 }, // &
{ 0x00, 0x05, 0x03, 0x00, 0x00 }, // '
{ 0x00, 0x1C, 0x22, 0x41, 0x00 }, // (
{ 0x00, 0x41, 0x22, 0x1C, 0x00 }, // )
{ 0x08, 0x2A, 0x1C, 0x2A, 0x08 }, // *
{ 0x08, 0x08, 0x3E, 0x08, 0x08 }, // +
{ 0x00, 0x50, 0x30, 0x00, 0x00 }, // ,
{ 0x08, 0x08, 0x08, 0x08, 0x08 }, // -
{ 0x00, 0x60, 0x60, 0x00, 0x00 }, // .
{ 0x20, 0x10, 0x08, 0x04, 0x02 }, // /
{ 0x3E, 0x51, 0x49, 0x45, 0x3E }, // 0
{ 0x00, 0x42, 0x7F, 0x40, 0x00 }, // 1
{ 0x42, 0x61, 0x51, 0x49, 0x46 }, // 2
{ 0x21, 0x41, 0x45, 0x4B, 0x31 }, // 3
{ 0x18, 0x14, 0x12, 0x7F, 0x10 }, // 4
{ 0x27, 0x45, 0x45, 0x45, 0x39 }, // 5
{ 0x3C, 0x4A, 0x49, 0x49, 0x30 }, // 6
{ 0x01, 0x71, 0x09, 0x05, 0x03 }, // 7
{ 0x36, 0x49, 0x49, 0x49, 0x36 }, // 8
{ 0x06, 0x49, 0x49, 0x29, 0x1E }, // 9
{ 0x00, 0x36, 0x36, 0x00, 0x00 }, // :
{ 0x00, 0x56, 0x36, 0x00, 0x00 }, // ;
{ 0x00, 0x08, 0x14, 0x22, 0x41 }, // <
{ 0x14, 0x14, 0x14, 0x14, 0x14 }, // =
{ 0x41, 0x22, 0x14, 0x08, 0x00 }, // >
{ 0x02, 0x01, 0x51, 0x09, 0x06 }, // ?
{ 0x32, 0x49, 0x79, 0x41, 0x3E }, // @
{ 0x7E, 0x11, 0x11, 0x11, 0x7E }, // A
{ 0x7F, 0x49, 0x49, 0x49, 0x36 }, // B
{ 0x3E, 0x41, 0x41, 0x41, 0x22 }, // C
{ 0x7F, 0x41, 0x41, 0x22, 0x1C }, // D
{ 0x7F, 0x49, 0x49, 0x49, 0x41 }, // E
{ 0x7F, 0x09, 0x09, 0x01, 0x01 }, // F
{ 0x3E, 0x41, 0x41, 0x51, 0x32 }, // G
{ 0x7F, 0x08, 0x08, 0x08, 0x7F }, // H
{ 0x00, 0x41, 0x7F, 0x41, 0x00 }, // I
{ 0x20, 0x40, 0x41, 0x3F, 0x01 }, // J
{ 0x7F, 0x08, 0x14, 0x22, 0x41 }, // K
{ 0x7F, 0x40, 0x40, 0x40, 0x40 }, // L
{ 0x7F, 0x02, 0x04, 0x02, 0x7F }, // M
{ 0x7F, 0x04, 0x08, 0x10, 0x7F }, // N
{ 0x3E, 0x41, 0x41, 0x41, 0x3E }, // O
{ 0x7F, 0x09, 0x09, 0x09, 0x06 }, // P
{ 0x3E, 0x41, 0x51, 0x21, 0x5E }, // Q
{ 0x7F, 0x09, 0x19, 0x29, 0x46 }, // R
{ 0x46, 0x49, 0x49, 0x49, 0x31 }, // S
{ 0x01, 0x01, 0x7F, 0x01, 0x01 }, // T
{ 0x3F, 0x40, 0x40, 0x40, 0x3F }, // U
{ 0x1F, 0x20, 0x40, 0x20, 0x1F }, // V
{ 0x7F, 0x20, 0x18, 0x20, 0x7F }, // W
{ 0x63, 0x14, 0x08, 0x14, 0x63 }, // X
{ 0x03, 0x04, 0x78, 0x04, 0x03 }, // Y
{ 0x61, 0x51, 0x49, 0x45, 0x43 }, // Z
{ 0x00, 0x00, 0x7F, 0x41, 0x41 }, // [
{ 0x02, 0x04, 0x08, 0x10, 0x20 }, // "\"
{ 0x41, 0x41, 0x7F, 0x00, 0x00 }, // ]
{ 0x04, 0x02, 0x01, 0x02, 0x04 }, // ^
{ 0x40, 0x40, 0x40, 0x40, 0x40 }, // _
{ 0x00, 0x01, 0x02, 0x04, 0x00 }, // `
{ 0x20, 0x54, 0x54, 0x54, 0x78 }, // a
{ 0x7F, 0x48, 0x44, 0x44, 0x38 }, // b
{ 0x38, 0x44, 0x44, 0x44, 0x20 }, // c
{ 0x38, 0x44, 0x44, 0x48, 0x7F }, // d
{ 0x38, 0x54, 0x54, 0x54, 0x18 }, // e
{ 0x08, 0x7E, 0x09, 0x01, 0x02 }, // f
{ 0x08, 0x14, 0x54, 0x54, 0x3C }, // g
{ 0x7F, 0x08, 0x04, 0x04, 0x78 }, // h
{ 0x00, 0x44, 0x7D, 0x40, 0x00 }, // i
{ 0x20, 0x40, 0x44, 0x3D, 0x00 }, // j
{ 0x00, 0x7F, 0x10, 0x28, 0x44 }, // k
{ 0x00, 0x41, 0x7F, 0x40, 0x00 }, // l
{ 0x7C, 0x04, 0x18, 0x04, 0x78 }, // m
{ 0x7C, 0x08, 0x04, 0x04, 0x78 }, // n
{ 0x38, 0x44, 0x44, 0x44, 0x38 }, // o
{ 0x7C, 0x14, 0x14, 0x14, 0x08 }, // p
{ 0x08, 0x14, 0x14, 0x18, 0x7C }, // q
{ 0x7C, 0x08, 0x04, 0x04, 0x08 }, // r
{ 0x48, 0x54, 0x54, 0x54, 0x20 }, // s
{ 0x04, 0x3F, 0x44, 0x40, 0x20 }, // t
{ 0x3C, 0x40, 0x40, 0x20, 0x7C }, // u
{ 0x1C, 0x20, 0x40, 0x20, 0x1C }, // v
{ 0x3C, 0x40, 0x30, 0x40, 0x3C }, // w
{ 0x44, 0x28, 0x10, 0x28, 0x44 }, // x
{ 0x0C, 0x50, 0x50, 0x50, 0x3C }, // y
{ 0x44, 0x64, 0x54, 0x4C, 0x44 }, // z
{ 0x00, 0x08, 0x36, 0x41, 0x00 }, // {
{ 0x00, 0x00, 0x7F, 0x00, 0x00 }, // |
{ 0x00, 0x41, 0x36, 0x08, 0x00 }, // }
{ 0x08, 0x08, 0x2A, 0x1C, 0x08 }, // ->
{ 0x08, 0x1C, 0x2A, 0x08, 0x08 }, // <-
{ 0xFF, 0x41, 0x5D, 0x41, 0xFF }, // 096: psycho 2
{ 0x00, 0x3E, 0x22, 0x3E, 0x00 }, // 097: psycho 1
{ 0x06, 0x15, 0x69, 0x15, 0x06 }, // 098: nuke
{ 0x0C, 0x1E, 0x3C, 0x1E, 0x0C }, // 099: solid heart
{ 0x0C, 0x12, 0x24, 0x12, 0x0C }, // 100: outline heart
{ 0x0A, 0x00, 0x55, 0x00, 0x0A }, // 101: flower
{ 0x08, 0x14, 0x2A, 0x14, 0x08 }, // 102: diamond
{ 0x07, 0x49, 0x71, 0x49, 0x07 }, // 103: cup
{ 0x22, 0x14, 0x6B, 0x14, 0x22 }, // 104: star2
{ 0x36, 0x36, 0x08, 0x36, 0x36 }, // 105: star3
{ 0x0F, 0x1A, 0x3E, 0x1A, 0x0F } // 106: fox

// ---------------------------------------------------------------------------
// Function: Light a column on the LED matrix display, according to contents
// of display buffer. buf[0] = leftmost column; buf[4] = rightmost
// This routine is called about 390 times per second, yielding a refresh
// rate for the whole display of 390/5 = 78 frames per second.

if (++curCol >= COLS) // advance column counter
curCol = 0;

// turn off all LEDS, by taking cathode (column) pins high
PORTA = 0x01;
PORTB = 0x16;
PORTD = 0x02;

// turn on individual row bits in this column
char i = buf[curCol];
if (i & _BV(0)) PORTB |= _BV(6);
if (i & _BV(1)) PORTB |= _BV(5);
if (i & _BV(2)) PORTA |= _BV(1);
if (i & _BV(3)) PORTB |= _BV(3);
if (i & _BV(4)) PORTD |= _BV(2);
if (i & _BV(5)) PORTD |= _BV(3);
if (i & _BV(6)) PORTD |= _BV(4);

// turn selected column on
case 0: PORTD &= ~_BV(1); break;
case 1: PORTA &= ~_BV(0); break;
case 2: PORTB &= ~_BV(4); break;
case 3: PORTB &= ~_BV(1); break;
case 4: PORTB &= ~_BV(2); break;

// ---------------------------------------------------------------------------

void init ()
{ // set output pins
DDRA = 0x03; // 0000.0011
DDRB = 0x7E; // 0111.1110
DDRD = 0x1E; // 0001.1110

// setup Timer/Counter0 for LED refresh
TCCR0A = _BV(WGM01); // Set CTC mode
TCCR0B = _BV(CS02); // Set prescaler clk/256 = 15625 Hz
OCR0A = 40; // 15625/40 = 390 interrupts/sec (5 cols = ~78fps)
TIMSK = _BV(OCIE0A); // Enable T/C 0A interrupt

MCUCR = 0x30; // 0011.0000 (sleep enabled, power down)
WDTCR = 0x18; // 0001.1000 set WD turn-off and WD enable bits
WDTCR = 0x10; // 0001.0000 reset WD enable to complete WD turnoff

sei(); // enable global interrupts

void DelayCS(int cs)
// Delays CPU for specified time, in centiseconds (1/100 sec)
// Calling _delay_ms in a routine prevents inlining, reducing code size,
// at the expense of slight timing inaccuracies.
for (int i=0; i<cs; i++)

void DelaySecond()

// -------------------------------------------------------------------------

void ShiftLeft()
// shifts the entire display buffer one column to the left
for (int i=0; i<11; i++)
buf[i] = buf[i+1]; // buf[0] on left; buf[11] on right
} // each element represents a column
} // buf[0..4] are only elements visible

void Scroll()
// scrolls a character onto the display
for (int i=0; i<COLS+1; i++)
ShiftLeft(); // shift display 1 column to left
DelayCS(SCROLLDELAY); // and wait a while
} // repeat 5x for whole character

void LoadSymbol(int index)
// loads a font symbol into the non-visible part of display buffer
for (int y = 0; y < COLS; y++)
buf[y+5] = pgm_read_byte(&(FONT_CHARS[index][y]));
buf[11] = 0x00; // add character spacing

void MakeVisible()
// copies char from non-visible to visible part of buffer
for (int i=0; i<COLS; i++)
buf[i] = buf[i+5];

void DisplaySymbol(int index)
// loads a font symbol into the visible display buffer

void ScrollText(const char *text)
// scrolls given text across matrix, right to left
for (int i=0; i<strlen(text); i++)
LoadSymbol(text[i]-' '); // get char
Scroll(); // and scroll it
} // repeat for all chars

void DisplayText(const char *text)
// displays given text, one character at a time
for (int i=0; i<strlen(text); i++)
DisplaySymbol(text[i]-' '); // display char
DelaySecond(); // wait a while
} // repeat for all chars

// -------------------------------------------------------------------------

void FlashHeart()
DisplaySymbol(HEARTCHAR); // flash heart on
DelayCS(FLASHDELAY); // wait
DisplaySymbol(0); // flash heart off
DelayCS(FLASHDELAY); // wait

void HeartBeat()
FlashHeart(); // heart on/off
FlashHeart(); // heart on/off
DelayCS(BEATDELAY); // wait
FlashHeart(); // do it again!

// -------------------------------------------------------------------------

void main_loop ()
for (int i=100; i<107; i++)
DisplaySymbol(i); // display a fun symbol
HeartBeat(); // heartbeats
DisplayText(TEXT1); // display text1
HeartBeat(); // more heartbeats
ScrollText(TEXT2); // scroll text2
} // repeat 7 times
sleep_cpu(); // turn off display

// ---------------------------------------------------------------------------

int main (void)
init(); // set up ports, CPU registers
main_loop(); // do the display, then sleep
return (0); // that's all, folks!
now i am no software expert so hopefully someone else can validate this,

I believe your going to use more than 44bytes of RAM as that is only the static usage, such as strings and global variables that are loaded into memory first thing, not taking into account your function specific variables, like the ints in your while loops,

(still in the back of my head it sounds funny describing RAM in bytes)

now i am no software expert so hopefully someone else can validate this,

I believe your going to use more than 44bytes of RAM as that is only the static usage, such as strings and global variables that are loaded into memory first thing, not taking into account your function specific variables, like the ints in your while loops,

(still in the back of my head it sounds funny describing RAM in bytes)

Yeah that's what I wasn't really sure about, whether it truly was giving me an accurate amount or just what will be needed as a minimum.  Looking back I probably should have went with the ATTiny4313A but I had ordered the parts long before the project came along.
The majority of your RAM usage is here:

#define TEXT1 "I Love You! "
#define TEXT2 "Ich Liebe Dich! "

You will need to modify your DisplayText() function, but you can put these strings in program space too as you have with your font data, eg:

const char string_1[] PROGMEM = "I Love You! ";

Also, you might try optimisation switch -Os instead of -O1 as your code will likely be smaller with no real speed impact.

The majority of your RAM usage is here:

#define TEXT1 "I Love You! "
#define TEXT2 "Ich Liebe Dich! "

You will need to modify your DisplayText() function, but you can put these strings in program space too as you have with your font data, eg:

const char string_1[] PROGMEM = "I Love You! ";

Also, you might try optimisation switch -Os instead of -O1 as your code will likely be smaller with no real speed impact.

I was an avid Gentoo Linux user for a few years and fiddling with compiler flags was always fun. I remember a lot of discussion on GCC optimization levels. I thought there was something with Os where it traded program size for ram or vice versa. But that was a while back and with x86 GCC.
I see -g2 as one of your compiler flags which means there's a bunch of debugging info in the output. If you manage to find where to turn off debugging there should be no -g option at all on the command line. There may also be extra code in there because the symbol DEBUG is defined on the command line with the -DDEBUG. What this means is that sections of code along the lines of #ifdef DEBUG ... will also be added to your code

Offline Kremmen

  • Super Contributor
  • ***
  • Posts: 1289
  • Country: fi
Re: Making sure I have enough free RAM in my AVR
« Reply #7 on: December 18, 2013, 03:44:26 pm »
You can (in Studio 6.1) go to the Project / [Scroll] Properties and from the Build header change Debug to Release but it won't change anything in this case (oh it chages the compiler flags allright). All the debug info (or not) will be in the .elf file but none of it will be flashed to the device. Both options will produce exactly the same code and static ram size.

P.S. Offhand and without checking - yes your ram is fine. You have used a third of the available, no way does your stack grow so large in such a simple piece of code that you need to start worrying about that.
RAM usage is usually divided in two parts: Static / Stack

Static is statically allocated, for example: your "char buf[12]; int curCol;" and if the optimizer is enabled all variables in main() block (not nested blocks). And all those vars marked with "static", which gives them one fixed memory address forever, not pushing it to global scope though.

Stack is temporary storage, return address of function call or ISR, function parameters/returns and temporary variables that do not fit in the cpu registers.

You can save an extra 2 bytes (16 bit address) of stack by declaring your main as non-returnable. The return address to the assembly reset handler is not pushed to stack this way.
int main(void);

And your max stack usage is at worst case the max function call depth times 2 plus all arguments.
However, the compiler usually puts arguments and returns in the registers, if it fits.

Also, #defines never make it to the chip, they don't even reach the compiler. If the define forms a literal (inline string/value) in the preprocessor, it is usually stored in flash and copied to ram on boot time (between the reset handler and the main routine). However, this depends on the compiler and optimizer. This can seriously damage performance on AVR due to the architecture of the chip.

Also, you are using an type int for curCol which only goes from 0 to 4, you waste one full byte and double the time working with that. Make that an uint8_t, (unsigned char) 8 bits instead of 16 bits.
Remember you have an 8 bit mcu which is very good with 8 bits, less good with more, a disaster with floating point.

*The reset handler is where code starts, it is well hidden in the atmel toolchain. But when reading raw assembly you can find it.
RAM usage is usually divided in two parts: Static / Stack

Static is statically allocated, for example: your "char buf[12]; int curCol;" and if the optimizer is enabled all variables in main() block (not nested blocks). And all those vars marked with "static", which gives them one fixed memory address forever, not pushing it to global scope though.

Stack is temporary storage, return address of function call or ISR, function parameters/returns and temporary variables that do not fit in the cpu registers.

You can save an extra 2 bytes (16 bit address) of stack by declaring your main as non-returnable. The return address to the assembly reset handler is not pushed to stack this way.
int main(void);

And your max stack usage is at worst case the max function call depth times 2 plus all arguments.
However, the compiler usually puts arguments and returns in the registers, if it fits.

Also, #defines never make it to the chip, they don't even reach the compiler. If the define forms a literal (inline string/value) in the preprocessor, it is usually stored in flash and copied to ram on boot time (between the reset handler and the main routine). However, this depends on the compiler and optimizer. This can seriously damage performance on AVR due to the architecture of the chip.

Also, you are using an type int for curCol which only goes from 0 to 4, you waste one full byte and double the time working with that. Make that an uint8_t, (unsigned char) 8 bits instead of 16 bits.
Remember you have an 8 bit mcu which is very good with 8 bits, less good with more, a disaster with floating point.

*The reset handler is where code starts, it is well hidden in the atmel toolchain. But when reading raw assembly you can find it.

I find all of this fascinating. Right now the code is exactly as the original author wrote with one change, he omitted 1 semicolon which I fixed.   I don't think my changes will noticeably increase the code size, I will mainly be adding some custom characters, so only a few bytes of progmem will be eaten up by that, I think.

And also will put some randomization in the custom characters displaying the different symbols I've come up with.
The ELF contains information for the debugger to translate program addresses (the machine code) to your C code. It also knows this way what is stored on the stack and CPU registers at any given point, so your debugger can give prettier readouts.

I recommended a "Debug" profile with initially no optimisations at all. The compiler may optimize some steps so single stepping code isn't as detailed. Variables may also become 'unavailable', especially those pushed onto the stack.

In addition: if you turn up optimizations you may get issues running code. Like the compiler suddenly makes some assumptions which are incorrect, and you end up abusing 'volatile'.

Regarding all RAM usage: any variable defined outside a method is statically allocated, and is known by the compiler how much space it occupies. Any variable defined inside functions, including function arguments & return values, is pushed and popped on/from the stack. The stack grows from the back of memory space to the front. If you call functions to deep (too many layers on layers..) and run out it will override over the statically allocated variables and overwrite them.

It gets even trickier if you use malloc (which I don't recommended for controllers with little RAM): you have a 'heap' area which grows from the end of your static RAM memory towards the stack. These two can also collide and cause issues.
Maybe a image borrowed from Google Images helps:

So the RAM usage is only an estimate: it's the RAM used by 'code'.
As said, I don't recommended using the heap.
The stack is unpredictable, but can be minimized by avoiding too many levels of code and too many function arguments.
Well I finally got a chance to load the hex file, and well it flickers really bad and some of the characters seem messed up. I've got all my wires in the right place, and every once in a while, putting 5V to it fixes it or dropping to a CR2032 battery which is significantly dimmer seems to work as well, but the 3.3v supply from my Arduino seems to mess it up.

So the dimness of the CR2032 will be a problem, anyone have any other battery ideas that are small but with a bit more kick than the CR2032? I seem to remember some sort of battery that's a little wider than a AAA but shorter. The datasheet on the ATTiny2313A says 1.8 to 5.5 so I can go higher. Small and discreet though is what I need or the necklace will be too bulky.

I'm guessing where the speed is defined is not actually setting the speed in the code, just what the program thinks the speed is. Either way I need to kick it up. Playing with the number in the software just seems to make it cycle between parts faster but still bad flicker.

Ok well I got the fuses set and got the flickering gone. However, I think it browned-out during one of my numerous reprogrammings of a tweaked program and "bricked" the chip because now it won't respond any more.

Here's the other strange thing. The chip seems now to run great at 3.3V but when I bump it up to 5V, it just flickers the LEDs.

That and all my symbols I created using an excel file designed for exporting HEX from a 5x8 grid (I'm using a 5x7 matrix) I had started from 8 and not 0 unbeknownst to me, so all the characters have one blank row and are shifted down.

That and the characters inside the program are flipped 180 degrees from what the other one was, so basically:

Have to change from 3.3 to 5 to program otherwise I start getting programming errors.
After programming change from 5 back to 3.3 or it will just flicker.
Need to flip all my symbols 180 degrees and shift back one spot.

Anyone know of an easy way to correct these?

The "Tree 1" with the comment marker is one that is wrong, Tree 1 without the comment marker is corrected.

All the ones underneath that need to be fixed.

Code: [Select]
// { 0x28, 0x78, 0xFE, 0x78, 0x28 }, // 107: Tree 1
{ 0x14, 0x1E, 0x7F, 0x1E, 0x14 }, // 107: Tree 1
{ 0x10, 0x38, 0xEE, 0x38, 0x10 }, // 108: Star 1
{ 0x10, 0x54, 0xEE, 0x54, 0x10 }, // 109: Star 2
{ 0x20, 0x20, 0xFE, 0x20, 0x20 }, // 110: Cross
{ 0x10, 0x38, 0x6C, 0x38, 0x10 }, // 112: Small Star 1
{ 0x38, 0x54, 0xEE, 0x54, 0x38 }, // 113: Ornament 1
{ 0x80, 0x8C, 0x64, 0x74, 0xFE }, // 114: Sleigh
{ 0x54, 0x38, 0xEE, 0x38, 0x54 }, // 115: Big Flake 1
{ 0x54, 0x28, 0xBA, 0x28, 0x54 }, // 116: Big Flake 2
{ 0x1E, 0x32, 0x76, 0x32, 0x9E }, // 117: Manger
{ 0x24, 0x6C, 0xFE, 0x6C, 0x24 }, // 118: Tree 2
{ 0xD6, 0xD6, 0xFE, 0xD6, 0xD6 }, // 119: Present 1
{ 0xFE, 0xD6, 0xFE, 0xD6, 0xFE }, // 120: Present 2
{ 0x3E, 0x52, 0xBE, 0x2A, 0x3E }, // 121: Church
{ 0x86, 0x22, 0xA2, 0x22, 0x26 }, // 122: Snow Face
{ 0xFE, 0x32, 0xF2, 0x32, 0xFE }, // 123: Crown
{ 0x5E, 0x6A, 0xBE, 0x6A, 0x5E }, // 124: Present 3
{ 0x10, 0x28, 0x54, 0x28, 0x10 } //  125: Small Star 2
