I'm developing a product using an FTDI+AVR combo and I can share some advice. I'm allowing two different programming methods, bitbang through the FTDI inteface, mainly used for burning the bootloader, and a bootloader which communicates over the serial line. I'm also using the FTDI as an external clock source instead of using a crystal.
At first I had difficulty programming the chip. What I finally found out is that the chip needs a clock source in order to work, and that the clock needs to be fast enough for the programming. The device comes shipped with the fuses set to use the internal RC oscillator to allow the chip to be programmed, which is spelled out explicitly in the datasheet.
The device is shipped with internal RC oscillator at 8.0MHz and with the fuse CKDIV8 programmed, resulting in 1.0MHz system clock. The tartup time is set to maximum and time-out period enabled. (CKSEL = "0010", SUT = "10", CKDIV8 = "0"). The default setting ensures that all users can make their desired clock source setting using any available programming interface.
If you program the CKSEL fuses to use anything but the internal RC oscillator before providing an external clock source of any description, you're screwed until and unless you can program the chip with such a clock present. The downside of the default setting is that you need to program the chip at a slower clock rate, which can be a pain. I had luck with using the
-B 1.0 avrdude switch, which sets the bit clock of the programmer to 1 µs. The exact interpretation of the switch might vary between programmers, so experiment with the value if needed.
In my case, since I had an external high speed clock available, my programming procedure would be to first program the fuses using the
-B 1.0 option, then to program the flash with no
-B flag (to get the fastest programming speed possible for programming the flash.) A more appropriate procedure for you is probably to program everything using
-B 1.0 or a slower setting, and also to program LFUSE last – LFUSE is where the clock source selection fuse lies on Atmega328 – so that you don't lock yourself out of the chip until you're ready to put in its final circuit, where you hopefully have an appropriate crystal.