Just throwing in my 2 cents in here, because I see many old comparisons to classic 'avr vs pic' battles.
The assembler of PIC 8-bit devices is horrible. End of it. Not much registers, banks, slow instruction speed. Ugh.
The assembler of AVR seems nicer, and because of the much higher instruction speed it's much faster. Flat memory, nice!
The C/C++ compiler of AVR 8-bit is good, because you have support for full C
and C++, which I really lack even on 8-bit. I agree, you wouldn't write C++ on a PIC16 device, but on a PIC18 that carries either on-board USB plus ethernet and a bunch of sensors, it can be useful to program in objects.
The C compiler for PIC16/18 is a pain. The free XC8 is obviously bloating NOP's and BRA's all over the place to make the commercial version look worthwhile. If possible, only use these devices for non-timing-critical projects.
Packaging/pricing/availability: unless it's really hard for you to access a certain device, then pick what you can do, otherwise get breakout SMT boards and there basically is no limits for most QFP parts.
I work with SMD components all the time on breadboards with the use of some simple breakout boards etched on college. If required, get some wire wrap and prototype that way.
Like here with a ENC28j60, and bodge 0402 50R termination resistors between the pins of an ethernet transformer.
Selecting chips for 'DIP only' is a pain for yourself, because the more popular and common chips are going towards SMT today.
I've never written ASM on ARM (only C and looking at the produced ASM), but I do know the architecture of ARM is very nice , with DATA and CODE in 1 space, instruction sets like Thumb to reduce code space, better interrupt vectors, etc.
I'm writing some inline-ASM on PIC24/dsPIC33 and it seems OK. It has 16 work registers, stack pointers and seems like a performance-orientated instruction set.
MOV W0, [W8++] ; move value W0 to address pointed by W8, move W8 pointer 1 further
SUB W0, #10h, [W8++] ; subtract 10h from W0, store result in address pointed by W8, move W8 pointer 1 further
MOV W0, [W8+1FAh]
These operations are 1 instruction word (which is actually 24-bits, or 3 bytes) and take 1 cycle.
However, moving a literal to a RAM byte, is still:
MOV #24h, W0
MOV W0, 0xCAFE ; CAFE = your RAM address
Atleast it also has some software traps and interrupt vectors, so the hardware pre-sorts interrupt sources to your handlers.
Unfortunately, you have to maintain the stack in software, which can cost some cycles. But if your ISR is really that time critical; write it in ASM so it doesn't produce much overhead anyway.
For me personally I wouldn't touch 8-bit PICs that much, especially not if I am looking out to write decent/clean code. I always have to do some adjustments here and there to make it compile and work efficiently on 8-bit PICs.
16-bit and 32-bit PIC's seem OK, just like 8-bit AVR's was for years. Unfortunately, the support for C++ is very shabby from Microchip, where they only just released it for PIC32.. It sounds to me the code density on 8-bit AVR's may be a bit higher than PIC24s , but then again 8-bit AVR's work with 8-bits, and PIC24s with 16-bits.