Such a device has always sparkled my interest, and I've got a project ongoing with a very similar thing.
I've had pretty good results with a conventional approach and picked up an AD8253 pre-amplifier plus a 16-bit ADC. For starters I looked how a MCP3911 would perform (answer: not too brilliant), but it
did gave me some useful info for a new revision.
As an example, this is a quick capture of a PIC16 RF node (another WIP project) waking up, starting conversion of a digital temp sensor (~200uA current draw), transmitting that data over 868MHz RF (25mA), waiting for an ACK and going back to sleep.
https://dl.dropboxusercontent.com/u/207647/PIC16%20Sleep%20test.pngUnfortunately the wake-up of the PIC to start the sensor is off screen; but you can see the trace slightly offset before the transmit compared to afterwards.
The AD8253 has a 4MHz bandwidth at a gain of 10. I measure across a 10R 0.1% current shunt. The voltage is attenuated and goes straight into the A/D. With G=10 I get about +/- 136mA FSR and 4.15uA/LSB (16-bit words).
If the current draw would stay within about 13mA, everything drops a tenfold to 0.415uA/LSB and 500kHz bandwidth.
That sounds similar to their "100s of nA to mA" range but without special trickery.
(plus the AD8253 has 1 range left - but then the offset/calibration is really fiddly)
I also tried measuring very brief sleep wakeup periods (at 1000x gain so limited bandwidth):
https://dl.dropboxusercontent.com/u/207647/PIC16DeepSleep%20-%20Wakeup%20pattern.pngI suspect this pulse is severely dampened by the limited bandwidth at such high gain; but also that the current consumption is measured across a 20cm cable and ~20uF of ceramic capacitors.
Likely the transition has a lot more detail if I would intercept the power pin of the microcontroller.
Nevertheless; if I wanted to know how long my microcontroller stays alive on each wake-up, I would probably toggle an I/O pin near the sleep() instruction and measure the pulse lengths with an oscilloscope.
MSP's profiler seems powerful, because it gives a quick overview,
The reason I DIY-ed it because I've seen some professional tools that do similar things, but are often unaffordable or locked-in to 1 microcontroller vendor (Microchip's Real ICE has a similar power meter board plug-in).
I know of CMicrotek (startup) that attempted a crowd-funding earlier this year, getting fame on this forum for calling it's current-scope probe the 'uCurrent'.
https://www.eevblog.com/forum/crowd-funded-projects/current-probe-cmicrotek-ultra-low-current-measurement-tools/They also have a uPower analyzer (which is functionally an identical instrument I am working on).
I should probably open source my project when I have got revision 1 debugged out enough

There are probably some more devices around - but they probably charge the same 500$+ price tag.