Andrei Alexandrescu gave a nice talk on allocators at CppCon 2015:
It's somewhat c++ specific, but he's advocating thinking of allocation and object creation quite separately, so the former can easily be applied to malloc as well as it could to new.
His basic point is that it's hard, but probably the right strategy is to, once you understand how your app uses memory to create multiple allocators for different sizes and alignments and then have a master allocator that dispatches to one of the others based on the type/size being requested.
I've spent enough time in small-memory embedded systems that I have to say I don't personally recommend trying to solve the fragmentation problem by making a smarter allocator. If you can, don't use dynamic memory. Otherwise, if you program just does a lot of allocations at startup and doesn't free anything, then don't worry about it, it'll be fine with just a simple buffer-pointer-increment-and-round allocator. If it does malloc/free a lot, then change your program! It's an embedded disaster in the making. Make that stop happening by changing your code. Allocate buffers in advance and reuse them, perhaps in a ring if necessary.