Author Topic: Help needed with some heap test code  (Read 3261 times)

0 Members and 2 Guests are viewing this topic.

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4519
  • Country: gb
  • Doing electronics since the 1960s...
Help needed with some heap test code
« on: April 29, 2023, 03:49:06 pm »
This is just weird. I am using the Newlib arm32 ex Cube IDE heap code.

My printf does not use the heap.

Code: [Select]
int blocksize=1000;

while (true)
{
osDelay(200);

char *buf = malloc(blocksize);

if (buf!=NULL)
{
uint32_t addr = (uint32_t) buf;
printf("malloc good, bk=%d addr=%08x",blocksize,addr);
free(buf);
blocksize+=200;
}
else
{
printf("malloc failed, bk=%d",blocksize);
}

}

I am allocating a 1000 byte block, freeing it, and then repeating with a bigger one. It is supposed to fail around 57k bytes. The base of the heap area is shown correctly at 0x2000fb68. The top is 0x2001e000, which is 58520 bytes of heap space.

The output is:

malloc good, bk=1000 addr=2000fb68
malloc good, bk=1200 addr=2000fb68
malloc good, bk=1400 addr=2000fb68
...
malloc good, bk=29400 addr=2000fb68
malloc good, bk=29600 addr=2000fb68
malloc good, bk=29800 addr=2000fb68

It fails at 29800+200 bytes. Way too soon.

If I increment the blocksize by 1000 instead of 200, I get the same result, so it isn't caused by the number of times malloc/free get called.

If I start with a blocksize of 57000, I get

malloc good, bk=57000 addr=2000fb68
malloc good, bk=57200 addr=2000fb68
malloc good, bk=57400 addr=2000fb68
malloc good, bk=57600 addr=2000fb68
malloc good, bk=57800 addr=2000fb68
malloc good, bk=58000 addr=2000fb68
malloc good, bk=58200 addr=2000fb68
malloc good, bk=58400 addr=2000fb68

which is correct!

Working downwards from 57000 I find that the lowest initial blocksize which reaches the correct max blocksize is 55000. 54000 doesn't cut the mustard.

This heap has been working for years, with TLS allocating and freeing 48k blocks, and of course (looking at the above) that works fine, millions of times (literally tested thus). But always 48k! I am working through a list of final things to test because I am writing up some user docs for this thing, and I had a hunch that this was never tested.

Can someone please tell me whether I need to go watch some youtube videos on how to write C, or I need to replace the shitty heap code :)

Same result is obtained if there is no debug output until failure, and incrementing with no delays, in 1 byte blocksize increments.

FWIW the malloc+free time is 10 microseconds, which is not bad considering there are four mutex calls there (FreeRTOS).

====

Digging around for alternative malloc code, and particularly looking around to see whether I have a bug in _sbrk, I came across this
https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c
where it states

Quote

 262   Because freed chunks may be overwritten with bookkeeping fields, this
 263   malloc will often die when freed memory is overwritten by user
 264   programs.  This can be very effective (albeit in an annoying way)
 265   in helping track down dangling pointers.

That is quite a huge limitation on heap usage, no?

On _sbrk, I found e.g. this
https://github.com/zephyrproject-rtos/zephyr/blob/main/lib/libc/newlib/libc-hooks.c
which the original ST lib was based on, and which stupidly sets the heap limit at the current value of SP. This is dumb; I am setting it at the lowest stack address.

Digging around some more, I found this
https://stackoverflow.com/questions/39088598/malloc-in-newlib-does-it-waste-memory-after-one-big-failure-allocation
which is not quite the above problem, but it shows that just because code is old doesn't mean it isn't buggy, and one has to be careful picking up sourcecode from around the net, especially when it is massively complicated as malloc is. Even the smallest lib is about 10k lines of C. Also it turns out that my ST lib newlib malloc doesn't have that problem

10909: malloc failed, bk=60000
11911: malloc failed, bk=59000
12913: malloc good, bk=58000 addr=2000fb68
12915: malloc good, bk=57000 addr=2000fb68
12917: malloc good, bk=56000 addr=2000fb68
12919: malloc good, bk=55000 addr=2000fb68
« Last Edit: April 29, 2023, 07:56:52 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4519
  • Country: gb
  • Doing electronics since the 1960s...
Re: Help needed with some heap test code
« Reply #1 on: April 30, 2023, 07:21:12 am »
A bit more data... with initial blocksize of 54000 it fails to use up all the RAM. With 55000 it works ok.

With 54000, I get the following (showing sbrk input and output; numbers on the left are a milliseconds counter)

10908: HEAP TEST
10910: malloc sbrk, incr=54024, ret=2000fb60
10912: malloc sbrk, incr=408, ret=2001ce68
10914: malloc good, bk=54000 addr=2000fb68
10926: malloc good, bk=54100 addr=2000fb68
10938: malloc good, bk=54200 addr=2000fb68
10950: malloc good, bk=54300 addr=2000fb68
10962: malloc good, bk=54400 addr=2000fb68
10974: malloc sbrk, incr=57344, ret=ffffffff
10976: malloc failed, bk=54500

With 55000 I get this

10908: HEAP TEST
10910: malloc sbrk, incr=55024, ret=2000fb60
10912: malloc sbrk, incr=3504, ret=2001d250
10914: malloc good, bk=55000 addr=2000fb68
10926: malloc good, bk=55100 addr=2000fb68
10938: malloc good, bk=55200 addr=2000fb68
10950: malloc good, bk=55300 addr=2000fb68
10962: malloc good, bk=55400 addr=2000fb68
10974: malloc good, bk=55500 addr=2000fb68
10986: malloc good, bk=55600 addr=2000fb68
10998: malloc good, bk=55700 addr=2000fb68
11010: malloc good, bk=55800 addr=2000fb68
11022: malloc good, bk=55900 addr=2000fb68
11034: malloc good, bk=56000 addr=2000fb68
11046: malloc good, bk=56100 addr=2000fb68
11058: malloc good, bk=56200 addr=2000fb68
11070: malloc good, bk=56300 addr=2000fb68
11082: malloc good, bk=56400 addr=2000fb68
11094: malloc good, bk=56500 addr=2000fb68
11106: malloc good, bk=56600 addr=2000fb68
11118: malloc good, bk=56700 addr=2000fb68
11130: malloc good, bk=56800 addr=2000fb68
11142: malloc good, bk=56900 addr=2000fb68
11154: malloc good, bk=57000 addr=2000fb68
11166: malloc good, bk=57100 addr=2000fb68
11178: malloc good, bk=57200 addr=2000fb68
11190: malloc good, bk=57300 addr=2000fb68
11202: malloc good, bk=57400 addr=2000fb68
11214: malloc good, bk=57500 addr=2000fb68
11226: malloc good, bk=57600 addr=2000fb68
11238: malloc good, bk=57700 addr=2000fb68
11250: malloc good, bk=57800 addr=2000fb68
11262: malloc good, bk=57900 addr=2000fb68
11274: malloc good, bk=58000 addr=2000fb68
11286: malloc good, bk=58100 addr=2000fb68
11298: malloc good, bk=58200 addr=2000fb68
11310: malloc good, bk=58300 addr=2000fb68
11322: malloc good, bk=58400 addr=2000fb68
11334: malloc good, bk=58500 addr=2000fb68
11346: malloc sbrk, incr=61440, ret=ffffffff
11348: malloc failed, bk=58600

I reckon it is obvious, but can't get my head around it :)

This is _sbrk

Code: [Select]

// This is used by malloc().
// The original Newlib version of this, on which the ST code was based
// https://github.com/zephyrproject-rtos/zephyr/blob/main/lib/libc/newlib/libc-hooks.c
// allowed the heap to go all the way up to the current SP value, which is stupid.
// This one sets the limit at the base (lowest memory address) of the stack area.

caddr_t _sbrk(int incr)
{

// These two are defined in the linkfile
extern char end asm("_end"); // end of BSS
extern char top asm("_top"); // base of the general stack

static char *heap_end; // this gets initialised to NULL by C convention
char *prev_heap_end; // this gets initialised on 1st call here

// This sets heap_end to end of BSS, on the first call to _sbrk
if (heap_end == NULL)
heap_end = &end;

prev_heap_end = heap_end;

// top = top of RAM minus size of stack
if ( (heap_end + incr) > &top )
{
errno = ENOMEM; // not apparently used by anything
prev_heap_end = -1;
}
else
{
heap_end += incr;
}

printf("malloc sbrk, incr=%d, ret=%08x",(int)incr, (int) prev_heap_end); // TODO

return (caddr_t) prev_heap_end;

}


If I start with a blocksize of 1000, we get a lot more output... sorry about that

Code: [Select]
10908: HEAP TEST
10910: malloc sbrk, incr=1024, ret=2000fb60
10912: malloc sbrk, incr=160, ret=2000ff60
10914: malloc good, bk=1000 addr=2000fb68
10926: malloc good, bk=1100 addr=2000fb68
10938: malloc sbrk, incr=4096, ret=20010000
10940: malloc good, bk=1200 addr=2000fb68
10952: malloc good, bk=1300 addr=2000fb68
10964: malloc good, bk=1400 addr=2000fb68
10976: malloc good, bk=1500 addr=2000fb68
10988: malloc good, bk=1600 addr=2000fb68
11000: malloc good, bk=1700 addr=2000fb68
11012: malloc good, bk=1800 addr=2000fb68
11024: malloc good, bk=1900 addr=2000fb68
11036: malloc good, bk=2000 addr=2000fb68
11048: malloc good, bk=2100 addr=2000fb68
11060: malloc good, bk=2200 addr=2000fb68
11072: malloc good, bk=2300 addr=2000fb68
11084: malloc good, bk=2400 addr=2000fb68
11096: malloc good, bk=2500 addr=2000fb68
11108: malloc good, bk=2600 addr=2000fb68
11120: malloc good, bk=2700 addr=2000fb68
11132: malloc good, bk=2800 addr=2000fb68
11144: malloc good, bk=2900 addr=2000fb68
11156: malloc good, bk=3000 addr=2000fb68
11168: malloc good, bk=3100 addr=2000fb68
11180: malloc good, bk=3200 addr=2000fb68
11192: malloc good, bk=3300 addr=2000fb68
11204: malloc good, bk=3400 addr=2000fb68
11216: malloc good, bk=3500 addr=2000fb68
11228: malloc good, bk=3600 addr=2000fb68
11240: malloc good, bk=3700 addr=2000fb68
11252: malloc good, bk=3800 addr=2000fb68
11264: malloc good, bk=3900 addr=2000fb68
11276: malloc good, bk=4000 addr=2000fb68
11288: malloc good, bk=4100 addr=2000fb68
11300: malloc good, bk=4200 addr=2000fb68
11312: malloc good, bk=4300 addr=2000fb68
11324: malloc good, bk=4400 addr=2000fb68
11336: malloc good, bk=4500 addr=2000fb68
11348: malloc good, bk=4600 addr=2000fb68
11360: malloc good, bk=4700 addr=2000fb68
11372: malloc good, bk=4800 addr=2000fb68
11384: malloc good, bk=4900 addr=2000fb68
11396: malloc good, bk=5000 addr=2000fb68
11408: malloc good, bk=5100 addr=2000fb68
11420: malloc good, bk=5200 addr=2000fb68
11432: malloc sbrk, incr=8192, ret=20011000
11434: malloc good, bk=5300 addr=2000fb68
11446: malloc good, bk=5400 addr=2000fb68
11458: malloc good, bk=5500 addr=2000fb68
11470: malloc good, bk=5600 addr=2000fb68
11482: malloc good, bk=5700 addr=2000fb68
11494: malloc good, bk=5800 addr=2000fb68
11506: malloc good, bk=5900 addr=2000fb68
11518: malloc good, bk=6000 addr=2000fb68
11530: malloc good, bk=6100 addr=2000fb68
11542: malloc good, bk=6200 addr=2000fb68
11554: malloc good, bk=6300 addr=2000fb68
11566: malloc good, bk=6400 addr=2000fb68
11578: malloc good, bk=6500 addr=2000fb68
11590: malloc good, bk=6600 addr=2000fb68
11602: malloc good, bk=6700 addr=2000fb68
11614: malloc good, bk=6800 addr=2000fb68
11626: malloc good, bk=6900 addr=2000fb68
11638: malloc good, bk=7000 addr=2000fb68
11650: malloc good, bk=7100 addr=2000fb68
11662: malloc good, bk=7200 addr=2000fb68
11674: malloc good, bk=7300 addr=2000fb68
11686: malloc good, bk=7400 addr=2000fb68
11698: malloc good, bk=7500 addr=2000fb68
11710: malloc good, bk=7600 addr=2000fb68
11722: malloc good, bk=7700 addr=2000fb68
11734: malloc good, bk=7800 addr=2000fb68
11746: malloc good, bk=7900 addr=2000fb68
11758: malloc good, bk=8000 addr=2000fb68
11770: malloc good, bk=8100 addr=2000fb68
11782: malloc good, bk=8200 addr=2000fb68
11794: malloc good, bk=8300 addr=2000fb68
11806: malloc good, bk=8400 addr=2000fb68
11818: malloc good, bk=8500 addr=2000fb68
11830: malloc good, bk=8600 addr=2000fb68
11842: malloc good, bk=8700 addr=2000fb68
11854: malloc good, bk=8800 addr=2000fb68
11866: malloc good, bk=8900 addr=2000fb68
11878: malloc good, bk=9000 addr=2000fb68
11890: malloc good, bk=9100 addr=2000fb68
11902: malloc good, bk=9200 addr=2000fb68
11914: malloc good, bk=9300 addr=2000fb68
11926: malloc good, bk=9400 addr=2000fb68
11938: malloc good, bk=9500 addr=2000fb68
11950: malloc good, bk=9600 addr=2000fb68
11962: malloc good, bk=9700 addr=2000fb68
11974: malloc good, bk=9800 addr=2000fb68
11986: malloc good, bk=9900 addr=2000fb68
11998: malloc good, bk=10000 addr=2000fb68
12010: malloc good, bk=10100 addr=2000fb68
12022: malloc good, bk=10200 addr=2000fb68
12034: malloc good, bk=10300 addr=2000fb68
12046: malloc good, bk=10400 addr=2000fb68
12058: malloc good, bk=10500 addr=2000fb68
12070: malloc good, bk=10600 addr=2000fb68
12082: malloc good, bk=10700 addr=2000fb68
12094: malloc good, bk=10800 addr=2000fb68
12106: malloc good, bk=10900 addr=2000fb68
12118: malloc good, bk=11000 addr=2000fb68
12130: malloc good, bk=11100 addr=2000fb68
12142: malloc good, bk=11200 addr=2000fb68
12154: malloc good, bk=11300 addr=2000fb68
12166: malloc good, bk=11400 addr=2000fb68
12178: malloc good, bk=11500 addr=2000fb68
12190: malloc good, bk=11600 addr=2000fb68
12202: malloc good, bk=11700 addr=2000fb68
12214: malloc good, bk=11800 addr=2000fb68
12226: malloc good, bk=11900 addr=2000fb68
12238: malloc good, bk=12000 addr=2000fb68
12250: malloc good, bk=12100 addr=2000fb68
12262: malloc good, bk=12200 addr=2000fb68
12274: malloc good, bk=12300 addr=2000fb68
12286: malloc good, bk=12400 addr=2000fb68
12298: malloc good, bk=12500 addr=2000fb68
12310: malloc good, bk=12600 addr=2000fb68
12322: malloc good, bk=12700 addr=2000fb68
12334: malloc good, bk=12800 addr=2000fb68
12346: malloc good, bk=12900 addr=2000fb68
12358: malloc good, bk=13000 addr=2000fb68
12370: malloc good, bk=13100 addr=2000fb68
12382: malloc good, bk=13200 addr=2000fb68
12394: malloc good, bk=13300 addr=2000fb68
12406: malloc good, bk=13400 addr=2000fb68
12418: malloc sbrk, incr=16384, ret=20013000
12420: malloc good, bk=13500 addr=2000fb68
12432: malloc good, bk=13600 addr=2000fb68
12444: malloc good, bk=13700 addr=2000fb68
12456: malloc good, bk=13800 addr=2000fb68
12468: malloc good, bk=13900 addr=2000fb68
12480: malloc good, bk=14000 addr=2000fb68
12492: malloc good, bk=14100 addr=2000fb68
12504: malloc good, bk=14200 addr=2000fb68
12516: malloc good, bk=14300 addr=2000fb68
12528: malloc good, bk=14400 addr=2000fb68
12540: malloc good, bk=14500 addr=2000fb68
12552: malloc good, bk=14600 addr=2000fb68
12564: malloc good, bk=14700 addr=2000fb68
12576: malloc good, bk=14800 addr=2000fb68
12588: malloc good, bk=14900 addr=2000fb68
12600: malloc good, bk=15000 addr=2000fb68
12612: malloc good, bk=15100 addr=2000fb68
12624: malloc good, bk=15200 addr=2000fb68
12636: malloc good, bk=15300 addr=2000fb68
12648: malloc good, bk=15400 addr=2000fb68
12660: malloc good, bk=15500 addr=2000fb68
12672: malloc good, bk=15600 addr=2000fb68
12684: malloc good, bk=15700 addr=2000fb68
12696: malloc good, bk=15800 addr=2000fb68
12708: malloc good, bk=15900 addr=2000fb68
12720: malloc good, bk=16000 addr=2000fb68
12732: malloc good, bk=16100 addr=2000fb68
12744: malloc good, bk=16200 addr=2000fb68
12756: malloc good, bk=16300 addr=2000fb68
12768: malloc good, bk=16400 addr=2000fb68
12780: malloc good, bk=16500 addr=2000fb68
12792: malloc good, bk=16600 addr=2000fb68
12804: malloc good, bk=16700 addr=2000fb68
12816: malloc good, bk=16800 addr=2000fb68
12828: malloc good, bk=16900 addr=2000fb68
12840: malloc good, bk=17000 addr=2000fb68
12852: malloc good, bk=17100 addr=2000fb68
12864: malloc good, bk=17200 addr=2000fb68
12876: malloc good, bk=17300 addr=2000fb68
12888: malloc good, bk=17400 addr=2000fb68
12900: malloc good, bk=17500 addr=2000fb68
12912: malloc good, bk=17600 addr=2000fb68
12924: malloc good, bk=17700 addr=2000fb68
12936: malloc good, bk=17800 addr=2000fb68
12948: malloc good, bk=17900 addr=2000fb68
12960: malloc good, bk=18000 addr=2000fb68
12972: malloc good, bk=18100 addr=2000fb68
12984: malloc good, bk=18200 addr=2000fb68
12996: malloc good, bk=18300 addr=2000fb68
13008: malloc good, bk=18400 addr=2000fb68
13020: malloc good, bk=18500 addr=2000fb68
13032: malloc good, bk=18600 addr=2000fb68
13044: malloc good, bk=18700 addr=2000fb68
13056: malloc good, bk=18800 addr=2000fb68
13068: malloc good, bk=18900 addr=2000fb68
13080: malloc good, bk=19000 addr=2000fb68
13092: malloc good, bk=19100 addr=2000fb68
13104: malloc good, bk=19200 addr=2000fb68
13116: malloc good, bk=19300 addr=2000fb68
13128: malloc good, bk=19400 addr=2000fb68
13140: malloc good, bk=19500 addr=2000fb68
13152: malloc good, bk=19600 addr=2000fb68
13164: malloc good, bk=19700 addr=2000fb68
13176: malloc good, bk=19800 addr=2000fb68
13188: malloc good, bk=19900 addr=2000fb68
13200: malloc good, bk=20000 addr=2000fb68
13212: malloc good, bk=20100 addr=2000fb68
13224: malloc good, bk=20200 addr=2000fb68
13236: malloc good, bk=20300 addr=2000fb68
13248: malloc good, bk=20400 addr=2000fb68
13260: malloc good, bk=20500 addr=2000fb68
13272: malloc good, bk=20600 addr=2000fb68
13284: malloc good, bk=20700 addr=2000fb68
13296: malloc good, bk=20800 addr=2000fb68
13308: malloc good, bk=20900 addr=2000fb68
13320: malloc good, bk=21000 addr=2000fb68
13332: malloc good, bk=21100 addr=2000fb68
13344: malloc good, bk=21200 addr=2000fb68
13356: malloc good, bk=21300 addr=2000fb68
13368: malloc good, bk=21400 addr=2000fb68
13380: malloc good, bk=21500 addr=2000fb68
13392: malloc good, bk=21600 addr=2000fb68
13404: malloc good, bk=21700 addr=2000fb68
13416: malloc good, bk=21800 addr=2000fb68
13428: malloc good, bk=21900 addr=2000fb68
13440: malloc good, bk=22000 addr=2000fb68
13452: malloc good, bk=22100 addr=2000fb68
13464: malloc good, bk=22200 addr=2000fb68
13476: malloc good, bk=22300 addr=2000fb68
13488: malloc good, bk=22400 addr=2000fb68
13500: malloc good, bk=22500 addr=2000fb68
13512: malloc good, bk=22600 addr=2000fb68
13524: malloc good, bk=22700 addr=2000fb68
13536: malloc good, bk=22800 addr=2000fb68
13548: malloc good, bk=22900 addr=2000fb68
13560: malloc good, bk=23000 addr=2000fb68
13572: malloc good, bk=23100 addr=2000fb68
13584: malloc good, bk=23200 addr=2000fb68
13596: malloc good, bk=23300 addr=2000fb68
13608: malloc good, bk=23400 addr=2000fb68
13620: malloc good, bk=23500 addr=2000fb68
13632: malloc good, bk=23600 addr=2000fb68
13644: malloc good, bk=23700 addr=2000fb68
13656: malloc good, bk=23800 addr=2000fb68
13668: malloc good, bk=23900 addr=2000fb68
13680: malloc good, bk=24000 addr=2000fb68
13692: malloc good, bk=24100 addr=2000fb68
13704: malloc good, bk=24200 addr=2000fb68
13716: malloc good, bk=24300 addr=2000fb68
13728: malloc good, bk=24400 addr=2000fb68
13740: malloc good, bk=24500 addr=2000fb68
13752: malloc good, bk=24600 addr=2000fb68
13764: malloc good, bk=24700 addr=2000fb68
13776: malloc good, bk=24800 addr=2000fb68
13788: malloc good, bk=24900 addr=2000fb68
13800: malloc good, bk=25000 addr=2000fb68
13812: malloc good, bk=25100 addr=2000fb68
13824: malloc good, bk=25200 addr=2000fb68
13836: malloc good, bk=25300 addr=2000fb68
13848: malloc good, bk=25400 addr=2000fb68
13860: malloc good, bk=25500 addr=2000fb68
13872: malloc good, bk=25600 addr=2000fb68
13884: malloc good, bk=25700 addr=2000fb68
13896: malloc good, bk=25800 addr=2000fb68
13908: malloc good, bk=25900 addr=2000fb68
13920: malloc good, bk=26000 addr=2000fb68
13932: malloc good, bk=26100 addr=2000fb68
13944: malloc good, bk=26200 addr=2000fb68
13956: malloc good, bk=26300 addr=2000fb68
13968: malloc good, bk=26400 addr=2000fb68
13980: malloc good, bk=26500 addr=2000fb68
13992: malloc good, bk=26600 addr=2000fb68
14004: malloc good, bk=26700 addr=2000fb68
14016: malloc good, bk=26800 addr=2000fb68
14028: malloc good, bk=26900 addr=2000fb68
14040: malloc good, bk=27000 addr=2000fb68
14052: malloc good, bk=27100 addr=2000fb68
14064: malloc good, bk=27200 addr=2000fb68
14076: malloc good, bk=27300 addr=2000fb68
14088: malloc good, bk=27400 addr=2000fb68
14100: malloc good, bk=27500 addr=2000fb68
14112: malloc good, bk=27600 addr=2000fb68
14124: malloc good, bk=27700 addr=2000fb68
14136: malloc good, bk=27800 addr=2000fb68
14148: malloc good, bk=27900 addr=2000fb68
14160: malloc good, bk=28000 addr=2000fb68
14172: malloc good, bk=28100 addr=2000fb68
14184: malloc good, bk=28200 addr=2000fb68
14196: malloc good, bk=28300 addr=2000fb68
14208: malloc good, bk=28400 addr=2000fb68
14220: malloc good, bk=28500 addr=2000fb68
14232: malloc good, bk=28600 addr=2000fb68
14244: malloc good, bk=28700 addr=2000fb68
14256: malloc good, bk=28800 addr=2000fb68
14268: malloc good, bk=28900 addr=2000fb68
14280: malloc good, bk=29000 addr=2000fb68
14292: malloc good, bk=29100 addr=2000fb68
14304: malloc good, bk=29200 addr=2000fb68
14316: malloc good, bk=29300 addr=2000fb68
14328: malloc good, bk=29400 addr=2000fb68
14340: malloc good, bk=29500 addr=2000fb68
14352: malloc good, bk=29600 addr=2000fb68
14364: malloc good, bk=29700 addr=2000fb68
14376: malloc good, bk=29800 addr=2000fb68
14388: malloc sbrk, incr=32768, ret=ffffffff
14390: malloc failed, bk=29900
« Last Edit: April 30, 2023, 07:26:15 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 4074
  • Country: us
Re: Help needed with some heap test code
« Reply #2 on: April 30, 2023, 03:29:39 pm »
One possibility is that malloc() or free() themselves are occasionally allocating memory to manage the free list, and if that allocation lands in the middle of your memory segment it then limits the maximum size allocation that can succeed.
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4519
  • Country: gb
  • Doing electronics since the 1960s...
Re: Help needed with some heap test code
« Reply #3 on: April 30, 2023, 04:54:50 pm »
Yeah... but that would be a really serious bug, no?

I wonder if this
https://stackoverflow.com/questions/39088598/malloc-in-newlib-does-it-waste-memory-after-one-big-failure-allocation/76138157#76138157
is related. It refers to mallocr.c. The problem described there I don't have (I tested it). I made a post there but some mod deleted it so I don't want to waste my time on that weird site where most replies are totally off the mark anyway.

In my product, the heap is used to allocate some buffers at startup (according to product options in a config file) and these are never freed. Then there is TLS allocating 48k and freeing it when the session ends, and this repeats many times (and works ok, 48k every time).

What makes it fail is allocating increasing size blocks.
« Last Edit: April 30, 2023, 04:57:02 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DavidAlfa

  • Super Contributor
  • ***
  • Posts: 6469
  • Country: es
Re: Help needed with some heap test code
« Reply #4 on: April 30, 2023, 05:23:28 pm »
Yes, for some reason it discards the previously allocated block if the new request is larger.

Thus, first allocating 1k+1k, freeing them and then allocating a 2kb block won't use the area previously used by 1k+1k.
But if you allocate 2KB, then it will be used by 1k+1k.
So my workaround (example) was to allocate the entire RAM at boot, then freeing it.
Now malloc is aware of the entire pool and behaves as expected.
Of course, fragmentation will be a different issue.
Hantek DSO2x1x            Drive        FAQ          DON'T BUY HANTEK! (Aka HALF-MADE)
Stm32 Soldering FW      Forum      Github      Donate
 
The following users thanked this post: peter-h

Offline Siwastaja

  • Super Contributor
  • ***
  • Posts: 9551
  • Country: fi
Re: Help needed with some heap test code
« Reply #5 on: April 30, 2023, 05:32:08 pm »
The whole point of malloc() is it stores metadata for free().

If all you need is one-time allocation of things depending on configuration (which is only read at boot time) and no free, writing your own "replacement" for malloc would be an order of magnitude simpler than doing your own malloc from scratch. In other words, you would have your own reliable and less wasteful allocation system working faster than it takes to debug the Newlib's malloc.

Something with this idea:
Code: [Select]
   uint8_t my_buf[123456];
   void* cur_ptr = my_buf;
   if(config.thing_a_enabled)
   {
      // equivalent of malloc(555):
      thing_a_buf = cur_ptr;
      cur_ptr += 555;
   }
   if(config.thing_b_enabled)
    ....

Add alignment and range check and that's pretty much it, no?

And if you have a few combinations only, then it's even simpler to just use a union.
 
The following users thanked this post: DiTBho

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4519
  • Country: gb
  • Doing electronics since the 1960s...
Re: Help needed with some heap test code
« Reply #6 on: April 30, 2023, 06:44:51 pm »
Quote
Yes, for some reason it discards the previously allocated block if the new request is larger.

*I* am always freeing each block immediately afterwards.

The interesting observation is that each newly allocated block is being allocated at the base of the heap area. I've read that a lot of heap code doesn't do that. So that is a plus, which should prevent the behaviour described here.

Quote
If all you need is one-time allocation of things depending on configuration

See above; that is true but I also have that 48k TLS block. It just happens to be the same size every time, which is why it works ;)

Quote
So my workaround (example) was to allocate the entire RAM at boot, then freeing it.
Now malloc is aware of the entire pool and behaves as expected.

It works!!! This is the code I used; same as yours basically, and it will go in main.c before any heap usage

Code: [Select]
/*
* This establishes the biggest block which can be allocated on the heap, allocates
* it and frees it. This is necessary to work around a bug in ST's Newlib heap code
* which fails to use the whole heap area if some smaller blocks were allocated
* previously.
* One has to start with the max heap space (&top-&end) and loop down from there
* because we don't know exactly what initial allocation overhead there is in the
* heap code. Actually it is only about 5x8 bytes.
* There is a limit counter to prevent some huge overrun if top<end, etc.
*
*/

extern char end asm("_end"); // end of BSS (from linker script)
extern char top asm("_top"); // base of the general stack (as above)

char *heap_ptr = NULL;
uint32_t test_block = &top-&end; // max theoretical heap area
uint32_t test_limit = 1000; // counter to set an upper limit on looping

while((heap_ptr==NULL) && (test_limit>0))
{
heap_ptr = malloc(test_block);
test_block-=8;
test_limit--;
}

free(heap_ptr);

It's funny that you came across this bug too

Quote
// Allocate max possible ram, then release it. This fills the heap pool and avoids internal fragmentation due (ST's?) poor malloc implementation.

I also notice that _sbrk is not getting called until the very end when it fails i.e. I am no longer seeing the various calls with 4096, 8192, etc. The first five sbrk calls below are from the above loop. The rest is as previously

10908: HEAP TEST
10910: malloc sbrk, incr=58552, ret=ffffffff
10912: malloc sbrk, incr=58544, ret=ffffffff
10914: malloc sbrk, incr=58536, ret=ffffffff
10916: malloc sbrk, incr=58528, ret=2000fb60
10918: malloc sbrk, incr=0, ret=2001e000
10920: malloc good, bk=1000 addr=2000fb68
10932: malloc good, bk=1100 addr=2000fb68
10944: malloc good, bk=1200 addr=2000fb68
10956: malloc good, bk=1300 addr=2000fb68
10968: malloc good, bk=1400 addr=2000fb68
10980: malloc good, bk=1500 addr=2000fb68
...
17808: malloc good, bk=58400 addr=2000fb68
17820: malloc good, bk=58500 addr=2000fb68
17832: malloc sbrk, incr=61440, ret=ffffffff
17834: malloc failed, bk=58600


« Last Edit: April 30, 2023, 07:13:27 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 4074
  • Country: us
Re: Help needed with some heap test code
« Reply #7 on: April 30, 2023, 07:23:01 pm »
Yeah... but that would be a really serious bug, no?

Depends on what you mean by bug.  As long as it returns 0 when unable to fulfill your request it is not doing anything incorrect.  Doesn't mean it works for every situation.

Quote
I wonder if this
https://stackoverflow.com/questions/39088598/malloc-in-newlib-does-it-waste-memory-after-one-big-failure-allocation/76138157#76138157
is related. It refers to mallocr.c. The problem described there I don't have (I tested it). I made a post there but some mod deleted it so I don't want to waste my time on that weird site where most replies are totally off the mark anyway.

If the question was correctly answered, including a reference to a bug fix submitted to the upstream package, and you posted a comment about a different bug, of course it got deleted.  That is exactly working as intended.  It is part of how SO keeps their answers mostly focused, and as a result is 100x more useful as a search result than discussion forms like eevblog,

Quote
What makes it fail is allocating increasing size blocks.

It definitely sounds like non-ideal behavior, and possibly a bug in newlib.  However, heap fragmentation is an unsolvable problem in this environment unless you are willing to make every allocation relocatable by using handles (which carries it's own costs and limitations).  There is always a allocation pattern that is going to leave most of the memory free but unable to handle even a medium sized request.  malloc() and free() are normally optimized around the case of allocating objects much smaller than available memory, and doing so with relatively low memory and CPU overhead.  It's why a lot of people say "don't use dynamic memory allocation in microcontrollers" which is too general to be correct, but is understandable. 
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4519
  • Country: gb
  • Doing electronics since the 1960s...
Re: Help needed with some heap test code
« Reply #8 on: April 30, 2023, 07:48:29 pm »
The problem is that if say you design your RAM to have 60k heap space, and malloc returns a 0 for a 29k block, that is not right :)

In this case I have worked hard all around to make as much RAM as possible so that TLS is able to malloc its 48k. TLS doesn't really work with much less than that - unless you control both ends and then you can do what you like.

Fragmentation I do understand. It's an entirely different problem.

I too don't use the heap in embedded, except as a one-off at startup as mentioned previously, but in the case of TLS there isn't much of a choice. It is
- an optional feature, so a startup malloc makes sense
- it can be invoked by different RTOS tasks (it is mutexed)
- not enabled as such in a config file; it gets started up when an RTOS task invokes it

If it wasn't for the last one, and its use was enabled by a config file parameter, TLS could just use one malloc which is never freed. Yes this is debatable both ways. But fragmentation should be impossible if
- all blocks except TLS are allocated only at startup (and these are typically only a few k)
- TLS allocates and frees a 48k block (repeatedly)
- total heap space is 57k

If one had several functions like TLS and they were allocating say 10k each, and there was no mutual exclusion on these executing, then fragmentation would be very possible.

This has been another 3 day rabbit hole for me :)

Can C even do the "garbage collection" thing? AFAICT that needs double indirection on all blocks, so the source code doesn't need to care of the address of the block has been physicaly changed.

Re SO, that would be off topic :) but I often see a question to which nearly all the answers are simply BS.
« Last Edit: April 30, 2023, 09:26:37 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4519
  • Country: gb
  • Doing electronics since the 1960s...
Re: Help needed with some heap test code
« Reply #9 on: May 02, 2023, 02:36:48 pm »
This gets even more funny.

I reckon lots of people knew about this but nobody, apart from DavidAlfa, looked into it properly.

Looking at a 2019 (very early, before I got involved) version of my project, it has code in main.c to malloc and free a 48k block, and it talks about a bug in the heap code:

Code: [Select]
// Allocate and immediately free the entire heap memory in order that it becomes available to RTOS threads
// This is a workaround for an RTOS heap management bug
void *dummy = malloc(48*1024);
free(dummy);

But that malloc() would have gone to the newlib heap code. Nothing to do with the RTOS heap (heap_4.c) whereby you give FreeRTOS a chuck of RAM and it runs its own private heap within that. Currently I am using the 64k CCM for the RTOS RAM block (which is at least partly a "heap").

Just like LWIP does, in the infamous and crappily documented
#define MEM_SIZE                (6*1024)
and just like TLS does (currently it is given a 48k block).

Otherwise I need to test the RTOS heap also (and the TLS heap, and the LWIP heap).

This stuff could all be buggy, if they use any of a number of heap sourcecode found online.

I have not yet found a definitive sourcecode for the newlib heap I am using. Based on Cube IDE disassembly, it should contain calls to empty functions for obtaining and returning mutexes. I found sourcecode for the newlib printf family.

More tests... I think this heap doesn't do what I think is called compaction. If you malloc/free and repeat, it does each block at the same address - the base of the heap. Buf if you are running two blocks, say one 5k and one 48k, the 5k one ends up quite high up (when it got allocated after the 48k one) and then when the 48k one stops being allocated, the 5k block never drops back down. I don't think this is "fragmentation"...
« Last Edit: May 02, 2023, 04:41:13 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4562
  • Country: gb
Re: Help needed with some heap test code
« Reply #10 on: May 02, 2023, 02:50:16 pm »
Code: [Select]
boolean_t is_bug;

is_bug = (measured isNotEqualTo expected);

which brings to
-1- are you sure you have planned correctly what you expect?
-2- are you sure you measure it correctly?
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline dare

  • Contributor
  • Posts: 40
  • Country: us
Re: Help needed with some heap test code
« Reply #11 on: May 03, 2023, 05:01:04 pm »
I have not yet found a definitive sourcecode for the newlib heap I am using. Based on Cube IDE disassembly, it should contain calls to empty functions for obtaining and returning mutexes.

Generally speaking, malloc comes with the toolchain, as part of libc.  If you're using a relatively modern toolchain (and building with --specs=nano.specs) this is the version of libc malloc you should be using: https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/nano-mallocr.c;h=a2b50facc35ebdfb4fc3c8fb11cd170d634649b3;hb=HEAD.  (Note that there are #defines that map to the nano_* names onto regular malloc/free names).

One quick way to tell if you're using this malloc is look for the following symbols in your linker map file: __malloc_free_list, __malloc_sbrk_start, sbrk_aligned

If you're not using this malloc, then I reckon you should switch toolchains.  Also, I can't quite tell from your previous comments whether you've setup newlib's retargetable locking mechanism to work with FreeRTOS, but if not you will need to do that.

FWIW, I ran the following code in a test application I'm building and got perfectly acceptable behavior for a total heap size of 260012 bytes:

Code: [Select]
void TestHeap(void)
{
  int blocksize = 1000;
  int increment = 1023;

  Log("Total Heap Size: %" PRId32 "\n", (uint32_t)GetHeapTotalSize());

  while (true)
  {
    vTaskDelay(pdMS_TO_TICKS(200));

    char *buf = malloc(blocksize);

    if (buf!=NULL)
    {
      Log("malloc good, bk=%d addr=0x%08" PRIX32 "\n", blocksize, (uint32_t)buf);
      free(buf);
      blocksize += increment;
    }
    else
    {
      Log("malloc failed, bk=%d\n",blocksize);
      blocksize -= increment;
      increment /= 2;
      if (increment == 0)
        break;
      blocksize += increment;
    }
  }
}

Code: [Select]
Total Heap Size: 260012
malloc good, bk=1000 addr=0x20010858
malloc good, bk=2023 addr=0x20010C48
malloc good, bk=3046 addr=0x20011438
malloc good, bk=4069 addr=0x20010858
...
malloc good, bk=256750 addr=0x20010858
malloc good, bk=257773 addr=0x20010858
malloc good, bk=258796 addr=0x20010858
malloc good, bk=259819 addr=0x20010858
malloc failed, bk=260842
malloc failed, bk=260330
malloc failed, bk=260074
malloc good, bk=259946 addr=0x20010858
malloc failed, bk=260073
malloc failed, bk=260009
malloc good, bk=259977 addr=0x20010858
malloc failed, bk=260008
malloc good, bk=259992 addr=0x20010858
malloc failed, bk=260007
malloc good, bk=259999 addr=0x20010858
malloc failed, bk=260006
malloc failed, bk=260002
malloc good, bk=260000 addr=0x20010858
malloc failed, bk=260001
 

Offline peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4519
  • Country: gb
  • Doing electronics since the 1960s...
Re: Help needed with some heap test code
« Reply #12 on: May 04, 2023, 11:36:33 am »
Nope, those symbols are not found.

I don't think I am using newlib-nano. I can't remember the decision chain now but it may have been to do with support for double floats, which I do want.

This is my current setting:



But - see below - those parts of it which reside in libc.a have been bypassed.

Quote
whether you've setup newlib's retargetable locking mechanism to work with FreeRTOS, but if not you will need to do that.

I edited the heap code in (the weakened) libc.a to protect calls to malloc and free

Code: [Select]
/*
 * newlib_locking.c
 *
 *  Created on: 24 Jul 2022
 *      Author: peter
 *
 *  These functions replace the empty ones in the newlib heap code.
 *  See notes under LIBC on what this relates to - readme.txt.
 *  We don't have the source for the libc.a library so this code replaces
 *  the empty mutex stubs.
 *
 *  Based around
 *  [url]https://gist.github.com/thomask77/3a2d54a482c294beec5d87730e163bdd[/url]
 *
 *  Also see
 *  [url]https://www.eevblog.com/forum/programming/st-cube-gcc-how-to-override-a-function-not-defined-as-weak-(no-sources)[/url]
 *
 *  27/7/22 PH Currently implements just heap mutexes.
 *
 *
 *
 */


#include "FreeRTOS.h"
#include "cmsis_os2.h"
#include <newlib_locking.h>


extern osMutexId_t g_HEAP_Mutex;


// The mutex lock is *not* recursive (as is shown in many examples online) because
// that seems pointless.
// These two functions are used by both malloc and free.

void __malloc_lock(void)
{
osMutexAcquire(g_HEAP_Mutex,osWaitForever);
}

void __malloc_unlock (void)
{
osMutexRelease(g_HEAP_Mutex);
}

// This one is not used
void __malloc_lock_acquire(void)
{
osMutexAcquire(g_HEAP_Mutex,osWaitForever);
}


The heap code in libc.a contained called to __malloc_lock and __malloc_unlock. Once libc.a was weakened (I could have done just the required functions but it was just as easy to do the whole lib) the above functions replaced the dummy stubs in libc.a.

Various people online have since commented that this is all BS and that there are much smarter ways but I never found any. And assertions that libc.a is weak are certainly BS for the code which came with Cube IDE c. 2019 which is what I am working with.

Exhaustive multi-thread tests over months suggest this is working as intended.

I then froze a local copy of libc.a so future Cube (and occassionally GCC) updated do not affect it. I spent some time identifying which of 66 (just counted them) libc.a files is the right one, made a local copy, and did a batch script to weaken it and replace the offending functions. There are posts here about it
https://www.eevblog.com/forum/microcontrollers/is-st-cube-ide-a-piece-of-buggy-crap/msg4363183/#msg4363183
The most important criteria were that it does hardware floats and supports the 32F4; most of the 66 files are obviously not applicable but some are close.
« Last Edit: May 04, 2023, 05:58:01 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf