I implemented and tested mmalloc, mmfree, and mmaddref in Linux using gcc. I used unsigned int 16 (u_int16_t) and void* for all the data types. It SHOULD run the same on both x86 and avr. There are no void*'s actually stored in the heap at all... the only way to find things in the heap is to walk it. Theoretically you could have up to a size_t heap, and this would still work, as long as each data block is 512 bytes or less.
Mini-malloc ported easily to the AVR. It has a 2K heap, and there is plently of data space left for other things. I would be happy with a 3K user heap on a 4k machine.
Next is to write the XRAM alloc functions, which will be along the same lines as this. For now, in XRAM, to get going in other parts of the projects, XRAM will have a stub that allocates only and never frees.
The intended use of handles is to create a sort of Virtual Memory type arragement. But it is not 'paged' virtual memory as modern CPUs would use. The user programs will ONLY allocate thru the use of handles. The program will ask for access to a handle, get a pointer, modify the data at the pointer, and then release the handle. The program should not keep many handles 'open' at a time, just whatever the current working set is. The simple, initial implemention will do a lot of copying back and forth from XRAM. But... just because a handle is released does not mean it has to be copied back immediately. When the last 'holder' of a handle releases it, the item can instead of being mmfree'd, be put in a queue. Only when mmalloc is about to fail, does it really have to copy the object back to XRAM. This resembles swapping on demand in a paged memory system.