This is mostly for making the info of the 6502.org forum topic "dynamic memory allocator, methods, uses, limitations, etc." both more concise and more complete. I was interested in writing a dynamic memory allocator for the 65c02 and later the 65816 microprocessors, for my workbench computers.
I started with the idea of "What if the memory allocator worked like a FAT-less RAM file chain, rather than a disc?" It would be basically a linked list. New additions always go on the end, so there's no need to search for space, "best fit" or otherwise, and there are no block boundaries. In fact, there are no blocks. If a buffer that's not at the end gets deleted, later-created ones get scooted up to close up the space, taking care of the garbage-collection and defragmentation issues. Re-sizing a buffer that's not at the end also requires scooting the following ones around, either to make more room or to close up space. The memory-moving is not instant, but it may be an acceptable penalty, to gain the other benefits.
I did it first in Forth to prove the concept, and intended to re-write it later in assembly (possibly as a Forth primitive) for better performance; but unless/until the need arises, I'll save the time and leave it now in Forth only. I did see some things in common for later writing words for local Forth variables including arrays (which ANS's way does not accommodate), and with improving my Forth assembler's way to deal with temporary labels; so I might come back and improve this later. Using locals (or even non-local variables!) for this should reduce the stack gymnastics.
What are the typical applications of these memory allocations? What applications would need the allocation (creation) and freeing (deletion) to be fast? The ones I'm thinking of don't need to get created and destroyed particularly fast, like tens of thousands times per second; but when you do need the speed, it will probably be for the one at the end of the chain, which can be deleted without moving other buffers anyway. Moving might be the only thing that takes much time.
Otherwise, what negative implications would there be? It's proven working, but I have not put it to practical use yet. I'm disappointed it's about 500 bytes (including headers). If I were to put it in my workbench computer's ROM, there's room and it's fine; but if I run the code in RAM, it might take some pretty substantial buffer work for it to pay for itself in RAM usage compared to just having the array space there full time when the program(s) that need it are resident.
In what I've done below, the calling program gives ALLOC the address of the calling program's variable which will point to the buffer. ALLOC puts this variable's address in the buffer's housekeeping bytes so FREE and RESIZE know where to update the variable if they move the buffer after deleting or re-sizing an earlier-created one.
There's no need for "blocks," since you can get the exact number of bytes requested (plus four housekeeping bytes). The collection of buffers does not take any more room than it needs to, unless you assign a little extra for each so they can handle a degree of resizing without having to move all the following buffers every time. I imagine the need to re-size buffers is rare, unlike the situation with files.
Here's a poor screenshot of a text diagram in the DOS/ANSI (code page 437) characters, showing the arrangement of bytes. The diagram shows three buffers allocated; but you can have any number you want as long as it all fits in 32KB. (It might work past that, but there's no reason to even check, especially since my current workbench computer only has 16KB of RAM.) Again, there's never any fragmentation with this method.
Each part of a program that uses a buffer should check the buffer's beginning address (at its variable) again anytime there's a chance it got moved. As long as multitasking is not preëmptive, it should be ok to keep a buffer's address on the stack for later use if convenient.
ISRs should not be moving buffers, and in fact they should run and be done faster than a buffer can be created, deleted, or resized.
Each program that uses a buffer has a variable to remember the buffer's address (ie, the beginning address of the data portion). So far, this is no different from ANS Forth's ALLOCATE. I've called mine ALLOC instead though because there's a difference in operation and in parameters passed back and forth. In mine, the program, when it requests the buffer, tells ALLOC what the address of this variable is, and ALLOC writes the buffer address into it, rather than passing it back on the stack. ALLOC also puts that address in a pair of housekeeping bytes kept with the buffer, so that if an earlier-created buffer later gets re-sized or deleted, such that this newer buffer gets moved, FREE or RESIZE knows where to write the buffer's new location. FREE or RESIZE may need to do this for several buffers at once.
Various parts of the program that use the buffer will all refer to the same variable. The program should remember the size it told ALLOC or RESIZE. That info could be recovered by looking at the pair of bytes at the first data address minus 4; but that's probably less efficient.
Stepping through the chain can be done starting at the lowest address, which has the 0000 end-of-chain marker in it. That marker is pointed to by the ALLOC_END variable. From there, the next pair of bytes is the data length of the newest buffer. Add four to that length, to get the overall length with housekeeping bytes, and that takes you to the next-oldest buffer's length cell. Repeat the process until you arrive at the ALLOC_END variable which is always in the same place, with only four bytes between it and the end of RAM. This is what ADJ_PGM_VARs below does which is used by FREE and RESIZE to adjust the contents of the program variables that point to their buffers.
Hopefully the following will not wrap on your monitor. I do this stuff in DOS (and yes, I do have 132 columns visible at once in DOS, and usually 43 lines, although I can go up to 60). The ^ in the comments indicates that the following tells what's on the data stack when that line is finished executing. the ^R is the same for the return stack. Default BASE is hex. Abbreviations: "buf"=buffer; "adr"=address; "pgm"=program; "var"=variable; "mem"=memory; "req"=request; "tot"=total; "len"=length; "dif"=difference; "rec"=record"; "del"=delete; "val"=value; "incr"=increment; "decr"=decrement; "adj"=adjust; "cnt"=count; "err"=error.
: UNUSED ?WORKSPACE 100 - ; ( -- bytes ) \ Modify this as necessary for your system. It should return the \ number of bytes available after leaving room for things like PAD. \ EM ("end of memory") below is the last byte of RAM, which on my workbench computer is 3FFF. \ The following become variables (even though it says "CONSTANT"): \ ALLOC_BUSY and BUF_MOVEs are 2 bytes each, to prevent trouble if you forget to use C@. EM-1 CONSTANT ALLOC_BUSY \ Flag var. Words below set it while they're working, and clear when done, like a semaphore. EM-3 CONSTANT BUF_MOVEs \ INCRement this every time buffers are moved. The calling program looks for changes if necessary. EM-5 CONSTANT ALLOC_END \ 2-byte var holding the adr of the 0000 signifying the end of buffers : INIT_ALLOCS ( -- ) \ This shouldn't be called when programs are still expecting to have buf space. ALLOC_END 2- ALLOC_END ! \ Make ALLOC_END point to one cell below it, ALLOC_END 2- OFF \ and zero that cell. ALLOC_BUSY OFF ; \ Turn off flag so any ISRs or tasks that may use buffers can know it's ok. \ ALLOCATE is an ANS word, but I'm calling mine ALLOC because I'm adding the var_adr input, and since \ ALLOC stores the buffer adr there, it doesn't need to be on the stack at the output. Good for any \ length of buffer that will fit in available memory; but do not use size=0. (No error checking.) : ALLOC ( size var_adr -- f ) UNUSED 2PICK U< \ Available mem less than req'd len? ^ var_adr size f IF 2DROP FALSE EXIT THEN \ If so, just return a false flag saying it didn't work. ALLOC_BUSY ON \ Now we know the req'd len will fit; so: ^ size var_adr OVER \ ^ size var_adr size ALLOC_END @ OVER - 2- 2! \ Sto len & pgm var adr together at start of buf. ^ size ALLOC_END @ SWAP - 4- \ Get the final ALLOC_END val ^ new_ALLOC_END_val DUP ALLOC_END ! \ and store it. ^ new_ALLOC_END_val DUP OFF \ Do the 0000 end-of-buffers marker. ^ new_ALLOC_END_val 6+ \ Get buf data adr to put in pgm var. ^ buf_adr DUP 2- @ ! \ Get adr of pgm var & store data adr there. ^ empty ALLOC_BUSY OFF TRUE ; \ ^ true \ This version, shorter than my original one, automatically does all bufs. : ADJ_PGM_VARs ( -- ) \ Adjust the pgm vars (not the records of their addresses in the buffer area!) ALLOC_END DUP @ 6+ \ Each "I" will give the data beginning adr of a buf. DO I DUP 2- @ ! \ Store the buf adr where the var adr's field points. I 4- @ 4+ \ For the incr, get the contents of the len byte, +4 for each buf's overhead. +LOOP ; \ Do not give FREE a wrong address! (There's no error-checking.) : FREE ( adr -- ) \ Adr is of 1st data byte of buf. Previous pair of bytes holds the pgm_var_adr. ALLOC_BUSY ON DUP 6- ALLOC_END @ = \ Is it the last buf created? IF \ If so, there are no later ones to scoot. ^ buf_adr DUP 4- @ + \ If del'ing newest buf, get data bytes count, add it to adr, ^ adr_of_next_buf's_len 2- DUP OFF \ decr adr to get adr of new end cell, and zero it, meaning ^ adr_of_end_len_cell ALLOC_END ! \ this last buf is del'ed. Update the rec of adr of 0000 cell. ^ empty ELSE \ ^ buf_adr ALLOC_END @ \ Get the "from" adr for CMOVE>. ^ buf_adr from OVER 4- @ 4+ \ Get the buf len + 4 overhead bytes. ^ buf_adr from tot_len DUP>R \ Keep rec of tol len of buf we're del'ing, to adj ALLOC_END later. ^R tot_len OVER + \ Add tot len to "from" adr to get "to" adr. ^ buf_adr from to ^R tot_len ROT ALLOC_END @ - 4- \ Finally, get the count. ^ from to cnt ^R tot_len CMOVE> \ Del buf, filling in the space it occupied. ^ empty ^R tot_len R> ALLOC_END +! \ Update ALLOC_END by inc'ing by tot len of del'ed buf. ^R empty ADJ_PGM_VARs \ Now adj the adrs of newer bufs in pgms' vars. BUF_MOVEs INCR \ Show that 1 or more vars that pgms use to remember buf adrs have been altered. THEN \ No need to incr BUF_MOVEs since no buffers were moved. ALLOC_BUSY OFF ; : RESIZE ( buf_adr new_len -- f ) \ Flag returned: True means successful. Don't req size<=0 though. No err checking. UNUSED \ Get the amount of available mem. ^ buf_adr new_len available OVER \ Copy the new len ^ buf_adr new_len available new_len 3PICK 4- @ - DUP>R \ subtract the old len to get difference ^ buf_adr new_len available len_dif - 0< \ and subtract that difference from available mem. ^ buf_adr new_len f IF R> 3DROP FALSE EXIT THEN \ If the margin was negative, exit with only false flag. ALLOC_BUSY ON \ ^ buf_adr new_len ^R len_dif ALLOC_END @ \ Get the "from" adr. ^ buf_adr new_len from ^R len_dif DUP R@ - \ Get the "to" adr. ^ buf_adr new_len from to ^R len_dif 2SWAP \ ^ from to buf_adr new_len ^R len_dif OVER 4- \ Get adr of len cell. ^ from to buf_adr new_len len_cell_adr ^R len_dif 2DUP >R >R \ Keep copy of new len & len cell adr to update latter after fetching the old len. @ R> R> ! MIN \ ^ from to buf_adr min_of_old_&_nu_len ^R len_dif + ALLOC_END @ - \ ^ from to cnt ^R len_dif R@ 0< \ Is a contraction being req'd? IF CMOVE> ELSE CMOVE THEN \ ^ empty ^R len_dif R> ALLOC_END -! \ Update ALLOC_END. ADJ_PGM_VARs BUF_MOVEs INCR \ Show that 1 or more vars used by pgms to remember buf adrs have been altered. ALLOC_BUSY OFF TRUE ; \ Indicate success. ^ true