libk  kmem.md at [49bf71fb47]

File kmem/kmem.md artifact 73d0a96333 part of check-in 49bf71fb47


# kmem

**kmem** is a libk module that contains various functions for memory allocation and deallocation. it uses the **short** naming convention with the glyph `m`.

kmem allocators can work in several different ways. they can allocate memory directly from the heap (like `kmheapa()` and `kmlina()`), use a header that has already been allocated by another function, or allocate memory only from a pre-allocated pool. linear allocation with pool allocation is particularly useful, as it permits the very rapid allocation and deallocation of lots of objects with only a few adjustments to the heap, and no possibility of fragmentation or need for expensive algorithms like `malloc()` or `kmheapa()`

## module functions

kmem supplies two module-level functions, used to interact with the `kmptr` container type.

 * `kmfree(kmptr) → void` - free, downref, or ignore the pasted object as appropriate
 * `kmshred(kmptr) → void` - free, downref, or ignore the pasted object as appropriate. if deallocating, zero its contents
 * `kmstat(void*) → kmptr` - convenience function to wrap a pointer to a non-managed object in a `kmptr` struct, so it can be passed to functions that accept arbitrary objects. `kmptr p = kmstat(raw)` is equivalent to `kmptr p = { kmkind_none, raw, NULL }`.
 * `kmtaint(&kmptr) → void` - "taints" a `kmptr` object by setting it to be shredded when freed. this may be desirable if the object pointed to contains privileged information.
 * `kmzero(void*,sz) → void` - zeroes a region of memory
 * `kmozero(kmptr) → void` - zeroes an object in memory
 * `kmcopy(void* dest, void* src, sz) → void` - copies one region of memory to another
 * `kmdup(kmptr) → kmptr` - duplicates an object in memory, allocating it as sibling of the original

## types

kmem defines the following types:
 
 * `enum kmkind` - enumerates allocation strategies
 * `struct kmptr` - abstract pointer object
 * `struct kmcell` - abstract memory cell
 * `struct kmref` - a reference-counted cell
 * `struct kmnode` - a node in an allocation tree
 * `struct kmpool` - a memory pool

`kmptr` and `kmcell` are both very similar. the difference is that a kmptr points to a region in memory and can be passed around freely. a `kmcell` is the actual in-memory representation of an allocation cell. a `kmcell` cannot be usefully instantiated; rather, it is downcast from an actual cell type (e.g. `kmnode n; kmcell* s = (kmcell*)(&n)`)


### kmkind

`kmkind` is an enum that specifies an allocation function.
 
 * `kmkind_none` - no allocation
 * `kmkind_lin` - linear heap allocation
 * `kmkind_heap` - random heap allocation
 * `kmkind_pool` - pool allocation
 * `kmkind_ref` - reference-counting allocation
 * `kmkind_tree` - tree allocation

### kmptr

kmem functions can operate on both raw pointers and the `kmptr` struct type. `kmptr` is a generic struct that can contain any kind of pointer. this is useful if you wish to allocate different objects in different manners, but pass them on into a single interface.

memory pointed at by `kmptr` pointers can be freed either with the usual specialized function, or by passing the `kmptr` structure itself to the generic function `kmfree`, which will handle it appropriately, even if it's a pointer to a garbage-collected object or to a static region of memory.

a `kmptr` has the following layout:

 * `kmkind kind` - codes the type of pointer; `kmkind_none` indicates a non-allocated pointer to a static (global or on-stack) object.
 * `kmshred shred` - an enum. if `kmshred_yes`, the value will be zeroed or otherwise made unreadable on free. if no, `kmfree` will consult `src` for shred policy if it is not NULL.
 * `void* ref` - the raw pointer enclosed by `cell`
 * `kmcell* cell` - a pointer to an object enclosure, typically either a memory pool or a referencing-counting object. NULL if not needed.
 
the convenience function `kmstat(void*) → kmptr` wraps a pointer to a static object in a `kmptr` struct.

### struct kmcell

`kmcell` is a stub struct used to disambiguate between source types. a "source" is an object that can hold an allocated object, such as the heap, a memory pool, a fixed-length array on stack, or a fixed-length global array. all values produced by a kmem allocation function can be cast to `kmcell*`, and have an intial field `id` that contains a `kmcell`.

 * `kmkind kind` - kind of cell
 * `size_t size` - size of cell (data plus all fields)
 * `kmshred shred` - shredding flag

### struct kmref

`kmref` is a struct that constitutes the in-memory representation of a reference-counted cell.

 * `kmcell id = { .kind = kmkind_ref, … } ` - kind of cell
 * `size_t refs` - number of active references 
 * `kmcell* src` - source, if any
 * `char data[]` - content of cell

### struct kmnode

`kmnode` is the header struct for tree nodes. all tree nodes pointers can yield a `kmnode` structure by subtracting `sizeof (kmnode)` from the pointer. a utility function and macro are made available to automate this safely.

 * `kmcell id = { .kind = kmkind_tree, … } ` - kind of cell
 * `kmnode* parent` - parent node
 * `kmnode* child` - first child node
 * `kmnode* lastchild` - last child node
 * `kmnode* prev` - previous sibling, NULL if first
 * `kmnode* next` - next sibling, NULL if last

### struct kmpool

 * `kmcell id = { .kind = kmkind_pool, … } ` - kind of cell
 * `size_t cellsz` - size of individual pool cells
 * `kmpoolcell* top` - pointer to most recently allocated pool cell
 * `kmpoolcell* bottom` - pointer to most recently freed pool cell
 * `kmpoolcell data[]` - content of cell

#### struct kmpoolcell

 * `kmpoolcell* last` - pointer to last element allocated before this one
 * `char data[]` - pool data

### enum kmshred

`kmshred` is an enum used to indicate whether an object should be "shredded" (written over) in memory when it's deleted. this is a useful means to ensure that privileged information is not accidentally left in memory after use. if the shredding mechanism is not useful, compile libk with the flag `KFmem_noshred` to exclude its functions and fields.

 * `kmshred_no = 0` - marks an object not to shred on free
 * `kmshred_yes = 1` - marks an object to shred on free

## naming convention

kmem function names are based on the **method** of allocation and the **action** being performed. methods are listed in the section below. kmem defines a number of standardized actions, though not every method uses every action. the character listed in brackets is suffixed to the name of the method to produce a function name: for instance, `kmheapa` will allocate memory on the heap, while `kmrefd` will decrement the reference count of its argument.

 * initialize [i]  - initializes a memory store on the heap
 * initialize fixed [if]  - initialize a memory store on the stack or in a fixed-size global
 * allocate [a]  - return a raw pointer to a new region of memory of the given size, ready to write, or NULL if not possible. contents of that region undefined. takes parameter (size_t sz).
 * allocate pointer object [ao]  - like *allocate*, but returns a `kmptr` instead of a raw `void*`.
 * zero [z]  - allocate a new region of memory and zero it before returning it for writing. 
 * zero pointer object [zo]  - like *zero*, but returns a `kmptr` instead of a raw `void*`.
 * free [f]  - free a section of memory, either decrementing a reference count or returning it to whatever pool it came from.
 * shred [s]  - destroy whatever was in the segment of memory, then return it to the pool it came from.
 * destroy [x]  - tears down a memory store
 * upref [u] - increments a reference counter

## methods

kmem currently supports the following methods of memory management, along with which methods are defined for it. (note that `a` implies `z` and `f` implies `s`). a method may be excluded from a libk binary by defining the flag `KFmem_no[name]`, e.g. `KFmem_noheap`.

the fastest allocator is the linear allocator, which should be sufficient for most simple programs. it allocates and deallocates memory simply by resizing the stack; there is no fragmentation, but objects must be freed in the order they are allocated. however, entire groups of objects can be freed at once at very little cost.

 * `lin` [iax] - linear heap allocator
   * `kmlini(void) → void*` - return a pointer to the current top of the heap
   * `kmlina(size_t) → void*` - allocate space on the heap and increase its size appropriately
   * `kmlinz(size_t) → void*` - allocate zero-filled space on the heap and increase its size appropriately
   * `kmlinx(void*) → void*` - returns the top of the heap to the location specified, freeing all memory allocated since the call to kmlini() or `kmlina()` that produced it
 * `heap` [af] - random heap allocation
   * `kmheapa(size_t) → void*` - allocate
   * `kmheapz(size_t) → void*` - zero-allocate
   * `kmheapao(size_t) → kmptr` - allocate pointer object
   * `kmheapzo(size_t) → kmptr` - zero-allocate pointer object
   * `kmheapf(void*) → void` - free
   * `kmheaps(void*) → void` - shred
 * `ref` [afu] - reference-counted heap object
   * `kmrefa(kmcell*, size_t) → void*` - allocate
   * `kmrefz(kmcell*, size_t) → void*` - zero-allocate
   * `kmrefao(kmcell*, size_t) → void*` - allocate pointer object
   * `kmrefzo(kmcell*, size_t) → void*` - zero-allocate pointer object
   * `kmreff(void*) → void` - downref; free if last ref
   * `kmrefs(void*) → void` - downref and mark for shred on last ref
 * `pool` [ixaf] - memory pool
   * `kmpooli(kmcell*, size_t sz, size_t n) → kmpool*` - initialize a fixed memory pool (a pool of `n` cells of length `sz`)
   * `kmpoolx(kmpool*) → void` - tear down a memory pool
   * `kmpoola(kmpool*) → void*` - allocate from pool
   * `kmpoolz(kmpool*, size_t) → void*` - zero-allocate from pool
   * `kmpoolao(kmpool*, size_t) → void*` - allocate pointer object
   * `kmpoolzo(kmpool*, size_t) → void*` - zero-allocate pointer object
   * `kmpoolf(void*) → void` - downref; free if last ref
   * `kmpools(void*) → void` - downref and mark for shred on last ref
 * `tree` [af] - uses a node-child strategy. when a node is freed, all of its children are automatically freed as well.
   * `kmtreea(kmcell* src, void* parent, size_t) → void*` - create a tree node. if `parent` is NULL, the node will the top of a new tree. if src is null, allocate on-heap.
   * `kmtreez(kmcell* src, void* parent, size_t) → void*` - like `kmtreea` but zeroed 
   * `kmtreeao(kmcell* src, void* parent, size_t) → kmptr` - like `kmtreea` but returns a `kmptr` 
   * `kmtreezo(kmcell* src, void* parent, size_t) → kmptr` - like `kmtreez` but returns a `kmptr` 
   * `kmtreef(void*) → kmptr` - frees a node and all its children

## macros

kmem defines the following macros.

 * `Kmsz(array)` - a convenience macro to return the number of elements in a static array. inserts the text `( sizeof (array) / sizeof (array) [0] )`
 * `Kmszs(type, struct)` - a convenience macro to return the size of a struct. requires compound literals.
 * `Kmszsa(type, array)` - calculates the number of elements in an array of a given struct type. inserts the text `( sizeof ( (type) array ) / sizeof (type) )`
 * `Kmpsa(type, struct)` - a convenience macro to insert a "pascal struct array" - that is, a struct array prefixed with a size value. this is equivalent to `Kmszsa(type, array), array`