libk  kstr.md at [8d6b36fcac]

File mod/kstr/kstr.md artifact 7f4b99ca6c part of check-in 8d6b36fcac


# kstr

**kstr** is the libk string library. it uses the **short** naming convention with the glyph `s`. **kstr** implies `#include <k/mem.h>`.

# types

## struct kstr
`struct kstr` is a structure for holding pascal strings (length-prefixed strings). it is the basic libk string type. **note:** if `ptr.ref` ≠ NULL and `sz` = 0, the string's length is unknown and should be calculated by any function that operates on a kstr, storing the result in the object if possible.
 * `sz size` - length of string, excluding any null terminator
 * `kmptr ptr` - pointer to string in memory

## struct ksraw
`struct ksraw` is like `kstr` except it uses raw `char` pointers instead of a `kmptr`.
 * `sz size` - length of string, excluding any null terminator
 * `char* ptr` - pointer to string in memory

## struct ksbuf
`struct ksbuf` is a structure used for buffered IO.
 * `sz run` - maximum size of buffer, including any null terminator
 * `kiochan channel` - the channel that output will be written to when flushed
 * `char* cur` - a pointer that tracks the length of the buffer
 * `char buf []` - region of memory to store buffer in

## struct kschain
`struct kschain` is a structure used for string accumulators that works by aggregating pointers to strings, instead of copying the strings themselves.
 * `kschain_kind kind` - kind of chain
 * `kmkind rule` - kind of allocation to use if `kind` ≠ `kschain_kind_linked`
 * `pstr* ptrs` - pointer to pointer list
 * `sz ptrc` - number of pointers
 * `sz size` - total amount of space in `ptrs`

### enum kschain_kind
 * `kschain_kind_block` - occupies a single block of memory
 * `kschain_kind_linked` - uses a linked list, allocated and deallocated as necessary

## enum ksalloc
`enum ksalloc` is an enumerator that tells libk what strategy to use when filling a `kschain` struct.
 * `ksalloc_static` - do not allocate memory, fill an already-allocated, statically-sized array.
 * `ksalloc_alloc` - allocate a string in memory using the specified kind of allocator.
 * `ksalloc_dynamic` - fill an already-allocated array if possible, allocate a string in memory if the string length exceeds available space.

# functions

## kssz
`size_t kssz(char* str, size_t max)` returns the number of characters in a C string, **including** the final null. will count at most `max` characters if `max` > 0.

## kstr
`kstr kstr(char* str, size_t max)` takes a C string and returns a P-string, calculating the length of `str` and storing it in the return value. `max` works as in `kssz`.

## kstoraw
`ksraw ksref(kstr)` is a simple convenience function that returns the `ksraw` form of a `kstr`.

## kscp
`kscond kscp(ksraw src, ksmut dest, sz* len)` copies the string pointed to by `src` into `dest`. its behavior varies depending on the value of `src.size` — if the size is already known, attempts to copy a longer string on top of a shorter one will immediately fail with no changes made to either string. if the size is set to zero, `kscp()` will copy as many bytes as it can before it hits either a NUL terminator in the source string or reaches the end of the destination string. if `dest.src` is zero, kscp simply copies until it hits the first NUL, or reaches `src.ptr[src.size - 1]`. for safety reasons, kscp always terminates `dest` with a NUL when it has enough space to, even if neither string ended with a NUL. if a partial copy occurs, `kscp` will return a `kscond` of `kscond_partial`.

## ksbufmk
`ksbuf* ksbufmk(void* where, kiochan channel, sz run)` initializes a new buffer at the specified address. `run` should be equivalent to the full length of the memory region minus `sizeof(struct ksbuf)` - in other words, the size of the string the `ksbuf` can hold. memory should be allocated by the user, either by creating an array on the stack or with `kmem` allocation functions. `ksbufmk()` returns a pointer to the new structure. the return value will always point to the same point in memory as `where`, but will be of the appropriate type to pass to buffer functions.

## ksbufput
`kcond ksbufput(ksbuf* b, ksraw str)` copies a string into a buffer with `kscp`. flushing it as necessary.

# ksbufflush
`kcond ksbufflush(ksbuf* b)` flushes a buffer to its assigned channel, emptying it and readying it for another write cycle. a buffer should almost always be flushed before it goes out of scope or is deallocated.

## kscomp
`char* kscomp(size_t ct, ksraw struct[], kmbuf* buf)` is a **string composition** function. it serves as an efficient, generalized replacement for functions like `strcat` and `strdup`.

to use kscomp, create an array of `kstr` and fill it with the strings you wish to concatenate. for example, to programmatically generate an HTML link tag, you might use the following code.

	char mem[512];
	kmptr text = <...>;
	char* src = <...>;
	kmbuf buf = { sizeof mem, &mem, kmkind_none };
    kstr chain[] = {
		Kstr("<a href=\""), { 0, src }, Kstr("\">"),
			ksref(text),
		Kstr("</a>")
	};
	char* html = kscomp(Kmsz(chain), chain, &buf);

kscomp will only calculate the length of individual strings if they are not already known. when it needs to calculate the length of a string, it will store that length in the original array so repeated calls can be made without needing to repeatedly calculate the lengths. this is not always desirable, so the variant `kscompc` exists, which is exactly the same as `kscomp` in every respect except that `chain` is not altered in any way.

## macros
if `KFclean` is not set when <k/str.h> is included, the following macros are defined.

 * `Kstr(string)` - the compile-time equivalent to `kstr()`. `Kstr` takes a literal string and inserts the text `{ sizeof (string), string }` into the document, suitable for initializing a kstr.