libk  kstr.md at [12a51d9c50]

File mod/kstr/kstr.md artifact 0b0860371c part of check-in 12a51d9c50


# kstr

**kstr** is the libk string library. it uses the **short** naming convention with the glyph `s`. **kstr** implies `#include <k/mem.h>`.

# types

## struct kstr
`struct kstr` is a structure for holding pascal strings (length-prefixed strings). it is the basic libk string type. **note:** if `ptr.ref` ≠ NULL and `sz` = 0, the string's length is unknown and should be calculated by any function that operates on a kstr, storing the result in the object if possible.
 * `sz size` - length of string, excluding any null terminator
 * `kmptr ptr` - pointer to string in memory

## struct ksraw
`struct ksraw` is like `kstr` except it uses raw `char` pointers instead of a `kmptr`.
 * `sz size` - length of string, excluding any null terminator
 * `char* ptr` - pointer to string in memory

## struct ksbuf
`struct ksbuf` is a structure used for buffered IO.
 * `sz run` - maximum size of buffer, including any null terminator
 * `kiochan channel` - the channel that output will be written to when flushed
 * `char* cur` - a pointer that tracks the length of the buffer
 * `char buf []` - region of memory to store buffer in

## struct kschain
`struct kschain` is a structure used for string accumulators that works by aggregating pointers to strings, instead of copying the strings themselves.
 * `kschain_kind kind` - kind of chain
 * `kmkind rule` - kind of allocation to use if `kind` ≠ `kschain_kind_linked`
 * `pstr* ptrs` - pointer to pointer list
 * `sz ptrc` - number of pointers
 * `sz size` - total amount of space in `ptrs`

### enum kschain_kind
 * `kschain_kind_block` - occupies a single block of memory
 * `kschain_kind_linked` - uses a linked list, allocated and deallocated as necessary

## enum ksalloc
`enum ksalloc` is an enumerator that tells libk what strategy to use when filling a `kschain` struct.
 * `ksalloc_static` - do not allocate memory, fill an already-allocated, statically-sized array.
 * `ksalloc_alloc` - allocate a string in memory using the specified kind of allocator.
 * `ksalloc_dynamic` - fill an already-allocated array if possible, allocate a string in memory if the string length exceeds available space.

# functions

## kssz
`size_t kssz(char* str, size_t max)` returns the number of characters in a C string, **including** the final null. will count at most `max` characters if `max` > 0.

## kstr
`kstr kstr(char* str, size_t max)` takes a C string and returns a P-string, calculating the length of `str` and storing it in the return value. `max` works as in `kssz`.

## kstoraw
`ksraw ksref(kstr)` is a simple convenience function that returns the `ksraw` form of a `kstr`.

## kscp
`kscond kscp(ksraw src, ksmut dest, sz* len)` copies the string pointed to by `src` into `dest`. its behavior varies depending on the value of `src.size` — if the size is already known, attempts to copy a longer string on top of a shorter one will immediately fail with no changes made to either string. if the size is set to zero, `kscp()` will copy as many bytes as it can before it hits either a NUL terminator in the source string or reaches the end of the destination string. if `dest.src` is zero, kscp simply copies until it hits the first NUL, or reaches `src.ptr[src.size - 1]`. for safety reasons, kscp always terminates `dest` with a NUL when it has enough space to, even if neither string ended with a NUL. if a partial copy occurs, `kscp` will return a `kscond` of `kscond_partial`.

## ksbufmk
`ksbuf* ksbufmk(void* where, kiochan channel, sz run)` initializes a new buffer at the specified address. `run` should be equivalent to the full length of the memory region minus `sizeof(struct ksbuf)` - in other words, the size of the string the `ksbuf` can hold. memory should be allocated by the user, either by creating an array on the stack or with `kmem` allocation functions. `ksbufmk()` returns a pointer to the new structure. the return value will always point to the same point in memory as `where`, but will be of the appropriate type to pass to buffer functions.

## ksbufput
`kcond ksbufput(ksbuf* b, ksraw str)` copies a string into a buffer with `kscp`. flushing it as necessary.

## ksbufflush
`kcond ksbufflush(ksbuf* b)` flushes a buffer to its assigned channel, emptying it and readying it for another write cycle. a buffer should almost always be flushed before it goes out of scope or is deallocated.

## kscomp
`char* kscomp(size_t ct, ksraw struct[], kmbuf* buf)` is a **string composition** function. it serves as an efficient, generalized replacement for functions like `strcat` and `strdup`.

to use kscomp, create an array of `kstr` and fill it with the strings you wish to concatenate. for example, to programmatically generate an HTML link tag, you might use the following code.

	char mem[512];
	kmptr text = <...>;
	char* src = <...>;
	kmbuf buf = { sizeof mem, &mem, kmkind_none };
    kstr chain[] = {
		Kstr("<a href=\""), { 0, src }, Kstr("\">"),
			ksref(text),
		Kstr("</a>")
	};
	char* html = kscomp(Kmsz(chain), chain, &buf);

kscomp will only calculate the length of individual strings if they are not already known. when it needs to calculate the length of a string, it will store that length in the original array so repeated calls can be made without needing to repeatedly calculate the lengths. this is not always desirable, so the variant `kscompc` exists, which is exactly the same as `kscomp` in every respect except that `chain` is not altered in any way.

## ksemit
`ksemit(sz len, ksraw* array, kiochan channel)` takes a `len`-length `array` of `ksraw`s and prints them to a channel. a buffer will be allocated based on the total length of the strings to avoid unnecessary write calls.

## ksemitc
`ksemitc(const char** array, sz bufsz, kiochan channel)` takes a null-terminated `array` of NUL-terminated strings and buffer-prints them to a channel. `bufsz` controls the size of the buffer used, and should be as close as possible to the size of the strings emitted. the buffer will be kept on the stack, so no memory management or cleanup is necessary.

# macros
if `KFclean` is not set when <k/str.h> is included, the following macros are defined.

 * `Kstr(string)` - the compile-time equivalent to `kstr()`. `Kstr` takes a literal string and inserts the text `{ sizeof (string) - 1, string }` into the document, suitable for initializing a kstr.