libk  kstr.md at [8d478e0b3c]

File mod/kstr/kstr.md artifact 0b0860371c part of check-in 8d478e0b3c


kstr

kstr is the libk string library. it uses the short naming convention with the glyph s. kstr implies #include <k/mem.h>.

types

struct kstr

struct kstr is a structure for holding pascal strings (length-prefixed strings). it is the basic libk string type. note: if ptr.ref ≠ NULL and sz = 0, the string's length is unknown and should be calculated by any function that operates on a kstr, storing the result in the object if possible. * sz size - length of string, excluding any null terminator * kmptr ptr - pointer to string in memory

struct ksraw

struct ksraw is like kstr except it uses raw char pointers instead of a kmptr. * sz size - length of string, excluding any null terminator * char* ptr - pointer to string in memory

struct ksbuf

struct ksbuf is a structure used for buffered IO. * sz run - maximum size of buffer, including any null terminator * kiochan channel - the channel that output will be written to when flushed * char* cur - a pointer that tracks the length of the buffer * char buf [] - region of memory to store buffer in

struct kschain

struct kschain is a structure used for string accumulators that works by aggregating pointers to strings, instead of copying the strings themselves. * kschain_kind kind - kind of chain * kmkind rule - kind of allocation to use if kindkschain_kind_linked * pstr* ptrs - pointer to pointer list * sz ptrc - number of pointers * sz size - total amount of space in ptrs

enum kschain_kind

  • kschain_kind_block - occupies a single block of memory
  • kschain_kind_linked - uses a linked list, allocated and deallocated as necessary

enum ksalloc

enum ksalloc is an enumerator that tells libk what strategy to use when filling a kschain struct. * ksalloc_static - do not allocate memory, fill an already-allocated, statically-sized array. * ksalloc_alloc - allocate a string in memory using the specified kind of allocator. * ksalloc_dynamic - fill an already-allocated array if possible, allocate a string in memory if the string length exceeds available space.

functions

kssz

size_t kssz(char* str, size_t max) returns the number of characters in a C string, including the final null. will count at most max characters if max > 0.

kstr

kstr kstr(char* str, size_t max) takes a C string and returns a P-string, calculating the length of str and storing it in the return value. max works as in kssz.

kstoraw

ksraw ksref(kstr) is a simple convenience function that returns the ksraw form of a kstr.

kscp

kscond kscp(ksraw src, ksmut dest, sz* len) copies the string pointed to by src into dest. its behavior varies depending on the value of src.size — if the size is already known, attempts to copy a longer string on top of a shorter one will immediately fail with no changes made to either string. if the size is set to zero, kscp() will copy as many bytes as it can before it hits either a NUL terminator in the source string or reaches the end of the destination string. if dest.src is zero, kscp simply copies until it hits the first NUL, or reaches src.ptr[src.size - 1]. for safety reasons, kscp always terminates dest with a NUL when it has enough space to, even if neither string ended with a NUL. if a partial copy occurs, kscp will return a kscond of kscond_partial.

ksbufmk

ksbuf* ksbufmk(void* where, kiochan channel, sz run) initializes a new buffer at the specified address. run should be equivalent to the full length of the memory region minus sizeof(struct ksbuf) - in other words, the size of the string the ksbuf can hold. memory should be allocated by the user, either by creating an array on the stack or with kmem allocation functions. ksbufmk() returns a pointer to the new structure. the return value will always point to the same point in memory as where, but will be of the appropriate type to pass to buffer functions.

ksbufput

kcond ksbufput(ksbuf* b, ksraw str) copies a string into a buffer with kscp. flushing it as necessary.

ksbufflush

kcond ksbufflush(ksbuf* b) flushes a buffer to its assigned channel, emptying it and readying it for another write cycle. a buffer should almost always be flushed before it goes out of scope or is deallocated.

kscomp

char* kscomp(size_t ct, ksraw struct[], kmbuf* buf) is a string composition function. it serves as an efficient, generalized replacement for functions like strcat and strdup.

to use kscomp, create an array of kstr and fill it with the strings you wish to concatenate. for example, to programmatically generate an HTML link tag, you might use the following code.

char mem[512];
kmptr text = <...>;
char* src = <...>;
kmbuf buf = { sizeof mem, &mem, kmkind_none };
kstr chain[] = {
	Kstr("<a href=\""), { 0, src }, Kstr("\">"),
		ksref(text),
	Kstr("</a>")
};
char* html = kscomp(Kmsz(chain), chain, &buf);

kscomp will only calculate the length of individual strings if they are not already known. when it needs to calculate the length of a string, it will store that length in the original array so repeated calls can be made without needing to repeatedly calculate the lengths. this is not always desirable, so the variant kscompc exists, which is exactly the same as kscomp in every respect except that chain is not altered in any way.

ksemit

ksemit(sz len, ksraw* array, kiochan channel) takes a len-length array of ksraws and prints them to a channel. a buffer will be allocated based on the total length of the strings to avoid unnecessary write calls.

ksemitc

ksemitc(const char** array, sz bufsz, kiochan channel) takes a null-terminated array of NUL-terminated strings and buffer-prints them to a channel. bufsz controls the size of the buffer used, and should be as close as possible to the size of the strings emitted. the buffer will be kept on the stack, so no memory management or cleanup is necessary.

macros

if KFclean is not set when <k/str.h> is included, the following macros are defined.

  • Kstr(string) - the compile-time equivalent to kstr(). Kstr takes a literal string and inserts the text { sizeof (string) - 1, string } into the document, suitable for initializing a kstr.