Boron 2.1.0
Datatype String

Strings are stored using the single word per character Latin-1 and UCS-2 encodings (UR_ENC_LATIN1/UR_ENC_UCS2). More...

Macros

#define ur_cstr(strC, bin)
 Make null terminated UTF-8 string in binary buffer.
 
#define ur_strFree   ur_arrFree
 A string is a simple array.
 

Functions

UIndex ur_makeString (UThread *ut, int enc, int size)
 Generate and initialize a single string buffer.
 
UBufferur_makeStringCell (UThread *ut, int enc, int size, UCell *cell)
 Generate a single string and set cell to reference it.
 
UIndex ur_makeStringLatin1 (UThread *ut, const uint8_t *it, const uint8_t *end)
 Generate and initialize a single string buffer from memory holding a Latin-1 string.
 
void ur_strInitUtf8 (UBuffer *buf, const uint8_t *it, const uint8_t *end)
 Initialize a single string buffer from memory holding a UTF-8 string.
 
UIndex ur_makeStringUtf8 (UThread *ut, const uint8_t *it, const uint8_t *end)
 Generate and initialize a single string buffer from memory holding a UTF-8 string.
 
void ur_strInit (UBuffer *buf, int enc, int size)
 Initialize buffer to type UT_STRING.
 
void ur_strAppendChar (UBuffer *str, int uc)
 Append a single UCS2 character to a string.
 
void ur_strAppendCStr (UBuffer *str, const char *cstr)
 Append a null-terminated UTF-8 string to a string buffer.
 
void ur_strAppendInt (UBuffer *str, int32_t n)
 Append an integer to a string.
 
void ur_strAppendInt64 (UBuffer *str, int64_t n)
 Append an 64-bit integer to a string.
 
void ur_strAppendHex (UBuffer *str, uint32_t n, uint32_t hi)
 Append a hexidecimal integer to a string.
 
void ur_strAppendDouble (UBuffer *str, double n)
 Append a double to a string.
 
void ur_strAppendFloat (UBuffer *str, float n)
 Append a float to a string.
 
void ur_strAppendIndent (UBuffer *str, int depth)
 Append tabs to a string.
 
void ur_strAppend (UBuffer *str, const UBuffer *strB, UIndex itB, UIndex endB)
 Append another string buffer to this string.
 
void ur_strAppendBinary (UBuffer *str, const uint8_t *it, const uint8_t *end, enum UrlanBinaryEncoding enc)
 Append binary data as text of the specified encoding.
 
void ur_strTermNull (UBuffer *str)
 Terminate with null character so buffer can be used as a C string.
 
int ur_strIsAscii (const UBuffer *str)
 Test if all characters are ASCII.
 
void ur_strFlatten (UBuffer *str)
 Convert a UTF-8 or UCS-2 string buffer to Latin-1 if possible.
 
void ur_strLowercase (UBuffer *buf, UIndex start, UIndex send)
 Convert characters of string slice to lowercase.
 
void ur_strUppercase (UBuffer *buf, UIndex start, UIndex send)
 Convert characters of string slice to uppercase.
 
UIndex ur_strFindChar (const UBuffer *str, UIndex start, UIndex end, int ch, int opt)
 Find the first instance of a character in a string.
 
UIndex ur_strFindChars (const UBuffer *str, UIndex start, UIndex end, const uint8_t *charSet, int len)
 Find the first character of a set in a string.
 
UIndex ur_strFindCharsRev (const UBuffer *str, UIndex start, UIndex end, const uint8_t *charSet, int len)
 Find the last character of a set in a string.
 
UIndex ur_strFind (const USeriesIter *ai, const USeriesIter *bi, int matchCase)
 Find string in another string or binary series.
 
UIndex ur_strFindRev (const USeriesIter *ai, const USeriesIter *bi, int matchCase)
 Find last string in another string or binary series.
 
UIndex ur_strMatch (const USeriesIter *ai, const USeriesIter *bi, int matchCase)
 Compare characters in two string or binary series.
 
int ur_strChar (const UBuffer *str, UIndex pos)
 Return the character at a given position.
 
char * ur_cstring (const UBuffer *str, UBuffer *bin, UIndex start, UIndex end)
 Make null terminated UTF-8 string in binary buffer.
 

Detailed Description

Strings are stored using the single word per character Latin-1 and UCS-2 encodings (UR_ENC_LATIN1/UR_ENC_UCS2).

UTF-8 (UR_ENC_UTF8) is only handled by ur_strAppend() and ur_makeStringUtf8() in order to bring UTF-8 strings into or out of the datatype system.

Macro Definition Documentation

◆ ur_cstr

#define ur_cstr ( strC,
bin )
Value:
ur_cstring(ur_bufferSer(strC), bin, strC->series.it, strC->series.end)
char * ur_cstring(const UBuffer *, UBuffer *bin, UIndex start, UIndex end)
Make null terminated UTF-8 string in binary buffer.
Definition string.c:1684
#define ur_bufferSer(c)
Convenience macro for ur_bufferSeries().
Definition urlan.h:752

Make null terminated UTF-8 string in binary buffer.

This calls ur_cstring().

Parameters
strCValid UT_STRING or UT_FILE cell.
binInitialized binary buffer to use.

Function Documentation

◆ ur_cstring()

char * ur_cstring ( const UBuffer * str,
UBuffer * bin,
UIndex start,
UIndex end )

Make null terminated UTF-8 string in binary buffer.

Parameters
strValid string buffer.
binInitialized binary buffer. The contents are replaced with the C string.
startStart position in str.
endEnd position in str. A negative number is the same as str->used.
Returns
Pointer to C string in bin.

◆ ur_makeString()

UIndex ur_makeString ( UThread * ut,
int enc,
int size )

Generate and initialize a single string buffer.

If you need multiple buffers then ur_genBuffers() should be used.

The caller must create a UCell for this string in a held block before the next ur_recycle() or else it will be garbage collected.

Parameters
encEncoding type.
sizeNumber of characters to reserve.
Returns
Buffer id of string.

◆ ur_makeStringCell()

UBuffer * ur_makeStringCell ( UThread * ut,
int enc,
int size,
UCell * cell )

Generate a single string and set cell to reference it.

If you need multiple buffers then ur_genBuffers() should be used.

Parameters
encEncoding type.
sizeNumber of characters to reserve.
cellCell to initialize.
Returns
Pointer to string buffer.

◆ ur_makeStringLatin1()

UIndex ur_makeStringLatin1 ( UThread * ut,
const uint8_t * it,
const uint8_t * end )

Generate and initialize a single string buffer from memory holding a Latin-1 string.

Caret escape sequences are converted to individual characters. This calls ur_makeString() internally.

Parameters
itStart of Latin-1 data.
endEnd of Latin-1 data.
Returns
Buffer id of string.

◆ ur_makeStringUtf8()

UIndex ur_makeStringUtf8 ( UThread * ut,
const uint8_t * it,
const uint8_t * end )

Generate and initialize a single string buffer from memory holding a UTF-8 string.

Caret escape sequences are converted to individual characters.

Parameters
itStart of UTF-8 data.
endEnd of UTF-8 data.
Returns
Buffer id of string.

◆ ur_strAppend()

void ur_strAppend ( UBuffer * str,
const UBuffer * strB,
UIndex itB,
UIndex endB )

Append another string buffer to this string.

Parameters
strDestination string.
strBString to append.
itBStart character of strB.
endBEnd character of strB.

◆ ur_strAppendFloat()

void ur_strAppendFloat ( UBuffer * str,
float n )

Append a float to a string.

This emits fewer significant digits than ur_strAppendDouble().

◆ ur_strAppendIndent()

void ur_strAppendIndent ( UBuffer * str,
int depth )

Append tabs to a string.

Parameters
depthNumber of tabs to append.

◆ ur_strChar()

int ur_strChar ( const UBuffer * str,
UIndex pos )

Return the character at a given position.

If the str->form is UR_ENC_UTF8, then the return value will be the byte at pos, not the UCS2 character.

Parameters
strValid string buffer.
posCharacter index. Pass negative numbers to index from the end (e.g. -1 will return the last character).
Returns
UCS2 value or -1 if pos is out of range.

◆ ur_strFind()

UIndex ur_strFind ( const USeriesIter * ai,
const USeriesIter * bi,
int matchCase )

Find string in another string or binary series.

Parameters
aiString/binary to search.
biPattern to look for.
matchCaseIf non-zero, compare character cases.
Returns
Index of pattern in string or -1 if not found.

◆ ur_strFindChar()

UIndex ur_strFindChar ( const UBuffer * str,
UIndex start,
UIndex end,
int ch,
int opt )

Find the first instance of a character in a string.

Parameters
strValid string/binary buffer.
startStart index in str.
endEnding index in str.
chCharacter to look for.
optMask of UrlanFindOption (UR_FIND_CASE, UR_FIND_LAST).
Returns
First index where character is found or -1 if not found.

◆ ur_strFindChars()

UIndex ur_strFindChars ( const UBuffer * str,
UIndex start,
UIndex end,
const uint8_t * charSet,
int len )

Find the first character of a set in a string.

Parameters
strValid string buffer.
startStart index in str.
endEnding index in str.
charSetBitset of characters to look for.
lenByte length of charSet.
Returns
First index where any characters in charSet are found or -1 if none are found.

◆ ur_strFindCharsRev()

UIndex ur_strFindCharsRev ( const UBuffer * str,
UIndex start,
UIndex end,
const uint8_t * charSet,
int len )

Find the last character of a set in a string.

Parameters
strValid string buffer.
startStart index in str.
endEnding index in str.
charSetBitset of characters to look for.
lenByte length of charSet.
Returns
Last index where any characters in charSet are found or -1 if none are found.

◆ ur_strFindRev()

UIndex ur_strFindRev ( const USeriesIter * ai,
const USeriesIter * bi,
int matchCase )

Find last string in another string or binary series.

Parameters
aiString/binary to search.
biPattern to look for.
matchCaseIf non-zero, compare character cases.
Returns
Index of pattern in string or -1 if not found.

◆ ur_strFlatten()

void ur_strFlatten ( UBuffer * str)

Convert a UTF-8 or UCS-2 string buffer to Latin-1 if possible.

Parameters
strValid string buffer.

◆ ur_strInit()

void ur_strInit ( UBuffer * buf,
int enc,
int size )

Initialize buffer to type UT_STRING.

Parameters
bufUninitialized buffer.
encEncoding type.
sizeNumber of characters to reserve.

◆ ur_strInitUtf8()

void ur_strInitUtf8 ( UBuffer * buf,
const uint8_t * it,
const uint8_t * end )

Initialize a single string buffer from memory holding a UTF-8 string.

Caret escape sequences are converted to individual characters. This calls ur_strInit() internally.

Parameters
bufUninitialized buffer.
itStart of UTF-8 data.
endEnd of UTF-8 data.

◆ ur_strIsAscii()

int ur_strIsAscii ( const UBuffer * str)

Test if all characters are ASCII.

Parameters
strValid string buffer.
Returns
Non-zero if all characters are ASCII.

◆ ur_strLowercase()

void ur_strLowercase ( UBuffer * buf,
UIndex start,
UIndex send )

Convert characters of string slice to lowercase.

Parameters
bufPointer to valid string buffer.
startStart position.
sendSlice end position.

◆ ur_strMatch()

UIndex ur_strMatch ( const USeriesIter * ai,
const USeriesIter * bi,
int matchCase )

Compare characters in two string or binary series.

Parameters
aiString/binary slice A.
biString/binary slice B.
matchCaseIf non-zero, compare character cases.
Returns
Number of characters which match in strings.

◆ ur_strTermNull()

void ur_strTermNull ( UBuffer * str)

Terminate with null character so buffer can be used as a C string.

Str->used is not changed.

◆ ur_strUppercase()

void ur_strUppercase ( UBuffer * buf,
UIndex start,
UIndex send )

Convert characters of string slice to uppercase.

Parameters
bufPointer to valid string buffer.
startStart position.
sendSlice end position.