19.0.0 released Nov 08, 2023
Split string

Purpose: Split a string into pieces based on a delimiter.

split-string \
    ( <string> with <delimiter> to [ define ] <result> \
    | delete <result> )

split-string will find all instances of <delimiter> in <string> and then split it into pieces. The <result> is a pointer to a variable of type "vely_split_str" and it has two members: integer variable "num_pieces" which is the number of pieces <string> was broken into, and string array "pieces", which holds the actual pieces, each null-terminated. If variable <result> does not exist, you can create it with define clause. A delimiter within double quotes ("..") is not counted, i.e. it is skipped.

Note that <string> is altered by placement of null-terminators and will not hold the same value (rather it will hold only the leading portion of what it did before split-string took place). Each element of "pieces" array points to memory occupied by <string>. Hence, split-string does not copy any data and is very fast in performing the kind of parsing described here. You can copy string beforehand if you don't want it altered (see copy-string).

All pieces produced will be trimmed both on left and right. If a piece is double quoted, then double quotes are removed. For instance:
char clist[] = "a , b, \"c , d\" , e"
split-string clist with "," to define res

After this, the variable "res" will be an array of strings with these values:
res->num_pieces is 4
res->pieces[0] points to "a"
res->pieces[1] points to "b"
res->pieces[2] points to "c , d"
res->pieces[3] points to "e"

Also, since <string> is altered, it cannot be a constant - rather it must always be a variable, for example, if you do this with the intention to split this string based on "," as a delimiter:
char *str = "string,to,split";

your program will report an error (SIGSEGV most likely, or segmentation fault). You should do:
char str[] = "string,to,split";

split-string is useful for parsing CSV (Comma Separated Values) or any other kind of separated values, where separator can be any string of any length, for example if you're parsing an encoded URL-string, then "&amp;" may be a separator, as in the example below.
Allocated internals
<result> is allocated memory along with additional internal memory, which can be released if "delete" clause is used on a <result> from a previously executed split-string. See memory-handling for more on when (not) to delete memory explicitly like this; the same rules apply as for delete-mem.
Examples
The following will parse a string containing name/value pairs (such as "name=value") separated by string "&amp;":
// Data to parse - data/value pairs delimited by "&amp;" string, and data and value delimited by equal sign:
char instr[]="x=23&amp;y=good&amp;z=hello_world";

// Split string first into pieces based on "amp;"
split-string instr with "&amp;" to define assignment

// For each of name=value pairs, split it with equal sign
num i;
for (i = 0; i < assignment->num_pieces; i++) {
    split-string assignment->pieces[i] with "=" to define data
    pf-out "Variable %s has value %s\n", data->pieces[0], data->pieces[1]
}

The result is:
Variable x has value 23
Variable y has value good
Variable z has value hello world

See also
Strings
copy-string  
count-substring  
lower-string  
num-string  
split-string  
trim-string  
upper-string  
write-string    
See all
documentation


You are free to copy, redistribute and adapt this web page (even commercially), as long as you give credit and provide a dofollow link back to this page - see full license at CC-BY-4.0. Copyright (c) 2019-2023 Dasoftver LLC. Vely and elephant logo are trademarks of Dasoftver LLC. The software and information on this web site are provided "AS IS" and without any warranties or guarantees of any kind. Icons from table-icons.io copyright Paweł Kuna, licensed under MIT license.