19.0.0 released Nov 08, 2023
Using trees for faster in-memory data access


A tree is a data structure that organizes key/value pairs into branches based on some ordering criteria. Each node then branches further down into lower nodes, and the entire structure resembles a tree, hence the name. There are many different kinds of trees, such as AVL, B, B++, Red-Black tree etc. They generally differ in a way the tree is maintained and the way data in it is connected, but most of those widely used today generally provide O(log(N)) access. That means in a tree holding N nodes, you need to access about log(N) of them to find any given key. So for example, if you have 1,000,000 keys/value pairs, it will take about 20 key comparisons to find any of them (since 2^20 is approximately 1M).

As an example of a tree structure, consider this:

Vely

In this tree, keys "3", "4", "5", "6", "7", "8" and "9" are stored. An arrow to the left points to a lesser key, while an arrow to the right points to a greater one. You can see how it takes at most 3 key comparisons to find any of the keys, since the tree is 3-level deep. Also, this is a binary tree, as each node has 2 children nodes branching from it (left and right). A tree can have multiple keys per node (typically B-variations), and those are generally well suited for storing data on disk, where access to any data is done in large blocks; such trees work well in-memory as well. However, binary trees are also well suited for in-memory access, if not even more so. Vely uses a modified AVL binary tree that also provides some benefits of various "B" trees, specifically fast range searches; additionally such benefits are optional so you have more options in order to make a tree structure fit your exact needs.

A tree above is a balanced one, which means that the heights of each node's children do not differ by more than one (in this case they are exactly the same). A balanced tree, just like its name suggests, looks "balanced" when shown in its graphical form like it is here. In order to achieve this kind of balance, a tree needs to be maintained with each insert and delete. For instance, if you were to insert keys above in an increasing order, your tree may initially look like this, simply because each insertion goes to the right because each next key is greater:

Vely

This binary tree effectively degenerates into a sorted list, as each node is followed by another to the right, due to each being greater than the previous. Searching a tree like this is O(N), meaning you may have to traverse all nodes in worst cases to reach a key/value pair. To balance a tree, connections between the nodes are re-arranged. This is a relatively fast operation, since it doesn't involve copying keys nor values; instead only the direction where a left or right connection goes from a node is changed. For instance:

Vely

In this case the tree was rotated to the left around the element with the key of "6". Now, if you right-rotate the left branch around key "4", and if you left-rotate the right branch around key "8", you will get the original balanced tree.

In practicality, these rotations happen when each element is inserted or deleted, and not after you already have a number of key/value pairs in the tree. The above is just to illustrate what could happen if rotations like the above weren't applied, and the effect they have in order to balance the tree. Note that trees can be balanced in different ways; in this case what you're balancing is the height of the tree from any given node.

AVL tree is a binary tree where the difference between a sub-tree on the right and on the left is at most 1. Vely's tree adds (optionally) a linked list that connects all nodes in order from minimum to maxium key; this capability is impotant because it allows very fast range searches; this means starting from a given key and moving to the next lower or greater key until some condition is met. Without this, each next range key would take O(log(N)) access time; with this feature, it's O(1), i.e. the fastest possible access with just a single hop. That's why a Vely tree is "hybrid" AVL/B tree, because its features are found in both varieties; it also makes it high-performance and well-suited for in-memory database or cache.
AVL/B tree cache server application
In this example you will create a cache server that uses fast tree-based data access in order to insert, update and search key/value pairs.

Create new "tree" application first, in a new directory (you can name it anything you like):
mkdir -p tree
cd tree

The vf command is a Vely program manager and it will create a new application (see how-vely-works) named "tree":
sudo vf -i -u $(whoami) tree

To get vim highlighting of Vely syntax:
vv -m

Create a source code file "treesrv.vely":
vi treesrv.vely

and copy and paste this to it:
%% /treesrv
    out-header default

    do-once
    new-tree define t process-scope
    end-do-once

    // Get input parameters
    task-param op
    input-param key
    input-param data

    if-task "add" // Add data to tree
        // Make a copy of key,data so they are Vely-allocated
        copy-string key to define c_key
        copy-string data to define c_data
        write-tree t key c_key value c_data status define st
        if (st == VV_ERR_EXIST) {
            delete-mem c_key
            delete-mem c_data
        }
        @Added [<<p-out key>>]
    else-task "delete" // Delete data and obtain the value deleted
        delete-tree t key (key) value define val old-key define okey status define st
        if (st == VV_ERR_EXIST) {
            @Not found [<<p-out key>>]
        } else {
            // If found, then delete key and value
            @Deleted [<<p-out val>>]
            delete-mem val
            delete-mem okey
        }
    else-task "query" // Query tree based on key value
        read-tree t key (key) value define val status define st
        if (st == VV_ERR_EXIST) {
            @Not found, queried [<<p-out key>>]
        } else {
            @Value [<<p-out val>>]
        }
    end-task
%%

Note that the source file name ("treesrv.vely") should match the request name, which is "treesrv". If you're using hyphens (which is useful for web applications), just substitute with underscore. The fact that a request is implemented in a file with the same name helps keep your applications neat, tidy and easy to peruse.

In do-once statement, a new-tree is created with a process scope; that means all the data in this tree will be kept for as long as the process lives. Normally all memory used by a request in Vely is released at the end of the request. The memory for a new tree here will remain throughout all requests this process will serve. The reason for do-once statement is because you only need to create the tree once for the life of the process.

There are 3 input parameters: "op", "key" and "data". Note that "op" is declared as a task-param (which is a special kind of input-param) because it determines what task (i.e. "op"eration) is being done. The other two, "key" and "data", are used as input parameter values for these tasks.

If the task is "add", a new key/data pair is added to the tree using write-tree statement. Note that copies of "key" and "data" input parameters are created because write-tree expects allocated memory (and the memory from input-param isn't).

If the task is "delete", a tree node with "key" is deleted.

If the task is "query", a value from the node with "key" is returned.

This code is a tree-based cache-server that maintains key/value pairs in memory, and allows adding, deleting and querying.
Make an executable program
Now, make a native executable:
vv -q

Run as application server
"Application server" means a daemon, a resident server process that remains in memory and can serve many requests very fast. Here's how you do that:
vf -w 1 tree

The above will start an application server process to serve incoming requests.
Using server from command line
Testing your server from command line is easy:
vv -r --req="/treesrv/op/add/key/1/data/one" --exec --server --silent-header --app="/tree"

Note the "--server" option. It says to contact a server and execute the same request as before. A process you started is staying resident in memory and serving the incoming requests. This way, your server can serve a large number of concurrent requests because it handles each operation very fast.

Again, you can see what's going on behind scenes by omitting "--exec":
vv -r --req="/treesrv/op/add/key/1/data/one" --server --silent-header --app="/tree"

The result being:
export CONTENT_TYPE=
export CONTENT_LENGTH=
export VV_SILENT_HEADER=yes
export REQUEST_METHOD=GET
export SCRIPT_NAME="/tree"
export PATH_INFO="/treesrv/op/add/key/1/data/one"
export QUERY_STRING=""
cgi-fcgi -connect /var/lib/vv/tree/sock/sock  /treesrv

Here a "cgi-fcgi" client program will contact the server you started, get a response, and print it out. You can make your own client application by using FastCGI-API; this way you can do whatever you want with the response, and you can do so in a multi-threaded application, since Vely's FastCGI API is MT-safe.
Test with 100 keys
This is a bash test script to insert 100 keys into your cache server, query them to make sure they are there, delete them, and the query again to verify they're all gone. Create "test_tree" file:
vi test_tree

And copy/paste the following:
#Restart tree server for this test
vf -m restart tree

#Add 100 key/data pairs. Key value is 0,1,2... and data values are "data_0", "data_1", "data_2" etc.
for i in {1..100}; do
   if [ "$(curl -s "http://localhost/tree/treesrv/op/add/key/$i/data/data_$i")" != "Added [$i]" ]; then
       echo "Error adding key [$i]"
       exit -1
   fi
done

echo "Keys added"

#Query all 100 keys and check that values retrieved are the correct ones.
for i in {1..100}; do
   if [ "$(curl -s "http://localhost/tree/treesrv/op/query/key/$i")" != "Value [data_$i]" ]; then
       echo "Error querying key [$i]"
       exit -1
   fi
done

echo "Keys queried"

#Delete all 100 keys
ERR="0"
for i in {1..100}; do
   if [ "$(curl -s "http://localhost/tree/treesrv/op/delete/key/$i")" != "Deleted [data_$i]" ]; then
       echo "Error deleting key [$i]"
       exit -1
   fi
done

echo "Keys deleted"

#Query all 100 keys and check that values retrieved are the correct ones.
for i in {1..100}; do
   if [ "$(curl -s "http://localhost/tree/treesrv/op/query/key/$i")" != "Not found, queried [$i]" ]; then
       echo "Error querying key [$i] after deletion."
       exit -1
   fi
done

echo "Keys queried"

echo "Tree server test successful"

Make sure it's executable:
chmod +x test_tree

and now run it:
./test_tree

The result is this:
Keys added
Keys queried
Keys deleted
Keys queried
Tree server test successful

Run as web application
Of course, your application server will probably serve web requests. Check out connect-apache-unix-socket on how to connect Apache web server to your application server, or the same for Nginx: connect-nginx-unix-socket. FastCGI is supported widely among web servers, so you can use pretty much any web server of your choice.

Here's a brief intro for Apache:
Setup web server
This shows how to connect your application listening on a Unix socket (started with vf) to Apache web server.

- Step 1:
To setup Apache as a reverse proxy and connect your application to it, you need to enable FastCGI proxy support, which generally means "proxy" and "proxy_fcgi" modules - this is done only once:
- Step 2:
Edit the Apache configuration file:
Add this to the end of file ("/tree" is the application path (see request-URL) and "tree" is your application name):
ProxyPass "/tree" unix:///var/lib/vv/tree/sock/sock|fcgi://localhost/tree

- Step 3:
Finally, restart Apache. On Debian systems (like Ubuntu) or OpenSUSE:
sudo systemctl restart apache2

On Fedora systems (like RedHat) and Arch Linux:
sudo systemctl restart httpd

Note: you must not have any other URL resource that starts with "/tree" (such as for example "/tree.html" or "/tree_something" etc.) as the web server will attempt to pass them as a reverse proxy request, and they will likely not work. If you need to, you can change the application path to be different from "/tree", see request-URL.

Access application server from the browser
Use the following URL(s) to access your application server from a client like browser (see request-URL). Use actual IP or web address instead of 127.0.0.1 if different.
# Call your application server to add a key 
http://127.0.0.1/tree/treesrv/op/add/key/1/data/one

# Call your application server to query a key 
http://127.0.0.1/tree/treesrv/op/query/key/1

# Call your application server to delete a key 
http://127.0.0.1/tree/treesrv/op/delete/key/1

Note: if your server is on the Internet and it has a firewall, you may need to allow HTTP traffic - see ufw, firewall-cmd etc.
The result is:

Vely


Vely


Vely

See also
Examples
example-client-API  
example-cookies  
example-create-table  
example-develop-web-applications-in-C-programming-language  
example-distributed-servers  
example-docker  
example-encryption  
example-file-manager  
example-form  
example-hash-server  
example-hello-world  
example-how-to-design-application  
example-how-to-use-regex  
example-json  
example-multitenant-SaaS  
example-postgres-transactions  
examples  
example-sendmail  
example-shopping  
example-stock  
example-uploading-files  
example-using-mariadb-mysql  
example-using-trees-for-in-memory-queries  
example-utility  
example-write-report    
See all
documentation


You are free to copy, redistribute and adapt this web page (even commercially), as long as you give credit and provide a dofollow link back to this page - see full license at CC-BY-4.0. Copyright (c) 2019-2023 Dasoftver LLC. Vely and elephant logo are trademarks of Dasoftver LLC. The software and information on this web site are provided "AS IS" and without any warranties or guarantees of any kind. Icons from table-icons.io copyright Paweł Kuna, licensed under MIT license.