19.0.0 released Nov 08, 2023
|
Using trees for faster in-memory data access
A tree is a data structure that organizes key/value pairs into branches based on some ordering criteria. Each node then branches further down into lower nodes, and the entire structure resembles a tree, hence the name. There are many different kinds of trees, such as AVL, B, B++, Red-Black tree etc. They generally differ in a way the tree is maintained and the way data in it is connected, but most of those widely used today generally provide O(log(N)) access. That means in a tree holding N nodes, you need to access about log(N) of them to find any given key. So for example, if you have 1,000,000 keys/value pairs, it will take about 20 key comparisons to find any of them (since 2^20 is approximately 1M).
As an example of a tree structure, consider this:
In this tree, keys "3", "4", "5", "6", "7", "8" and "9" are stored. An arrow to the left points to a lesser key, while an arrow to the right points to a greater one. You can see how it takes at most 3 key comparisons to find any of the keys, since the tree is 3-level deep. Also, this is a binary tree, as each node has 2 children nodes branching from it (left and right). A tree can have multiple keys per node (typically B-variations), and those are generally well suited for storing data on disk, where access to any data is done in large blocks; such trees work well in-memory as well. However, binary trees are also well suited for in-memory access, if not even more so. Vely uses a modified AVL binary tree that also provides some benefits of various "B" trees, specifically fast range searches; additionally such benefits are optional so you have more options in order to make a tree structure fit your exact needs.
A tree above is a balanced one, which means that the heights of each node's children do not differ by more than one (in this case they are exactly the same). A balanced tree, just like its name suggests, looks "balanced" when shown in its graphical form like it is here. In order to achieve this kind of balance, a tree needs to be maintained with each insert and delete. For instance, if you were to insert keys above in an increasing order, your tree may initially look like this, simply because each insertion goes to the right because each next key is greater:
This binary tree effectively degenerates into a sorted list, as each node is followed by another to the right, due to each being greater than the previous. Searching a tree like this is O(N), meaning you may have to traverse all nodes in worst cases to reach a key/value pair. To balance a tree, connections between the nodes are re-arranged. This is a relatively fast operation, since it doesn't involve copying keys nor values; instead only the direction where a left or right connection goes from a node is changed. For instance:
In this case the tree was rotated to the left around the element with the key of "6". Now, if you right-rotate the left branch around key "4", and if you left-rotate the right branch around key "8", you will get the original balanced tree.
In practicality, these rotations happen when each element is inserted or deleted, and not after you already have a number of key/value pairs in the tree. The above is just to illustrate what could happen if rotations like the above weren't applied, and the effect they have in order to balance the tree. Note that trees can be balanced in different ways; in this case what you're balancing is the height of the tree from any given node.
AVL tree is a binary tree where the difference between a sub-tree on the right and on the left is at most 1. Vely's tree adds (optionally) a linked list that connects all nodes in order from minimum to maxium key; this capability is impotant because it allows very fast range searches; this means starting from a given key and moving to the next lower or greater key until some condition is met. Without this, each next range key would take O(log(N)) access time; with this feature, it's O(1), i.e. the fastest possible access with just a single hop. That's why a Vely tree is "hybrid" AVL/B tree, because its features are found in both varieties; it also makes it high-performance and well-suited for in-memory database or cache.
AVL/B tree cache server application
In this example you will create a cache server that uses fast tree-based data access in order to insert, update and search key/value pairs.
Create new "tree" application first, in a new directory (you can name it anything you like):
The vf command is a Vely program manager and it will create a new application (see how-vely-works) named "tree":
sudo vf -i -u $(whoami) tree
To get vim highlighting of Vely syntax:
Create a source code file "treesrv.vely":
and copy and paste this to it:
%% /treesrv
out-header default
do-once
new-tree define t process-scope
end-do-once
task-param op
input-param key
input-param data
if-task "add"
copy-string key to define c_key
copy-string data to define c_data
write-tree t key c_key value c_data status define st
if (st == VV_ERR_EXIST) {
delete-mem c_key
delete-mem c_data
}
@Added [<<p-out key>>]
else-task "delete"
delete-tree t key (key) value define val old-key define okey status define st
if (st == VV_ERR_EXIST) {
@Not found [<<p-out key>>]
} else {
@Deleted [<<p-out val>>]
delete-mem val
delete-mem okey
}
else-task "query"
read-tree t key (key) value define val status define st
if (st == VV_ERR_EXIST) {
@Not found, queried [<<p-out key>>]
} else {
@Value [<<p-out val>>]
}
end-task
%%
Note that the source file name ("treesrv.vely") should match the request name, which is "treesrv". If you're using hyphens (which is useful for web applications), just substitute with underscore. The fact that a request is implemented in a file with the same name helps keep your applications neat, tidy and easy to peruse.
In do-once statement, a new-tree is created with a process scope; that means all the data in this tree will be kept for as long as the process lives. Normally all memory used by a request in Vely is released at the end of the request. The memory for a new tree here will remain throughout all requests this process will serve. The reason for do-once statement is because you only need to create the tree once for the life of the process.
There are 3 input parameters: "op", "key" and "data". Note that "op" is declared as a task-param (which is a special kind of input-param) because it determines what task (i.e. "op"eration) is being done. The other two, "key" and "data", are used as input parameter values for these tasks.
If the task is "add", a new key/data pair is added to the tree using write-tree statement. Note that copies of "key" and "data" input parameters are created because write-tree expects allocated memory (and the memory from input-param isn't).
If the task is "delete", a tree node with "key" is deleted.
If the task is "query", a value from the node with "key" is returned.
This code is a tree-based cache-server that maintains key/value pairs in memory, and allows adding, deleting and querying.
Make an executable program
Now, make a native executable:
Run as application server
"Application server" means a daemon, a resident server process that remains in memory and can serve many requests very fast. Here's how you do that:
The above will start an application server process to serve incoming requests.
Using server from command line
Testing your server from command line is easy:
vv -r --req="/treesrv/op/add/key/1/data/one" --exec --server --silent-header --app="/tree"
Note the "--server" option. It says to contact a server and execute the same request as before. A process you started is staying resident in memory and serving the incoming requests. This way, your server can serve a large number of concurrent requests because it handles each operation very fast.
Again, you can see what's going on behind scenes by omitting "--exec":
vv -r --req="/treesrv/op/add/key/1/data/one" --server --silent-header --app="/tree"
The result being:
export CONTENT_TYPE=
export CONTENT_LENGTH=
export VV_SILENT_HEADER=yes
export REQUEST_METHOD=GET
export SCRIPT_NAME="/tree"
export PATH_INFO="/treesrv/op/add/key/1/data/one"
export QUERY_STRING=""
cgi-fcgi -connect /var/lib/vv/tree/sock/sock /treesrv
Here a "cgi-fcgi" client program will contact the server you started, get a response, and print it out. You can make your own client application by using FastCGI-API; this way you can do whatever you want with the response, and you can do so in a multi-threaded application, since Vely's FastCGI API is MT-safe.
This is a bash test script to insert 100 keys into your cache server, query them to make sure they are there, delete them, and the query again to verify they're all gone. Create "test_tree" file:
And copy/paste the following:
Make sure it's executable:
and now run it:
The result is this:
Keys added
Keys queried
Keys deleted
Keys queried
Tree server test successful
Of course, your application server will probably serve web requests. Check out connect-apache-unix-socket on how to connect Apache web server to your application server, or the same for Nginx: connect-nginx-unix-socket. FastCGI is supported widely among web servers, so you can use pretty much any web server of your choice.
Here's a brief intro for Apache:
This shows how to connect your application listening on a Unix socket (started with vf) to Apache web server.
- Step 1:
To setup Apache as a reverse proxy and connect your application to it, you need to enable FastCGI proxy support, which generally means "proxy" and "proxy_fcgi" modules - this is done only once:
- For Debian (like Ubuntu) and OpenSUSE systems you need to enable proxy and proxy_fcgi modules:
sudo a2enmod proxy
sudo a2enmod proxy_fcgi
- For Fedora systems (or others like Archlinux) enable proxy and proxy_fcgi modules by adding (or uncommenting) LoadModule directives in the Apache configuration file - the default location of this file on Linux depends on the distribution. For Fedora (such as RedHat), Archlinux:
sudo vi /etc/httpd/conf/httpd.conf
For OpenSUSE:
sudo vi /etc/apache2/httpd.conf
Add this to the end of the file:
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_fcgi_module modules/mod_proxy_fcgi.so
- Step 2:
Edit the Apache configuration file:
- For Debian (such as Ubuntu):
sudo vi /etc/apache2/apache2.conf
- for Fedora (such as RedHat), Archlinux:
sudo vi /etc/httpd/conf/httpd.conf
- and for OpenSUSE:
sudo vi /etc/apache2/httpd.conf
Add this to the end of file ("/tree" is the application path (see request-URL) and "tree" is your application name):
ProxyPass "/tree" unix:///var/lib/vv/tree/sock/sock|fcgi://localhost/tree
- Step 3:
Finally, restart Apache. On Debian systems (like Ubuntu) or OpenSUSE:
sudo systemctl restart apache2
On Fedora systems (like RedHat) and Arch Linux:
sudo systemctl restart httpd
Note: you must not have any other URL resource that starts with "/tree" (such as for example "/tree.html" or "/tree_something" etc.) as the web server will attempt to pass them as a reverse proxy request, and they will likely not work. If you need to, you can change the application path to be different from "/tree", see request-URL.
Access application server from the browser
Use the following URL(s) to access your application server from a client like browser (see request-URL). Use actual IP or web address instead of 127.0.0.1 if different.
# Call your application server to add a key
http://127.0.0.1/tree/treesrv/op/add/key/1/data/one
# Call your application server to query a key
http://127.0.0.1/tree/treesrv/op/query/key/1
# Call your application server to delete a key
http://127.0.0.1/tree/treesrv/op/delete/key/1
Note: if your server is on the Internet and it has a firewall, you may need to allow HTTP traffic - see ufw, firewall-cmd etc.
The result is:
Examples
example-client-API
example-cookies
example-create-table
example-develop-web-applications-in-C-programming-language
example-distributed-servers
example-docker
example-encryption
example-file-manager
example-form
example-hash-server
example-hello-world
example-how-to-design-application
example-how-to-use-regex
example-json
example-multitenant-SaaS
example-postgres-transactions
examples
example-sendmail
example-shopping
example-stock
example-uploading-files
example-using-mariadb-mysql
example-using-trees-for-in-memory-queries
example-utility
example-write-report
See all
documentation
You are free to copy, redistribute and adapt this web page (even commercially), as long as you give credit and provide a dofollow link back to this page - see full license at
CC-BY-4.0. Copyright (c) 2019-2023 Dasoftver LLC. Vely and elephant logo are trademarks of Dasoftver LLC. The software and information on this web site are provided "AS IS" and without any warranties or guarantees of any kind. Icons from
table-icons.io copyright Paweł Kuna, licensed under
MIT license.