{"id":1284,"date":"2024-02-24T18:33:19","date_gmt":"2024-02-24T18:33:19","guid":{"rendered":"https:\/\/haxed.me.uk\/?p=1284"},"modified":"2024-02-24T18:40:16","modified_gmt":"2024-02-24T18:40:16","slug":"creating-a-small-lab-environment-for-galera-mariadb-cluster-centos-stream-9","status":"publish","type":"post","link":"https:\/\/haxed.me.uk\/index.php\/2024\/02\/24\/creating-a-small-lab-environment-for-galera-mariadb-cluster-centos-stream-9\/","title":{"rendered":"Creating a small Lab Environment for Galera (mariadb) Cluster, CentOS stream 9"},"content":{"rendered":"\n<p>Hey, so it has been a little while since I&#8217;ve ventured into libvirt and I have noticed a few things, like support for kimchi on modern versions of python is kind of lacking, so instead of firing up a small wok-kimchi html5 lab I took a different route to what I would normally prefer, to create the lab from the ground up without any helper framework or GUI. <br><br>This involved a few things, mainly to install libvirt and kvm-qemu, and then to create a virsh install script utilising cloud init user-data and meta-data for the hostnames and adding the ssh key of my cloud user. <br><br>The Box I use is a remote HP Z440 with vtx extensions for virtualisation, these are required and if you are trying to achieve it will need to enable virtualisation support in the BIOS, which oddly in my case could be found in the &#8220;Security&#8221; Section of the HP Bios. <\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Install Libvirt<\/h1>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\ndnf -y install qemu-kvm libvirt virt-install\nsystemctl enable --now libvirtd\nsystemctl status libvirtd\n<\/pre><\/div>\n\n\n<p>Once virt is installed we can concentrate on creating a cloud init files, which will be used by virsh-install during VM creation.<\/p>\n\n\n\n<p>I also retrieved the latest cloud image provided by Redhat\/CentOS &#8216;CentOS-Stream-GenericCloud-9-latest.x86_64.qcow2&#8217;. For a list of CentOS cloud ready images you can look at <a href=\"https:\/\/cloud.centos.org\/centos\/9-stream\/x86_64\/images\/\">https:\/\/cloud.centos.org\/centos\/9-stream\/x86_64\/images\/<\/a>. Remember the older images (older than CentOS 8\/9 are probably not safe to run. Always use an up to date image.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Cloud Init<\/h1>\n\n\n\n<p>In order to use these images, unlike back in the day there is no default local user only password to login via KVM. You will need to use the more established way of adding your password or SSH-key via cloud-init automation.<\/p>\n\n\n\n<p>First we will place our qcow2 image in the \/var\/lib\/libvirt\/images repoistory. It is important to organise your images as best you can and I recommend against using images in non standard directories, it may upset selinux and permissions and create for confusing lab environment;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nmv \/home\/adam\/CentOS-Stream-GenericCloud-9-latest.x86_64.qcow2 \/var\/lib\/libvirt\/images\n\n# lets create a cloud init directory\nmkdir \/var\/lib\/libvirt\/images\/cloudinit\nvim cloud-init.yaml\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">The Cloud Init YAML File structure<\/h2>\n\n\n\n<p>The cloud init file structure allows you to set a variety of different things, including running scripts, attaching assigning hostnames, pre-installing packages, setting passwords and installing ssh public keys for secure login post-boot cloud-init. Cloud init only runs once so it&#8217;s important to understand any script you want to frequently run, should be installed to crontab with cloud init so that it runs independently after first-run. etc. In our case we want to add the adam user to sudo and set a \/bin\/bash shell with an authorized public key from my test lab.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n#cloud-config\n\nusers:\n  - name: adam\n    ssh_authorized_keys:\n      - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDB1D9CP4jqWxKw4ug+lR2zol7W6oJZ7wuliMt8aqlYekUBk7Pi7apQakc7rqjYD+b4iUDig0\/4Zk4u6DC8WPgVr6o60fV7sdpoj0GBuxL+voGE0YV84zmorHoM8TCfLeMN3AdM0EMcT2NI8V\/dmZ7uILYLYaXB+RRLv1QoMiL6zLGhLOfhdVKdvmbNqNcrvAEonnzQCVhFjRied2CfhnuH9tNXzGT5Y8wz0E9I8gQQp6GCyU7HnCHW8CLWpymZIrt2y7\/Bi4XlKAbvaUFZJ9XLNsAK3gBC\/VygIVQkWp9o3Y+KOmOsmsS51xJsigfDI0UMRdehdNEN+6vm7Eft9QZYHOg1xoTyJkgiFs9yCRFSRuXvFSsFLXUq5TFLv73qquKE6e\/STORKobF2V7LaOuvbw1BIt2zo4v4c4toyaB5hshojO7bpORzhH8K43vEs0VW2ou9Zo8L3DwmZv6qFAy88BDCAIHoElgc3fmddlZJfCvcN4ZWDuISEP\/j2oVuDT40= adam@localhost.localdomain\n    sudo: &#x5B;&quot;ALL=(ALL) NOPASSWD:ALL&quot;]\n    groups: sudo\n    shell: \/bin\/bash\n\n<\/pre><\/div>\n\n\n<p>If you want to create your own key you will need to replace the section &#8211; ssh-rsa {} with your own generated rsa key (or equivalent cipher like ecdsa etc etc);<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Creating the Meta Data File<\/h2>\n\n\n\n<p>There are two files, the above is the &#8216;user-data&#8217; component of cloud-init, we also want to provide a hostname to our machine so on build time its system hostname is set, and it is clearer for us what machine we are logging into. Maybe for a small setup such things aren&#8217;t important but in a cluster, or large environment having context to the commandline of the machine you are working on could be the difference between life and death, you don&#8217;t want to restart the primary node in a cluster accidentally instead of one of the &#8216;slave nodes&#8217;. So Labeling is important and it it&#8217;s really easy to do;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;root@localhost cloudinit]# vim galera1-metadata.yaml\n&#x5B;root@localhost cloudinit]# cat galera1-metadata.yaml\ninstance-id: galera-1\nlocal-hostname: galera-1\n<\/pre><\/div>\n\n\n<p>In my case I will create two more meta-data files for my other 2 Galera nodes <strong>&#8216;galera2-metadata.yaml&#8217;<\/strong> and <strong>&#8216;galera3-metadata.yaml<\/strong>&#8216; with different node numbers &#8216;galera-2&#8217; etc. It&#8217;s useful if you keep these files in somewhere safe or clear like<strong> \/var\/lib\/libvirt\/images\/cloudinit<\/strong> or similar. You will reference them using the &#8211;cloud-init command and the <strong>user-data and meta-data suboptions later<\/strong> when creating the Virtual machines.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Generating a Key to Login to Cloud init<\/h1>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# create a (by default) rsa key for your logged in user\nssh-keygen\n# output the safe public key to add to cloud init for login later\n&#x5B;root@localhost cloudinit]# cat ~adam\/.ssh\/id_rsa.pub\nssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDB1D9CP4jqWxKw4ug+lR2zol7W6oJZ7wuliMt8aqlYekUBk7Pi7apQakc7rqjYD+b4iUDig0\/4Zk4u6DC8WPgVr6o60fV7sdpoj0GBuxL+voGE0YV84zmorHoM8TCfLeMN3AdM0EMcT2NI8V\/dmZ7uILYLYaXB+RRLv1QoMiL6zLGhLOfhdVKdvmbNqNcrvAEonnzQCVhFjRied2CfhnuH9tNXzGT5Y8wz0E9I8gQQp6GCyU7HnCHW8CLWpymZIrt2y7\/Bi4XlKAbvaUFZJ9XLNsAK3gBC\/VygIVQkWp9o3Y+KOmOsmsS51xJsigfDI0UMRdehdNEN+6vm7Eft9QZYHOg1xoTyJkgiFs9yCRFSRuXvFSsFLXUq5TFLv73qquKE6e\/STORKobF2V7LaOuvbw1BIt2zo4v4c4toyaB5hshojO7bpORzhH8K43vEs0VW2ou9Zo8L3DwmZv6qFAy88BDCAIHoElgc3fmddlZJfCvcN4ZWDuISEP\/j2oVuDT40= adam@localhost.localdomain\n\n\n<\/pre><\/div>\n\n\n<p>As you can see it is a very simple process to generate the key for your user. In my case the adam user is located on the same remote machine I am running the libvirtd hypervisor. However if the network routes are bridged to the local adapter on the hypervisor I could login from anywhere in the adjoined network. For convenience and security purposes I keep everything exposed locally to the remote hypervisor only, and for vnc sessions I use a dynamic tunnel on my desktop, which is frequently used to safely connect to unecnrypted local vnc sessions on the remote hypervisor. I use a socks5 configuration Dynamic Tunnel to achieve this on linux and windows machine like illustrated below;<\/p>\n\n\n\n<p><strong>Linux<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nssh -f -N -D 8080 adam@server1.example.com\n<\/pre><\/div>\n\n\n<p><strong>Windows Putty<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/dynamic_tunnel_putty.png\"><img loading=\"lazy\" decoding=\"async\" width=\"452\" height=\"443\" src=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/dynamic_tunnel_putty.png\" alt=\"\" class=\"wp-image-1289\" srcset=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/dynamic_tunnel_putty.png 452w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/dynamic_tunnel_putty-300x294.png 300w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/dynamic_tunnel_putty-65x65.png 65w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/dynamic_tunnel_putty-306x300.png 306w\" sizes=\"auto, (max-width: 452px) 100vw, 452px\" \/><\/a><\/figure>\n\n\n\n<p>We&#8217;ll see why this can be important and useful later. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Creating the Virtual Machines<\/h2>\n\n\n\n<p>Now we have all the necessary cloud init files created, a ssh user and key for cloud init for us to access the new vm ip addresses via ssh login and the ability to vnc to console via encrypted tunnel. Let&#8217;s go ahead and create the 3 Virtual Machines that will form part of the Galera Cluster Lab Environment.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nvirt-install --name galera1 --memory 2048 --vcpus 4 --disk=size=10,backing_store=\/var\/lib\/libvirt\/images\/CentOS-Stream-GenericCloud-9-latest.x86_64.qcow2 --cloud-init user-data=.\/cloud-init.yaml,meta-data=.\/galera1-metadata.yaml,disable=on --network bridge=virbr0 --osinfo=centos-stream9 --noautoconsole\n\nvirt-install --name galera2 --memory 2048 --vcpus 4 --disk=size=10,backing_store=\/var\/lib\/libvirt\/images\/CentOS-Stream-GenericCloud-9-latest.x86_64.qcow2 --cloud-init user-data=.\/cloud-init.yaml,meta-data=.\/galera2-metadata.yaml,disable=on --network bridge=virbr0 --osinfo=centos-stream9 --noautoconsole\n\nvirt-install --name galera3 --memory 2048 --vcpus 4 --disk=size=10,backing_store=\/var\/lib\/libvirt\/images\/CentOS-Stream-GenericCloud-9-latest.x86_64.qcow2 --cloud-init user-data=.\/cloud-init.yaml,meta-data=.\/galera3-metadata.yaml,disable=on --network bridge=virbr0 --osinfo=centos-stream9 --noautoconsole\n\n<\/pre><\/div>\n\n\n<p>In my case I create a little script called &#8220;make-galera-cluster.sh&#8221; for this so I can automate the creation of the galera cluster. .<\/p>\n\n\n\n<p>I also create a tear down script called &#8220;destroy-gakera-cluster.sh&#8221;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nvirsh destroy galera1\nvirsh destroy galera2\nvirsh destroy galera3\n<\/pre><\/div>\n\n\n<p>Then creating the cluster is simple:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n# building out the cluster nodes\n.\/make-galera-cluster.sh\n# tearing down the cluster nodes\n.\/destroy-galera-cluster.sh\n\n<\/pre><\/div>\n\n\n<p>A little bit about the virsh-install parameters. name is the label of the virtual machine. memory and vcpu&#8217;s are self explanatory. the disk size creates a 10gb qcow2 image based on the &#8216;golden image&#8217; (master image reference) of Centos-Stream-9, and &#8211;cloud-Init  provides our user-data and meta-data for the cloud init service to set the hostname and add our ssh key respectively. the disable=on parameter disables cloud-init after the first run, which i recommend. At least in my case virsh obtains 2 dhcp leases for the same label hostname which causes for confusion. Unless you really need to run cloud init more than once, leave it that way. &#8211;noautoconsole prevents you from being attached to the virtual machine console, and the &#8211;network bridge defines the local nic interface to use for the dhcp network lease subnet of the virtual machine. Simple, right?<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Configuring the Galera Nodes<\/h2>\n\n\n\n<p>We should by now have 3 nodes running each with an IP dhcp lease on the virbr0 bridge interface.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;root@localhost cloudinit]# virsh net-dhcp-leases default\n Expiry Time           MAC address         Protocol   IP address           Hostname   Client ID or DUID\n------------------------------------------------------------------------------------------------------------\n 2024-02-24 02:18:19   52:54:00:1b:49:65   ipv4       192.168.122.219\/24   galera-1   01:52:54:00:1b:49:65\n 2024-02-24 02:19:41   52:54:00:6c:00:83   ipv4       192.168.122.247\/24   galera-3   01:52:54:00:6c:00:83\n 2024-02-24 02:19:40   52:54:00:c6:17:7b   ipv4       192.168.122.199\/24   galera-2   01:52:54:00:c6:17:7b\n\n<\/pre><\/div>\n\n\n<p>In my case cloud-init actually creates 6 leases for the 3 VM&#8217;s with one dead ip that was used during cloud init boot. I find this irritating so I wipe the stat file and restart my vm&#8217;s so I don&#8217;t have to try two ip&#8217;s until I get one that is alive;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nvirsh stop galera-1; virsh stop galera-2; virsh stop galera-3\ncat \/dev\/null &gt; \/var\/lib\/libvirt\/dnsmasq\/virbr0.status\nvirsh start galera-1; virsh start galera-2; virsh start galera-3\n<\/pre><\/div>\n\n\n<p>I resent the way dnsmasq works here, it certainly makes it easier to login to our first node to configure the packages for galera. <\/p>\n\n\n\n<p>From earlier when we setup our Tunnel for VNC on the dynamic socks port 8080:localhost; we may want to access the VNC sessions to our created VM&#8217;s in an emergency. Let&#8217;s test this works OK.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;root@localhost cloudinit]# virsh list\n Id   Name      State\n-------------------------\n 21   galera1   running\n 22   galera2   running\n 23   galera3   running\n\n&#x5B;root@localhost cloudinit]# virsh vncdisplay galera1\n127.0.0.1:0\n\n&#x5B;root@localhost cloudinit]# virsh vncdisplay galera2\n127.0.0.1:1\n\n&#x5B;root@localhost cloudinit]# virsh vncdisplay galera3\n127.0.0.1:2\n\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image.png\"><img loading=\"lazy\" decoding=\"async\" width=\"406\" height=\"275\" src=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image.png\" alt=\"\" class=\"wp-image-1295\" srcset=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image.png 406w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-300x203.png 300w\" sizes=\"auto, (max-width: 406px) 100vw, 406px\" \/><\/a><\/figure>\n\n\n\n<p>Remember, if you get a message like this you <strong>can safely ignore it. <\/strong>Why? <strong>Your SSH2 tunnel effectively makes the session transport encrypted<\/strong> locally at both points, so there is no opportunity to intercept the plaintext session by other parties that may sit in the middle, as it&#8217;s effectively a private tunnel like vpn between you and the hypervisor (No MIM Can take place that it is warning you about). The reason this is is because the socks5 is a &#8216;transparent&#8217; proxy, so VNC doesn&#8217;t know or look to check socks5 traffic encapsulated by sha based encryption via our dynamic ssh tunnel]\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"183\" src=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-1-1024x183.png\" alt=\"\" class=\"wp-image-1296\" srcset=\"https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-1-1024x183.png 1024w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-1-300x54.png 300w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-1-768x137.png 768w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-1-500x89.png 500w, https:\/\/haxed.me.uk\/wp-content\/uploads\/2024\/02\/image-1.png 1281w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Testing first, Automation Second, To Ansible or Not to Ansible?<\/h2>\n\n\n\n<p>Of course the <em>next step after manually testing<\/em> our cluster will be to consider automating these package installations utilising the user-data cloud-init file which can install many different packages and repo&#8217;s to our machines. We can include as meta-data additional configuration files, system package installation variables for our virtual machines, and these processes can be carried out by cloud-init. <\/p>\n\n\n\n<p>For a Home lab setup the best (and probably most commercially flexible) approach is to let virsh and cloud init to handle the building of the VM and the installation of packages perhaps, and ansible can take care of any advanced packages and configuration changes to the cluster, because cloud-init isn&#8217;t cluster-aware, but ansible playbooks very much can be written to do that. <\/p>\n\n\n\n<p> I think ansible will be useful later for my lab, as changing the nodes of the cluster to different consistency or performing upgrades isn&#8217;t something you could handle thru cloud-init anyways, except for brave and foolish lost souls, perhaps! \ud83d\ude00<\/p>\n\n\n\n<p>  In order to create an effective playbook testing manually is necessary first, unless you are fortunate to find an ansible playbook which has already been thoroughly tested by someone else to do the same. In any case you will probably want to get some hands on experience to understand what the playbook automation actually does, otherwise your sorta firing blind into the wind.. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Installing Galera on the 3 Nodes Cluster<\/h2>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n sudo dnf -y install mariadb-server-galera\n<\/pre><\/div>\n\n\n<p>I saw some versions of debian actually come with mariadb aliased repo, but CentOS package specifies &#8216;mariadb-server-galera&#8217;. We are using 10.11.6, as you can see below;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-2 ~]$ dnf info mariadb-server-galera\nLast metadata expiration check: 0:00:58 ago on Sat 24 Feb 2024 11:19:55 AM EST.\nInstalled Packages\nName         : mariadb-server-galera\nEpoch        : 3\nVersion      : 10.11.6\nRelease      : 1.module_el9+853+7b957b9b\nArchitecture : x86_64\nSize         : 39 k\nSource       : mariadb-10.11.6-1.module_el9+853+7b957b9b.src.rpm\nRepository   : @System\nFrom repo    : appstream\nSummary      : The configuration files and scripts for galera replication\nURL          : http:\/\/mariadb.org\nLicense      : GPLv2 and LGPLv2\nDescription  : ... &#x5B;shortened]\n\n\n<\/pre><\/div>\n\n\n<p>First important thing is to check what repo your getting this from, and the support for the version in terms of updates or upgrades later. If we look at the vendor page <a href=\"https:\/\/mariadb.com\/kb\/en\/release-notes\/#:~:text=This%20section%20contains%20the%20release,development%20series%20is%20MariaDB%2011.4\">https:\/\/mariadb.com\/kb\/en\/release-notes\/#:~:text=This%20section%20contains%20the%20release,development%20series%20is%20MariaDB%2011.4<\/a>. we can see the stable centos9 stream was release 13 Nov 2023, and is a few months old. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">About Releases, Stability, Support, Developent Cycles<\/h2>\n\n\n\n<p>Generally we don&#8217;t live too close to the cutting edge [unless there is a good reason like support to do so]. The advantage of stable CentOS 9 stream packages is the stability and future supportability, most repo&#8217;s will have patched issues with earlier releases that are found normally in the first few months after release. According to the vendor 10.11 is the latest stable release, and considering the development cycle is important for Long Time Support (aka LTS). <\/p>\n\n\n\n<p>It actually is important to understand that a General Availability (GA) Release may have much less future support and will not be maintained for security updates or package updates in the stream rep , a veritable nightmare a good technician is worthy of avoiding his client.<\/p>\n\n\n\n<p>Simply using a stable LTS is the best approach since you will get support for 5 years. So in our case we will expect support to end 5 years from the release date which is sometime between 2028-2029. Probably long enough. Which is a big difference two 2025 or 2026, and likely to be a real headache sometimes, as releases can be library and cross-dependency breaking in larger more complex software stack and development environments. Most companies don&#8217;t spend so much time on such matters, but it&#8217;s <strong>really important man<\/strong>, and in my experience working in managed support at Rackspace for 3 years, I saw a lot of businesses who didn&#8217;t spend so much time on this matter really wish that they had when their application came tumbling down by a forced upgrade gone wrong that if they used LTS they could have avoided. etc etc. Last of all one other important consideration is whether they want to install the latest stable from the centos 9 stream repoistory, or if they want to use a specific version from the vendor repo. Again, this depends on the clients objectives and is an important consideration when carrying out implementation in a commercial environment.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Configure Galera Node<\/strong><\/h2>\n\n\n\n<p>Before we begin we need to remove the anonymous mysql user and test data and &#8216;make mysql secure&#8217;. This can be done with the simple command below. Not much more to say as this is standard practice since forever;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 ~]$ mysql_secure_installation\n\n<\/pre><\/div>\n\n\n<p>Let&#8217;s configure our first Galera Node on the galera-1 virtual machine. We need to reference the other addresses of our node (wsrep_cluster_address), we need to name the node for legibility (wsrep_node_name, and its not a uuid so be warned), and we need to consider what wsrep_sst_method we are using. Finally we&#8217;ll need to set the ws_rep_on=1 before stopping our service and insuring that \/var\/lib\/mysql is empty and does not have any other data in it that is either important or will prevent the cluster creating the db structure for the cluster. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n# This file contains wsrep-related mysqld options. It should be included\n# in the main MySQL configuration file.\n#\n# Options that need to be customized:\n#  - wsrep_provider\n#  - wsrep_cluster_address\n#  - wsrep_sst_auth\n# The rest of defaults should work out of the box.\n\n##\n## mysqld options _MANDATORY_ for correct opration of the cluster\n##\n&#x5B;mysqld]\n\n# (This must be substituted by wsrep_format)\nbinlog_format=ROW\n\n# Currently only InnoDB storage engine is supported\ndefault-storage-engine=innodb\n\n# to avoid issues with 'bulk mode inserts' using autoinc\ninnodb_autoinc_lock_mode=2\n\n# Override bind-address\n# In some systems bind-address defaults to 127.0.0.1, and with mysqldump SST\n# it will have (most likely) disastrous consequences on donor node\nbind-address=0.0.0.0\n\n##\n## WSREP options\n##\n\n# Enable wsrep\nwsrep_on=1\n\n# Full path to wsrep provider library or 'none'\nwsrep_provider=\/usr\/lib64\/galera\/libgalera_smm.so\n\n# Provider specific configuration options\n#wsrep_provider_options=\n\n# Logical cluster name. Should be the same for all nodes.\nwsrep_cluster_name=&quot;galeracluster&quot;\n\n# Group communication system handle\nwsrep_cluster_address=&quot;gcomm:\/\/192.168.122.219,192.168.122.247,192.168.199&quot;\n\n# Human-readable node name (non-unique). Hostname by default.\nwsrep_node_name=galera-1\n\n# Base replication &lt;address|hostname&gt;&#x5B;:port] of the node.\n# The values supplied will be used as defaults for state transfer receiving,\n# listening ports and so on. Default: address of the first network interface.\n#wsrep_node_address=\n\n# Address for incoming client connections. Autodetect by default.\n#wsrep_node_incoming_address=\n\n# How many threads will process writesets from other nodes\nwsrep_slave_threads=1\n\n# DBUG options for wsrep provider\n#wsrep_dbug_option\n\n# Generate fake primary keys for non-PK tables (required for multi-master\n# and parallel applying operation)\nwsrep_certify_nonPK=1\n\n# Maximum number of rows in write set\nwsrep_max_ws_rows=0\n\n# Maximum size of write set\nwsrep_max_ws_size=2147483647\n\n# to enable debug level logging, set this to 1\nwsrep_debug=0\n\n# convert locking sessions into transactions\nwsrep_convert_LOCK_to_trx=0\n\n# how many times to retry deadlocked autocommits\nwsrep_retry_autocommit=1\n\n# change auto_increment_increment and auto_increment_offset automatically\nwsrep_auto_increment_control=1\n\n# retry autoinc insert, which failed for duplicate key error\nwsrep_drupal_282555_workaround=0\n\n# enable &quot;strictly synchronous&quot; semantics for read operations\nwsrep_causal_reads=0\n\n# Command to call when node status or cluster membership changes.\n# Will be passed all or some of the following options:\n# --status  - new status of this node\n# --uuid    - UUID of the cluster\n# --primary - whether the component is primary or not (&quot;yes&quot;\/&quot;no&quot;)\n# --members - comma-separated list of members\n# --index   - index of this node in the list\nwsrep_notify_cmd=\n\n##\n## WSREP State Transfer options\n##\n\n# State Snapshot Transfer method\nwsrep_sst_method=rsync\n\n# Address which donor should send State Snapshot to.\n# Should be the address of THIS node. DON'T SET IT TO DONOR ADDRESS!!!\n# (SST method dependent. Defaults to the first IP of the first interface)\n#wsrep_sst_receive_address=\n\n# SST authentication string. This will be used to send SST to joining nodes.\n# Depends on SST method. For mysqldump method it is root:&lt;root password&gt;\nwsrep_sst_auth=root:\n\n# Desired SST donor name.\n#wsrep_sst_donor=\n\n# Reject client queries when donating SST (false)\n#wsrep_sst_donor_rejects_queries=0\n\n# Protocol version to use\n# wsrep_protocol_version=\n\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">Creating the Cluster DB Filestructure<\/h2>\n\n\n\n<p>Once you have configured your primary node or &#8216;first node&#8217;, in this case as this is a set of 3 &#8216;masters&#8217; all can be written and sync with eachother. You will need to run the shell command galera_new_cluster.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 my.cnf.d]$ galera_new_cluster\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\"><strong>Configure the other nodes in cluster with the same configuration file<\/strong><\/h2>\n\n\n\n<p>Now we will need to configure the other 2 nodes &#8216;galera-2&#8217; and &#8216;galera-3&#8217; respectively. Making sure to give them a human readable name &#8216;wsrep_node_name=&#8217; you don&#8217;t have to do it, but please do it. Your crazy running a big cluster without that and you can see why this is once you start the cluster.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Starting the Cluster <\/h2>\n\n\n\n<p>It&#8217;s important to start the cluster after the initial cluster has been created on a given node; i.e using the galera_new_cluster is defining &#8216;I am the copy to sync to the other nodes&#8217;. An initiatialisation state, in this configuration all the nodes will sync with eachother as data is written to any of them once this initial stage is completed.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 my.cnf.d]$ sudo systemctl start mariadb\n<\/pre><\/div>\n\n\n<p>And we will start mariadb on the second and third node now;<\/p>\n\n\n\n[adam@galera-2 ~]$ sudo systemctl start mariadb<br>[adam@galera-3 ~]$ sudo systemctl start mariadb<\/p>\n\n\n\n<p>We will check the status of the nodes using systemctl<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 my.cnf.d]$ sudo systemctl status mariadb\n\u25cf mariadb.service - MariaDB 10.11 database server\n     Loaded: loaded (\/usr\/lib\/systemd\/system\/mariadb.service; enabled; preset: disabled)\n     Active: active (running) since Sat 2024-02-24 11:46:05 EST; 9s ago\n       Docs: man:mariadbd(8)\n             https:\/\/mariadb.com\/kb\/en\/library\/systemd\/\n    Process: 8105 ExecStartPre=\/usr\/libexec\/mariadb-check-socket (code=exited, status=0\/SUCCESS)\n    Process: 8127 ExecStartPre=\/usr\/libexec\/mariadb-prepare-db-dir mariadb.service (code=exited, status=0\/SUCCESS)\n    Process: 8364 ExecStartPost=\/usr\/libexec\/mariadb-check-upgrade (code=exited, status=0\/SUCCESS)\n   Main PID: 8162 (mariadbd)\n     Status: &quot;Taking your SQL requests now...&quot;\n      Tasks: 17 (limit: 10856)\n     Memory: 179.1M\n        CPU: 903ms\n     CGroup: \/system.slice\/mariadb.service\n             \u2514\u25008162 \/usr\/libexec\/mariadbd --basedir=\/usr\n\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8323]: sent 2005 bytes  received 416901 bytes  total size 408076\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8329]: connect from galera-2 (192.168.122.199)\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8329]: rsync allowed access on module rsync_sst from galera-2 (192.168.122.199)\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8324]: sent 48 bytes  received 386 bytes  total size 65\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8322]: sent 48 bytes  received 383 bytes  total size 67\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8329]: rsync to rsync_sst\/ from galera-2 (192.168.122.199)\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8329]: receiving file list\nFeb 24 11:46:04 galera-1 rsyncd&#x5B;8329]: sent 48 bytes  received 185 bytes  total size 41\nFeb 24 11:46:05 galera-1 rsyncd&#x5B;8294]: sent 0 bytes  received 0 bytes  total size 0\nFeb 24 11:46:05 galera-1 systemd&#x5B;1]: Started MariaDB 10.11 database server.\n\n<\/pre><\/div>\n\n\n<p>We can see that our first node is receiving from galera-2.<\/p>\n\n\n\n<p>We can also see that our 3rd node is communicating with galera-1<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;root@galera-3 my.cnf.d]# systemctl status mariadb\n\u25cf mariadb.service - MariaDB 10.11 database server\n     Loaded: loaded (\/usr\/lib\/systemd\/system\/mariadb.service; enabled; preset: disabled)\n     Active: active (running) since Fri 2024-02-23 21:08:45 EST; 14h ago\n       Docs: man:mariadbd(8)\n             https:\/\/mariadb.com\/kb\/en\/library\/systemd\/\n   Main PID: 14265 (mariadbd)\n     Status: &quot;Taking your SQL requests now...&quot;\n      Tasks: 17 (limit: 10857)\n     Memory: 257.0M\n        CPU: 37.386s\n     CGroup: \/system.slice\/mariadb.service\n             \u2514\u250014265 \/usr\/libexec\/mariadbd --basedir=\/usr\n\nFeb 23 21:08:43 galera-3 rsyncd&#x5B;14426]: sent 2005 bytes  received 416977 bytes  total size 408076\nFeb 23 21:08:43 galera-3 rsyncd&#x5B;14424]: sent 1853 bytes  received 4052436 bytes  total size 4045370\nFeb 23 21:08:43 galera-3 rsyncd&#x5B;14430]: connect from galera-1 (192.168.122.219)\nFeb 23 21:08:43 galera-3 rsyncd&#x5B;14430]: rsync allowed access on module rsync_sst from galera-1 (192.168.122.219)\nFeb 23 21:08:43 galera-3 rsyncd&#x5B;14425]: sent 48 bytes  received 380 bytes  total size 67\nFeb 23 21:08:44 galera-3 rsyncd&#x5B;14430]: rsync to rsync_sst\/ from galera-1 (192.168.122.219)\nFeb 23 21:08:44 galera-3 rsyncd&#x5B;14430]: receiving file list\nFeb 23 21:08:44 galera-3 rsyncd&#x5B;14430]: sent 48 bytes  received 186 bytes  total size 41\nFeb 23 21:08:44 galera-3 rsyncd&#x5B;14397]: sent 0 bytes  received 0 bytes  total size 0\nFeb 23 21:08:45 galera-3 systemd&#x5B;1]: Started MariaDB 10.11 database server.\n\n<\/pre><\/div>\n\n\n<p>Lets install sysbench to run some performance tests of our new cluster against the nodes. This is pretty important and will reveal any bottlenecks or serious misconfiguration that results. Naturally the query size and type of queries matter too, but that is often to do with application design, rather than the physical set up of the application. Though depending on the application tuning can be done and then tests can be repeated with the application in staging, or utilising a specific sysbench query with given query size\/test data etc. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n# sys bench is in the epel-release repo so lets install it\ndnf install epel-release\ndnf install sysbench\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">Tuning Sysctl, threading, tcp_wait maxconn and more<\/h2>\n\n\n\n<p>So, although this is a virtualised cluster, and because each machine is virtualised by the same file system limitations set by the hypervisor, the performance increase is minimal, since, most of the writing is made to disk, and a shared disk means that having your cloud nodes on the same hyperivsor for a cloud galera cluster is very very bad indeed. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 my.cnf.d]$ sysbench oltp_read_write --table-size=1000000 --db-driver=mysql --mysql-db=test --mysql-user=root --mysql-password=test --threads=2  --mysql-host=192.168.122.219,192.168.122.247,192.168.199 run\nsysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)\n\nRunning the test with following options:\nNumber of threads: 2\nInitializing random number generator from current time\n\n\nInitializing worker threads...\n\nThreads started!\n\nSQL statistics:\n    queries performed:\n        read:                            31360\n        write:                           7999\n        other:                           5438\n        total:                           44797\n    transactions:                        2237   (223.52 per sec.)\n    queries:                             44797  (4476.08 per sec.)\n    ignored errors:                      3      (0.30 per sec.)\n    reconnects:                          0      (0.00 per sec.)\n\nGeneral statistics:\n    total time:                          10.0063s\n    total number of events:              2237\n\nLatency (ms):\n         min:                                    3.67\n         avg:                                    8.94\n         max:                                   24.25\n         95th percentile:                       13.95\n         sum:                                20002.45\n\nThreads fairness:\n    events (avg\/stddev):           1118.5000\/60.50\n    execution time (avg\/stddev):   10.0012\/0.00\n\n\n<\/pre><\/div>\n\n\n<p><br>In reality, a galera cluster could be 3 bare metal servers, or 3 cloud instances on seperate hypervisors making full use of the IOPS available (writes\/reads) on each device attached to each node. So, although we won&#8217;t see much difference in our sysbench marks because of the virtual synthetic lab I have created. I still think it is really important to cover sysctl tuning et al.<\/p>\n\n\n\n<p>So lets first increase the maxconns that Mariadb can do, and increase the size of the innodb log from 100M to 2000M, and the maxconnections to 3000 each.  Whilst we&#8217;re at it lets also increase the number of threads for the galera cluster slaves from 1 to 32. This is pretty important as its effectively the number of sharding instances assigned per query-set grouping from my understanding. Let&#8217;s do it!<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\ninnodb_log_file_size = 2000M\nmax_connections = 2800\nwsrep_slave_threads=32\n<\/pre><\/div>\n\n\n<p>lets restart our mariadb instances again;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 ~]$ sudo systemctl restart mariadb\n&#x5B;root@galera-2 ~]# systemctl restart mariadb\n&#x5B;root@galera-3 my.cnf.d]# systemctl restart mariadb\n\n<\/pre><\/div>\n\n\n<p>Let&#8217;s verify the settings took hold<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nMariaDB &#x5B;(none)]&gt; SHOW GLOBAL VARIABLES    LIKE 'max_connections';\n+-----------------+-------+\n| Variable_name   | Value |\n+-----------------+-------+\n| max_connections | 3000  |\n+-----------------+-------+\n1 row in set (0.001 sec)\n\n<\/pre><\/div>\n\n\n<p>Let&#8217;s take a look at the mysql database of the cluster and inspect the wsrep_cluster; and verify that the members really are working correctly with each node incoming_address forwarding as it should; this is important for the distribution of load between the cluster.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nMariaDB &#x5B;mysql]&gt; use mysql;\nMariaDB &#x5B;mysql]&gt; select * from wsrep_cluster_members;\n+--------------------------------------+--------------------------------------+-----------+-----------------------+\n| node_uuid                            | cluster_uuid                         | node_name | node_incoming_address |\n+--------------------------------------+--------------------------------------+-----------+-----------------------+\n| 089e1f9a-d340-11ee-88bd-f286854158f3 | 75204503-d2b9-11ee-b391-3aa84a29b9dd | galera-2  | 192.168.122.199:0     |\n| 0aaaabe2-d340-11ee-89bb-536d4bd359ef | 75204503-d2b9-11ee-b391-3aa84a29b9dd | galera-3  | 192.168.122.247:0     |\n| 0e050c50-d340-11ee-b43d-376d616a87b2 | 75204503-d2b9-11ee-b391-3aa84a29b9dd | galera-1  | 192.168.122.219:0     |\n+--------------------------------------+--------------------------------------+-----------+-----------------------+\n3 rows in set (0.001 sec)\n\n<\/pre><\/div>\n\n\n<p>Perfect, so everything is there, but I&#8217;d expect to see a little better performance, so let&#8217;s set some sysctl defaults, because the ones on these VM&#8217;s is going to be really low and TCP_WAIT on 128 threads is really miniscule<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 ~]$ sudo sysctl net.ipv4.tcp_max_syn_backlog\nnet.ipv4.tcp_max_syn_backlog = 128\n&#x5B;adam@galera-1 ~]$ sudo sysctl net.core.somaxconn\nnet.core.somaxconn = 4096\n&#x5B;adam@galera-1 ~]$ cat \/proc\/sys\/fs\/file-max\n4096\n\n\n<\/pre><\/div>\n\n\n<p>These values need to really be a lot higher for us to hit the real bottleneck, which is a lot higher as a &#8216;hard&#8217; limit than these &#8216;soft&#8217; limits set in the VM are imposed. So lets go ahead and put in some new sysctl tunables on all of our VM&#8217;s as follows;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nvim \/etc\/sysctl.conf\n\n&#x5B;adam@galera-1 ~]$ sudo vim \/etc\/sysctl.conf\n&#x5B;adam@galera-1 ~]$ cat \/etc\/sysctl.conf\n# sysctl settings are defined through files in\n# \/usr\/lib\/sysctl.d\/, \/run\/sysctl.d\/, and \/etc\/sysctl.d\/.\n#\n# Vendors settings live in \/usr\/lib\/sysctl.d\/.\n# To override a whole file, create a new file with the same in\n# \/etc\/sysctl.d\/ and put new settings there. To override\n# only specific settings, add a file with a lexically later\n# name in \/etc\/sysctl.d\/ and put new settings there.\n#\n# For more information, see sysctl.conf(5) and sysctl.d(5).\n#\n#\n# Increase the maximum number of memory map areas a process may have\n# This can help prevent out-of-memory errors in large applications\nvm.max_map_count=262144\n\n# Increase the maximum number of file handles and inode cache for large file transfers\n# This can improve performance when dealing with a large number of files\nfs.file-max = 3261780\n\n# Increase the maximum buffer size for TCP\nnet.core.rmem_max = 16777216\nnet.core.wmem_max = 16777216\n\n# Increase Linux autotuning TCP buffer limits\nnet.ipv4.tcp_rmem = 4096 12582912 16777216\nnet.ipv4.tcp_wmem = 4096 12582912 16777216\n\n# Disable caching of ssthresh from previous TCP connection\nnet.ipv4.tcp_no_metrics_save = 1\n\n# Reduce the kernel's tendency to swap\nvm.swappiness = 1\n\n# Set the default queueing discipline for network devices\nnet.core.default_qdisc = fq_codel\n\n# Enable TCP BBR congestion control\nnet.ipv4.tcp_congestion_control=bbr\n\n# Enable TCP MTU probing\nnet.ipv4.tcp_mtu_probing=1\n\n# Increase the maximum input queue length of a network device\nnet.core.netdev_max_backlog = 32768\n\n# Increase the maximum accept queue limit\nnet.core.somaxconn = 65535\n\n# Reduce the number of SYN and SYN+ACK retries before packet expires\nnet.ipv4.tcp_syn_retries = 1\nnet.ipv4.tcp_synack_retries = 1\n\n# Reduce the timeout to close client connections in TIME_WAIT state\nnet.ipv4.tcp_fin_timeout = 30\n\n# Disable SYN cookie flood protection\nnet.ipv4.tcp_syncookies = 0\n\n# Increase the local port range used by TCP and UDP\nnet.ipv4.ip_local_port_range = 1024 61000\n\n# Additional Galera Cluster optimizations\n# Increase the number of allowed open files per process for MariaDB\nfs.file-max = 100000\n\n# Increase the number of file handles specifically for MariaDB\n# Adjust according to the needs of your Galera Cluster\nfs.aio-max-nr = 1000000\n\n<\/pre><\/div>\n\n\n<p>And be sure to run this afterwards so that sysctl runs the present configuration<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nsysctl -p\n<\/pre><\/div>\n\n\n<p>Let&#8217;s rerun our test using all 32 threads, which was failing before from timeouts [due to too many threads overwhelming the sysctl limits set in sysbench&#8217;s shell we&#8217;re testing from]\n\n\n\n<p>The error we were getting :<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 ~]$ sysbench oltp_read_write --table-size=1000000 --db-driver=mysql --mysql-db=test --mysql-user=root --mysql-password=test --threads=16  --mysql --host=192.168.122.219,192.168.122.247,192.168.199 run\nsysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)\n\nRunning the test with following options:\nNumber of threads: 16\nInitializing random number generator from current time\n\n\nInitializing worker threads...\n\nFATAL: Worker threads failed to initialize within 30 seconds!\n<\/pre><\/div>\n\n\n<p>After we have increased the limits on mariadb galera, and sysctl we can rerun the test to get maximum throughput (<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 ~]$ sysbench oltp_read_write --table-size=1000000 --db-driver=mysql --mysql-db=test --mysql-user=root --mysql-password=test --threads=16  --mysql-host=192.168.122.219,192.168.122.247,192.168.122.199 run   sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)\n\nRunning the test with following options:\nNumber of threads: 16\nInitializing random number generator from current time\n\n\nInitializing worker threads...\n\nThreads started!\n\nSQL statistics:\n    queries performed:\n        read:                            204624\n        write:                           52222\n        other:                           35396\n        total:                           292242\n    transactions:                        14555  (1453.33 per sec.)\n    queries:                             292242 (29180.70 per sec.)\n    ignored errors:                      61     (6.09 per sec.)\n    reconnects:                          0      (0.00 per sec.)\n\nGeneral statistics:\n    total time:                          10.0131s\n    total number of events:              14555\n\nLatency (ms):\n         min:                                    3.62\n         avg:                                   11.00\n         max:                                  123.99\n         95th percentile:                       15.83\n         sum:                               160056.66\n\nThreads fairness:\n    events (avg\/stddev):           909.6875\/105.22\n    execution time (avg\/stddev):   10.0035\/0.01\n\n\n<\/pre><\/div>\n\n\n<p>Wow! What a difference, we have nearly quadrupled the read performance now by taking all of these steps. I am really impressed with the results. Though this hypervisor is utilising NVME, so I would ideally expect to see writes in the 1GB\/s to 6GB\/s range at least! I think though, that it depends on how the queries are bulked together, and the delay in each transaction sent to and from the cluster and disk. Also, we have to bare in mind that these aren&#8217;t baremetal benchmarks either. Lets run hdparm and see what libvirt can do anyways.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 ~]$ sudo fdisk -l\nDisk \/dev\/vda: 10 GiB, 10737418240 bytes, 20971520 sectors\nUnits: sectors of 1 * 512 = 512 bytes\nSector size (logical\/physical): 512 bytes \/ 512 bytes\nI\/O size (minimum\/optimal): 512 bytes \/ 512 bytes\nDisklabel type: dos\nDisk identifier: 0xdce436c4\n\nDevice     Boot Start      End  Sectors Size Id Type\n\/dev\/vda1  *     2048 20971486 20969439  10G 83 Linux\n&#x5B;adam@galera-1 ~]$ hdparm -Tt \/dev\/vda1\n\/dev\/vda1: Permission denied\n&#x5B;adam@galera-1 ~]$ sudo hdparm -Tt \/dev\/vda1\n\n\/dev\/vda1:\n Timing cached reads:   21602 MB in  1.99 seconds = 10856.33 MB\/sec\n Timing buffered disk reads: 6226 MB in  3.00 seconds = 2074.76 MB\/sec\n\n<\/pre><\/div>\n\n\n<p>Woof. I really love my HP Z440. With those 16 cores and 32 threads and a 10GB\/s pci busspeed it leaves nothing much to be more desired. I think that with some more effort , perhaps by increasing the number of CPU cores and ram of the Galera Cluster we could get this far beyond 15k\/qps which is pretty impressive for a single server machine. I am going to run some benchmark against mariadb on the hypervisor on the bare metal and see how much efficiency is really lost in splitting it between these 3 instances. However, if this was 3 dedicated machines, I am confident the results would be pretty amazing. <br><br>I think I could probably loosen the consistency restraints so that others worked as slaves rather than synchronous masters, and get a lot more than 15k qps. <br><br>With 32 threads we see even better performance. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;adam@galera-1 ~]$ sysbench oltp_read_write --table-size=1000000 --db-driver=mysql --mysql-db=test --mysql-use                                                                                                                               r=root --mysql-password=test --threads=32  --mysql-host=192.168.122.219,192.168.122.247,192.168.122.199 run\nsysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)\n\nRunning the test with following options:\nNumber of threads: 32\nInitializing random number generator from current time\n\n\nInitializing worker threads...\n\nThreads started!\n\nSQL statistics:\n    queries performed:\n        read:                            275352\n        write:                           70394\n        other:                           47423\n        total:                           393169\n    transactions:                        19513  (1948.12 per sec.)\n    queries:                             393169 (39252.79 per sec.)\n    ignored errors:                      155    (15.47 per sec.)\n    reconnects:                          0      (0.00 per sec.)\n\nGeneral statistics:\n    total time:                          10.0145s\n    total number of events:              19513\n\nLatency (ms):\n         min:                                    4.75\n         avg:                                   16.41\n         max:                                  121.10\n         95th percentile:                       26.68\n         sum:                               320136.02\n\nThreads fairness:\n    events (avg\/stddev):           609.7812\/98.11\n    execution time (avg\/stddev):   10.0043\/0.00\n\n<\/pre><\/div>\n\n\n<p>Likely there are a lot more things I can do to increase performance and make use of the nvme backed virtio disk, however <strong>because all the instances are on the same box hypervisor lab the<\/strong> <strong>qps is lower that it would otherwise be in a real production setting, which is important to consider.<\/strong><br><br>Anyways, that&#8217;s all folks! Hopefully I will get time to automate this with ansible on the local hyperivisors adam user and build a little automation script that puts together libvirt vm creation and the configuration of all described above into a single script. And maybe as an added bonus we could add in some sysbench tests and return codes from the playbook, which might be with some adaptation really handy for testing out database implementations on different cloud providers. <br><br>Last but not least! Lets make it so that mariadb [galera] is running on boot, do this on all machines;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nsystemctl enable mariadb\n<\/pre><\/div>","protected":false},"excerpt":{"rendered":"<p>Hey, so it has been a little while since I&#8217;ve ventured into libvirt and I have noticed a few things, like support for kimchi on modern versions of python is kind of lacking, so instead of firing up a small &hellip; <a href=\"https:\/\/haxed.me.uk\/index.php\/2024\/02\/24\/creating-a-small-lab-environment-for-galera-mariadb-cluster-centos-stream-9\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1284","post","type-post","status-publish","format-standard","hentry","category-uncategorised"],"_links":{"self":[{"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/posts\/1284","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/comments?post=1284"}],"version-history":[{"count":19,"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/posts\/1284\/revisions"}],"predecessor-version":[{"id":1309,"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/posts\/1284\/revisions\/1309"}],"wp:attachment":[{"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/media?parent=1284"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/categories?post=1284"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/haxed.me.uk\/index.php\/wp-json\/wp\/v2\/tags?post=1284"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}