Checking Load Balancer Connectivity & Automating it in some interesting ways

So, in a dream last night, I woke up realising I had forgot to write my automated load balancer connectivity checker.

Basically, sometimes a customer will complain their site is down because their ‘load balancer is broken’! In many cases, this is actually due to a firewall on one of the nodes behind the load balancer, or an issue with the webserver application listening on the port. So, I wrote a little piece of automation in the form of a BASH script, that accepts an Load Balancer ID and then uses the API to pull the server nodes behind that Load Balancer, including the ports being used to communicate, and then uses, either netcat or nmap to check that port for connectivity. There were a few ways to achieve this, but the below is what I was happiest with.

#!/bin/bash

# Username used to login to control panel
USERNAME='mycloudusernamegoeshere'

# Find the APIKey in the 'account settings' part of the menu of the control panel
APIKEY="apikeygoeshere"

# Your Rackspace account number (the number that is in the URL of the control panel after logging in)
ACCOUNT=100101010

# Your Rackspace loadbalancerID
LOADBALANCERID=157089

# Rackspace LoadBalancer Endpoint
ENDPOINT="https://lon.loadbalancers.api.rackspacecloud.com/v1.0"

# This section simply retrieves and sets the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" |  python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`

#   (UNUSED) METHOD 1Extract IP addresses (Currently assuming port 80 only)
#curl -H "X-Auth-Token: $TOKEN" -H "Accept: application/json" -X GET "$ENDPOINT/$ACCOUNT/loadbalancers/$LOADBALANCERID/nodes" | jq .nodes[].address | xargs -i nmap -p 80 {}
#   (UNUSED) Extract ports
# curl -H "X-Auth-Token: $TOKEN" -H "Accept: application/json" -X GET "$ENDPOINT/$ACCOUNT/loadbalancers/$LOADBALANCERID/nodes" | jq .nodes[].port | xargs -i nmap -p 80 {}


# I opted for using this method to extract the important detail
curl -H "X-Auth-Token: $TOKEN" -H "Accept: application/json" -X GET "$ENDPOINT/$ACCOUNT/loadbalancers/$LOADBALANCERID/nodes" | jq .nodes[].address | sed 's/"//g' > address.txt
curl -H "X-Auth-Token: $TOKEN" -H "Accept: application/json" -X GET "$ENDPOINT/$ACCOUNT/loadbalancers/$LOADBALANCERID/nodes" | jq .nodes[].port > port.txt

# Loop thru both output files sequentially, order is important
# WARNING script does not ignore whitespace

while read addressfile1 <&3 && read portfile2 <&4; do
   ncat $addressfile1 $portfile2
done 3

Output looks a bit like;

# ./lbtest.sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 5143 100 5028 100 115 4731 108 0:00:01 0:00:01 --:--:-- 4734
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 225 100 225 0 0 488 0 --:--:-- --:--:-- --:--:-- 488
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 225 100 225 0 0 679 0 --:--:-- --:--:-- --:--:-- 681
Ncat: No route to host.
Ncat: Connection timed out.

I plan to add some additional support that will check the load balancer is up, AND the servicenet connection between the cloud servers.

Please note that this script must be run on a machine with access to servicenet network, in the same Rackspace Datacenter to be able to check servicenet connectivity of servers. The script can give false positives if strict firewall rules are setup on the cloud server nodes behind the load balancer. It's kind of alpha-draft but I thought I would share it as a proof of concept.

You will need to download and install jq to use it. To download jq please see; https://stedolan.github.io/jq/download/

Windows Password reset for Rackspace Cloud Servers

In the previous articles Using API and BASH to validate changing conditions and Reset windows administrator password using rescue mode without nova-agent I explained both the steps how to reset the password of a windows VM instance by modifying the SAM file by using a Linux ‘rescue’ image in the cloud, and, I also explained how to automate checks for BASH automation thru the API. The checks specifically waited until the server entered rescue, and then lifted the ipv4 address, connecting only when the rescue server had finished building.

That way the automation is handling the delay it takes, as well as setting and lifting the access credentials and ip address to use each time. Here is the complete script. Please note that backticks are deprecated but I’m a bit ‘oldskool’. This is a rough alpha, but it works really nicely. After testing it consistently allows ourselves, or our customers to reset a Windows Cloud Server password, in the case that a customer loses access to it, and cannot use other Rackspace services to do the reset. This effectively turns a useless server, back into a usable one again and saves a lot of time.

#!/bin/bash
# Adam Bull, Rackspace UK
# This script automates the resetting of windows passwords
# Arguments $1 == username
# Arguments $2 == apikey
# Arguments $3 == ddi
# Arguments $4 == instanceid

echo "Rackspace windows cloud server Password Reset"
echo "written by Adam Bull, Rackspace UK"
sleep 2
PASSWORD=39fdfgk4d3fdovszc932456j2oZ

# Provide an instance uuid to rescue and reset windows password

USERNAME=mycloudusernamehere
APIKEY=myapikeyhere
# DDI is the 'customer ID', if you don't know this login to the control panel and check the number in the URL
DDI=10010101
# The instance uuid you want to rescue
INSTANCE=ca371a8b-748e-46da-9e6d-8c594691f71c

# INITIATE RESCUE PROCESS

nova  --os-username $USERNAME --os-auth-system=rackspace  --os-tenant-name $DDI --os-auth-url https://lon.identity.api.rackspacecloud.com/v2.0/ --os-password $APIKEY --insecure rescue --password "$PASSWORD" --image 7fade26a-0cca-415f-a988-49c021768fca $INSTANCE

# LOOP UNTIL STATE DETECTED AS RESCUED

STATE=0
until [[ $STATE == rescued ]]; do
echo "start rescue check"
STATE=`nova --os-username $USERNAME --os-auth-system=rackspace  --os-tenant-name $DDI --os-auth-url https://lon.identity.api.rackspacecloud.com/v2.0/ --os-password $APIKEY --insecure show $INSTANCE | grep rescued | awk '{print $4}'`

echo "STATE =" $STATE
echo "sleeping.."
sleep 5
done

# EXTRACT PUBLIC ipv4 FROM INSTANCE

IP=`nova --os-username $USERNAME --os-auth-system=rackspace  --os-tenant-name $DDI --os-auth-url https://lon.identity.api.rackspacecloud.com/v2.0/ --os-password $APIKEY --insecure show $INSTANCE | grep public | awk '{print $5}' | grep -oE '((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])'`
echo "IP = $IP"

# UPDATE AND INSTALL RESCUE TOOLS AND RESET WINDOWS PASS
# Set environment locally
yum install sshpass -y

# Execute environment remotely
echo "Performing Rescue..."
sshpass -p "$PASSWORD" ssh -o StrictHostKeyChecking=no root@"$IP" 'yum update -y; yum install ntfs-3g -y; mount /dev/xvdb1 /mnt; curl li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-5.el7.nux.noarch.rpm -o /root/nux.rpm; rpm -Uvh /root/nux.rpm; yum install chntpw -y; cd /mnt/Windows/System32/config; echo -e "1\ny\n" | chntpw -u "Administrator" SAM'

echo "Unrescuing in 100 seconds..."
sleep 100
nova  --os-username $USERNAME --os-auth-system=rackspace  --os-tenant-name $DDI --os-auth-url https://lon.identity.api.rackspacecloud.com/v2.0/ --os-password $APIKEY --insecure unrescue $INSTANCE

Thanks again to my friend Cory who gave me the instructions, I simply automated the process to make it easier and learned something in the process 😉

HOWTO: Rackspace Automation, Using BASH with API (to validate conditions to perform conditional tasks)

In the previous article, I showed how to wipe clean the windows password from a broken Virtual Machine that you were locked out of by rescuing with a Linux image. In this article I explain steps of how you would automate this with a bash script, that looked at the STATE of the server, and accepts commandline arguments.

It’s quite a simple script;

#!/bin/bash
# Adam Bull
# April 28 2016
# This script automates the resetting of windows passwords
# Arguments $1 == instanceuuid
# Arguments $2 == username
# Arguments $3 == apikey

PASSWORD=mypassword

# Provide an instance uuid to rescue and reset windows password

USERNAME=$1
APIKEY=$2
DDI=$3
INSTANCE=$4


nova  --os-username $USERNAME --os-auth-system=rackspace  --os-tenant-name $DDI --os-auth-url https://lon.identity.api.rackspacecloud.com/v2.0/ --os-password $APIKEY --insecure rescue --password mypassword --image 7fade26a-0cca-415f-a988-49c021768fca $INSTANCE

The above script takes the arguments I give the executable script on the commandline, in this case the first argument passed is $1, the Rackspace mycloud username. The second argument the apikey. etc. This basically puts the server into rescue. But.. what if we wanted to run some automation AFTER it rescued? We don’t want to try and let the automation ssh to the box and run the automation early, so we could use a supernova show to find whether the VM state has changed to ‘rescue’. Whilst its initiating the state will be rescuing. So we have the option of using when !rescueing logic, or, when == equal to rescue. Lets use when equal to rescue in our validation loop.

This loop will continue until the task state changes to the desired value. Here is how we achieve it

#!/bin/bash
# Initialize Variable
STATE=0
# Validate $STATE variable, looping UNTIL $STATE == rescued
until [[ $STATE == rescued ]]; do
echo "start rescue check"
# 'show' the servers data, and grep for rescued and extract only the correct value if it is found
STATE=`nova --os-username $USERNAME --os-auth-system=rackspace  --os-tenant-name $DDI --os-auth-url https://lon.identity.api.rackspacecloud.com/v2.0/ --os-password $APIKEY --insecure show $INSTANCE | grep rescued | awk '{print $4}'`

# For debugging
echo "STATE =" $STATE
echo "sleeping.."

# For API Limit control
sleep 5
# Exit the loop once until condition satisfied
done

# Post Rescue
echo "If you read this, it means that the program detected a rescued state"

It’s quite a simple script to use. We just provide the arguments $1, $2, $3 and $4.

 
./rescue.sh mycloudusername mycloudapikey 10010101 e744af0f-6643-44f4-a63f-d99db1588c94

Where 10010101 is the tenant id and e744af0f-6643-44f4-a63f-d99db1588c94 is the UUID of your server.

It’s really quite simple to do! But this is not enough we want to go a step further. Let’s move the rescue.sh to /bin

# WARNING /bin is not a playground this is for demonstration purposes of how to 'install' bin system applications
cp rescue.sh /bin/rescue.sh 

Now you can call the command ‘rescue’.

rescue mycloudusername mycloudapikey mycustomerid mycloudserveruuidgoeshere

nice, and quite simple too. Obviously ‘post rescue’ in the script I can upload a script via ssh to the server, and then execute it remotely to perform the password reset.

Backing up a MySQL Database remotely

So, you might want to backup a MySQL database remotely, like one of our customers did today. This is relatively simply utilizing the inbuilt mysqldump facility. This customer in particular was running varnish in front of his apache2 webserver so setting up phpmyadmin wasn’t entirely straight forward for this non technical customer. It’s easily achievable with something like;

Specific database

ssh -l user 1.1.1.1 "mysqldump -mysqldumpoptions databasenamegoeshere | gzip -3 -c" > /localpath/localfile.sql.gz 

All databases

mysqldump -uroot -ppassword -h162.13.137.249 > backup.sql

The formatting of the command should look like

mysqldump -u root -p[root_password] -h [hostname] [database_name] > dumpfilename.sql

Preparing a Github/Gitlab Development Bastion Server

So you are looking to use github / gitlab to manage your infrastructure and development. To do this effectively you will need to prepare your environment. Here is an example.

This is for our ansible playbook.

Install Required Dependencies

yum update -y
yum install -y vim git ansible tree fail2ban

Add user for repo

useradd -m -G wheel osan
passwd osan

Secure SSH by disabling root login and changing SSH port

sed 's/#PermitRootLogin yes/PermitRootLogin no/g;s/#Port 22/Port 222/g' -i /etc/ssh/sshd_config
firewall-cmd --add-port=666/tcp --permanent
firewall-cmd --reload
systemctl restart sshd.service

Generate key for osan user

su - osan
ssh-keygen -f ~/.ssh/id_rsa -t rsa -N ''

Output the key you generated

cat ~/.ssh/id_rsa.pub

The next step is adding your SSH key above to the ‘profiles’ section of your gitlab/github user. Find this in my profile, under ‘SSH KEYS’.

Screen Shot 2016-04-25 at 10.13.03 AM

Screen Shot 2016-04-25 at 10.13.19 AM

Set Git Variables

USERNAME=yourgitlabusername
git config --global user.name $USERNAME
git config --global user.email "[email protected]"

Clone Project

git clone [email protected]:$USERNAME/projectname.git

Delete All Cloud Backup from Cloud Files

Please note that by performing the below commands the effect can be destructive.

TAKE CAUTION WHEN USING THIS COMMAND IT CAN DELETE EVERYTHING IF YOU DO SOMETHING WRONG!!!!

# swiftly --verbose --eventlet --concurrency=100 for "" --prefix z_DO_NOT_DELETE --output-names do delete "" --recursive --until-empty

This particularly command *should* only remove the cloud files directories starting with z_DO_NOT_DELETE. I have tested it and it appears to work correctly.

Creating 200 cloud servers using openstack Nova

Had a question on how to do this from a customer today.
It is possible to create very many cloud servers in a quick time something like:

#!/bin/sh
for i in `seq 1 200`;
do
nova boot --image someimageidhere --flavor '2GB Standard Instance' "\Server-$i"
sleep 5
done

So simple, but could build out many servers (a small farm) in just an hour or so:D

Update

So my colleague tells me, that backticks are bad, i.e. deprecated. Which, they are, and I expected to hear this from someone, as my knowledge is somewhat a little old school. Here is what my friend recommends.

for i in {0..200}; do
nova boot --image someimageidhere --flavor '2GB Standard Instance' "\Server-$i"
sleep 5
done

Using CBS boot from volume with Rackspace HEAT Orchestration

So, a customer reached out to us today concerning ways to use HEAT to build CBS

  blk_server:
    type: "Rackspace::Cloud::Server"
    properties:
      flavor: 15 GB Memory v1
      image: { get_param: image }
      name: "blk"
      user_data:
...

The problem is using this format they get an error

ERROR: Image Ubuntu 14.04 LTS (Trusty Tahr) (PVHVM) requires 20 GB minimum disk space. Flavor 15 GB Memory v1 has only 0 GB.

This is happening because memory flavor doesn’t use the hypervisor instance store, and instead is using Cloud Block Storage, hence ‘0GB’.
Thanks to my friend Aaron I have dug out the documentation for building CBS boot from volume server flavors. Here is how it would be done.

parameters:
  nodesize:
    type: number
    label: Nodes Disk Size
    description: Size of the each Nodes primary disk.
    default: 50
    constraints:
      - range: { min: 50, max: 1024 }
        description: must be between 50 and 1024 Gb.

    nodeimage:
    label: Operating system
    description: |
      Server image. Defaults to 'CentOS 6 (PVHVM)'.
    type: string
    default: CentOS 7 (PVHVM)
    constraints:
    - allowed_values:
      - CentOS 7 (PVHVM)
      - Red Hat Enterprise Linux 7 (PVHVM)
      description: Must be a supported operating system.

  
  elk_server:
    type: "Rackspace::Cloud::Server"
    properties:
      flavor: 15 GB Memory v1
      block_device_mapping: [{ device_name: "vda", volume_id : { get_resource : cinder_volume }, delete_on_termination : "true" }]
      name: "elk"
      user_data:
  
    cinder_volume:
    type: OS::Cinder::Volume
    properties:
      size: { get_param: nodesize }
      image: { get_param: nodeimage }