Site keeps on going down because of spiders

So a Rackspace customer was consistently having an issue with their site going down, even after the number of workers were increased. It looked like in this customers case they were being hit really hard by yahoo slurp, google bot, a href bot, and many many others.

So I checked the hour the customer was affected, and found that over that hour just yahoo slurp and google bot accounted for 415 of the requests. This made up like 25% of all the requests to the site so it was certainly a possibility the max workers were being reached due to spikes in traffic from bots, in parallel with potential spikes in usual visitors.

[root@www logs]#  grep '01/Mar/2017:10:' access_log | egrep -i 'www.google.com/bot.html|http://help.yahoo.com/help/us/ysearch/slurp' |  wc -l
415

It wasn’t a complete theory, but was the best with all the available information I had, since everything else had been checked. The only thing that remains is the number of retransmits for that machine. All in all it was a victory, and this was so awesome, I’m now thinking of making a tool that will do this in more automated way.

I don’t know if this is the best way to find google bot and yahoo bot spiders, but it seems like a good method to start.

Using omconfig to add a RAID 1 device for a Perc 6/i Dell Raid Controller

So, I’ve been provisioning disks, and stuff recently.. this is how I did it on a Dell. Quite an easy thing to do!

omconfig storage controller action=createvdisk controller=0 raid=r1 size=max readpolicy=ara pdisk=0:0:2,0:0:3
Command successful!

In this case the two disks newly added were 0:0:2 and 0:0:3 on the SAS ‘bus’.

An additional primary partition was created and added for this device sdb1, and a filesystem of the same kind (ext3) as the system disk was created;

 mkfs.ext3 /dev/sdb1

mke2fs 1.39 (29-May-2006)
....
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

You will naturally need to mount the partition and create an fstab entry to make this permanent;

mount /dev/sdb1 /mnt/backup

echo "/dev/sdb1               /mnt/backup             ext3    defaults        1 1" >> /etc/fstab

You may wish to consider adding the above to fstab manually. It’s not a good idea using echo with it incase you make a mistake ;-D

Cheers &
Best wishes,
Adam

Block all the IP’s from country

So, I wrote a nice little one liner for one of our customers that wanted to blanket ban Russia (even though I said it wasn’t a good idea, or marginally effective to stop attacks). Might help with spam or other stuff though, and anyway, the customer is always ‘wrong’, it’s up to us to make sure that they do it wrongly right. ;-D

curl http://www.ipdeny.com/ipblocks/data/countries/ru.zone -o russia_ips_all.txt; cat russia_ips_all.txt | xargs -i echo /sbin/iptables -I INPUT -s {} -j DROP

Here is how I achieved it above. This bans all the IP’s from russia. But, if you aren’t very equal opportunities :(, you can ban all kinds of countries:

http://www.ipdeny.com/ipblocks/

Just take a look at this, and change the url, as such. It doesn’t matter what the variables say (even if they say russia, just change the url directly after curl). For instance

http://www.ipdeny.com/ipblocks/data/countries/pl.zone -o ips_all.txt; cat ips_all.txt | xargs -i echo /sbin/iptables -I INPUT -s {} -j DROP

I was really quite happy with this little oneliner. 😀

Cheers &
Best wishes,
Adam

Help! I can’t login to my cloud-server even though I’ve reset my root password

The most common cause of this is the permit root login is set to no, although there might be other causes, like a really broken sshd_config, instead of just one variable. The procedure for looking into this is pretty much the same regardless of the breakage that has occurred. Here is what you need to do:

Here’s the full procedure:

1) Put server into rescue mode.
2) Login to cloud-server on SSH port, please note rescue mode gives you a new temporary root password allowing you to reset the password for SSH on the ‘original disk’.
3) once logged in mount the /dev/xvdb devices, this may be /dev/xvdb1 or /dev/xvdb2 but is usually /dev/xvdb1 and chroot (change root to the ‘original disk’)

# Mount old disk
mnt /dev/xvdb1 /mnt

# Change to the ‘old disk’
chroot /mnt

# Set the new password for root on the old disk:

passwd
# enter the new password when prompted

and specifically ensure that /etc/ssh/sshd_config has this line:

PermitRootLogin no

changed to:

PermitRootLogin yes

Your developer or sysad won’t be able to login until you reset the root password here, and if you do not know the username to su to root from, it is absolutely critical to perform this work, otherwise you won’t be able to access the server.

Also, once you have allowed the root login, and changed the password to something you recognise you will be able to exit rescue mode thru the control panel and login to the machine as normal.

For more detail about how to do this (although all the steps are here pretty much, please see):

https://support.rackspace.com/how-to/rackspace-cloud-essentials-rescue-mode-on-linux-cloud-servers/

I hope this helps you folks out some,

Enabling Automatic Security Updates in CentOS 6, 7 and RHEL 6 and RHEL 7 (and Debian and Ubuntu too)

yum -y install yum-cron

This can also be done on Debian and Ubuntu systems if you are feeling left out:

apt-get -y install unattended-upgrades

Configuration on the CentOS/RHEL side is:

da da da da da da da ! Actually that’s all you need to do to enable it, but there are a lot of things you can customise in /etc/yum/yum-cron.conf

[commands]
#  What kind of update to use:
# default                            = yum upgrade
# security                           = yum --security upgrade
# security-severity:Critical         = yum --sec-severity=Critical upgrade
# minimal                            = yum --bugfix update-minimal
# minimal-security                   = yum --security update-minimal
# minimal-security-severity:Critical =  --sec-severity=Critical update-minimal
update_cmd = default

# Whether a message should be emitted when updates are available,
# were downloaded, or applied.
update_messages = yes

# Whether updates should be downloaded when they are available.
download_updates = yes

# Whether updates should be applied when they are available.  Note
# that download_updates must also be yes for the update to be applied.
apply_updates = no

# Maximum amout of time to randomly sleep, in minutes.  The program
# will sleep for a random amount of time between 0 and random_sleep
# minutes before running.  This is useful for e.g. staggering the
# times that multiple systems will access update servers.  If
# random_sleep is 0 or negative, the program will run immediately.
# 6*60 = 360
random_sleep = 360


[emitters]
# Name to use for this system in messages that are emitted.  If
# system_name is None, the hostname will be used.
system_name = None

# How to send messages.  Valid options are stdio and email.  If
# emit_via includes stdio, messages will be sent to stdout; this is useful
# to have cron send the messages.  If emit_via includes email, this
# program will send email itself according to the configured options.
# If emit_via is None or left blank, no messages will be sent.
emit_via = stdio

# The width, in characters, that messages that are emitted should be
# formatted to.
ouput_width = 80


[email]
# The address to send email messages from.
email_from = root@localhost

# List of addresses to send messages to.
email_to = root

# Name of the host to connect to to send email messages.
email_host = localhost


[groups]
# NOTE: This only works when group_command != objects, which is now the default
# List of groups to update
group_list = None

# The types of group packages to install
group_package_types = mandatory, default

[base]
# This section overrides yum.conf

# Use this to filter Yum core messages
# -4: critical
# -3: critical+errors
# -2: critical+errors+warnings (default)
debuglevel = -2

# skip_broken = True
mdpolicy = group:main

# Uncomment to auto-import new gpg keys (dangerous)
# assumeyes = True

Creating basic Log Tree’s for Rackspace CDN

Well, this was a little bit of a hack, but I share it because it’s quite cool

#!/bin/bash

# Simple script thaqt is designed to process Rackspace .CDN_ACCESS_LOGS recursively
# once they are download with swiftly

# to download the CDN logs use
# swiftly --verbose --eventlet --concurrency=100 get .CDN_ACCESS_LOGS --all-objects -o ./
alldirs=$(ls -al -1);

echo "$alldirs" | awk '{print $9}' > alldirs.txt



for item in `cat alldirs.txt`

 do
        echo " --- CONTAINER $item START ---"
        tree -L 2 $item;
 echo " --- CONTAINER $item END --- "
 printf "\n\n"
done

DISASTER RECOVERY! Exporting a Broken Cloud-server image VHD from Rackspace and attempting to recover data

Thanks to my colleague Marcin for thie guestmount tools protip.

I wrote a previous guide which explains how to download/export a Cloud server image VHD from Rackspace Cloud, which is failing to build. It might allow you to perform data recovery, even if the image can’t be booted. Which I’m guessing someone is going to run into sooner or later, and will be pleased to see this article, it will at least give you a best shot at reading the VHD and recovering it, since as you might know already, just because boot or kernel is broken, doesn’t mean that the data isn’t there!

# A better article to use if you want to download via commandline
https://community.rackspace.com/products/f/25/t/3583

# My article doing this thru a web-browser which might be useful too for some customers
https://community.rackspace.com/products/f/25/t/7089

Once the image gets downloaded to your new cloud instance you can use ‘libguestfs-tools’ package (same name on Ubuntu and CentOS) which contains tools necessary for mounting .vhd image files.

The command would be (read-only mode):

guestmount -a {image-name}.vhd -i --ro {mount-point}

Taking strace output from stderr and piping to other utilities

Well, this is a strange thing to do, but say you want to know how fast an application is processing data. how to tell? Enter strace, and… a bit of wc -l with the assistance of tee and 2>&1 proper redirection.

strace -p 9653 2>&1 | tee >(wc -l)

where 9653 is the process id (pid) and wc -l is the command you want to pipe to!

read(4, "2.74.2.34 - - [26/05/2015:15:15"..., 8192) = 8192
read(4, "o) Version/7.1.6 Safari/537.85.1"..., 8192) = 8192
read(4, "ident/6.0)\"\n91.233.154.8 - - [26"..., 8192) = 8192
^C1290

1290 lines in the output.. perfect, thats what I wanted to know, roughly how quickly this log parser is going thru my logs 😀

Adding nodes and Updating nodes behind a Cloud Load Balancer

I have succeeded in putting together a basic script documenting exactly how API works and for adding node(s), listing the nods behind the LB, as well as updating the nodes (such as DRAINING, DISABLED, ENABLED).

Use update node to set one of your nodes to gracefully drain (not accept new connections, wait for present connections to die). Naturally, you will want to put the secondary server in behind the load balancer first, with addnode.sh.

Once new node is added as enabled, set the old node to ‘DRAINING’. This will gracefully switch over the server.

# List Load Balancers

#!/bin/bash

USERNAME='yourmycloudusernamegoeshere'
APIKEY='apikeygoeshere'
LB_ID='157089'
CUSTOMER_ID='10017858'

TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" |  python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`



curl -v -H "X-Auth-Token: $TOKEN" -H "content-type: application/json" -X GET "https://lon.loadbalancers.api.rackspacecloud.com/v1.0/$CUSTOMER_ID/loadbalancers/$LB_ID"

#

# Add Node(s) addnode.sh

#!/bin/bash

USERNAME='yourmycloudusernamegoeshere'
APIKEY='apikeygoeshere'
LB_ID='157089'
CUSTOMER_ID='10017858'

TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" |  python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`

# Add Node
curl -v -H "X-Auth-Token: $TOKEN" -d @addnode.json -H "content-type: application/json" -X POST "https://lon.loadbalancers.api.rackspacecloud.com/v1.0/$CUSTOMER_ID/loadbalancers/$LB_ID/nodes"



## 

For the addnode script you require a file, called addnode.json
that file must contain the snet ip's you wish to add

#
# addnode.json

{"nodes": [
        {
            "address": "10.0.0.1",
            "port": 80,
            "condition": "ENABLED",
            "type":"PRIMARY"
        }
    ]
}

##

##

# updatenode.sh

#!/bin/bash

USERNAME='yourmycloudusernamegoeshere'
APIKEY='apikeygoeshere'
LB_ID='157089'
CUSTOMER_ID='100101010'
NODE_ID=719425

TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: applic

# Update Node

curl -v -H "X-Auth-Token: $TOKEN" -d @updatenode.json -H "content-type: application/json" -X PUT "https://lon.loadbalancers.api.rackspacecloud.com/v1.0/$CUSTOMER_ID/loadbalancers/$LB_ID/nodes/$NODE_ID"

##

##

## updatenode.json

{"node":{
            "condition": "DISABLED",
            "type":"PRIMARY"
        }
}

Naturally, you will be able to change condition to ENABLED, DISABLED, or DRAINING.

I recommend to use DRAINING, since it will gracefully remove the cloud-server, and any existing connections will be waited on, before removing the server from LB.

HTTPS to HTTP redirect for Rackspace Cloud Load Balancer

Hello,

So a customer reached out to us today asking about how to configure HTTPS to HTTP redirects. This is actually really easy to do.

Think of it like this;

When you enable HTTP and HTTPS (allow insecure and secure traffic), the protocol is always HTTP hitting the server.

When you ‘only allow secure traffic’ it will always be HTTPS. Think of it like, if the load balancer has certificates on it and supports both protocols (i.e. terminates SSL at the load balancer), then the requests hitting the server will always be HTTP.

This is why the X_forwarded_Proto becomes important in your server being able to determine what traffic hitting it coming from load balancer originated from HTTPS protocol, and which originated from HTTP protocol. This is what allows you to effectively do the redirection on the cloud-server side.

I hope this helps!

So the rewrite rule on the server, using x_forwarded_proto to detect the protocol instead of the usual ‘https’ directive can be (or rather will need to be replaced) with a rule that uses the header instead of the regular incoming protocol to determine redirect.

    RewriteEngine On
    RewriteCond %{HTTP:http_x_forwarded_proto} !=http
    RewriteCond %{REQUEST_URI} !^/some/path.*
    RewriteRule (.*) http://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Quite simple, but took me the better part of half an hour to get my head properly around this 😀

You might be interested in achieving this on Windows systems with iis or similar, so check this link out which explains how to do that step by step in windows;

https://support.rackspace.com/how-to/configuring-load-balanced-sites-with-ssl-offloading-using-iis/