so, you want to check your Rackspace DNS API rate Limits?
Here is how you do it.
#!/bin/bash
# Username used to login to control panel
USERNAME='mycloudusernamegoeshere'
# Find the APIKey in the 'account settings' part of the menu of the control panel
APIKEY='mycloudapikeygoeshere'
# Customer ID of the account (the numbers that are in the URL when you login to mycloud control panel)
CUSTOMER_ID="100101010"
# This section simply retrieves the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" | python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`
# Retrieve Rate Limit detail for the account
curl "https://dns.api.rackspacecloud.com/v1.0/$CUSTOMER_ID/limits" -H "X-Auth-Token: $token" | python -m json.tool
A customer came to us with an issue with their CDN which was strange and odd. I wanted to document this so that it is understood why this happened.
The customer is using two TLS origins, and HTTP2. Why that is a problem will become evidently clear. This is a general method of troubleshooting in terms of replicating behaviour of the CDN and origin with host headers. This can be applied no matter what the problem, to understand the HTTP code given by the origin, which at least half of the time turns out to be the cause. The origin being the cloud-server your CDN is backed by.
Question
Hi,
We are currently experiencing some issues with the Cloud CDN. We are using this for our CSS and images and now everything is getting a HTTP/503 SERVICE UNAVAILABLE. If you want to test, you may test this url:
https://cdn.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css
This is supposed to deliver this file:
https://origin.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css
Is something mis-configured or are there some issues on the appliance?
Answer
First we confirm the origin is UP
# curl -I https://originserver.customerdomain.com/static/version1476169182/adminhtml/Magento/backend/nb_NO/extjs/resources/css/ext-all.min.css
HTTP/1.1 200 OK
Date: Tue, 11 Oct 2016 08:45:42 GMT
Server: Apache
Last-Modified: Tue, 11 Oct 2016 06:57:53 GMT
ETag: "ed26-53e91653c61d0"
Accept-Ranges: bytes
Content-Length: 60710
Vary: Accept-Encoding
Cache-Control: max-age=31536000, public
Expires: Wed, 11 Oct 2017 08:45:42 GMT
Access-Control-Allow-Origin: *
X-Frame-Options: SAMEORIGIN
Content-Type: text/css
The origin is the cloud-server where the CDN pulls from. As we can see the site is up. So what is causing the issue? The way CDN works is it provides a host header for the domain, so the site has to have a host for both domains. The reason is that the CDN uses CNAME hostnames to identify which CDN is which. I.e. which path like /media/ directs to which static origins subdomain.
The best way to look further at the situation now is to check the origin (the subdomain that you’ve associated with your CDN subdomain that raxspace gives you, when sending the host header for the CDN url we get:
Looking at the definition for HTTP 2, this issue was caused by different TLS configurations for your domains and mod http2 trying to reuse the same connection, which will not work if the TLS configurations are not the same on the origin cloud-server side.
You just need to disable HTTP2, or configure the TLS configurations to be the same on the apache2 side. I hope that this clarifies and makes sense to you, of course if you have additional questions, comments or concerns please don't hesitate to reach out to us, we are here to help!
As you can see the importance of debugging CDN by sending host header to the origin that the CDN uses, to replicate the issue the customer was experiencing, which was essentially, the CDN edgenodes (the machines around the world that pull from the origin for content distribution really worldwide), weren't able to retrieve the files from the origin with the host header domain that is defined in the control panel.
This customer needed to in this case adjust their Apache2 configuration. The problem was likely caused by updating Apache2 or similar.
So a customer came with the question today on the process of uploading an ISO to a cloud-server. It’s a relatively easy process. RDP allows you to use resource sharing, that effectively mounts on the server you RDP to, your remote filesystem of your choice. This is the manner in which you can upload the file easily.
Once you have the ISO on the server you simply need to mount the disk with a software to present it as an ordinary CD/DVD disk. I personally use Elaborate Bytes’ Virtual Clone Drive.
So, a customer is experiencing slowness/sluggishness in their app. You know there is not issue with the hypervisor from instinct, but instinct isn’t enough. Using tools like xentop, sar, bwm-ng are critical parts of live and historical troubleshooting.
Sar can tell you a story, if you can ask the storyteller the write questions, or even better, pick up the book and read it properly. You’ll understand what the plot, scenario, situation and exactly how to proceed with troubleshooting by paying attention to these data and knowing which things to check under certain circumstances.
This article doesn’t go in depth to that, but it gives you a good reference of a variety of tests, the most important being, cpu usage, io usage, network usage, and load averages.
CPU Usage of all processors
# Grab details live
sar -u 1 3
# Use historical binary sar file
# sa10 means '10th day' of current month.
sar -u -f /var/log/sa/sa10
CPU Usage of a particular Processor
sar -P ALL 1 1
‘-P 1’ means check only the 2nd Core. (Core numbers start from 0).
sar -P 1 1 5
The above command displays real time CPU usage for core number 1, every 1 second for 5 times.
Observing Changes in Memory over time
sar -r 1 3
The above command provides memory stats every 1 second for a total of 3 times.
Observing Swap usage over time
sar -S 1 5
The above command reports swap statistics every 1 seconds, a total 3 times.
Overall I/O activity
sar -b 1 3
The above command checks every 1 seconds, 3 times.
Individual Block Device I/O Activities
This is a useful check for LUN , block devices and other specific mounts
sar -d 1 1
sar -p d
DEV – indicates block device, i.e. sda, sda1, sdb1 etc.
Total Number processors created a second / Context switches
sar -w 1 3
Run Queue and Load Average
sar -q 1 3
This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3” reports for every 1 seconds a total of 3 times.
Report Network Statistics
sar -n KEYWORD
KEYWORDS Available;
DEV – Displays network devices vital statistics for eth0, eth1, etc.,
EDEV – Display network device failure statistics
NFS – Displays NFS client activities
NFSD – Displays NFS server activities
SOCK – Displays sockets in use for IPv4
IP – Displays IPv4 network traffic
EIP – Displays IPv4 network errors
ICMP – Displays ICMPv4 network traffic
EICMP – Displays ICMPv4 network errors
TCP – Displays TCPv4 network traffic
ETCP – Displays TCPv4 network errors
UDP – Displays UDPv4 network traffic
SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
ALL – This displays all of the above information. The output will be very long.
sar -n DEV 1 1
Specify Start Time
sar -q -f /var/log/sa/sa11 -s 11:00:00
sar -q -f /var/log/sa/sa11 -s 11:00:00 | head -n 10
Today, a customer approached us after a Host Server Down complaining that, although the server is up again their website and application were down & not working. Even though the server was online and functioning correctly.
The customer discovered that the source of the issue was that there /etc/resolv.conf is blank, this means that they will not be able to resolve DNS A/PTR/CNAME record hostnames into a resolved IP. This is called hostname to IP resolution. Its means that if /etc/resolv.conf is blank and the customer uses hostnames in their calls, such a failure will break the connectivity due to failure to resolve to IP to communicate on the TCP stack.
There is actually a very simple way to prevent the /etc/resolv.conf file from being changed. But first, it’s important to understand why /etc/resolv.conf is being reset.
On All Rackspace cloud-servers there is a process called nova-agent, and when the server starts up, the /etc/resolv.conf file will be reset along with the networking configuration. This happens each time your server is restarted and is used to set new networking details, specifically if you take an image and build server on a new ip address or if your server is live-migrated to a new host, it makes sure on the next reboot it comes up with correct networking detail transparently. However this can cause some issues, such as in this case with the /etc/resolv.conf file. Fortunately there are some novel ways of preventing your /etc/resolv.conf being modified after you have added the correct nameservers you desire to it.
You can use the chattr immutable file setting to stop processes from modifying it after you have made the changes to your resolv.conf that are desired;
Set Immutable File
chattr +i /etc/resolv.conf
Un-set Immutable File
chattr -i /etc/resolv.conf
This /etc/resolv.conf issue is a common problem, however using immutable file flag and chattr should prevent it from being changed ever again.
So, I was testing with curl today and I know that it’s possible to direct to /dev/null to suppress the page. But that’s not very handy if you are checking whether html page loads, so I came up with some better body checks to use.
A Basic body check using wc -l to count the lines of the site
time curl https://www.google.com/ > 1; echo "non zero indicates server up and served content of n lines"; cat 1 | wc -l
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 167k 0 167k 0 0 79771 0 --:--:-- 0:00:02 --:--:-- 79756
real 0m2.162s
user 0m0.042s
sys 0m0.126s
non zero indicates server up and served content of n lines
2134
A body check for Google analytics
$ time curl https://www.groundworkjobs.com/ > 1; echo "Checking for google analytics html elements string"; cat 1 | grep "www.google-analytics.com/analytics.js"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 167k 0 167k 0 0 76143 0 --:--:-- 0:00:02 --:--:-- 76152
real 0m2.265s
user 0m0.042s
sys 0m0.133s
Checking for google analytics html elements string
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
Such commands might be useful when troubleshooting a cluster for instance, where one server shows more up to date versions, (different number of lines). There’s probably better way to do this with ls and awk and use the html filesize, since number of lines wouldn’t be so accurate.
Check Filesize from request
$ time curl https://www.groundworkjobs.com/ > 1; var=$(ls -al 1 | awk '{print $5}') ; echo "Page size is: $var kB"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 167k 0 167k 0 0 79467 0 --:--:-- 0:00:02 --:--:-- 79461
real 0m2.170s
user 0m0.048s
sys 0m0.111s
Page size is: 171876 kB
Pretty simple.. but you could take the oneliner even further… populate a variable called $var with the filesize using ls and awk , and then use an if statement to check that var is not 0, indicating the page is answering positively, or alternatively not answering at all.
Check Filesize and populate a variable with the filesize, then validate variable
$ time curl https://www.groundworkjobs.com/ > 1; var=$(ls -al 1 | awk '{print $5}') ; echo "Page size is: $var kB"; if [ "$var" -gt 0 ] ; then echo "The filesize was greater than 0, which indicates box is up but may be giving an error page"; fi
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 167k 0 167k 0 0 78915 0 --:--:-- 0:00:02 --:--:-- 78950
real 0m2.185s
user 0m0.041s
sys 0m0.132s
Page size is: 171876 kB
The filesize was greater than 0, which indicates box is up but may be giving an error page
The second exercise is not particularly useful or practical as a means of testing, since if the site was timing out the script would take ages to reply and make the whole test pointless, but as a learning exercise being able to assemble one liners on the fly like this is an enjoyable, rewarding and useful investment of time and effort. Understanding such things are the fundamentals of automating tasks. In this case with output filtering, variable creation, and subsequent validation logic. It’s a simple test, but the concept is exactly the same for any advanced automation procedure too.
So, as you may already be aware, I am working on a lightweight backup script called obscene redundancy’. An redundant backup software capable of 18 replicas of data to Rackspace Cloud Files API service. It’s so redundant… it’s obscene redundancy.
Today, I was discussing with my colleague, that it was all very well uploading your tar to cloud files, but, wouldn’t you really like to know if the file you uploaded is completely identical number of bits, and order? Enter, Cloud Files ‘HEAD’and Etag. Our MD5 friend.
What I did to improve the obscene redundancy script was quite simple here:
# We define a variable that takes the 'Etag' (MD5Sum) value for the cloud files archive
cfmd5sum=$(swiftly --conf swiftly-configs/swiftly-${SHORT_REGION,,}.conf head
"${BACKUP_DEST}/${FILE}" | grep -i Etag | awk '{print $2}')
# We Define a variable that generates an 'MD5Sum' for the local file archive
localmd5sum=$(md5sum "$BACKUP_DIR"/"$FILE")
echo "Checking Data integrity of Cloud Files upload to $REGION"
echo "Cloud Files Archive MD5: $cfmd5sum ....... Local File Archive MD5: $localmd5sum"
# If these values
if [[ "$cfmd5sum" -ne "$localmd5sum" ]];
then
echo "VALUES NOT EQUAL"
echo "$REGION CRC OK..."
else
echo "VALEUS EQUAL
echo "$REGION CRC missing, in error, or NOT OK..."
fi
After all this I found that the script wasn’t working properly… so I did some debugging about this to check, at least, first of all , the length of each variable.
if [[ "$cfmd5sum" == "$localmd5sum" ]]; then
echo "VALUES EQUAL, (local md5sum length given first)"
echo "$localmd5sum"| wc -L
echo "$cfmd5sum"| wc -L
echo "$REGION CRC OK..."
else
echo "VALUES NOT EQUAL"
echo "$localmd5sum"|wc -L
echo "$cfmd5sum"|wc -L
echo "$REGION CRC missing, in error, or NOT OK..."
fi
The output shown me that the variable length was different. At this stage I’ve no idea why, but will add updates here. I’m going to commit this to obsceneredundancy because proof of concept is working and valid, as shown by the output of the script. (i.e. the method is fine, it’s just the way the string is compared in the if, statement, I suspect it is to do with special character or \n characters as I had before. So, when I made this addition to the multi-dc-backup.sh script.. the output now looks like:
Creating Container in LON for obsceneredundancy
LON: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://LON/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
Checking Data integrity of Cloud Files upload to BACKUP_TO_LON
Cloud Files Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_LON CRC missing, in error, or NOT OK...
lon: COMPLETED OK 15504796/15504796
ORD: Not backing up ...
Creating Container in IAD for obsceneredundancy
IAD: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://IAD/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
Checking Data integrity of Cloud Files upload to BACKUP_TO_IAD
Cloud Files Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_IAD CRC missing, in error, or NOT OK...
iad: COMPLETED OK 15504796/15504796
DFW: Not backing up ...
As we can see the 107 (localmd5size) and the 32 (cloudfilesmd5size) are different! I’ve no idea why, since when echoing the variables they look the same. I suspect gremlins and Trolls. A fresh head tomorrow will probably solve this in a few minutes!
This one is worth a mention because it causes some of our customers alarm. If your seeing this ‘warning’ in your Cloud-server control panel, don’t fret!
When you build a cloud-server and select the two tick boxes at the bottom of the build server page (scroll right down), this instructs rackspace automation to attempt to install the Rackspace monitoring & Rackspace Backup agent.
When the server finishes building, these are usually applied by the automation, but sometimes it may have an issue logging into the server and doesn’t run as expected.
Since this warning only indicates the monitoring and cloud backup auto-install failed, these can still be installed by yourselves manually at the below location (please note these links may become out of date use docs.rackspace.com and support.rackspace.com for more detail):
To summarise and clarify, the notification ‘”Server build complete. Installing & configuring software.’ indicates that your server environment built OK, and that the server is waiting for automation to install the additional 2 Rackspace products.
If you see this notification again, it is safe to ignore in terms of the functioning of the cloud-server, and is intended as a warning so you know Rackspace monitoring and cloud backup were not additionally installed by the automation. I have reset the state of your server and you can consider this situation resolved.
If this is causing you concern, it’s actually possible to correct this yourself by installing supernova and novaclient. Please take special care when using admin resources such as nova and API. It’d be difficult to break something if you don’t follow these instructions, but still…take care!
# Install using Python pip the supernova nova wrapper and the rackspace-novaclient
pip install supernova rackspace-novaclient
# Remove the 'rax_service_level_automation' metadata from this server
supernova customer meta serveruuidgoeshere delete rax_service_level_automation
Simples fix. Please note that you will need to configure supernova. This is explained in the supernova category of this blog, and also at:
Hey folks. So, recently I have been doing a bit of work on the Rackspace community, specifically trying to document and make as easy as possible the importing and exporting of cloud server VHD’s between Rackspace regions. This might be really useful if you are designing some HA or multi-region and/or load balancing solution that might be utilizing autoscale, and other kinds of redundancy too, but moving your ‘golden image’ between regions might be quite difficult if doing the entire process manually or step by step as I have documented in the below two articles:
Exporting Cloud server images from a Rackspace Region
https://community.rackspace.com/products/f/25/t/7089
Importing Cloud Server Images to a Rackspace Region
https://community.rackspace.com/products/f/25/t/7186
In this article I completely finish writing the ‘automation demo’ of how to specifically move images, without changing much at all, apart from one ‘serverID’ variable, and the source and destination. The script isn’t finished yet, however the last time I posted this on my blog I was so excited, I actually forgot to include the import function. (which is kind of important!) sorry about that.
echo "Exporting VHD to Cloud Files"
# This section simply retrieves the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" | python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`
echo "IMAGEID detected as $IMAGEID"
# This section requests the Glance API to copy the cloud server image uuid to a cloud files container called export
# > export-cloudfiles
echo "THE IMAGE ID IS: $IMAGEID"
IMAGEID=${IMAGEID%$'\r'}
curl -v "https://lon.images.api.rackspacecloud.com/v2/$TENANT/tasks" -X POST -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" -d '{"type": "export", "input": {"image_uuid": "'$IMAGEID'" , "receiving_swift_container": "export"}}' -o export-cloudfiles
echo "Export looks like"
echo "Waiting for Task to complete..."
## WAIT FOR TASKID EXPORT TO COMPLETE TO CLOUD FILES
# This section simply retrieves the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" | python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`
# This section requests the Glance API to copy the cloud server image uuid to a cloud files container called export
curl "https://lon.images.api.rackspacecloud.com/v2/1000000/tasks/$TASKID_EXPORT" -X GET -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" | python -mjson.tool > export-status
EXPORT_STATUS=$(cat export-status | grep status | awk '{print $2}' | sed 's/"//g' | sed 's/,//g')
while [ "$EXPORT_STATUS" = "processing" ]; do
sleep 15
curl "https://lon.images.api.rackspacecloud.com/v2/1000000/tasks/$TASKID_EXPORT" -X GET -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" | python -mjson.tool > export-status
EXPORT_STATUS=$(cat export-status | grep status | awk '{print $2}' | sed 's/"//g' | sed 's/,//g')
done
# SET CORRECT CLOUD FILES NAME
CLOUD_FILES_NAME=$(cat export-cloudfiles | python -mjson.tool | grep image_uuid | awk '{print $2}' | sed 's/,//g' | sed 's/"//g')
## Download VHD Cloud from Cloud Files to this server
As You can probably see my code is still rather rough, but it’s just so darn exciting that this script works from start to finish, nicely I just HAD to share it a bit earlier! The plan now is to add commandline function so that you can specify ./moveregion {SOURCE_REGION} {DEST_REGION} {SERVER_ID} {TENANT_ID} . Then a customer or a racker would only need these 4 variables to import and export images in an automated way.
I can rewrite the script in such a way that it would accept a .txt file of a couple of hundred cloud server UUID’s, and it would take the server UUID of each, use that uuid to create an image of each server, export to cloud files, import to cloud files, and then import to glance image store for the second region destination. Which naturally, would save hundreds of hours of human time doing this manually.. which is … nice 😀
I would really like to make a UI frontend, using something like Django, and utilize some form of ‘light’ database, that keeps track of all the API import/exports, and even provides estimated time for completion, but my UI skills are really limited to xhtml, css php and mysql.. I need a python or django guy to help out with some of this. If anyone is interested, please reach out to me.