Rackspace Cloud Server not coming up after building it

So! Perhaps you’ve read my article on nova-agent , the common cause of this issue? If you haven’t you should since it covers out well the importance of nova-agent.

However, nova-agent itself also comes unstuck if the machine nova-agent is installed on is missing the xe-linux-distribution service, this service is provided by the package xe-guest-utilities and can be installed by yourselves, in the case that installing/ ensuring nova-agent starts on boot does not fix your issues.

Specifically if your nova-agent log provides you this message, you know you need to install the xe-guest-utilities. Simplies!

Problem

# cat /var/log/nova-agent.log

2016-10-06 18:58:14,696 [ERROR] [EXC] Traceback (most recent call last):
2016-10-06 18:58:14,697 [ERROR] [EXC]   File "/usr/share/nova-agent/nova-agent.py", line 40, in 
2016-10-06 18:58:14,697 [ERROR] [EXC]     xs = plugins.XSComm()
2016-10-06 18:58:14,697 [ERROR] [EXC]   File "/usr/share/nova-agent/1.39.0/plugins/xscomm.py", line 43, in __init__
2016-10-06 18:58:14,697 [ERROR] [EXC]     self.xs_handle = pyxenstore.Handle()
2016-10-06 18:58:14,700 [ERROR] [EXC] PyXenStoreError: Couldn't open connection to the xenstore: No such file or directory
2016-10-06 18:58:14,701 [ERROR] failed to parse config file '/usr/share/nova-agent/nova-agent.py'

Solution

# Redhat and CentOS systems
yum install xe-guest-utilities

# Debian, Ubuntu and other apt based systems
apt-get install xe-guest-utilities

I hope that this is of some assistance, here is some more background information.

More details about nova-agent and xe-guest-utilities in Xen

Provided that you have definitely enabled nova-agent, and ensured that it is running (after restarting the original server), with ps auxfwww | grep nova-agent

then, you should be good to re-image the original server, and then rebuild out the second.

The reason why your server doesn’t appear to be coming up in the new build is for some reason, nova-agent service got disabled on boot-time, and as a result, the nova-agent service responsible for swapping out the network configuration of your cloud-server wasn’t started up when the server was built, and the automatic ip configuration change didn’t occur. This explains well the behaviour you’ve been seeing, and after looking in the backend the error code seems to confirm that the issue was that the nova-agent wasn’t running.

Provided that you’ve definitely installed nova-agent and confirmed it is running, as well as made sure it starts at boot time, as in the article I wrote, you should be good.

I hope that this explanation and clarification meets you well.

I can see that you’ve recently posted an additional issue that has been experienced with xe-linux-distribution (the cause of the PyXenStoreError). This secondary cause of the issue can be fixed by ensuring xe-linux-distribution is installed;

apt-get update;
apt-get install xe-guest-utilities

This should install the xen guest tools as required by nova-agent. This is required by the nova-agent in order for the networking data to be retrieved by nova-agent, whereas nova-agent itself applies the change, but these services both need to be running and installed for this to work properly!

I really hope that this is of some assistance ,of course if you have additional questions, comments or concerns please don’t hesitate to write back, and we can escalate this issue further for you. These instructions should fix your issues though! I hope this helps &

Cool Little script for downloading stuff


#!/bin/sh
# just use uuid's instead of sequential numbers hehe

for a in `seq 10000000 90000000`;
do
for b in `seq 1 10`;
do
‪#‎echo‬ http://cdn.anonymous.com/$a""_""$b"".user"
echo "wget http://cdn.anonymous.com/$a""_""$b"".user" | bash
#echo "curl http://cdn.anonymous.com/$a""_""$b".user -o "($a)"_"($b)".user"

filesize=`ls -al "$a"_"$b".user | awk '{print $5}'`
echo "FILESIZE= $filesize"

if [ "$filesize" -eq "49" ]
then
echo "404: Emtpy fakefile HTTP 200 detected! The end of this hidden usergroup was detected"
echo "Cleaning up.."
rm "$a"_"$b".user
break;

else
echo "200: Continuing "
fi

sleep 4

done

TCPDUMP command packet capture Usage

So, it’s been a little while since my last update. We’ve been quite busy recently, but for those interested in learning more about tcpdump and physically capturing packets.

List Interfaces that can be tcp dumped

tcpdump -D

Listen on Interface eth0

tcpdump -i eth0

Listen to Xenserver domain 16 on public net

tcpdump -i vif16.0 

Listen on any interface

tcpdump -i any

Super duper High verbosity tcpdump

tcpdump -vvvv -i eth0 

Be verbose and print data of each packet in both hex and ASCII

tcpdump -v -X -i eth0

Be less verbose

tcpdump -q 

Limit the capture of packets to 100

tcpdump -c 100 -i eth0 

Display IP addresses and port numbers instead of domain and service names when capturing packets (note: on some systems you need to specify -nn to display port numbers):

tcpdump -n

Capture any packets where the destination host is 192.168.1.1. Display IP addresses and port numbers:

tcpdump -n dst host 192.168.1.1

Capture any packets where the source host is 192.168.1.1. Display IP addresses and port numbers:

tcpdump -n src host 192.168.1.1

http://www.rationallyparanoid.com/articles/tcpdump.html

Finding Stuff quick and dirty way

Hey. So my good friend who is a support engineer was asking me how he could find mail log that wasn’t in the traditional location and he was scratching his head.
So I put this together (which by the way is really bad), but not in a harmful way, it could just be more elegant. But since he is still learning , this seemed like a good time to introduce him to xargs.

find / | grep mail | grep log | xargs -i ls -al {}

Nice and simple though and pretty much straight to the point, if the grep pipes are forgiven. (and wouldn’t blame you if they were not 🙂 )

Building 50 Cloud Servers BASH/API Automation

So, I had a good friend of mine who is a cloud Mentor at Rackspace, reach out to me concerning an easier way of deploying cloud-images, without each time a cloud server is spun up, having to unroll the image into a CBS. His customer wanted to simply have a ‘primary master’ CBS volume, a template of their site if you will, the equivalent of a ‘golden image’, the only difference was it was a CBS Volume. So I set about making this work. It would still take a few hours, at least to provision 50 to 200 servers, but it was much faster than the alternatives. Here is how I did it. I actually have some ideas for how to improve this but I’ve not yet implemented it. That goody is to come in later scripts.

#!/bin/bash

USERNAME='mycloudusername'
APIKEY='mycloudapikey'
ACCOUNT_NUMBER=100101011
API_ENDPOINT="https://lon.blockstorage.api.rackspacecloud.com/v1/$ACCOUNT_NUMBER/volumes"
MASTER_CBS_VOL_ID="d8a67ad1-8037-46bc-8790-efca2cb6e5bd"


TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" |  python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`


# Populate CBS
for i in `seq 1 2`;
do

echo "Generating CBS Clone #$i"
#curl -s -vvvv  \
-X POST "$API_ENDPOINT" \
-H "X-Auth-Token: $TOKEN"  \
-H "X-Project-Id: $ACCOUNT_NUMBER" \
-H "Accept: application/json"  \
-H "Content-Type: application/json" -d '{"volume": {"source_volid": "d8a67ad1-8037-46bc-8790-efca2cb6e5bd", "size": 50, "display_name": "win-'$i'", "volume_type": "SSD"}}'  | jq .volume.id | tr -d '"' >> cbs.created

done

echo "Giving CBS 2 hour grace time for 50 CBS clone"
#sleep 7200

echo "Listing all CBS Volume ID's created"
cat cbs.created
echo ""


# Populate Nova
count=1;
echo "Populating Nova servers with CBS disk"
while read n; do
Echo "Build Task $n Started:"
nova --insecure --os-username mycloudusername --os-auth-system=rackspace  --os-tenant-name 100110111 --os-auth-url https://lon.identity.api.rackspacecloud.com/v2.0/ --os-password myapikeygoeshere boot --flavor general1-1 --block-device-mapping vda="$n":::1 Auto-win-"$count"
((count=count+1))

done < cbs.created

# Move the cbs.created.old away
mv cbs.created cbs.created.old -f

Requirements are nova and jq.
https://stedolan.github.io/jq/
https://developer.rackspace.com/blog/getting-started-using-python-novaclient-to-manage-cloud-servers/

Checking for Network packet Retransmission , troubleshooting network card & switches

So, you might want to test whether the NIC of your box is ‘bad’, one way to do this is looking at the retransmissions.

netstat -s | grep retransmits
   3535665 fast retransmits
   3920918 forward retransmits
   122319 retransmits in slow start
   3652 sack retransmits failed
netstat -s | grep transmit
    10512472 segments retransmited
    733 times recovered from packet loss due to fast retransmit
    Detected reordering 73 times using reno fast retransmit
    TCPLostRetransmit: 196
    400 timeouts after reno fast retransmit
    3535665 fast retransmits
    3920918 forward retransmits
    122319 retransmits in slow start
    13652 sack retransmits failed

This isn’t much use though, because you need to see how many total packets come in:

netstat -s | grep total
    23799703342 total packets received

It’s possible to get the full details with netstat -s , naturally.

Checking Forward and Reverse Connectivity of a Linux Server

A good friend of mine is to thank for this excellent pair of one liners. One is to be executed on source, and the other on destination target.

Testing forward route


# Machine Source
root@iup2-web01:/mnt/www# dd if=/dev/zero bs=1024K count=1024 | nc -v 10.181.164.100 23
Connection to 10.181.164.100 23 port [tcp/telnet] succeeded!
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 11.0009 s, 97.6 MB/s
---

# Machine Destination
root@iup2-nfs:~# nc -v -l 23 > /dev/null
Listening on [0.0.0.0] (family 0, port 23)
Connection from [10.181.162.15] port 23 [tcp/telnet] accepted (family 2, sport 54373)

However when we try this in reverse, we see a major degradation in the network speed. Here is test with 20MB transfer instead of 1000MB:

Testing Reverse Route

---

# Machine Source
root@iup2-nfs:~# dd if=/dev/zero bs=1024K count=20 | nc -v 10.181.162.15 23
Connection to 10.181.162.15 23 port [tcp/telnet] succeeded!
20+0 records in
20+0 records out
20971520 bytes (21 MB) copied, 144.327 s, 145 kB/s
---
# Machine Destination
root@iup2-web01:/mnt/www#  nc -v -l 23 > /dev/null 
Listening on [0.0.0.0] (family 0, port 23)
Connection from [10.181.164.100] port 23 [tcp/telnet] accepted (family 2, sport 56072)

As we can see one of the machines has some difficulty. The issue at hand was that there was some problems with the virtual switch daemon on the hypervisor. Thanks to my friend Gospodin for documenting this one and sharing with me how he tested it,

Deploying your own cloud API using Keystone Openstack

Just a quick one. There are a lot of things that aren’t complete, but this is mostly for my reference and to make writing an Ansible playbook massively easier of course!

For the full guide you will want the link at the bottom of the page.

Outlay

openstack-101-update-25-638

Operation

SCH_5002_V00_NUAC-Keystone

Deployment

# EPEL Not Needed for CENTOS 7 on RS Cloud, included for detail
yum install http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm -y

# Install Openstack Liberty repo
yum install centos-release-openstack-liberty

# Upgrade dist packages
yum upgrade -y

# Install openstack client
yum install python-openstackclient -y

# Update selinux policies for Openstack
yum install openstack-selinux -y

# Configure SQL
yum install mariadb mariadb-server MySQL-python -y

# Configure and enable mariadb bind and utf8 settings etc
vi /etc/my.cnf.d/mariadb_openstack.cnf

systemctl enable mariadb.service
systemctl start mariadb.service

# Prepare database privileges ____________TODO_______
# mysql_secure_installation _____TODO______


# Prepare mongodb nosqli set controller address, set start and enabled

yum install mongodb-server mongodb -y
vi /etc/mongod.conf


systemctl enable mongod.service
systemctl start mongod.servicei

# Queuing Install , enable start rabbitmq, add user and set permissions for openstack user
yum install rabbitmq-server -y
systemctl enable rabbitmq-server.service
systemctl start rabbitmq-server.service
rabbitmqctl add_user openstack somepasswordhere
rabbitmqctl set_permissions openstack ".*" ".*" ".*"

# Generate admin_token
openssl rand -hex 15

# Install openstack keystone, httpd and memcached, set to start, enable
yum install openstack-keystone httpd mod_wsgi memcached python-memcached -y
systemctl enable memcached.service
systemctl start memcached.service

# Complete Keystone [Default], [database] connection, [memcache] servers, [token] provider and driver = memcache [revoke] driver = sql [default] verbose = True
vi /etc/keystone/keystone.conf

# Populate the keystone database
su -s /bin/sh -c "keystone-manage db_sync" keystone

# (re)configure httpd
vi /etc/httpd/conf.d/wsgi-keystone.conf
systemctl enable httpd.service
systemctl start httpd.service

# Update environment variable exports for OS_TOKEN=admintoken, OS_URL=http://snetip:35357/v3 OS_IDENTITY_API_VERSION=3 and source it

vi .bash_profile
source .bash_profile

# Create Service entity and API endpoints
openstack service create   --name keystone --description "OpenStack Identity" identity



# API Endpoints
openstack endpoint create --region RegionOne identity public http://10.179.1.188:5000/v2.0
openstack endpoint create --region RegionOne identity internal http://10.179.1.188:5000/v2.0
openstack endpoint create --region RegionOne identity admin http://10.179.1.188:35357/v2.0

# Create project; admin
openstack project create --domain default   --description "Admin Project" admin

# Create admin user for project
openstack user create --domain default   --password-prompt admin


# Create admin's role
openstack role create adminn

# Add admin role to admin project & it's admin user
openstack role add --project admin --user admin admin

# Create Service Project

openstack project create --domain default   --description "Service Project" service

# Create demo project
openstack project create --domain default   --description "Demo Project" demo

# Create the demo user
openstack user create --domain default   --password-prompt demo

# and user role for demo user
openstack role create user

# Add the user role to the demo project and user
openstack role add --project demo --user demo user

# SKIPPED remove keystone-dist-paste.ini

# Unset the OS_TOKEN and OS_URL environment variables
unset OS_TOKEN OS_URL

# Request token for admin user
openstack --os-auth-url http://10.179.1.188:35357/v3  --os-project-domain-id default --os-user-domain-id default   --os-project-name demo --os-username demo --os-auth-type password   token issue

# Verify operation toadd check verification status function
touch demo-openrc.sh
touch admin-openrc.sh
cat /etc/keystone/keystone.conf | grep admin_token

# Test admin api credentials
source admin-openrc.sh
opentack token issue

# Test demo api credentials
source demo-openrc.sh
openstack token issue
http://docs.openstack.org/liberty/install-guide-rdo/environment.html