The most common cause of this is the permit root login is set to no, although there might be other causes, like a really broken sshd_config, instead of just one variable. The procedure for looking into this is pretty much the same regardless of the breakage that has occurred. Here is what you need to do:
Here’s the full procedure:
1) Put server into rescue mode.
2) Login to cloud-server on SSH port, please note rescue mode gives you a new temporary root password allowing you to reset the password for SSH on the ‘original disk’.
3) once logged in mount the /dev/xvdb devices, this may be /dev/xvdb1 or /dev/xvdb2 but is usually /dev/xvdb1 and chroot (change root to the ‘original disk’)
# Mount old disk
mnt /dev/xvdb1 /mnt
# Change to the ‘old disk’
chroot /mnt
# Set the new password for root on the old disk:
passwd
# enter the new password when prompted
and specifically ensure that /etc/ssh/sshd_config has this line:
PermitRootLogin no
changed to:
PermitRootLogin yes
Your developer or sysad won’t be able to login until you reset the root password here, and if you do not know the username to su to root from, it is absolutely critical to perform this work, otherwise you won’t be able to access the server.
Also, once you have allowed the root login, and changed the password to something you recognise you will be able to exit rescue mode thru the control panel and login to the machine as normal.
For more detail about how to do this (although all the steps are here pretty much, please see):
Thanks to my colleague Marcin for thie guestmount tools protip.
I wrote a previous guide which explains how to download/export a Cloud server image VHD from Rackspace Cloud, which is failing to build. It might allow you to perform data recovery, even if the image can’t be booted. Which I’m guessing someone is going to run into sooner or later, and will be pleased to see this article, it will at least give you a best shot at reading the VHD and recovering it, since as you might know already, just because boot or kernel is broken, doesn’t mean that the data isn’t there!
Once the image gets downloaded to your new cloud instance you can use ‘libguestfs-tools’ package (same name on Ubuntu and CentOS) which contains tools necessary for mounting .vhd image files.
The command would be (read-only mode):
guestmount -a {image-name}.vhd -i --ro {mount-point}
So, a customer is experiencing slowness/sluggishness in their app. You know there is not issue with the hypervisor from instinct, but instinct isn’t enough. Using tools like xentop, sar, bwm-ng are critical parts of live and historical troubleshooting.
Sar can tell you a story, if you can ask the storyteller the write questions, or even better, pick up the book and read it properly. You’ll understand what the plot, scenario, situation and exactly how to proceed with troubleshooting by paying attention to these data and knowing which things to check under certain circumstances.
This article doesn’t go in depth to that, but it gives you a good reference of a variety of tests, the most important being, cpu usage, io usage, network usage, and load averages.
CPU Usage of all processors
# Grab details live
sar -u 1 3
# Use historical binary sar file
# sa10 means '10th day' of current month.
sar -u -f /var/log/sa/sa10
CPU Usage of a particular Processor
sar -P ALL 1 1
‘-P 1’ means check only the 2nd Core. (Core numbers start from 0).
sar -P 1 1 5
The above command displays real time CPU usage for core number 1, every 1 second for 5 times.
Observing Changes in Memory over time
sar -r 1 3
The above command provides memory stats every 1 second for a total of 3 times.
Observing Swap usage over time
sar -S 1 5
The above command reports swap statistics every 1 seconds, a total 3 times.
Overall I/O activity
sar -b 1 3
The above command checks every 1 seconds, 3 times.
Individual Block Device I/O Activities
This is a useful check for LUN , block devices and other specific mounts
sar -d 1 1
sar -p d
DEV – indicates block device, i.e. sda, sda1, sdb1 etc.
Total Number processors created a second / Context switches
sar -w 1 3
Run Queue and Load Average
sar -q 1 3
This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3” reports for every 1 seconds a total of 3 times.
Report Network Statistics
sar -n KEYWORD
KEYWORDS Available;
DEV – Displays network devices vital statistics for eth0, eth1, etc.,
EDEV – Display network device failure statistics
NFS – Displays NFS client activities
NFSD – Displays NFS server activities
SOCK – Displays sockets in use for IPv4
IP – Displays IPv4 network traffic
EIP – Displays IPv4 network errors
ICMP – Displays ICMPv4 network traffic
EICMP – Displays ICMPv4 network errors
TCP – Displays TCPv4 network traffic
ETCP – Displays TCPv4 network errors
UDP – Displays UDPv4 network traffic
SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
ALL – This displays all of the above information. The output will be very long.
sar -n DEV 1 1
Specify Start Time
sar -q -f /var/log/sa/sa11 -s 11:00:00
sar -q -f /var/log/sa/sa11 -s 11:00:00 | head -n 10
So, as you may already be aware, I am working on a lightweight backup script called obscene redundancy’. An redundant backup software capable of 18 replicas of data to Rackspace Cloud Files API service. It’s so redundant… it’s obscene redundancy.
Today, I was discussing with my colleague, that it was all very well uploading your tar to cloud files, but, wouldn’t you really like to know if the file you uploaded is completely identical number of bits, and order? Enter, Cloud Files ‘HEAD’and Etag. Our MD5 friend.
What I did to improve the obscene redundancy script was quite simple here:
# We define a variable that takes the 'Etag' (MD5Sum) value for the cloud files archive
cfmd5sum=$(swiftly --conf swiftly-configs/swiftly-${SHORT_REGION,,}.conf head
"${BACKUP_DEST}/${FILE}" | grep -i Etag | awk '{print $2}')
# We Define a variable that generates an 'MD5Sum' for the local file archive
localmd5sum=$(md5sum "$BACKUP_DIR"/"$FILE")
echo "Checking Data integrity of Cloud Files upload to $REGION"
echo "Cloud Files Archive MD5: $cfmd5sum ....... Local File Archive MD5: $localmd5sum"
# If these values
if [[ "$cfmd5sum" -ne "$localmd5sum" ]];
then
echo "VALUES NOT EQUAL"
echo "$REGION CRC OK..."
else
echo "VALEUS EQUAL
echo "$REGION CRC missing, in error, or NOT OK..."
fi
After all this I found that the script wasn’t working properly… so I did some debugging about this to check, at least, first of all , the length of each variable.
if [[ "$cfmd5sum" == "$localmd5sum" ]]; then
echo "VALUES EQUAL, (local md5sum length given first)"
echo "$localmd5sum"| wc -L
echo "$cfmd5sum"| wc -L
echo "$REGION CRC OK..."
else
echo "VALUES NOT EQUAL"
echo "$localmd5sum"|wc -L
echo "$cfmd5sum"|wc -L
echo "$REGION CRC missing, in error, or NOT OK..."
fi
The output shown me that the variable length was different. At this stage I’ve no idea why, but will add updates here. I’m going to commit this to obsceneredundancy because proof of concept is working and valid, as shown by the output of the script. (i.e. the method is fine, it’s just the way the string is compared in the if, statement, I suspect it is to do with special character or \n characters as I had before. So, when I made this addition to the multi-dc-backup.sh script.. the output now looks like:
Creating Container in LON for obsceneredundancy
LON: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://LON/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
Checking Data integrity of Cloud Files upload to BACKUP_TO_LON
Cloud Files Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_LON CRC missing, in error, or NOT OK...
lon: COMPLETED OK 15504796/15504796
ORD: Not backing up ...
Creating Container in IAD for obsceneredundancy
IAD: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://IAD/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
Checking Data integrity of Cloud Files upload to BACKUP_TO_IAD
Cloud Files Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_IAD CRC missing, in error, or NOT OK...
iad: COMPLETED OK 15504796/15504796
DFW: Not backing up ...
As we can see the 107 (localmd5size) and the 32 (cloudfilesmd5size) are different! I’ve no idea why, since when echoing the variables they look the same. I suspect gremlins and Trolls. A fresh head tomorrow will probably solve this in a few minutes!
Hey folks. So, recently I have been doing a bit of work on the Rackspace community, specifically trying to document and make as easy as possible the importing and exporting of cloud server VHD’s between Rackspace regions. This might be really useful if you are designing some HA or multi-region and/or load balancing solution that might be utilizing autoscale, and other kinds of redundancy too, but moving your ‘golden image’ between regions might be quite difficult if doing the entire process manually or step by step as I have documented in the below two articles:
Exporting Cloud server images from a Rackspace Region
https://community.rackspace.com/products/f/25/t/7089
Importing Cloud Server Images to a Rackspace Region
https://community.rackspace.com/products/f/25/t/7186
In this article I completely finish writing the ‘automation demo’ of how to specifically move images, without changing much at all, apart from one ‘serverID’ variable, and the source and destination. The script isn’t finished yet, however the last time I posted this on my blog I was so excited, I actually forgot to include the import function. (which is kind of important!) sorry about that.
echo "Exporting VHD to Cloud Files"
# This section simply retrieves the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" | python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`
echo "IMAGEID detected as $IMAGEID"
# This section requests the Glance API to copy the cloud server image uuid to a cloud files container called export
# > export-cloudfiles
echo "THE IMAGE ID IS: $IMAGEID"
IMAGEID=${IMAGEID%$'\r'}
curl -v "https://lon.images.api.rackspacecloud.com/v2/$TENANT/tasks" -X POST -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" -d '{"type": "export", "input": {"image_uuid": "'$IMAGEID'" , "receiving_swift_container": "export"}}' -o export-cloudfiles
echo "Export looks like"
echo "Waiting for Task to complete..."
## WAIT FOR TASKID EXPORT TO COMPLETE TO CLOUD FILES
# This section simply retrieves the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" | python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`
# This section requests the Glance API to copy the cloud server image uuid to a cloud files container called export
curl "https://lon.images.api.rackspacecloud.com/v2/1000000/tasks/$TASKID_EXPORT" -X GET -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" | python -mjson.tool > export-status
EXPORT_STATUS=$(cat export-status | grep status | awk '{print $2}' | sed 's/"//g' | sed 's/,//g')
while [ "$EXPORT_STATUS" = "processing" ]; do
sleep 15
curl "https://lon.images.api.rackspacecloud.com/v2/1000000/tasks/$TASKID_EXPORT" -X GET -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" | python -mjson.tool > export-status
EXPORT_STATUS=$(cat export-status | grep status | awk '{print $2}' | sed 's/"//g' | sed 's/,//g')
done
# SET CORRECT CLOUD FILES NAME
CLOUD_FILES_NAME=$(cat export-cloudfiles | python -mjson.tool | grep image_uuid | awk '{print $2}' | sed 's/,//g' | sed 's/"//g')
## Download VHD Cloud from Cloud Files to this server
As You can probably see my code is still rather rough, but it’s just so darn exciting that this script works from start to finish, nicely I just HAD to share it a bit earlier! The plan now is to add commandline function so that you can specify ./moveregion {SOURCE_REGION} {DEST_REGION} {SERVER_ID} {TENANT_ID} . Then a customer or a racker would only need these 4 variables to import and export images in an automated way.
I can rewrite the script in such a way that it would accept a .txt file of a couple of hundred cloud server UUID’s, and it would take the server UUID of each, use that uuid to create an image of each server, export to cloud files, import to cloud files, and then import to glance image store for the second region destination. Which naturally, would save hundreds of hours of human time doing this manually.. which is … nice 😀
I would really like to make a UI frontend, using something like Django, and utilize some form of ‘light’ database, that keeps track of all the API import/exports, and even provides estimated time for completion, but my UI skills are really limited to xhtml, css php and mysql.. I need a python or django guy to help out with some of this. If anyone is interested, please reach out to me.
So, you may have noticed over the past weeks and months I have been a little bit quieter about the articles I have been writing. Mainly because I’ve been working on a new github project, which, although simple, and lightweight is actually really rather outrageously powerful.
https://github.com/aziouk/obsceneredundancy
Imagine being able to take 15+ redundant replica copies of your files, across 5 or 6 different datacentres. Rackspace Cloud Files API powered, but also with a lot of the flexibility of Bourne Again Shell (BASH).
This was actually quite a neat achievement and I am pleased with the results. There are still some limitations of this redundant replica application, and there are a few bugs, but it is a great proof of concept which shows what you can do with the API both quickly and cheaply (ish). Using filesystems as a service will be the future with some further innovation on the world wide network infrastructure, and it would only take a small breakthrough to rapidly alter the way that OS and machines boot/backup.
If you want to see the project and read the source code before I lay out and describe/explain the entire process of writing this software as well as how to deploy it with cron on linux, then you need wait no longer. Revision 1 alpha is now tested, ready and working in 5 different datacentres.
You can actually toggle which datacentres you wish to utilize as well, it is slightly flexible. The only important consideration here is to understand that there are some limitations such as a lack of de-duping, and this uses tar’s and swiftly, instead of directly querying the API. Since directly uploading thru the API a tar file is relatively simple, I will probably implement it like that as I have before and get rid of swiftly in future iterations, however such a project is really ideal for learning more about BASH , CRON, API and programmatic automation of and sequential filesystems utilizing functional programming and division of labour between workers,
https://github.com/aziouk/obsceneredundancy
Test it (please note it will be a little bit buggy on different environments and there is no instructions yet)
So, you have some really important data, so much so that 99.99% redundancy is not enough for you. One solution to this is to use multiple copies in multiple datacentres. Most enterprise backup will have on-site, an off-site, and an archival copy. What I’m going to show here is how to make 4 different copies of your data, in 4 different datacentres around the world. This will provide a very high redundancy of storage, and greatly reduce the likelihood of data loss. Although it costs a bit more, this kind of solution may be suitable for many small, medium and large businesses. Naturally, depending on the size of the data, and the importance of redundancy. You might not have many files to backup, perhaps a small cd worth.. it will be very inexpensive if you have a small backup to make. However, due to the way that cloud files is billed, copying data to cloud files costs money in bandwidth when writing from a server in London to a cloud files in Sydney, Chicago or Dallas for instance, so it’s very important to consider the impact of bandwidth costs when utilizing an additional 3 cloud files endpoints that are not in the local datacentre region. Which, is essentially what we are doing in this guide.
Create your swiftly environments (setting the name for each file)
==> /root/.swiftly-dfw.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = dfw
==> /root/.swiftly-iad.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = iad
==> /root/.swiftly-ord.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = ord
==> /root/.swiftly-syd.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = syd
Create your Script
# Adam Bull
# Adam Bull, Rackspace UK
# May 17, 2016
# This can be sequential or, it can be parallel, not sure which is better yet use & for parallel
# This backs up /documents file and puts it in the 'managed_backup' cloud files container at the following 4 datacentres ,DFW, IAD, ORD and SYD
swiftly --verbose --conf ~/.swiftly-dfw.conf --concurrency 100 put -i /documents /managed_backup
swiftly --verbose --no-snet --conf ~/.swiftly-iad.conf --concurrency 100 put -i /documents /managed_backup
swiftly --verbose --no-snet --conf ~/.swiftly-ord.conf --concurrency 100 put -i /documents /managed_backup
swiftly --verbose --no-snet --conf ~/.swiftly-syd.conf --concurrency 100 put -i /documents /managed_backup
Because the other 3 endpoints are in different datacentres, we can't use servicenet, so we defined --no-snet option for swiftly as above.
Execute your script
chmod +x multibackup.sh
./multibackup.sh
This obviously is a basic system and script of taking backups, and it is not for production use (yet). This is an alpha project I started today. The cool thing is that it works, and quite nicely. Although it is far from finished yet as a workable script.
Once the script is made, you can simply add it to crontab -e as you would usually. Make sure the user you execute with cron has access to the .conf files in their home directory!
So, you have lost your Windows Administrator password for your Rackspace cloud server? I’d like to thank my friend Cory for providing the link details for how to do this.
No problem. Simply put the Windows VM into rescue mode using a Linux image (yup!)
Put Windows VM into Rescue mode using Linux image
# Initiate rescue using the CentOS 7 image for the server uuid 0b67faf7-bc56-4844-ad0b-16e39f289ef6
$ nova me rescue --password mypasswordforrescuemodehere --image 7fade26a-0cca-415f-a988-49c021768fca 0b67faf7-bc56-4844-ad0b-16e39f289ef6
If you’ve broken your Rackspace server and you don’t know how to perform the above step, send a ticket to Rackspace support and they should be able to put your server in rescue so you can reset the password of your windows machine!
SSH to rescue server
ssh root@myserveriphere
Check which disk is Windows NTFS
# fdisk -l
Disk /dev/xvdc: 2147 MB, 2147483648 bytes, 4194304 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x0003e9b3
Device Boot Start End Blocks Id System
/dev/xvdc1 2048 4194303 2096128 83 Linux
Disk /dev/xvdb: 85.9 GB, 85899345920 bytes, 167772160 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xfcb073fc
Device Boot Start End Blocks Id System
/dev/xvdb1 * 2048 167770111 83884032 7 HPFS/NTFS/exFAT
Disk /dev/xvda: 85.9 GB, 85899345920 bytes, 167772160 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x00070dc0
Here we can see that the disk we want is /dev/xvdb1 since this is the HPFS/NTFS/exFAT partition format used by windows. The rescue mode builds a new server and disk, attaching your old disk as the ‘b’ disk, xvdb. Lets mount the disk and install the application we need to wipe the password for the box.
Mount the disk
yum update -y
yum install ntfs-3g -y
mount /dev/xvdb1 /mnt
root@RESCUE-test config]# chntpw -u "Administrator" SAM
chntpw version 0.99.6 110511 , (c) Petter N Hagen
Hive name (from header): <\SystemRoot\System32\Config\SAM>
ROOT KEY at offset: 0x001020 * Subkey indexing type is: 666c
File size 262144 [40000] bytes, containing 6 pages (+ 1 headerpage)
Used for data: 255/20712 blocks/bytes, unused: 13/3672 blocks/bytes.
* SAM policy limits:
Failed logins before lockout is: 0
Minimum password length : 0
Password history count : 0
| RID -|---------- Username ------------| Admin? |- Lock? --|
| 01f4 | Administrator | ADMIN | |
| 01f5 | Guest | | dis/lock |
---------------------> SYSKEY CHECK <-----------------------
SYSTEM SecureBoot : -1 -> Not Set (not installed, good!)
SAM Account\F : 0 -> off
SECURITY PolSecretEncryptionKey: -1 -> Not Set (OK if this is NT4)
Syskey not installed!
RID : 0500 [01f4]
Username: Administrator
fullname:
comment : Built-in account for administering the computer/domain
homedir :
User is member of 1 groups:
00000220 = Administrators (which has 1 members)
Account bits: 0x0010 =
[ ] Disabled | [ ] Homedir req. | [ ] Passwd not req. |
[ ] Temp. duplicate | [X] Normal account | [ ] NMS account |
[ ] Domain trust ac | [ ] Wks trust act. | [ ] Srv trust act |
[ ] Pwd don't expir | [ ] Auto lockout | [ ] (unknown 0x08) |
[ ] (unknown 0x10) | [ ] (unknown 0x20) | [ ] (unknown 0x40) |
Failed login count: 0, while max tries is: 0
Total login count: 15
- - - - User Edit Menu:
1 - Clear (blank) user password
2 - Edit (set new) user password (careful with this on XP or Vista)
3 - Promote user (make user an administrator)
(4 - Unlock and enable user account) [seems unlocked already]
q - Quit editing user, back to user select
Select: [q] > 1
Password cleared!
Hives that have changed:
# Name
0
Write hive files? (y/n) [n] : y
0 - OK
It’s been done, yay!
Unrescue the cloud server, either from control panel or using nova
abull-mb:~ adam$ supernova me unrescue 0b67faf7-bc56-4844-ad0b-16e39f289ef6
Yay! We now automatically bypass the ordinary login screen so we can get into the server to reconfigure it properly again.
You might have some questions about… setting up nova.
Setting up Nova
# Nova configuration
#export OS_AUTH_URL=https://lon.identity.api.rackspacecloud.com/v2.0/
#export OS_AUTH_SYSTEM=rackspace_uk
#export OS_REGION_NAME=LON
#export OS_USERNAME=mycloudusernamehere
# Tenant Name is customer number shown in url of mycloud control panel
##export OS_TENANT_NAME=10101010
#export NOVA_RAX_AUTH=1
#export OS_PASSWORD=mycloudapikeyhere
# Project ID is customer number shown in url of mycloud control panel
#export OS_PROJECT_ID=100101010
#export OS_NO_CACHE=1
These ‘environment variables’ should be put in a file like your .bash_profile. Then you will want to source it before using nova
source .bash_profile
or
. .bash_profile
This just sets the variables on the commandline so they can be used by nova. It is possible to provide all of the credentials on the nova commandline as described in previous articles on this blog concerning nova.
Using nova without .bash_profile or environment variables
for more details about how to install python based nova, used in this article, please see;
https://support.rackspace.com/how-to/installing-python-novaclient-on-linux-and-mac-os/