Automating Backups in Public Cloud using Cloud Files

Hey folks, I know it’s been a little while since I put an article together. However I have been putting together a really article explaining how to write bespoke backup systems for the Rackspace Community. It’s a proof of concept/demonstration/tutorial as opposed to a production application. However people looking to create custom cloud backup scripts may benefit from the experience of reading thru it.

You can see the article at the below URL:

https://community.rackspace.com/products/f/25/t/7857

Checking File integrity with Cloud Files, post upload file

So, as you may already be aware, I am working on a lightweight backup script called obscene redundancy’. An redundant backup software capable of 18 replicas of data to Rackspace Cloud Files API service. It’s so redundant… it’s obscene redundancy.

For more details visit the project URL:
https://github.com/aziouk/obsceneredundancy/

Today, I was discussing with my colleague, that it was all very well uploading your tar to cloud files, but, wouldn’t you really like to know if the file you uploaded is completely identical number of bits, and order? Enter, Cloud Files ‘HEAD’and Etag. Our MD5 friend.

What I did to improve the obscene redundancy script was quite simple here:

# We define a variable that takes the 'Etag' (MD5Sum) value for the cloud files archive
cfmd5sum=$(swiftly --conf swiftly-configs/swiftly-${SHORT_REGION,,}.conf head
"${BACKUP_DEST}/${FILE}" | grep -i Etag | awk '{print $2}')

# We Define a variable that generates an 'MD5Sum' for the local file archive
localmd5sum=$(md5sum "$BACKUP_DIR"/"$FILE")

echo "Checking Data integrity of Cloud Files upload to $REGION"
echo "Cloud Files Archive MD5:  $cfmd5sum  ....... Local File Archive MD5: $localmd5sum"

# If these values
if [[ "$cfmd5sum" -ne "$localmd5sum" ]];
then
echo "VALUES NOT EQUAL"
echo "$REGION CRC OK..."
else
echo "VALEUS EQUAL
echo "$REGION CRC missing, in error, or NOT OK..."
fi

After all this I found that the script wasn’t working properly… so I did some debugging about this to check, at least, first of all , the length of each variable.

   if [[ "$cfmd5sum" == "$localmd5sum" ]]; then
                        echo "VALUES EQUAL, (local md5sum length given first)"
                        echo "$localmd5sum"| wc -L
                        echo "$cfmd5sum"| wc -L


                        echo "$REGION CRC OK..."
                else
                        echo "VALUES NOT EQUAL"
                        echo "$localmd5sum"|wc -L
                        echo "$cfmd5sum"|wc -L
                        echo "$REGION CRC missing, in error, or NOT OK..."
                fi

The output shown me that the variable length was different. At this stage I’ve no idea why, but will add updates here. I’m going to commit this to obsceneredundancy because proof of concept is working and valid, as shown by the output of the script. (i.e. the method is fine, it’s just the way the string is compared in the if, statement, I suspect it is to do with special character or \n characters as I had before. So, when I made this addition to the multi-dc-backup.sh script.. the output now looks like:

Creating Container in LON for obsceneredundancy

LON: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://LON/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz

Checking Data integrity of Cloud Files upload to BACKUP_TO_LON
Cloud Files Archive MD5:  65147eb66f8bbeff03a229570b0a1be7  ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7  /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_LON CRC missing, in error, or NOT OK...
lon: COMPLETED OK 15504796/15504796
ORD: Not backing up ...



Creating Container in IAD for obsceneredundancy

IAD: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://IAD/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz

Checking Data integrity of Cloud Files upload to BACKUP_TO_IAD
Cloud Files Archive MD5:  65147eb66f8bbeff03a229570b0a1be7  ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7  /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_IAD CRC missing, in error, or NOT OK...
iad: COMPLETED OK 15504796/15504796
DFW: Not backing up ...

As we can see the 107 (localmd5size) and the 32 (cloudfilesmd5size) are different! I’ve no idea why, since when echoing the variables they look the same. I suspect gremlins and Trolls. A fresh head tomorrow will probably solve this in a few minutes!

Cheers &
Best wishes,
Adam