Checking File integrity with Cloud Files, post upload file

So, as you may already be aware, I am working on a lightweight backup script called obscene redundancy’. An redundant backup software capable of 18 replicas of data to Rackspace Cloud Files API service. It’s so redundant… it’s obscene redundancy.

For more details visit the project URL:
https://github.com/aziouk/obsceneredundancy/

Today, I was discussing with my colleague, that it was all very well uploading your tar to cloud files, but, wouldn’t you really like to know if the file you uploaded is completely identical number of bits, and order? Enter, Cloud Files ‘HEAD’and Etag. Our MD5 friend.

What I did to improve the obscene redundancy script was quite simple here:

# We define a variable that takes the 'Etag' (MD5Sum) value for the cloud files archive
cfmd5sum=$(swiftly --conf swiftly-configs/swiftly-${SHORT_REGION,,}.conf head
"${BACKUP_DEST}/${FILE}" | grep -i Etag | awk '{print $2}')

# We Define a variable that generates an 'MD5Sum' for the local file archive
localmd5sum=$(md5sum "$BACKUP_DIR"/"$FILE")

echo "Checking Data integrity of Cloud Files upload to $REGION"
echo "Cloud Files Archive MD5:  $cfmd5sum  ....... Local File Archive MD5: $localmd5sum"

# If these values
if [[ "$cfmd5sum" -ne "$localmd5sum" ]];
then
echo "VALUES NOT EQUAL"
echo "$REGION CRC OK..."
else
echo "VALEUS EQUAL
echo "$REGION CRC missing, in error, or NOT OK..."
fi

After all this I found that the script wasn’t working properly… so I did some debugging about this to check, at least, first of all , the length of each variable.

   if [[ "$cfmd5sum" == "$localmd5sum" ]]; then
                        echo "VALUES EQUAL, (local md5sum length given first)"
                        echo "$localmd5sum"| wc -L
                        echo "$cfmd5sum"| wc -L


                        echo "$REGION CRC OK..."
                else
                        echo "VALUES NOT EQUAL"
                        echo "$localmd5sum"|wc -L
                        echo "$cfmd5sum"|wc -L
                        echo "$REGION CRC missing, in error, or NOT OK..."
                fi

The output shown me that the variable length was different. At this stage I’ve no idea why, but will add updates here. I’m going to commit this to obsceneredundancy because proof of concept is working and valid, as shown by the output of the script. (i.e. the method is fine, it’s just the way the string is compared in the if, statement, I suspect it is to do with special character or \n characters as I had before. So, when I made this addition to the multi-dc-backup.sh script.. the output now looks like:

Creating Container in LON for obsceneredundancy

LON: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://LON/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz

Checking Data integrity of Cloud Files upload to BACKUP_TO_LON
Cloud Files Archive MD5:  65147eb66f8bbeff03a229570b0a1be7  ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7  /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_LON CRC missing, in error, or NOT OK...
lon: COMPLETED OK 15504796/15504796
ORD: Not backing up ...



Creating Container in IAD for obsceneredundancy

IAD: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://IAD/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz

Checking Data integrity of Cloud Files upload to BACKUP_TO_IAD
Cloud Files Archive MD5:  65147eb66f8bbeff03a229570b0a1be7  ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7  /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_IAD CRC missing, in error, or NOT OK...
iad: COMPLETED OK 15504796/15504796
DFW: Not backing up ...

As we can see the 107 (localmd5size) and the 32 (cloudfilesmd5size) are different! I’ve no idea why, since when echoing the variables they look the same. I suspect gremlins and Trolls. A fresh head tomorrow will probably solve this in a few minutes!

Cheers &
Best wishes,
Adam

Obscene Redundancy utilizing Rackspace Cloud Files

So, you may have noticed over the past weeks and months I have been a little bit quieter about the articles I have been writing. Mainly because I’ve been working on a new github project, which, although simple, and lightweight is actually really rather outrageously powerful.

https://github.com/aziouk/obsceneredundancy

Imagine being able to take 15+ redundant replica copies of your files, across 5 or 6 different datacentres. Rackspace Cloud Files API powered, but also with a lot of the flexibility of Bourne Again Shell (BASH).

This was actually quite a neat achievement and I am pleased with the results. There are still some limitations of this redundant replica application, and there are a few bugs, but it is a great proof of concept which shows what you can do with the API both quickly and cheaply (ish). Using filesystems as a service will be the future with some further innovation on the world wide network infrastructure, and it would only take a small breakthrough to rapidly alter the way that OS and machines boot/backup.

If you want to see the project and read the source code before I lay out and describe/explain the entire process of writing this software as well as how to deploy it with cron on linux, then you need wait no longer. Revision 1 alpha is now tested, ready and working in 5 different datacentres.

You can actually toggle which datacentres you wish to utilize as well, it is slightly flexible. The only important consideration here is to understand that there are some limitations such as a lack of de-duping, and this uses tar’s and swiftly, instead of directly querying the API. Since directly uploading thru the API a tar file is relatively simple, I will probably implement it like that as I have before and get rid of swiftly in future iterations, however such a project is really ideal for learning more about BASH , CRON, API and programmatic automation of and sequential filesystems utilizing functional programming and division of labour between workers,

https://github.com/aziouk/obsceneredundancy

Test it (please note it will be a little bit buggy on different environments and there is no instructions yet)

git clone https://github.com/aziouk/obsceneredundancy

Cheers &

Best wishes,
Adam

Preparing a Github/Gitlab Development Bastion Server

So you are looking to use github / gitlab to manage your infrastructure and development. To do this effectively you will need to prepare your environment. Here is an example.

This is for our ansible playbook.

Install Required Dependencies

yum update -y
yum install -y vim git ansible tree fail2ban

Add user for repo

useradd -m -G wheel osan
passwd osan

Secure SSH by disabling root login and changing SSH port

sed 's/#PermitRootLogin yes/PermitRootLogin no/g;s/#Port 22/Port 222/g' -i /etc/ssh/sshd_config
firewall-cmd --add-port=666/tcp --permanent
firewall-cmd --reload
systemctl restart sshd.service

Generate key for osan user

su - osan
ssh-keygen -f ~/.ssh/id_rsa -t rsa -N ''

Output the key you generated

cat ~/.ssh/id_rsa.pub

The next step is adding your SSH key above to the ‘profiles’ section of your gitlab/github user. Find this in my profile, under ‘SSH KEYS’.

Screen Shot 2016-04-25 at 10.13.03 AM

Screen Shot 2016-04-25 at 10.13.19 AM

Set Git Variables

USERNAME=yourgitlabusername
git config --global user.name $USERNAME
git config --global user.email "my.username@mydomain.com"

Clone Project

git clone user@domainname.com:$USERNAME/projectname.git