Rackspace Cloud Files CDN enabled containers and the Rackspace CDN product can occasionally have an issue, in such cases, you can troubleshoot this a lot easier by using the DEBUG headers. To do this add -D and the folowing -H headers for your output
Naturally for this to work, you need to make sure you have a test URL. You can get one from Cloud Files in the Rackspace Control panel, or from whatever origin is configured with the CDN. Just login to the origin, and find a file path like httpdocs/somefolder/somefile.png and then get the raxcdn.com URL from the Rackspace CDN product page, or the Rackspace Cloud Files ‘show all links’ page on the cog icon next to the cloud files container. And add the path behind your documentroot.
So for the CDN URL rackspacecdnurlgoeshereprivatecensored.r17.cf3.rackcdn.com
For the files on the origin in the documentroot of the website configured with the CDN just add the ‘local’ document path within httpdocs or your www folder!
i.e. rackspacecdnurlgoeshereprivatecensored.r17.cf3.rackcdn.com becomes rackspacecdnurlgoeshereprivatecensored.r17.cf3.rackcdn.com/somefolder/somefile.png
So, a customer today was having some issues with their CDN. They said that their SSL CDN was presenting a different image, than the HTTP CDN. So, I thought the best way to begin any troubleshooting process would firstly be to try and recreate those issues. To do that, I need a way to compare the files programmatically, enter md5sum a handly little shell application usually installed by default on most Linux OS.
[user@cbast3 ~]$ curl https://3485asd3jjc839c9d3-08e84cacaacfcebda9281e3a9724b749.ssl.cf3.rackcdn.com/companies/5825cb13f2e6c9632807d103/header.jpeg -o file ; cat file | md5sum
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 382k 100 382k 0 0 1726k 0 --:--:-- --:--:-- --:--:-- 1732k
e917a67bbe34d4eb2d4fe5a87ce90de0 -
[user@cbast3 ~]$ curl http://3485asd3jjc839c9d3-08e84cacaacfcebda9281e3a9724b749.r45.cf3.rackcdn.com/companies/5825cb13f2e6c9632807d103/header.jpeg -o file2 ; cat file2 | md5sum
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 382k 100 382k 0 0 2071k 0 --:--:-- --:--:-- --:--:-- 2081k
e917a67bbe34d4eb2d4fe5a87ce90de0 -
As we can see from the output of both, the md5sum (the hashing) of the two files is the same, this means there is a statistically very very very high chance the content is exactly the same, especially when passing several hundred characters or more. The hashing algorithm is combination based, so the more characters, the less likely same combination is of coming around twice!
In this case I was able to disprove the customers claim’s. Not because I wanted to, but because I wanted to solve their issue. These results show me, the issue must be, if it is with the CDN, with a local edgenode local to the customer having the issue. Since I am unable to recreate it from my location, it is therefore not unreasonable to assume that it is a client side issue, or a failure on our CDN edgenode side, local to the customer. That’s how I troubleshooted this, and quite happy with this one! Took about 2 minutes to do, and a few minutes to come up with. A quick and useful check indeed, which reduces the number of possibilities considerably in tracing down the issue!
Cheers &
Best wishes,
Adam
Please note the real CDN location has been altered for privacy reasons
Hey folks, I know it’s been a little while since I put an article together. However I have been putting together a really article explaining how to write bespoke backup systems for the Rackspace Community. It’s a proof of concept/demonstration/tutorial as opposed to a production application. However people looking to create custom cloud backup scripts may benefit from the experience of reading thru it.
So, as you may already be aware, I am working on a lightweight backup script called obscene redundancy’. An redundant backup software capable of 18 replicas of data to Rackspace Cloud Files API service. It’s so redundant… it’s obscene redundancy.
Today, I was discussing with my colleague, that it was all very well uploading your tar to cloud files, but, wouldn’t you really like to know if the file you uploaded is completely identical number of bits, and order? Enter, Cloud Files ‘HEAD’and Etag. Our MD5 friend.
What I did to improve the obscene redundancy script was quite simple here:
# We define a variable that takes the 'Etag' (MD5Sum) value for the cloud files archive
cfmd5sum=$(swiftly --conf swiftly-configs/swiftly-${SHORT_REGION,,}.conf head
"${BACKUP_DEST}/${FILE}" | grep -i Etag | awk '{print $2}')
# We Define a variable that generates an 'MD5Sum' for the local file archive
localmd5sum=$(md5sum "$BACKUP_DIR"/"$FILE")
echo "Checking Data integrity of Cloud Files upload to $REGION"
echo "Cloud Files Archive MD5: $cfmd5sum ....... Local File Archive MD5: $localmd5sum"
# If these values
if [[ "$cfmd5sum" -ne "$localmd5sum" ]];
then
echo "VALUES NOT EQUAL"
echo "$REGION CRC OK..."
else
echo "VALEUS EQUAL
echo "$REGION CRC missing, in error, or NOT OK..."
fi
After all this I found that the script wasn’t working properly… so I did some debugging about this to check, at least, first of all , the length of each variable.
if [[ "$cfmd5sum" == "$localmd5sum" ]]; then
echo "VALUES EQUAL, (local md5sum length given first)"
echo "$localmd5sum"| wc -L
echo "$cfmd5sum"| wc -L
echo "$REGION CRC OK..."
else
echo "VALUES NOT EQUAL"
echo "$localmd5sum"|wc -L
echo "$cfmd5sum"|wc -L
echo "$REGION CRC missing, in error, or NOT OK..."
fi
The output shown me that the variable length was different. At this stage I’ve no idea why, but will add updates here. I’m going to commit this to obsceneredundancy because proof of concept is working and valid, as shown by the output of the script. (i.e. the method is fine, it’s just the way the string is compared in the if, statement, I suspect it is to do with special character or \n characters as I had before. So, when I made this addition to the multi-dc-backup.sh script.. the output now looks like:
Creating Container in LON for obsceneredundancy
LON: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://LON/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
Checking Data integrity of Cloud Files upload to BACKUP_TO_LON
Cloud Files Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_LON CRC missing, in error, or NOT OK...
lon: COMPLETED OK 15504796/15504796
ORD: Not backing up ...
Creating Container in IAD for obsceneredundancy
IAD: Backing up ...
Source: /var/www/ ---> Dest: cloudfiles://IAD/obsceneredundancy/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
Checking Data integrity of Cloud Files upload to BACKUP_TO_IAD
Cloud Files Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 ....... Local File Archive MD5: 65147eb66f8bbeff03a229570b0a1be7 /var/backup/varwww-2016-07-06-6bd657e9-d268-4883-9f40-3859f690aadb.tar.gz
VALUES NOT EQUAL
107
32
BACKUP_TO_IAD CRC missing, in error, or NOT OK...
iad: COMPLETED OK 15504796/15504796
DFW: Not backing up ...
As we can see the 107 (localmd5size) and the 32 (cloudfilesmd5size) are different! I’ve no idea why, since when echoing the variables they look the same. I suspect gremlins and Trolls. A fresh head tomorrow will probably solve this in a few minutes!
Hey folks. So, recently I have been doing a bit of work on the Rackspace community, specifically trying to document and make as easy as possible the importing and exporting of cloud server VHD’s between Rackspace regions. This might be really useful if you are designing some HA or multi-region and/or load balancing solution that might be utilizing autoscale, and other kinds of redundancy too, but moving your ‘golden image’ between regions might be quite difficult if doing the entire process manually or step by step as I have documented in the below two articles:
Exporting Cloud server images from a Rackspace Region
https://community.rackspace.com/products/f/25/t/7089
Importing Cloud Server Images to a Rackspace Region
https://community.rackspace.com/products/f/25/t/7186
In this article I completely finish writing the ‘automation demo’ of how to specifically move images, without changing much at all, apart from one ‘serverID’ variable, and the source and destination. The script isn’t finished yet, however the last time I posted this on my blog I was so excited, I actually forgot to include the import function. (which is kind of important!) sorry about that.
echo "Exporting VHD to Cloud Files"
# This section simply retrieves the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" | python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`
echo "IMAGEID detected as $IMAGEID"
# This section requests the Glance API to copy the cloud server image uuid to a cloud files container called export
# > export-cloudfiles
echo "THE IMAGE ID IS: $IMAGEID"
IMAGEID=${IMAGEID%$'\r'}
curl -v "https://lon.images.api.rackspacecloud.com/v2/$TENANT/tasks" -X POST -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" -d '{"type": "export", "input": {"image_uuid": "'$IMAGEID'" , "receiving_swift_container": "export"}}' -o export-cloudfiles
echo "Export looks like"
echo "Waiting for Task to complete..."
## WAIT FOR TASKID EXPORT TO COMPLETE TO CLOUD FILES
# This section simply retrieves the TOKEN
TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" | python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`
# This section requests the Glance API to copy the cloud server image uuid to a cloud files container called export
curl "https://lon.images.api.rackspacecloud.com/v2/1000000/tasks/$TASKID_EXPORT" -X GET -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" | python -mjson.tool > export-status
EXPORT_STATUS=$(cat export-status | grep status | awk '{print $2}' | sed 's/"//g' | sed 's/,//g')
while [ "$EXPORT_STATUS" = "processing" ]; do
sleep 15
curl "https://lon.images.api.rackspacecloud.com/v2/1000000/tasks/$TASKID_EXPORT" -X GET -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" | python -mjson.tool > export-status
EXPORT_STATUS=$(cat export-status | grep status | awk '{print $2}' | sed 's/"//g' | sed 's/,//g')
done
# SET CORRECT CLOUD FILES NAME
CLOUD_FILES_NAME=$(cat export-cloudfiles | python -mjson.tool | grep image_uuid | awk '{print $2}' | sed 's/,//g' | sed 's/"//g')
## Download VHD Cloud from Cloud Files to this server
As You can probably see my code is still rather rough, but it’s just so darn exciting that this script works from start to finish, nicely I just HAD to share it a bit earlier! The plan now is to add commandline function so that you can specify ./moveregion {SOURCE_REGION} {DEST_REGION} {SERVER_ID} {TENANT_ID} . Then a customer or a racker would only need these 4 variables to import and export images in an automated way.
I can rewrite the script in such a way that it would accept a .txt file of a couple of hundred cloud server UUID’s, and it would take the server UUID of each, use that uuid to create an image of each server, export to cloud files, import to cloud files, and then import to glance image store for the second region destination. Which naturally, would save hundreds of hours of human time doing this manually.. which is … nice 😀
I would really like to make a UI frontend, using something like Django, and utilize some form of ‘light’ database, that keeps track of all the API import/exports, and even provides estimated time for completion, but my UI skills are really limited to xhtml, css php and mysql.. I need a python or django guy to help out with some of this. If anyone is interested, please reach out to me.
So, you may have noticed over the past weeks and months I have been a little bit quieter about the articles I have been writing. Mainly because I’ve been working on a new github project, which, although simple, and lightweight is actually really rather outrageously powerful.
https://github.com/aziouk/obsceneredundancy
Imagine being able to take 15+ redundant replica copies of your files, across 5 or 6 different datacentres. Rackspace Cloud Files API powered, but also with a lot of the flexibility of Bourne Again Shell (BASH).
This was actually quite a neat achievement and I am pleased with the results. There are still some limitations of this redundant replica application, and there are a few bugs, but it is a great proof of concept which shows what you can do with the API both quickly and cheaply (ish). Using filesystems as a service will be the future with some further innovation on the world wide network infrastructure, and it would only take a small breakthrough to rapidly alter the way that OS and machines boot/backup.
If you want to see the project and read the source code before I lay out and describe/explain the entire process of writing this software as well as how to deploy it with cron on linux, then you need wait no longer. Revision 1 alpha is now tested, ready and working in 5 different datacentres.
You can actually toggle which datacentres you wish to utilize as well, it is slightly flexible. The only important consideration here is to understand that there are some limitations such as a lack of de-duping, and this uses tar’s and swiftly, instead of directly querying the API. Since directly uploading thru the API a tar file is relatively simple, I will probably implement it like that as I have before and get rid of swiftly in future iterations, however such a project is really ideal for learning more about BASH , CRON, API and programmatic automation of and sequential filesystems utilizing functional programming and division of labour between workers,
https://github.com/aziouk/obsceneredundancy
Test it (please note it will be a little bit buggy on different environments and there is no instructions yet)
So, you have some really important data, so much so that 99.99% redundancy is not enough for you. One solution to this is to use multiple copies in multiple datacentres. Most enterprise backup will have on-site, an off-site, and an archival copy. What I’m going to show here is how to make 4 different copies of your data, in 4 different datacentres around the world. This will provide a very high redundancy of storage, and greatly reduce the likelihood of data loss. Although it costs a bit more, this kind of solution may be suitable for many small, medium and large businesses. Naturally, depending on the size of the data, and the importance of redundancy. You might not have many files to backup, perhaps a small cd worth.. it will be very inexpensive if you have a small backup to make. However, due to the way that cloud files is billed, copying data to cloud files costs money in bandwidth when writing from a server in London to a cloud files in Sydney, Chicago or Dallas for instance, so it’s very important to consider the impact of bandwidth costs when utilizing an additional 3 cloud files endpoints that are not in the local datacentre region. Which, is essentially what we are doing in this guide.
Create your swiftly environments (setting the name for each file)
==> /root/.swiftly-dfw.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = dfw
==> /root/.swiftly-iad.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = iad
==> /root/.swiftly-ord.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = ord
==> /root/.swiftly-syd.conf <==
[swiftly]
auth_user = myusername
auth_key = censored
auth_url = https://identity.api.rackspacecloud.com/v2.0
region = syd
Create your Script
# Adam Bull
# Adam Bull, Rackspace UK
# May 17, 2016
# This can be sequential or, it can be parallel, not sure which is better yet use & for parallel
# This backs up /documents file and puts it in the 'managed_backup' cloud files container at the following 4 datacentres ,DFW, IAD, ORD and SYD
swiftly --verbose --conf ~/.swiftly-dfw.conf --concurrency 100 put -i /documents /managed_backup
swiftly --verbose --no-snet --conf ~/.swiftly-iad.conf --concurrency 100 put -i /documents /managed_backup
swiftly --verbose --no-snet --conf ~/.swiftly-ord.conf --concurrency 100 put -i /documents /managed_backup
swiftly --verbose --no-snet --conf ~/.swiftly-syd.conf --concurrency 100 put -i /documents /managed_backup
Because the other 3 endpoints are in different datacentres, we can't use servicenet, so we defined --no-snet option for swiftly as above.
Execute your script
chmod +x multibackup.sh
./multibackup.sh
This obviously is a basic system and script of taking backups, and it is not for production use (yet). This is an alpha project I started today. The cool thing is that it works, and quite nicely. Although it is far from finished yet as a workable script.
Once the script is made, you can simply add it to crontab -e as you would usually. Make sure the user you execute with cron has access to the .conf files in their home directory!
Please note that by performing the below commands the effect can be destructive.
TAKE CAUTION WHEN USING THIS COMMAND IT CAN DELETE EVERYTHING IF YOU DO SOMETHING WRONG!!!!
# swiftly --verbose --eventlet --concurrency=100 for "" --prefix z_DO_NOT_DELETE --output-names do delete "" --recursive --until-empty
This particularly command *should* only remove the cloud files directories starting with z_DO_NOT_DELETE. I have tested it and it appears to work correctly.