Block all the IP’s from country

So, I wrote a nice little one liner for one of our customers that wanted to blanket ban Russia (even though I said it wasn’t a good idea, or marginally effective to stop attacks). Might help with spam or other stuff though, and anyway, the customer is always ‘wrong’, it’s up to us to make sure that they do it wrongly right. ;-D

curl http://www.ipdeny.com/ipblocks/data/countries/ru.zone -o russia_ips_all.txt; cat russia_ips_all.txt | xargs -i echo /sbin/iptables -I INPUT -s {} -j DROP

Here is how I achieved it above. This bans all the IP’s from russia. But, if you aren’t very equal opportunities :(, you can ban all kinds of countries:

http://www.ipdeny.com/ipblocks/

Just take a look at this, and change the url, as such. It doesn’t matter what the variables say (even if they say russia, just change the url directly after curl). For instance

http://www.ipdeny.com/ipblocks/data/countries/pl.zone -o ips_all.txt; cat ips_all.txt | xargs -i echo /sbin/iptables -I INPUT -s {} -j DROP

I was really quite happy with this little oneliner. 😀

Cheers &
Best wishes,
Adam

All About NOVA and Xen Tools in Rackspace Cloud – why can’t I connect to my Windows server?

Why can’t I connect to my Rackspace Windows cloud-server, you ask? 2 important questions.

1. Is it a new build?
2. Is it using a custom image (a non rackspace base image).

(because the rackspace base images all have correct nova-agent and xen tools, so get networking information OK. But customer images don’t!). In the case you have run the below tests to see if nova-agent is running (or installed), you will need to install them.

Checking for the nova-agent and xe-guest-utilities

ps auxfwww | grep nova-agent
yum -qa xe-guest-utilities nova-agent
dpkg -l xe-guest-utilities nova-agent

Explanation and solution

Thanks for reaching out to us with your inquiry today. I’m glad to convey to you that I understand what the problem is with your cloud-server not being contactable.

Main reasons for breakage

The main reason why this is not working is most likely caused by some important pieces of software being missing. There is a piece of software called nova-agent, which is responsible for setting your cloud-servers IPV4 address, network subnet/mask, and ip routes, when it is first built. This is important, since the server image you built the server from, has different network details.

The rackspace build process giving networking detail to the VM is completely dependent on xe-guest-utilities and nova-agent

What has happened in this case, because the nova-agent wasn’t running on the cloud-server, the hypervisor software Rackspace use to automate cloud-server builds wasn’t able to contact the nova-agent running on your cloud-server, and therefore nova-agent wasn’t able to update the networking information. And hence, your not able to connect to it on it’s IPv4 address you are given at build time.

The steps to resolution: installing nova-agent and xen guest utilities
As such, nova-agent needs to be installed on the cloud-server you take the image from, it can be installed as follows:

https://community.rackspace.com/products/f/25/t/5694

Also nova-agent uses another piece of important software called xe-guest-utilities, or (Xen Tools) for your windows servers, this is an important ‘PV’ paravirtualization tools, responsible for seamless management of cloud-servers. Sorry that in this case it’s not working out seamlessly, but this can happen with images taken of servers which have had nova-agent disabled, uninstalled, or similar.

Upgrading the tools that nova-agent depends upon, can be installed by following the instructions at the following location:

https://support.rackspace.com/how-to/upgrade-citrix-xen-server-tools-for-windows-cloud-servers/

# Options of how to do this / Summary of Solution Steps

Naturally, you might be wondering how to achieve these changes, if you cannot RDP to the server. This is quite understandable, there are two ways to get this working;

Option 1) Manually install nova-agent on the current server you cannot access, then manually install the Xen Tools in the same way. This shall fix the OS on the server itself, and not the original image you built the server from. So it is important to create a new cloud-server image after performing these steps and us verifying tools + nova-agent installed correctly.

2) Manually install nova-agent on the source server you initially taken the image from, and install Xen Tools, then re-image the server, and then re-deploy. This should seamlesssly work each time on build with that image, provided the tools are installed. You will not need to recreate the image, since your fixing the problem on the cloud-server source that the original image was taken from.

I appreciate that these things are not 100% simple to get your head around and can be confusing for customers, I hope my explanation and summary makes this a little more painless to fix. Of course if you have additional questions, comments or concerns or don’t understand something I’ve said, please don’t hesitate to reach out to us, we are here to help!

Adding nodes and Updating nodes behind a Cloud Load Balancer

I have succeeded in putting together a basic script documenting exactly how API works and for adding node(s), listing the nods behind the LB, as well as updating the nodes (such as DRAINING, DISABLED, ENABLED).

Use update node to set one of your nodes to gracefully drain (not accept new connections, wait for present connections to die). Naturally, you will want to put the secondary server in behind the load balancer first, with addnode.sh.

Once new node is added as enabled, set the old node to ‘DRAINING’. This will gracefully switch over the server.

# List Load Balancers

#!/bin/bash

USERNAME='yourmycloudusernamegoeshere'
APIKEY='apikeygoeshere'
LB_ID='157089'
CUSTOMER_ID='10017858'

TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" |  python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`



curl -v -H "X-Auth-Token: $TOKEN" -H "content-type: application/json" -X GET "https://lon.loadbalancers.api.rackspacecloud.com/v1.0/$CUSTOMER_ID/loadbalancers/$LB_ID"

#

# Add Node(s) addnode.sh

#!/bin/bash

USERNAME='yourmycloudusernamegoeshere'
APIKEY='apikeygoeshere'
LB_ID='157089'
CUSTOMER_ID='10017858'

TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: application/json" |  python -mjson.tool | grep -A5 token | grep id | cut -d '"' -f4`

# Add Node
curl -v -H "X-Auth-Token: $TOKEN" -d @addnode.json -H "content-type: application/json" -X POST "https://lon.loadbalancers.api.rackspacecloud.com/v1.0/$CUSTOMER_ID/loadbalancers/$LB_ID/nodes"



## 

For the addnode script you require a file, called addnode.json
that file must contain the snet ip's you wish to add

#
# addnode.json

{"nodes": [
        {
            "address": "10.0.0.1",
            "port": 80,
            "condition": "ENABLED",
            "type":"PRIMARY"
        }
    ]
}

##

##

# updatenode.sh

#!/bin/bash

USERNAME='yourmycloudusernamegoeshere'
APIKEY='apikeygoeshere'
LB_ID='157089'
CUSTOMER_ID='100101010'
NODE_ID=719425

TOKEN=`curl https://identity.api.rackspacecloud.com/v2.0/tokens -X POST -d '{ "auth":{"RAX-KSKEY:apiKeyCredentials": { "username":"'$USERNAME'", "apiKey": "'$APIKEY'" }} }' -H "Content-type: applic

# Update Node

curl -v -H "X-Auth-Token: $TOKEN" -d @updatenode.json -H "content-type: application/json" -X PUT "https://lon.loadbalancers.api.rackspacecloud.com/v1.0/$CUSTOMER_ID/loadbalancers/$LB_ID/nodes/$NODE_ID"

##

##

## updatenode.json

{"node":{
            "condition": "DISABLED",
            "type":"PRIMARY"
        }
}

Naturally, you will be able to change condition to ENABLED, DISABLED, or DRAINING.

I recommend to use DRAINING, since it will gracefully remove the cloud-server, and any existing connections will be waited on, before removing the server from LB.

HTTPS to HTTP redirect for Rackspace Cloud Load Balancer

Hello,

So a customer reached out to us today asking about how to configure HTTPS to HTTP redirects. This is actually really easy to do.

Think of it like this;

When you enable HTTP and HTTPS (allow insecure and secure traffic), the protocol is always HTTP hitting the server.

When you ‘only allow secure traffic’ it will always be HTTPS. Think of it like, if the load balancer has certificates on it and supports both protocols (i.e. terminates SSL at the load balancer), then the requests hitting the server will always be HTTP.

This is why the X_forwarded_Proto becomes important in your server being able to determine what traffic hitting it coming from load balancer originated from HTTPS protocol, and which originated from HTTP protocol. This is what allows you to effectively do the redirection on the cloud-server side.

I hope this helps!

So the rewrite rule on the server, using x_forwarded_proto to detect the protocol instead of the usual ‘https’ directive can be (or rather will need to be replaced) with a rule that uses the header instead of the regular incoming protocol to determine redirect.

    RewriteEngine On
    RewriteCond %{HTTP:http_x_forwarded_proto} !=http
    RewriteCond %{REQUEST_URI} !^/some/path.*
    RewriteRule (.*) http://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Quite simple, but took me the better part of half an hour to get my head properly around this 😀

You might be interested in achieving this on Windows systems with iis or similar, so check this link out which explains how to do that step by step in windows;

https://support.rackspace.com/how-to/configuring-load-balanced-sites-with-ssl-offloading-using-iis/

Could not resolve mirror.rackspace.com

capture-12

So a customer came to me with an issue about not being able to install chkrootkit, due to DNS failure. The customer wasn’t technical and didn’t understand how to fix it. It’s easy to fix, by wiping the resolv.conf nameservers file, and rebuilding it. Takes a few seconds, this should allow you to resolve domains afterwards again.


# Delete resolv.conf

rm /etc/resolv.conf

# Recreate the DNS file with different DNS servers
touch /etc/resolv.conf
echo "nameserver 4.2.2.4" >> /etc/resolv.conf
echo "nameserver 8.8.8.8" >> /etc/resolv.conf

Alternatively you could just open it up for editing with vi or nano.

vi /etc/resolv.conf
nano /etc/resolv.conf 

Testing Rackspace Cloud-server Service-net Connectivity and creating an alarm

So, the last few weeks my colleagues and myself have been noticing that there has been a couple of issues with the cloud-servers servicenet interface. Unfortunately for some customers utilizing dbaas instances this means that their cloud-server will be unable to communicate, often, with their database backend.

The solution is a custom monitoring script that my colleague Marcin has kindly put together for another customer of his own.

The python script that goes on the server:

Create file:

vi /usr/lib/rackspace-monitoring-agent/plugins/servicenet.sh

Paste into file:

#!/bin/bash
#
ping="/usr/bin/ping -W 1 -c 1 -I eth1 -q"

if [ -z $1 ];then
   echo -e "status CRITICAL\nmetric ping_check uint32 1"
   exit 1
else
   $ping $1 &>/dev/null
   if [ "$?" -eq 0 ]; then
      echo -e "status OK\nmetric ping_check uint32 0"
      exit 0
   else
      echo -e "status CRITICAL\nmetric ping_check uint32 1"
      exit 1
   fi
fi

Create an alarm that utilizes the below metric

if (metric["ping_check"] == 1) {
    return new AlarmStatus(CRITICAL, 'what?');
}
if (metric["ping_check"] == 0) {
    return new AlarmStatus(OK, 'eee?');
}

Of course for this to work the primary requirement is a Rackspace Cloud-server and an installation of Rackspace Cloud Monitoring installed on the server already.

Thanks again Marcin, for this golden nugget.

Using Sar to tell a story

So, a customer is experiencing slowness/sluggishness in their app. You know there is not issue with the hypervisor from instinct, but instinct isn’t enough. Using tools like xentop, sar, bwm-ng are critical parts of live and historical troubleshooting.

Sar can tell you a story, if you can ask the storyteller the write questions, or even better, pick up the book and read it properly. You’ll understand what the plot, scenario, situation and exactly how to proceed with troubleshooting by paying attention to these data and knowing which things to check under certain circumstances.

This article doesn’t go in depth to that, but it gives you a good reference of a variety of tests, the most important being, cpu usage, io usage, network usage, and load averages.

CPU Usage of all processors

# Grab details live
sar -u 1 3

# Use historical binary sar file
# sa10 means '10th day' of current month.
sar -u -f /var/log/sa/sa10 

CPU Usage of a particular Processor

sar -P ALL 1 1

‘-P 1’ means check only the 2nd Core. (Core numbers start from 0).

sar -P 1 1 5

The above command displays real time CPU usage for core number 1, every 1 second for 5 times.

Observing Changes in Memory over time

sar -r 1 3

The above command provides memory stats every 1 second for a total of 3 times.

Observing Swap usage over time

sar -S 1 5

The above command reports swap statistics every 1 seconds, a total 3 times.

Overall I/O activity

sar -b 1 3 

The above command checks every 1 seconds, 3 times.

Individual Block Device I/O Activities

This is a useful check for LUN , block devices and other specific mounts

sar -d 1 1 
sar -p d

DEV – indicates block device, i.e. sda, sda1, sdb1 etc.

Total Number processors created a second / Context switches

sar -w 1 3

Run Queue and Load Average

sar -q 1 3 

This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3” reports for every 1 seconds a total of 3 times.

Report Network Statistics

sar -n KEYWORD

KEYWORDS Available;

DEV – Displays network devices vital statistics for eth0, eth1, etc.,
EDEV – Display network device failure statistics
NFS – Displays NFS client activities
NFSD – Displays NFS server activities
SOCK – Displays sockets in use for IPv4
IP – Displays IPv4 network traffic
EIP – Displays IPv4 network errors
ICMP – Displays ICMPv4 network traffic
EICMP – Displays ICMPv4 network errors
TCP – Displays TCPv4 network traffic
ETCP – Displays TCPv4 network errors
UDP – Displays UDPv4 network traffic
SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
ALL – This displays all of the above information. The output will be very long.

sar -n DEV 1 1

Specify Start Time

sar -q -f /var/log/sa/sa11 -s 11:00:00
sar -q -f /var/log/sa/sa11 -s 11:00:00 | head -n 10

Grabbing network activity from server without network utility

So, is it possible to look at a network interfaces activity without bwm-ng, iptraf, or other tools? Yes.

while true do
RX1=`cat /sys/class/net/${INTERFACE}/statistics/rx_bytes`
TX1=`cat /sys/class/net/${INTERFACE}/statistics/tx_bytes`
DOWN=$(($RX1-$RX2))
UP=$(($TX1-$TX2))
DOWN_Bits=$(($DOWN * 8 ))
UP_Bits=$(($UP * 8 ))
DOWNmbps=$(( $DOWN_Bits >> 20 ))
UPmbps=$(($UP_Bits >> 20 ))
echo -e "RX:${DOWN}\tTX:${UP} B/s | RX:${DOWNmbps}\tTX:${UPmbps} Mb/s"
RX2=$RX1; TX2=$TX1
sleep 1; done

I found this little gem yesterday, but couldn’t understand why they had not used clear. I guess they wanted to log activity or something… still this was a really nice find. I can’t remember where I found it yesterday but googling part of it should lead you to the original source 😀

Using TCP to ping test your Cloud server connectivity

So, you have probably heard that there are a variety of reasons why you shouldn’t use ICMP to test your service is operating normally. Mainly because of the way that ICMP is handled by routers. If you really want a representative view of the way that TCP packets, such as HTTP and HTTPS are performing in terms of packet loss (that is to say packets which do not arrive at their destination) , then hping is your friend.

You might be pinging a cloud-server that is not replying. You might think it’s down. But what if the firewall is simply dropping ICMP echo requests coming in on that port? Indeed.

Enter hping.

# hping -S -p 80 google.com
HPING google.com (eth0 74.125.136.102): S set, 40 headers + 0 data bytes
len=46 ip=74.125.136.102 ttl=46 id=23970 sport=80 flags=SA seq=0 win=42780 rtt=13.8 ms
len=46 ip=74.125.136.102 ttl=47 id=37443 sport=80 flags=SA seq=1 win=42780 rtt=12.6 ms
len=46 ip=74.125.136.102 ttl=47 id=43654 sport=80 flags=SA seq=2 win=42780 rtt=12.0 ms
len=46 ip=74.125.136.102 ttl=47 id=37877 sport=80 flags=SA seq=3 win=42780 rtt=11.4 ms
len=46 ip=74.125.136.102 ttl=47 id=62433 sport=80 flags=SA seq=4 win=42780 rtt=13.3 ms
^C
--- google.com hping statistic ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 11.4/12.6/13.8 ms

In this case I tested with google.com. I’m actually surprised that more people don’t use hping, because, hping is awesome. It also makes quite a decent port scanner, were it not for the fact that the machine I tried to test that feature with buffer overflowed 😉 It’s a nice way to test a firewalled box, but more than that, it’s a more reliable test in my opinion.

Preventing /etc/resolv.conf reset on startup/boot

Today, a customer approached us after a Host Server Down complaining that, although the server is up again their website and application were down & not working. Even though the server was online and functioning correctly.

The customer discovered that the source of the issue was that there /etc/resolv.conf is blank, this means that they will not be able to resolve DNS A/PTR/CNAME record hostnames into a resolved IP. This is called hostname to IP resolution. Its means that if /etc/resolv.conf is blank and the customer uses hostnames in their calls, such a failure will break the connectivity due to failure to resolve to IP to communicate on the TCP stack.

There is actually a very simple way to prevent the /etc/resolv.conf file from being changed. But first, it’s important to understand why /etc/resolv.conf is being reset.

On All Rackspace cloud-servers there is a process called nova-agent, and when the server starts up, the /etc/resolv.conf file will be reset along with the networking configuration. This happens each time your server is restarted and is used to set new networking details, specifically if you take an image and build server on a new ip address or if your server is live-migrated to a new host, it makes sure on the next reboot it comes up with correct networking detail transparently. However this can cause some issues, such as in this case with the /etc/resolv.conf file. Fortunately there are some novel ways of preventing your /etc/resolv.conf being modified after you have added the correct nameservers you desire to it.

You can use the chattr immutable file setting to stop processes from modifying it after you have made the changes to your resolv.conf that are desired;

Set Immutable File

chattr +i /etc/resolv.conf 

Un-set Immutable File

chattr -i /etc/resolv.conf 

This /etc/resolv.conf issue is a common problem, however using immutable file flag and chattr should prevent it from being changed ever again.