Calculating the Average Hits per minute en-mass for thousands of sites

So, I had a customer having some major MySQL woes, and I wanted to know whether the MySQL issues were query related, as in due to the frequency of queries alone, or the size of the database. VS it being caused by the number of visitors coming into apache, therefore causing more frequency of MySQL hits, and explaining the higher CPU usage.

The best way to achieve this is to inspect /var/log/httpd with ls -al,

First we take a sample of all of the requests coming into apache2, as in all of them.. provided the customer has used proper naming conventions this isn’t a nightmare. Apache is designed to make this easy for you by the way it is setup by default, hurrah!

[root@box-DB1 logparser]# time tail -f /var/log/httpd/*access_log > allhitsnow
^C

real	0m44.560s
user	0m0.006s
sys	0m0.031s

Time command prefixed here, will tell you how long you ran it for.

[root@box-DB1 logparser]# cat allhitsnow | wc -l
1590

The above command shows you the number of lines in allhitsnow file, which was written to with all the new requests coming into sites from all the site log files. Simples! 1590 queries a minute is quite a lot.

Lysncd fails with Error: Terminating since out of inotify watches.

This is a remarkably easy problem to solve.

From man:

The inotify API provides a mechanism for monitoring filesystem
       events.  Inotify can be used to monitor individual files, or to
       monitor directories.  When a directory is monitored, inotify will
       return events for the directory itself, and for files inside the
       directory.

The problem is a kernel based value, that determines how much memory is used by processes such as lsyncd. Lets just double check what value it is set to presently:

Check Inotify max_user_watches value presently

[root@server php]# cat /proc/sys/fs/inotify/max_user_watches
100000

Check the amount of available memory on the server to ensure it can deal with the increase

[root@server-new php]# free -m
             total       used       free     shared    buffers     cached
Mem:          7358       6577        781          1        376       2427
-/+ buffers/cache:       3773       3584
Swap:            0          0          0

The cache can be added to free, as a maximum saturation. Linux uses lots of memory it doesn’t really need for efficiency.

Each used inotify watch uses 1 kB RAM in 64 bit systems, for 32bit its half.

Increase Inotify watches to a higher value

sysctl fs.inotify.max_user_watches=200000

Less Ghetto Log Parser for Website Hitcount/Downtime Analysis

Yesterday I created a proof of concept script, which basically goes off and identifies the hitcounts of a website, and can give a technician within a short duration of time (minutes instead of hours) exactly where hitcounts are coming from and where.

This is kind of a tradeoff, between a script that is automated, and one that is flexible.

The end goal is to provide a hitcount vs memory commit metric value. A NEW TYPE OF METRIC! HURRAH! (This is needed by the industry IMO).

And also would be nice to generate graphing and mean, average, and ranges, etc. So can provide output like ‘stat’ tool. Here is how I have progress

#!/bin/bash
#
# Author: 	Adam Bull, Cirrus Infrastructure, Rackspace LTD
# Date: 	March 20 2017
# Use:		This script automates the analysis of webserver logs hitcounts and
# 		provides a breakdown to indicate whether outages are caused by website visits
#		In correlation to memory and load avg figures


# Settings

# What logfile to get stats for
logfile="/var/log/httpd/google.com-access.log"

# What year month and day are we scanning for minute/hour hits
year=2017
month=Mar
day=9

echo "Total HITS: MARCH"
grep "/$month/$year" "$logfile" | wc -l;

# Hours
for i in 0{1..9} {10..24};

do echo "      > 9th March 2017, hits this $i hour";
grep "$day/$month/$year:$i" "$logfile" | wc -l;

        # break down the minutes in a nested visual way thats AWsome

# Minutes
for j in 0{1..9} {10..60};
do echo "                  >>hits at $i:$j";
grep "$day/$month/$year:$i:$j" "$logfile" | wc -l;
done

done

Thing is, after I wrote this, I wasn’t really happy, so I refactored it a bit more;

#!/bin/bash
#
# Author: 	Adam Bull, Cirrus Infrastructure, Rackspace LTD
# Date: 	March 20 2017
# Use:		This script automates the analysis of webserver logs hitcounts and
# 		provides a breakdown to indicate whether outages are caused by website visits
#		In correlation to memory and load avg figures


# Settings

# What logfile to get stats for
logfile="/var/log/httpd/someweb.biz-access.log"

# What year month and day are we scanning for minute/hour hits
year=2017
month=Mar
day=9

echo "Total HITS: $month"
grep "/$month/$year" "$logfile" | wc -l;

# Hours
for i in 0{1..9} {10..24};

do
hitsperhour=$(grep "$day/$month/$year:$i" "$logfile" | wc -l;);
echo "    > $day $month $year, hits this $ith hour: $hitsperhour"

        # break down the minutes in a nested visual way thats AWsome

# Minutes
for j in 0{1..9} {10..59};
do
hitsperminute=$(grep "$day/$month/$year:$i:$j" "$logfile" | wc -l);
echo "                  >>hits at $i:$j  $hitsperminute";
done

done

Now it’s pretty leet.. well, simple. but functional. Here is what the output of the more nicely refined script; I’m really satisfied with the tabulation.

[root@822616-db1 automation]# ./list-visits.sh
Total HITS: Mar
6019301
    > 9 Mar 2017, hits this  hour: 28793
                  >>hits at 01:01  416
                  >>hits at 01:02  380
                  >>hits at 01:03  417
                  >>hits at 01:04  408
                  >>hits at 01:05  385
^C

PHP5 newrelic agent not collecting data

So today I had a newrelic customer who was having issues after installing the newrelic php plugin. He couldn’t understand why it wasn’t collecting data. For it to collecting data you need to make sure newrelic-daemon process is running by using ps auxfwww | grep newrelic-daemon.

We check the process of the daemon is running

[root@rtd-production-1 ~]# ps -ef | grep newrelic-daemon
root     26007 18914  0 09:59 pts/0    00:00:00 grep newrelic-daemon

We check the status of the daemon process

[root@rtd-production-1 ~]# service newrelic-daemon status
newrelic-daemon is stopped...

Copy basic NewRelic configuration template to correct location

[root@rtd-production-1 ~]# cp /etc/newrelic/newrelic.cfg.template /etc/newrelic/newrelic.cfg

Start the daemon

[root@rtd-production-1 ~]# service newrelic-daemon start
Starting newrelic-daemon:                                  [  OK  ]

How To Enable GZIP Compression/mod_deflate in Apache

This is something quite easy to do.

Check that the deflate module is installed

# apachectl -M | grep deflate
 deflate_module (shared)
Syntax OK

Check the content encoding is enabled by using curl on the site

# curl -I -H 'Accept-Encoding: gzip,deflate' http://somewebsite.com/
HTTP/1.1 200 OK
Date: Fri, 23 Jan 2017 11:02:16 GMT
Server: Apache
X-Pingback: http://somewebsite.com/somesite.php
X-Powered-By: rabbits
Connection: close
Content-Type: text/html; charset=UTF-8

Create a file called /etc/httpd/conf.d/deflate.conf

<ifmodule mod_deflate.c>
    AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript application/javascript
    DeflateCompressionLevel 8
</ifmodule>

Restart apache2

/etc/init.d/httpd reload
Reloading httpd:

retest the gzip content-encoding directive with curl using Accept-Encoding: gzip directive.


# curl -I -H 'Accept-Encoding: gzip,deflate' http://somesite.com/
HTTP/1.1 200 OK
Date: Fri, 25 Sep 2015 00:04:26 GMT
Server: Apache
X-Pingback: http://somesite.com/somesite.php
X-Powered-By: muppets
Vary: Accept-Encoding
Content-Encoding: gzip    <---- this line indicates that compression is active
Content-Length: 20
Connection: close
Content-Type: text/html; charset=UTF-8

For this, thanks to odin.com

Checking Webserver Logs and generating hit counts

So, I’ve been meaning to do this for a while. I’m sure your all familiar with this crappy oneliner where you’ll check the hits using wc -l against a webserver log for a given hour and use for i in seq or similar to get the last 10 minutes data, or the last 59 minutes. But getting hours and days is a bit harder. You need some additional nesting, and, it isn’t difficult for you to do at all!

for i in 0{1..9}; do echo "16/Jan/2017:06:$i"; grep "16/Jan/2017:06:$i" /var/log/httpd/soemsite.com-access_log | wc -l ; done

I improved this one, quite significantly by adding an additional for j, for the hour, and adding an additional 0 in front of {1..9}, this properly is matching the Apache2 log format and allows me to increment through all the hours of the day. to present ;-D All that is missing is some error checking when the last date in the file is, im thinking a tail and grep for the timecode from the log should be sufficient.

Here is the proud oneliner I made for this!

for j in 0{1..9}; do for i in 0{1..9} {10..59}; do echo "16/Jan/2017:$j:$i"; grep "16/Jan/2017:06:$i" /var/log/httpd/website.com-access_log | wc -l ; done; done

Tracing Down Network and Process Traffic Using Netfilter

Every now and then at Rackspace, as with any hosting provider. We do occasionally have issues where customers have left themselves open to attack. In such cases sometimes customers find their server is sending spam email, and is prone to other malware occurring on the Rackspace Network.

Due to AUP and other obligations, it can become a critical issue for both the uptime, and reputation of your site. In many cases, customers do not necessarily have forensic experience, and will struggle very hard to remove the malware. In some cases, the malware keeps on coming back, or, like in my customers case, you could see lots of extra network traffic still using tcpdump locally on the box.

Enter, netfilter, part of the Linux Kernel, and it is able, if you ask it, to track down where packets are coming from, on a process level. This is really handy if you have an active malware or spam process on your system, since you can find out exactly where it is, before doing more investigation. Such a method, also allows you to trace down any potential false positives, since the packet address is always included, you get a really nice overview.

To give you an idea, I needed to install a kernel with debuginfo, just to do this troubleshooting, however this depends on your distribution.

Updating your Kernel may be necessary to use netfilter debug

$yum history info 18

Transaction performed with:
    Installed     rpm-4.11.3-21.el7.x86_64                               @base
    Installed     yum-3.4.3-150.el7.centos.noarch                        @base
    Installed     yum-plugin-auto-update-debug-info-1.1.31-40.el7.noarch @base
    Installed     yum-plugin-fastestmirror-1.1.31-40.el7.noarch          @base
Packages Altered:
    Updated kernel-debuginfo-4.4.40-202.el7.centos.x86_64               @base-debuginfo
    Update                   4.4.42-202.el7.centos.x86_64               @base-debuginfo
    Updated kernel-debuginfo-common-x86_64-4.4.40-202.el7.centos.x86_64 @base-debuginfo
    Update                                 4.4.42-202.el7.centos.x86_64 @base-debuginfo

You could use a similar process using netfilter.ip.local_in, I suspect.

The Script

#! /usr/bin/env stap

# Print a trace of threads sending IP packets (UDP or TCP) to a given
# destination port and/or address.  Default is unfiltered.

global the_dport = 0    # override with -G the_dport=53
global the_daddr = ""   # override with -G the_daddr=127.0.0.1

probe netfilter.ip.local_out {
    if ((the_dport == 0 || the_dport == dport) &&
        (the_daddr == "" || the_daddr == daddr))
            printf("%s[%d] sent packet to %s:%d\n", execname(), tid(), daddr, dport)
}

Executing the Script

[root@pirax-test-new hacked]# chmod +x dns_probe.sh
[root@pirax-test-new hacked]# ./dns_probe.sh
Missing separate debuginfos, use: debuginfo-install kernel-3.10.0-514.2.2.el7.x86_64
swapper/3[0] sent packet to 78.136.44.6:0
sshd[25421] sent packet to 134.1.1.1:55336
sshd[25421] sent packet to 134.1.1.1:55336
swapper/3[0] sent packet to 78.136.44.6:0

I was a little bit concerned about the above output, it looks like swapper with pid 3, is doing something it wouldn’t normally do. Upon further inspection though, we find it is just the outgoing cloud monitoring call;

# nslookup 78.136.44.6
Server:		83.138.151.81
Address:	83.138.151.81#53

Non-authoritative answer:
6.44.136.78.in-addr.arpa	name = collector-lon-78-136-44-6.monitoring.rackspacecloud.com.

Authoritative answers can be found from:

Troubleshooting Akamai CDN using Pragma Headers

Rackspace Cloud Files CDN enabled containers and the Rackspace CDN product can occasionally have an issue, in such cases, you can troubleshoot this a lot easier by using the DEBUG headers. To do this add -D and the folowing -H headers for your output

curl -I http://rackspacecdnurlgoeshereprivatecensored.r17.cf3.rackcdn.com/common/test/generator.png -D - -H "Pragma: akamai-x-get-client-ip, akamai-x-cache-on, akamai-x-cache-remote-on, akamai-x-check-cacheable, akamai-x-get-cache-key, akamai-x-get-extracted-values, akamai-x-get-nonces, akamai-x-get-ssl-client-session-id, akamai-x-get-true-cache-key, akamai-x-serial-no, akamai-x-feo-trace, akamai-x-get-request-id" -L

Naturally for this to work, you need to make sure you have a test URL. You can get one from Cloud Files in the Rackspace Control panel, or from whatever origin is configured with the CDN. Just login to the origin, and find a file path like httpdocs/somefolder/somefile.png and then get the raxcdn.com URL from the Rackspace CDN product page, or the Rackspace Cloud Files ‘show all links’ page on the cog icon next to the cloud files container. And add the path behind your documentroot.

So for the CDN URL rackspacecdnurlgoeshereprivatecensored.r17.cf3.rackcdn.com

For the files on the origin in the documentroot of the website configured with the CDN just add the ‘local’ document path within httpdocs or your www folder!

i.e. rackspacecdnurlgoeshereprivatecensored.r17.cf3.rackcdn.com becomes rackspacecdnurlgoeshereprivatecensored.r17.cf3.rackcdn.com/somefolder/somefile.png

Creating a proper Method of Retrieving, Sorting, and Parsing Rackspace CDN Access Logs

So, this has been rather a bane on the life which is lived as Adam Bull. Basically, a large customer of ours had 50+ CDN’s, and literally hundreds of gigabytes of Log Files. They were all in Rackspace Cloud Files, and the big question was ‘how do I know how busy my CDN is?’.

screen-shot-2016-11-07-at-12-41-30-pm

This is a remarkably good question, because actually, not many tools are provided here, and the customer will, much like on many other CDN services, have to download those logs, and then process them. But that is actually not easier either, and I spent a good few weeks (albeit when I had time), trying to figure out the best way to do this. I dabbled with using tree to display the most commonly used logs, I played with piwik, awstats, and many others such as goaccess, all to no avail, and even used a sophisticated AWK script from our good friends in Operations. No luck, nothing, do not pass go, or collect $200. So, I was actually forced to write something to try and achieve this, from start to finish. There are 3 problems.

1) how to easily obtain .CDN_ACCESS_LOGS from Rackspace Cloud Files to Cloud Server (or remote).
2) how to easily process these logs, in which format.
3) how to easily present these logs, using which application.

The first challenge was actually retrieving the files.

swiftly --verbose --eventlet --concurrency=100 get .CDN_ACCESS_LOGS --all-objects -o ./

Naturally to perform this step above, you will need a working, and setup swiftly environment. If you don’t know what swiftly, is or understand how to set up a swiftly envrionment, please see this article I wrote on the subject of deleting all files with swiftly (The howto explains the environment setup first! Just don’t follow the article to the end, and continue from here, once you’ve setup and installed swiftly)

Fore more info see:
https://community.rackspace.com/products/f/25/t/7190

Processing the Rackspace CDN Logs that we’ve downloaded, and organising them for further log processing
This required a lot more effort, and thought

The below script sits in the same folder as all of the containers

# ls -al 
total 196
drwxrwxr-x 36 root root  4096 Nov  7 12:33 .
drwxr-xr-x  6 root root  4096 Nov  7 12:06 ..
# used by my script
-rw-rw-r--  1 root root  1128 Nov  7 12:06 alldirs.txt

# CDN Log File containers as we downloaded them from swiftly Rackspace Cloud Files (.CDN_ACCESS_LOGS)
drwxrwxr-x  3 root root  4096 Oct 19 11:22 dev.demo.video.cdn..com
drwxrwxr-x  3 root root  4096 Oct 19 11:22 europe.assets.lon.tv
drwxrwxr-x  5 root root  4096 Oct 19 11:22 files.lon.cdn.lon.com
drwxrwxr-x  3 root root  4096 Oct 19 11:23 files.blah.cdn..com
drwxrwxr-x  5 root root  4096 Oct 19 11:24 files.demo.cdn..com
drwxrwxr-x  3 root root  4096 Oct 19 11:25 files.invesco.cdn..com
drwxrwxr-x  3 root root  4096 Oct 19 11:25 files.test.cdn..com
-rw-r--r--  1 root root   561 Nov  7 12:02 generate-report.sh
-rwxr-xr-x  1 root root  1414 Nov  7 12:15 logparser.sh

# Used by my script
drwxr-xr-x  2 root root  4096 Nov  7 12:06 parsed
drwxr-xr-x  2 root root  4096 Nov  7 12:33 parsed-combined
#!/bin/bash

# Author : Adam Bull
# Title: Rackspace CDN Log Parser
# Date: November 7th 2016

echo "Deleting previous jobs"
rm -rf parsed;
rm -rf parsed-combined

ls -ld */ | awk '{print $9}' | grep -v parsed > alldirs.txt


# Create Location for Combined File Listing for CDN LOGS
mkdir parsed

# Create Location for combined CDN or ACCESS LOGS
mkdir parsed-combined

# This just builds a list of the CDN Access Logs
echo "Building list of Downloaded .CDN_ACCESS_LOG Files"
sleep 3
while read m; do
folder=$(echo "$m" | sed 's@/@@g')
echo $folder
        echo "$m" | xargs -i find ./{} -type f -print > "parsed/$folder.log"
done < alldirs.txt

# This part cats the files and uses xargs to produce all the Log oiutput, before cut processing and redirecting to parsed-combined/$folder
echo "Combining .CDN_ACCESS_LOG Files for bulk processing and converting into NCSA format"
sleep 3
while read m; do
folder=$(echo "$m" | sed 's@/@@g')
cat "parsed/$folder.log" | xargs -i zcat {} | cut -d' ' -f1-10  > "parsed-combined/$folder"
done < alldirs.txt


# This part processes the Log files with Goaccess, generating HTML reports
echo "Generating Goaccess HTML Logs"
sleep 3
while read m; do
folder=$(echo "$m" | sed 's@/@@g')
goaccess -f "parsed-combined/$folder" -a -o "/var/www/html/$folder.html"
done < alldirs.txt

How to easily present these logs

I kind of deceived you with the last step. Actually, because I have already done it, with the above script. Though, you will naturally need to have an httpd installed, and a documentroot in /var/www/html, so make sure you install apache2:

yum install httpd awstats

De de de de de de da! da da!

screen-shot-2016-11-07-at-12-41-30-pm

Some little caveats:

Generating a master index.html file of all the sites


[root@cdn-log-parser-mother html]# pwd
/var/www/html
[root@cdn-log-parser-mother html]# ls -al | awk '{print $9}' | xargs -i echo " {}
" > index.html

I will expand the script to generate this automatically soon, but for now, leaving like this due to time constraints.

Taking strace output from stderr and piping to other utilities

Well, this is a strange thing to do, but say you want to know how fast an application is processing data. how to tell? Enter strace, and… a bit of wc -l with the assistance of tee and 2>&1 proper redirection.

strace -p 9653 2>&1 | tee >(wc -l)

where 9653 is the process id (pid) and wc -l is the command you want to pipe to!

read(4, "2.74.2.34 - - [26/05/2015:15:15"..., 8192) = 8192
read(4, "o) Version/7.1.6 Safari/537.85.1"..., 8192) = 8192
read(4, "ident/6.0)\"\n91.233.154.8 - - [26"..., 8192) = 8192
^C1290

1290 lines in the output.. perfect, thats what I wanted to know, roughly how quickly this log parser is going thru my logs 😀