Rapid Troubleshooting mail not arriving at its destination

Today I had a customer whose mail was not arriving at it’s destination. IN my customers case, they knew that it was arriving at the destination but going into the spam folder. This is likely due to blacklisting, however some ISP or clients dont know it’s reaching the spam folder at the other end, and since the most common cause of this happening is due to a missing SPF (Sending policy framework record), MX record or DKIM record it is possibly to rapidly check the DNS of each using dig, if the sender domain is known.

To check for the IP Blacklistings on a mailserver use it’s ip in one of the many spam a checker;

https://mxtoolbox.com/SuperTool.aspx?action=blacklist

In other cases it might be being caused by email bounces, due to the PTR, MX or DKIM records, and not even getting into the inbox, you can see that on the sending mail server using a simple grep command;

cat /var/log/maillog | grep -i status=bounced

You probably want to save the file as well

cat /var/log/maillog | grep -i status=bounced > bouncedemail.txt 

If you wanted to know which domains bounced email, if you’ve ensured all sending domains are correctly configured via DNS..

[root@api ~]# cat /var/log/maillog | grep -i status=bounced | awk '{print $7}'
to=<[email protected]>,
to=<[email protected]>,

You could use sed to extract which domains failed..

Then you could use whois against the domains to reach out to the email contact with some automation that explains that ‘weve checked DKIM, MX and SPF and all are configured correctly and believe this is an error on your behalf, etc, blah blah’..

Such a thing would be a good idea to implement for some large email providers, and I’m sure you could automate the DNS checking as well. You could likely automate the whole thing, just by watching the logs and DNS records, and some intelligent grep and awking.

Not bad.

Calculating the Average Hits per minute en-mass for thousands of sites

So, I had a customer having some major MySQL woes, and I wanted to know whether the MySQL issues were query related, as in due to the frequency of queries alone, or the size of the database. VS it being caused by the number of visitors coming into apache, therefore causing more frequency of MySQL hits, and explaining the higher CPU usage.

The best way to achieve this is to inspect /var/log/httpd with ls -al,

First we take a sample of all of the requests coming into apache2, as in all of them.. provided the customer has used proper naming conventions this isn’t a nightmare. Apache is designed to make this easy for you by the way it is setup by default, hurrah!

[root@box-DB1 logparser]# time tail -f /var/log/httpd/*access_log > allhitsnow
^C

real	0m44.560s
user	0m0.006s
sys	0m0.031s

Time command prefixed here, will tell you how long you ran it for.

[root@box-DB1 logparser]# cat allhitsnow | wc -l
1590

The above command shows you the number of lines in allhitsnow file, which was written to with all the new requests coming into sites from all the site log files. Simples! 1590 queries a minute is quite a lot.

Lysncd fails with Error: Terminating since out of inotify watches.

This is a remarkably easy problem to solve.

From man:

The inotify API provides a mechanism for monitoring filesystem
       events.  Inotify can be used to monitor individual files, or to
       monitor directories.  When a directory is monitored, inotify will
       return events for the directory itself, and for files inside the
       directory.

The problem is a kernel based value, that determines how much memory is used by processes such as lsyncd. Lets just double check what value it is set to presently:

Check Inotify max_user_watches value presently

[root@server php]# cat /proc/sys/fs/inotify/max_user_watches
100000

Check the amount of available memory on the server to ensure it can deal with the increase

[root@server-new php]# free -m
             total       used       free     shared    buffers     cached
Mem:          7358       6577        781          1        376       2427
-/+ buffers/cache:       3773       3584
Swap:            0          0          0

The cache can be added to free, as a maximum saturation. Linux uses lots of memory it doesn’t really need for efficiency.

Each used inotify watch uses 1 kB RAM in 64 bit systems, for 32bit its half.

Increase Inotify watches to a higher value

sysctl fs.inotify.max_user_watches=200000