So, a customer is experiencing slowness/sluggishness in their app. You know there is not issue with the hypervisor from instinct, but instinct isn’t enough. Using tools like xentop, sar, bwm-ng are critical parts of live and historical troubleshooting.
Sar can tell you a story, if you can ask the storyteller the write questions, or even better, pick up the book and read it properly. You’ll understand what the plot, scenario, situation and exactly how to proceed with troubleshooting by paying attention to these data and knowing which things to check under certain circumstances.
This article doesn’t go in depth to that, but it gives you a good reference of a variety of tests, the most important being, cpu usage, io usage, network usage, and load averages.
CPU Usage of all processors
# Grab details live sar -u 1 3 # Use historical binary sar file # sa10 means '10th day' of current month. sar -u -f /var/log/sa/sa10
CPU Usage of a particular Processor
sar -P ALL 1 1
‘-P 1’ means check only the 2nd Core. (Core numbers start from 0).
sar -P 1 1 5
The above command displays real time CPU usage for core number 1, every 1 second for 5 times.
Observing Changes in Memory over time
sar -r 1 3
The above command provides memory stats every 1 second for a total of 3 times.
Observing Swap usage over time
sar -S 1 5
The above command reports swap statistics every 1 seconds, a total 3 times.
Overall I/O activity
sar -b 1 3
The above command checks every 1 seconds, 3 times.
Individual Block Device I/O Activities
This is a useful check for LUN , block devices and other specific mounts
sar -d 1 1 sar -p d
DEV – indicates block device, i.e. sda, sda1, sdb1 etc.
Total Number processors created a second / Context switches
sar -w 1 3
Run Queue and Load Average
sar -q 1 3
This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3” reports for every 1 seconds a total of 3 times.
Report Network Statistics
sar -n KEYWORD
DEV – Displays network devices vital statistics for eth0, eth1, etc.,
EDEV – Display network device failure statistics
NFS – Displays NFS client activities
NFSD – Displays NFS server activities
SOCK – Displays sockets in use for IPv4
IP – Displays IPv4 network traffic
EIP – Displays IPv4 network errors
ICMP – Displays ICMPv4 network traffic
EICMP – Displays ICMPv4 network errors
TCP – Displays TCPv4 network traffic
ETCP – Displays TCPv4 network errors
UDP – Displays UDPv4 network traffic
SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
ALL – This displays all of the above information. The output will be very long.
sar -n DEV 1 1
Specify Start Time
sar -q -f /var/log/sa/sa11 -s 11:00:00
sar -q -f /var/log/sa/sa11 -s 11:00:00 | head -n 10