Server Monitoring With vmstat – Security Engineer Notes

You will learn the basics of vmstat server monitoring, scripting and redirection in this article.

What is vmstat and why you would want to use it for server monitoring?
How to use vmstat for server monitoring
How to timestamp vmstat output (and other commands besides vmstat too)
How to create a script that is easier to type than complicated commands
How to run a shell script
How to save data into a file
Showing Memory Pages (useless info)
Understanding vmstat
Chart that explains the columns

1. What is vmstat?

Vmstat is a tool that tells quite a bit about the actual performance of your machine disk io, cpu, memory, wait states and more. This is a true measure of how your linux machine is handling load. Specifically it will diagnose the “why” a computer is slow by identifying “what” is bogged down. Usually you run vmstat for more than a few iterations and ignore the first one. Check Point has most of the really good tools removed, but this one is one that they let us have. It’s on every other flavor of linux I’ve tested too, so it’s a good idea to know how to use it.

2. How to use vmstat for server monitoring:

You run by typing: vmstat followed by how many seconds for each delay


[Expert@firewall]# vmstat 1
procs                      memory      swap          io     system         cpu
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy wa id
1  0      0 2750352 101940 773080    0    0     0     1    0     1  0  0  0  0
0  0      0 2750352 101940 773080    0    0     0     0 3986    43  0  0  0 100
0  0      0 2750352 101940 773080    0    0     0    56 3980    57  0  0  0 100

3. How to Timestamp vmstat (or other commands):

However, this doesn’t do much good if you have to see an exact time… therefore you can use awk and strftime to timestamp each line with the following command:


[Expert@firewall]# vmstat 1 |awk '{now=strftime("%Y-%m-%d %T "); print now $0}'
2010-04-23 09:59:53 procs                      memory      swap          io     system         cpu
2010-04-23 09:59:53  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy wa id
2010-04-23 09:59:53  1  0      0 2750344 101940 773080    0    0     0     1    1     1  0  0  0  0
2010-04-23 09:59:54  0  0      0 2750344 101940 773080    0    0     0    68 3930    64  0  0  0 100
2010-04-23 09:59:55  0  0      0 2750344 101940 773080    0    0     0     0 4354    37  0  0  0 100

4. An Easy way to Remember Difficult Commands:

But, that’s kinda obtuse to remember, so put it in a script


[Expert@firewall]# cat logvmstat.sh
#!/bin/bash
vmstat 1 | awk '{now=strftime("%Y-%m-%d %T "); print now $0}'
[Expert@firewall]#

5. How to Run a Script:

Now you can use sh to run it:


[Expert@firewall]# sh logvmstat.sh
2010-04-23 10:00:04 procs                      memory      swap          io     system         cpu
2010-04-23 10:00:04  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy wa id
2010-04-23 10:00:04  1  0      0 2750344 101940 773080    0    0     0     1    0     1  0  0  0  0
2010-04-23 10:00:05  0  0      0 2750336 101940 773080    0    0     0     0 4303    45  0  0  0 100
2010-04-23 10:00:06  0  0      0 2750336 101940 773080    0    0     0     0 3800    48  0  0  0 100

6. How to Save Data to a File

This is somewhat basic, but when you want to keep a written log of what your test is showing you:


[Expert@firewall]# vmstat 1 |awk '{now=strftime("%Y-%m-%d %T "); print now $0}' >> somelog.log


[Expert@firewall]# sh logvmstat.sh >> somelog.log

Now you can review it after your testing is done.

7. Show Memory Pages … useless

Want to review memory and forgot about free? I only include this so you know what they are talking about on the other fields (swpd, free, inact, active).


[Expert@firewall]# vmstat -a 3
procs                      memory      swap          io     system         cpu
r  b   swpd   free  inact active   si   so    bi    bo   in    cs us sy wa id
0  0      0 2750592 281160 665664    0    0     0     1    0     1  0  0  0  0
0  0      0 2750592 281160 665664    0    0     0    25 4720    80  0  0  0 100
0  0      0 2750592 281160 665664    0    0     0     6 4117    41  0  0  0 99
0  0      0 2750592 281160 665664    0    0     0     0 4154    40  0  0  0 100

8. But what does it all mean?

Googling vmstat tutorial will gave me a handy chart that I put at the bottom. But here is a summary of what I’d look for on a slow machine:

CPU slow1:: r has numbers in it constantly, threads/tasks waiting to be processed by your gimp cpu
CPU slow2:: in is high, you are handling too many interrupts (likely from disk activity, but could be bad driver)
Processes:: us or sy is high? Some process is being a cpu hog, use top -n 1 to find it, and kill -9 the PID if needed
Disk Subsystem Overloaded:: wa is high? If you are waiting for IO then you need to upgrade your disk subsystem
Not Enough RAM:: si and so are high, swapping disk too much. You really shouldn’t swap at all for high performance. If these are high, in will be high too. Upgrade your RAM.
Low Memory2:: cs is high? The kernel is paging memory in and out of context. Likely you need more RAM, but it could be other issues too such as damaged hardware or pitiful software.
Out of Memory:: I ignore free, inact, active because it’s not as useful and understanding the actual reasons. Ie: if you are out of memory, you’ll know that, but unless you look at cs, so, si, etc you won’t know why. So it’s redundant.

9. Here is the chart that explains all of the columns.

I don’t use them except for what I listed above though.
The ‘procs’ field has 2 columns:

r – The number of processes waiting for run time.
b – The number of processes in uninterruptible sleep (blocked processes).

The ‘memory’ field has 4 columns: (see with vmstat -a)

swpd – The amount of used swap space(virtual memory) used.
free – The amount of idle memory(free RAM).
inact – The amount of inactive memory.
active – The amount of active memory.

The ‘swap’ field has 2 columns:

si – Amount of memory swapped in from disk (/s).
so – Amount of memory swapped to disk (/s).

The ‘io’ field has 2 columns:

bi – Blocks received from a block device (blocks in).
bo – Blocks sent to a block device (blocks out).

The ‘system’ field has 2 columns:

in – The number of interrupts per second, including the clock (System interrupts).
cs – The number of context switches per second (Process context switches).

The ‘cpu’ field has only 4 columns:

us: Time spent running non-kernel code. (user time, including nice time).
sy: Time spent running kernel code. (system time).
id: Time spent idle.
wa: Time spent waiting for IO.

2 replies on “Server Monitoring With vmstat”

Cool! writeup on vmstat… i liked it.. Now i know what to look for if my system is slow… 🙂
I am looking for a shell script (Bourne shell sh) that will collect vmstat logs from App,DB and Web servers and download that into a xls. xls should automatically plot CPU utilization graphs on each server.. I am not good at scripting.. so please help me if possible..
-Lambzee

This is only the tip of the iceberg. Ideally you would run other tools once you find out what is causing problems. vmstat won’t give all of the details you need.