Introduction to iostat , vmstat and netstat This document is primarily written with reference to solaris performance monitoring and tuning but these tools are available in other unix variants also with slight syntax difference. iostat , vmstat and netstat are three most commonly used tools for performance monitoring . These comes built in with the operating system and are easy to use .iostat stands for input output statistics and reports statistics for i/o devices such as disk drives . vmstat gives the statistics for virtual Memory and netstat gives the network statstics . Following paragraphs describes these tools and their usage for performance monitoring and if you need more information there are some very good solaris performance monitoring books available at www.besttechbooks.com. Table of content : Input Output statistics ( iostat ) iostat reports terminal and disk I/O activity and CPU utilization. The first line of output is for the time period since boot & each subsequent line is for the prior interval . Kernel maintains a number of counters to keep track of the values. iostat's activity class options default to tdc (terminal, disk, and CPU). If any other option/s are specified, this default is completely overridden i.e. iostat -d will report only statistics about the disks. Basic synctax is iostat <options> interval count option - let you specify the device for which information is needed like disk , cpu or terminal. (-d , -c , -t or -tdc ) . x options gives the extended statistics . interval - is time period in seconds between two samples . iostat 4 will give data at each 4 seconds interval. count - is the number of times the data is needed . iostat 4 5 will give data at 4 seconds interval 5 times |
|
|
$ iostat -xtc 5 2 extended disk statistics tty cpu disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0 sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23 sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31 The fields have the following meanings: disk name of the disk r/s reads per second w/s writes per second Kr/s kilobytes read per second Kw/s kilobytes written per second wait average number of transactions waiting for service (Q length) actv average number of transactions actively being serviced (removed from the queue but not yet completed) %w percent of time there are transactions waiting for service (queue non-empty) %b percent of time the disk is busy (transactions in progress) |
The values to look from the iostat output are:
%b
)svc_t
)If a disk shows consistently high reads/writes along with , the percentage busy (%b
) of the disks is greater than 5 percent, and the average service time (svc_t
) is greater than 30 milliseconds, then one of the following action needs to be taken
1.)Tune the application to use disk i/o more efficiently by modifying the disk queries and using available cache facilities of application servers .
2.) Spread the file system of the disk on to two or more disk using disk striping feature of volume manager /disksuite etc.
3.) Increase the system parameter values for inode cache , ufs_ninode , which is Number of inodes to be held in memory. Inodes are cached globally (for UFS), not on a per-file system basis
Virtual Memory Statistics ( vmstat )
vmstat - vmstat reports virtual memory statistics of process, virtual memory, disk, trap, and CPU activity.
On multicpu systems , vmstat averages the number of CPUs into the output. For per-process statistics .Without options, vmstat displays a one-line summary of the virtual memory activity since the system was booted.
syntax:
Basic synctax is vmstat <options> interval count
option - let you specify the type of information needed such as paging -p , cache -c ,.interrupt -i etc.
if no option is specified information about process , memory , paging , disk ,interrupts & cpu is displayed .
interval - is time period in seconds between two samples . vmstat 4 will give data at each 4 seconds interval.
Example
The following command displays a summary of what the system is doing every five seconds. example% vmstat 5
procs memory page disk faults cpu r b w swap free re mf pi p fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 11456 4120 1 41 19 1 3 0 2 0 4 0 0 48 112 130 4 14 82 0 0 1 10132 4280 0 4 44 0 0 0 0 0 23 0 0 211 230 144 3 35 62 0 0 1 10132 4616 0 0 20 0 0 0 0 0 19 0 0 150 172 146 3 33 64 0 0 1 10132 5292 0 0 9 0 0 0 0 0 21 0 0 165 105 130 1 21 78 |
A. CPU issues:
Following columns has to be watched to determine if there is any cpu issue
procs cpu r b w us sy id 0 0 0 4 14 82 0 0 1 3 35 62 0 0 1 3 33 64 0 0 1 1 21 78 |
procs r
) are consistently greater than the number of CPUs on the system it will slow down system as there are more processes then available CPUs .
cpu id
) is consistently 0 and if the system time (
cpu sy
) is double the user time (
cpu us
) system is facing shortage of CPU resources.
sr
) . The scan rate is the pages scanned by the clock algorithm per second. If the scan rate (
sr
) is continuously over 200 pages per second then there is a memory shortage.
netstat displays the contents of various network-related data structures in depending on the options selected.
netstat <option/s>
multiple options can be given at one time.
Options
interval - number for continuous display of statictics.
$netstat -rn Routing Table: IPv4 Destination Gateway Flags Ref Use Interface -------------------- -------------------- ----- ----- ------ --------- 192.168.1.0 192.168.1.11 U 1 1444 le0 224.0.0.0 192.168.1.11 U 1 0 le0 default 192.168.1.1 UG 1 68276 127.0.0.1 127.0.0.1 UH 1 10497 lo0 |
This shows the output on a Solaris machine who's IP address is 192.168.1.11 with a default router at 192.168.1.1
A.) Network availability
The command as above is mostly useful in troubleshooting network accessibility issues . When outside network is not accessible from a machine check the following
1. if the default router ip address is correct
2. you can ping it from your machine.
3. If router address is incorrect it can be changed with route add commnad . See man route for more info .
If the router address is correct but still you can't ping it there may be some network cable /hub/switch problem and you have to try and eliminate the faulty component .
B.) Network Response
$ netstat -i
Name |
Mtu |
Net/Dest |
Address |
Ipkts |
Ierrs |
Opkts |
Oerrs |
Collis |
Queue |
lo0 |
8232 |
loopback |
localhost |
77814 |
0 |
77814 |
0 |
0 |
0 |
hme0 |
1500 |
server1 |
server1 |
10658566 |
3 |
4832511 |
0 |
279257 |
0 |
This option is used to diagnose the network problems when the connectivity is there but it is slow in response .
Values to look at:
Collis
)Opkts
)Ierrs
)Ipkts
)The above values will give information to workout
i. Network collision rate as follows :
Network collision rate = Output collision counts / Output packets
Network-wide collision rate greater than 10 percent will indicate
ii. Input packet error rate as follows :
Input Packet Error Rate = Ierrs / Ipkts
.
If the input error rate is high (over 0.25 percent), the host is dropping packets. Hub/switch cables etc needs to be checked for potential problems.
C. Network socket & TCP Cconnection state
Netstat gives important information about network socket and tcp state . This is very useful in finding out the open , closed and waiting network tcp connection .
Network states returned by netstat are following :
|
评论