Unix Technical Forum

TOP iowait vs iostat; 100% reading accurate?

This is a discussion on TOP iowait vs iostat; 100% reading accurate? within the comp.unix.solaris forums, part of the Solaris Operating System category; --> Good morning gentlemen; I've seen several threads regarding I/O wait stats from the top utility, and read some of ...


Go Back   Unix Technical Forum > Unix Operating Systems > Solaris Operating System > comp.unix.solaris

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 01-11-2008, 02:51 PM
Joe D.
 
Posts: n/a
Default TOP iowait vs iostat; 100% reading accurate?

Good morning gentlemen;

I've seen several threads regarding I/O wait stats from the top
utility, and read some of the links that are referenced regarding how
I/O wait is calculated. Unless I've totally mis-read/mis-understood the
gist of the threads, I/O wait is not the "raise the flag; stop the
presses" issue that one might think it is at first glance, but rather
an indicator of CPU idle time on a healthy machine. I'm sure if I've
gotten this wrong, I will be corrected.

I would appreciate the collective's thoughts on the following scenario:

We have a V280R with 2 processors and 6 GB of RAM (recently upgraded
from 2 GB). This is a Bea Weblogic webserver front-ending a remote
Sybase DB, with no local storage other than the OS, but rather
everything is mounted via NFS from a Network Appliance filer.

We have a Compuware web-based utility (called VantageView) running on
the server keeping track of performance, with what looks like the top
utility. Ever since I upgraded the memory to 6 GB periodically, this
monitoring tool will show I/O wait at 100%. At the same time, the
system idle time will also be at 100%. So again, if I read the
aforementioned threads correctly, and I/O wait and system idle time are
inter-twined, then this kind of makes sense at first blush.

Unfortunately, I cannot reproduce the same results running iostat; I
set a script to poll the system every 30 seconds with /bin/iostat
-czxn. I then parse the cpu stats from the output file, and I graph
them myself in excel. At no time do the wio numbers even remotely
approach the 100% mark as reported by this Compuware utility. The
normally running sar stats also do not show any real issues.

The question I have is; has anyone else ever logged 100% WIO in top,
and seen such a discrepancy with the normal os based monitoring tools
such as iostat?

Any insight would be appreciated, as well as any suggestions for other
things to look at.

Joe D.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 01-11-2008, 02:51 PM
Dexthor
 
Posts: n/a
Default Re: TOP iowait vs iostat; 100% reading accurate?

Are you collecting the data at the same intervals and same time period
as the Top is reporting ? If yes, and sar,iostat are not showing any
signs of I/O wait, then you have to suspect your build of top and the
compuware tool.

I have seen <80% of I/O wait on a single CPU 500Mhz webserver (heavily
loaded).

When these I/O %s are pegging, howzz your %busy and reads/sec and
wr/sec fields looking ?

-Dexthor.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 01-11-2008, 02:51 PM
Darren Dunham
 
Posts: n/a
Default Re: TOP iowait vs iostat; 100% reading accurate?

Joe D. <newbie_from_newbie@yahoo.com> wrote:
> Good morning gentlemen;


> I've seen several threads regarding I/O wait stats from the top
> utility, and read some of the links that are referenced regarding how
> I/O wait is calculated. Unless I've totally mis-read/mis-understood the
> gist of the threads, I/O wait is not the "raise the flag; stop the
> presses" issue that one might think it is at first glance, but rather
> an indicator of CPU idle time on a healthy machine. I'm sure if I've
> gotten this wrong, I will be corrected.


It's certainly not an immediate warning. It is an indicator of CPU idle
time. Whether or not the machine is healthy, iowait cannot tell you by
itself.

> I would appreciate the collective's thoughts on the following scenario:


> We have a V280R with 2 processors and 6 GB of RAM (recently upgraded
> from 2 GB). This is a Bea Weblogic webserver front-ending a remote
> Sybase DB, with no local storage other than the OS, but rather
> everything is mounted via NFS from a Network Appliance filer.


> We have a Compuware web-based utility (called VantageView) running on
> the server keeping track of performance, with what looks like the top
> utility. Ever since I upgraded the memory to 6 GB periodically, this
> monitoring tool will show I/O wait at 100%. At the same time, the
> system idle time will also be at 100%. So again, if I read the
> aforementioned threads correctly, and I/O wait and system idle time are
> inter-twined, then this kind of makes sense at first blush.


Do you have one tool showing 100% in two columns? Top shouldn't do
that. If a tool shows both iowait and idle, they should be disjoint
components of CPU idle.

> Unfortunately, I cannot reproduce the same results running iostat; I
> set a script to poll the system every 30 seconds with /bin/iostat
> -czxn. I then parse the cpu stats from the output file, and I graph
> them myself in excel. At no time do the wio numbers even remotely
> approach the 100% mark as reported by this Compuware utility. The
> normally running sar stats also do not show any real issues.


Perhaps the compuware utility is confused or calculating something its
own way. When I talk about the behavior of iowait, I'm referring only
to that calculated by the kernel and reported by 'top' and 'iostat'.
Your utility may mean something different.

> The question I have is; has anyone else ever logged 100% WIO in top,
> and seen such a discrepancy with the normal os based monitoring tools
> such as iostat?


I haven't except with some older bugs. (I've seen 'top' read the wrong
values after a system patch changed some kernel variables).

Remember, iowait is a subset of idle CPU. If the CPU isn't idle, you
shouldn't havej iowait. Try running a cpu intensive job (like a stupid
perl or shell loop). If the iowait doesn't decrease below 100% by at
least the amount of user time, then something is wrong.

--
Darren Dunham ddunham@taos.com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:44 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com