This is a discussion on TOP iowait vs iostat; 100% reading accurate? within the comp.unix.solaris forums, part of the Solaris Operating System category; --> Good morning gentlemen; I've seen several threads regarding I/O wait stats from the top utility, and read some of ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Good morning gentlemen; I've seen several threads regarding I/O wait stats from the top utility, and read some of the links that are referenced regarding how I/O wait is calculated. Unless I've totally mis-read/mis-understood the gist of the threads, I/O wait is not the "raise the flag; stop the presses" issue that one might think it is at first glance, but rather an indicator of CPU idle time on a healthy machine. I'm sure if I've gotten this wrong, I will be corrected. I would appreciate the collective's thoughts on the following scenario: We have a V280R with 2 processors and 6 GB of RAM (recently upgraded from 2 GB). This is a Bea Weblogic webserver front-ending a remote Sybase DB, with no local storage other than the OS, but rather everything is mounted via NFS from a Network Appliance filer. We have a Compuware web-based utility (called VantageView) running on the server keeping track of performance, with what looks like the top utility. Ever since I upgraded the memory to 6 GB periodically, this monitoring tool will show I/O wait at 100%. At the same time, the system idle time will also be at 100%. So again, if I read the aforementioned threads correctly, and I/O wait and system idle time are inter-twined, then this kind of makes sense at first blush. Unfortunately, I cannot reproduce the same results running iostat; I set a script to poll the system every 30 seconds with /bin/iostat -czxn. I then parse the cpu stats from the output file, and I graph them myself in excel. At no time do the wio numbers even remotely approach the 100% mark as reported by this Compuware utility. The normally running sar stats also do not show any real issues. The question I have is; has anyone else ever logged 100% WIO in top, and seen such a discrepancy with the normal os based monitoring tools such as iostat? Any insight would be appreciated, as well as any suggestions for other things to look at. Joe D. |
| |||
| Are you collecting the data at the same intervals and same time period as the Top is reporting ? If yes, and sar,iostat are not showing any signs of I/O wait, then you have to suspect your build of top and the compuware tool. I have seen <80% of I/O wait on a single CPU 500Mhz webserver (heavily loaded). When these I/O %s are pegging, howzz your %busy and reads/sec and wr/sec fields looking ? -Dexthor. |
| ||||
| Joe D. <newbie_from_newbie@yahoo.com> wrote: > Good morning gentlemen; > I've seen several threads regarding I/O wait stats from the top > utility, and read some of the links that are referenced regarding how > I/O wait is calculated. Unless I've totally mis-read/mis-understood the > gist of the threads, I/O wait is not the "raise the flag; stop the > presses" issue that one might think it is at first glance, but rather > an indicator of CPU idle time on a healthy machine. I'm sure if I've > gotten this wrong, I will be corrected. It's certainly not an immediate warning. It is an indicator of CPU idle time. Whether or not the machine is healthy, iowait cannot tell you by itself. > I would appreciate the collective's thoughts on the following scenario: > We have a V280R with 2 processors and 6 GB of RAM (recently upgraded > from 2 GB). This is a Bea Weblogic webserver front-ending a remote > Sybase DB, with no local storage other than the OS, but rather > everything is mounted via NFS from a Network Appliance filer. > We have a Compuware web-based utility (called VantageView) running on > the server keeping track of performance, with what looks like the top > utility. Ever since I upgraded the memory to 6 GB periodically, this > monitoring tool will show I/O wait at 100%. At the same time, the > system idle time will also be at 100%. So again, if I read the > aforementioned threads correctly, and I/O wait and system idle time are > inter-twined, then this kind of makes sense at first blush. Do you have one tool showing 100% in two columns? Top shouldn't do that. If a tool shows both iowait and idle, they should be disjoint components of CPU idle. > Unfortunately, I cannot reproduce the same results running iostat; I > set a script to poll the system every 30 seconds with /bin/iostat > -czxn. I then parse the cpu stats from the output file, and I graph > them myself in excel. At no time do the wio numbers even remotely > approach the 100% mark as reported by this Compuware utility. The > normally running sar stats also do not show any real issues. Perhaps the compuware utility is confused or calculating something its own way. When I talk about the behavior of iowait, I'm referring only to that calculated by the kernel and reported by 'top' and 'iostat'. Your utility may mean something different. > The question I have is; has anyone else ever logged 100% WIO in top, > and seen such a discrepancy with the normal os based monitoring tools > such as iostat? I haven't except with some older bugs. (I've seen 'top' read the wrong values after a system patch changed some kernel variables). Remember, iowait is a subset of idle CPU. If the CPU isn't idle, you shouldn't havej iowait. Try running a cpu intensive job (like a stupid perl or shell loop). If the iowait doesn't decrease below 100% by at least the amount of user time, then something is wrong. -- Darren Dunham ddunham@taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. > |