Unix Technical Forum

Sar questions

This is a discussion on Sar questions within the Sco Unix forums, part of the Unix Operating Systems category; --> Below is my sar output. How can we know that what caused the cpu to consume 100% CPU at ...


Go Back   Unix Technical Forum > Unix Operating Systems > Sco Unix

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-15-2008, 12:38 PM
Chalawal Maliwan
 
Posts: n/a
Default Sar questions

Below is my sar output. How can we know that what caused the cpu to
consume 100% CPU at all time?

#sar

00:00:00 %usr %sys %wio %idle (-u)
01:00:00 25 75 0 0
02:00:00 25 75 0 0
03:00:00 24 76 0 0
04:00:00 24 76 0 0
05:00:00 24 76 0 0
06:00:00 24 76 0 0
..
..
(%idle = 0 all the time)

#sar 1 5

23:30:19 %usr %sys %wio %idle (-u)
23:30:20 24 76 0 0
23:30:21 24 76 0 0
23:30:22 16 84 0 0
23:30:23 14 86 0 0
23:30:24 20 80 0 0

Average 19 81 0 0



Thanks,

chalawal
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-15-2008, 12:38 PM
Jeff Liebermann
 
Posts: n/a
Default Re: Sar questions

On 24 Dec 2003 08:34:35 -0800, chalawal@hotmail.com (Chalawal Maliwan)
wrote:

>Below is my sar output. How can we know that what caused the cpu to
>consume 100% CPU at all time?
>
>#sar
>
>00:00:00 %usr %sys %wio %idle (-u)
>01:00:00 25 75 0 0
>02:00:00 25 75 0 0
>03:00:00 24 76 0 0
>04:00:00 24 76 0 0
>05:00:00 24 76 0 0
>06:00:00 24 76 0 0


Download cpuhog, iohog, and memhog.
http://www.caldera.com/skunkware/sysadmin/
Cpuhog should identify the culprit.

You can also do it manually with the ps command:
http://docsrv.sco.com:507/en/man/html.C/ps.C.html
The pcpu column will show the percentage of CPU use for each process.

Also, watch out for situations where more than one process is hogging
the CPU cycles. I've seen it happen once or twice.

--
Jeff Liebermann 150 Felker St #D Santa Cruz CA 95060
(831)421-6491 pgr (831)336-2558 home
http://www.LearnByDestroying.com AE6KS
jeffl@comix.santa-cruz.ca.us jeffl@cruzio.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-15-2008, 12:38 PM
Tony Lawrence
 
Posts: n/a
Default Re: Sar questions

Chalawal Maliwan <chalawal@hotmail.com> wrote:
>Below is my sar output. How can we know that what caused the cpu to
>consume 100% CPU at all time?


>#sar


>00:00:00 %usr %sys %wio %idle (-u)
>01:00:00 25 75 0 0
>02:00:00 25 75 0 0
>03:00:00 24 76 0 0
>04:00:00 24 76 0 0
>05:00:00 24 76 0 0
>06:00:00 24 76 0 0


Usually fairly simply. From http://aplawrence.com/Unixart/slow.html

If it is the cpu that is pegged busy, it *may* be a run away process
that is eating cpu cycles. Do this:

for x in 1 2 3 4 5
do
ps -e | sort -r +2 | head -5
echo "==="
sleep 5
done

Look for a process who's time column has gone up by 3 to 5 seconds each
time- if you have something like that, that's your problem- you need to
kill it. The TIME column is time on the cpu- normally a process doesn't
spend a great deal of time actually running- it's waiting for the disk,
waiting for you to type something, etc. Most processes spend most of
their time sleeping, waiting for something else to happen, so something
that gains 3 seconds or more in 5 seconds of wall time is usually suspect.

If you watch it over a few minutes, the time it gains here divided by
the elapsed wall clock time is the percentage of your cpu this process
is taking for itself. A shortlived process can take a lot of the cpu
to print, or to redraw an X screen etc., so you have to use some good
judgement here. But 3 seconds out of 5 is very likely a real problem.

Of course you need to understand what you are killing: you probably
wouldn't want to kill the main Oracle database, for example.

If you kill the errant process and another copy of it pops right back
to the top of the list, then you need to track down its parent:

# for example, if process 15246 is the problem
ps -p 15246 -o ppid

Of course, it may go further up the chain. Here's a script that traces
back to init:

# This works on SCO or Linux, just pass a process ID as an argument.
MYPROC=$1
NEXTPROC=$MYPROC
while [ $NEXTPROC != 0 ]
do
ps -lp $NEXTPROC
MYPROC=$NEXTPROC
NEXTPROC=`ps -p $MYPROC -o "ppid=" `
done

Sometimes you'll have a badly written network program that starts sucking
resources when its client dies. If you can't get the supplier to fix it,
you may want to write a script to track down and kill these things. One
clue that might help: the difference between a good "xyz" process and a
bad one might just be whether or not it has an attached tty. So, if you
see this:

5821 ? 00:00:42 xyz
6689 ttyp0 00:00:08 xyz
7654 ttyp1 00:00:12 xyz

It's probably the one with a "?" that will start accumulating time. So
a script that watched for and killed those might look like this:

set -f
# turn off shell expansion because of "?"
ps -e | grep "xyz$" | while read line
do
set $line
[ "$2" = "?" ] && kill -9 $1
done

If you can't do it that way, you have to get more clever, and watch for
changing time:

set -f
mkdir /tmp/mystuff
ps -e | grep "xyz$" | while read line
do
set $line
ps -p $1 > /tmp/mystuff/first
sleep 5
#adjust sleep as necessary
ps -p $1 > /tmp/mystuff/second
diff /tmp/mystuff/first /tmp/mystuff/second || kill -9 $1
done

And even that may not be clever enough for your particular situation,
so test and tread carefully. You may even need to do math on the time
field to see what has really happened.

Bela Lubkin made an interesting post about an apparently slow CPU2 on
an SMP system. Read it at http://aplawrence.com//Bofcusm/1695.html.

Another thing you may see is a process that has used a lot of time
but isn't gaining time right now. I've seen that many times where the
process is "deliver"- MMDF's mail delivery agent on SCO systems that
aren't running sendmail. What happens is that for whatever reason
(a root.lock file from a crash in /usr/spool/mail or a missing "sys"
home directory), there are thousands of undelivered messages in the
subdirectories of /usr/spool/mmdf/lock/home

The fix for that is simple if you don't care about the messages: rm -r
all those directories and recreate them empty with the same ownership
and permissions

cd /usr/spool/mmdf/lock/home
/etc/rc2.d/P86mmdf stop
rm -r *
chown mmdf:mmdf *
chmod 777 *
cd /usr/spool/mail
rm *.lock
/etc/rc2.d/P86mmdf start

You'd then want to verify that mail is working normally and that whatever
caused the problem isn't still happening- for example, if /usr/sys is
missing this problem will come right back again very quickly.

Another possibility is a program that is rapidly spawning off other
programs. You should be able to see that in "ps -e". First, are the
number of processes growing?:

ps -e | wc -l
sleep 5
ps -e | wc -l

Or, are there new processes briefly showing up at the end of the listing?:

ps -e | tail
sleep 5
ps -e | tail

In either case, you need to track down the parent and kill it.


--
tony@aplawrence.com Unix/Linux/Mac OS X resources: http://aplawrence.com
Get paid for writing about tech: http://aplawrence.com/publish.html
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-15-2008, 12:38 PM
Jeff Liebermann
 
Posts: n/a
Default Re: Sar questions

On Wed, 24 Dec 2003 09:48:03 -0800, Jeff Liebermann
<jeffl@comix.santa-cruz.ca.us> wrote:

>You can also do it manually with the ps command:
> http://docsrv.sco.com:507/en/man/html.C/ps.C.html
>The pcpu column will show the percentage of CPU use for each process.


I found this on the man page for ps:

Display in decreasing order, the IDs and percentage CPU usage of all
processes where usage is more than 5% of CPU time:

ps -A -o "pid=" -o "pcpu=" | awk '$2 > 5 {print $1" "$2}' | sort -r +1

Methinks it might be handy although I would sort by the pcpu
percentage instead of the process ID. Change to:

ps -A -o "pid=" -o "pcpu=" | awk '$2 > 5 {print $1" "$2}' | sort -r +2

(Note: I didn't try this because my ancient 3.2v4.2 ps command doesn't
support the -o option).


--
Jeff Liebermann 150 Felker St #D Santa Cruz CA 95060
(831)421-6491 pgr (831)336-2558 home
http://www.LearnByDestroying.com AE6KS
jeffl@comix.santa-cruz.ca.us jeffl@cruzio.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-15-2008, 12:38 PM
Jeff Liebermann
 
Posts: n/a
Default Re: Sar questions

On Wed, 24 Dec 2003 18:13:47 +0000 (UTC), Tony Lawrence
<apl@shell01.TheWorld.com> wrote:
(...)

We both forgot about the "top" program.
http://www.caldera.com/skunkware/sysadmin/
which will also display the top cpu hogs.

>In either case, you need to track down the parent and kill it.


That's what happens with child abuse.

--
Jeff Liebermann 150 Felker St #D Santa Cruz CA 95060
(831)421-6491 pgr (831)336-2558 home
http://www.LearnByDestroying.com AE6KS
jeffl@comix.santa-cruz.ca.us jeffl@cruzio.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-15-2008, 12:38 PM
Tony Lawrence
 
Posts: n/a
Default Re: Sar questions

Jeff Liebermann <jeffl@comix.santa-cruz.ca.us> wrote:
>On Wed, 24 Dec 2003 18:13:47 +0000 (UTC), Tony Lawrence
><apl@shell01.TheWorld.com> wrote:
>(...)


>We both forgot about the "top" program.
> http://www.caldera.com/skunkware/sysadmin/
>which will also display the top cpu hogs.


I didn't forget. Although extraordinarily popular, I think it's
usefulness is overblown and undeserved.

>>In either case, you need to track down the parent and kill it.


>That's what happens with child abuse.


:-)

--
tony@aplawrence.com Unix/Linux/Mac OS X resources: http://aplawrence.com
Get paid for writing about tech: http://aplawrence.com/publish.html
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-15-2008, 12:39 PM
Chalawal Maliwan
 
Posts: n/a
Default Re: Sar questions

>
> Download cpuhog, iohog, and memhog.
> http://www.caldera.com/skunkware/sysadmin/
> Cpuhog should identify the culprit.
>
> You can also do it manually with the ps command:
> http://docsrv.sco.com:507/en/man/html.C/ps.C.html
> The pcpu column will show the percentage of CPU use for each process.
>
> Also, watch out for situations where more than one process is hogging
> the CPU cycles. I've seen it happen once or twice.


I tried with cpuhog and saw some particular processes that consumed the cpu time.


Thanks for all who answered my questions.

chalawal
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:41 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com