View Single Post

   
  #4 (permalink)  
Old 04-19-2008, 10:12 AM
Benjamin Minshall
 
Posts: n/a
Default Re: stats collector process high CPU utilization

Tom Lane wrote:
> Benjamin Minshall <minshall@intellicon.biz> writes:
>> Tom Lane wrote:
>>> Can you gather some info about what it's doing?
>>> strace'ing the stats collector might prove interesting, also if you have
>>> built it with --enable-debug then oprofile results would be helpful.

>
>> I will gather some strace info later today when I have a chance to
>> shutdown the server.

>
> I don't see why you'd need to shut anything down. Just run
> strace -p stats-process-ID
> for a few seconds or minutes (enough to gather maybe a few thousand
> lines of output).
>


Seems the problem may be related to a huge global/pgstat.stat file.
Under 8.1.5 it was about 1 MB; now it's 90 MB in 8.2.3.

I ran strace for 60 seconds:

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
95.71 1.119004 48652 23 rename
4.29 0.050128 0 508599 write
0.00 0.000019 0 249 22 poll
0.00 0.000000 0 23 open
0.00 0.000000 0 23 close
0.00 0.000000 0 34 getppid
0.00 0.000000 0 23 munmap
0.00 0.000000 0 23 setitimer
0.00 0.000000 0 23 22 sigreturn
0.00 0.000000 0 23 mmap2
0.00 0.000000 0 23 fstat64
0.00 0.000000 0 216 recv
------ ----------- ----------- --------- --------- ----------------
100.00 1.169151 509282 44 total

I attached an excerpt of the full strace with the many thousands of
write calls filtered.

--
Benjamin Minshall <minshall@intellicon.biz>
Senior Developer -- Intellicon, Inc.
http://www.intellicon.biz

..
..
..
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0T \274\355"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0/O\227\27\230"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ 0\0\0"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ 0\0\0"..., 3363) = 3363
close(3) = 0
munmap(0xacdfc000, 4096) = 0
rename("global/pgstat.tmp", "global/pgstat.stat") = 0
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\1\0\0\0\0\0 \0\0\301"..., 1000, 0) = 976
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 500}}, NULL) = 0
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\0\0\0\0\0\0 \0\0\322"..., 1000, 0) = 976
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0H\3\0\0:\204\30\0\f\0\0\0\1\0\0\0\0\0\0\0 \301\204"..., 1000, 0) = 840
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\1\0\0\0\0\0 \0\0\301"..., 1000, 0) = 976
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\320\3\0\0:\204\30\0\16\0\0\0\0\0\0\0\0\0 \0\000"..., 1000, 0) = 976
poll([{fd=5, events=POLLIN|POLLERR, revents=POLLIN}], 1, 2000) = 1
recv(5, "\1\0\0\0\240\0\0\0:\204\30\0\2\0\0\0\0\0\0\0\0\0\ 0\0\303"..., 1000, 0) = 160
poll([{fd=5, events=POLLIN|POLLERR}], 1, 2000) = -1 EINTR (Interrupted system call)
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])
getppid() = 22447
open("global/pgstat.tmp", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
fstat64(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xacdfc000
write(3, "\226\274\245\1D\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0\0\0"..., 4096) = 4096
write(3, "Z\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0\0\0"..., 4096) = 4096
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ 0\0\0"..., 4096) = 4096
..
..
..

Reply With Quote