This is a discussion on Re: Hanging queries on dual CPU windows within the Pgsql Performance forums, part of the PostgreSQL category; --> > > > Could it be they broke it when they did that???? > > > > In theory, ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| > > > Could it be they broke it when they did that???? > > > > In theory, yes, but it still seems a bit far fetched :-( > > Well, I rolled back SP1 and am running my test again. Looking > much better, hasn't locked up in 45mins now, whereas before > it would lock up within 5mins. > > So I think they broke something. Wow. I guess I was lucky that I didn't say it was impossible :-) But what really is happening. What other thread is actually holding the critical section at this point, causing us to block? The only places it gets held is while looping the signal queue, but it is released while calling the signal function itself... But they obviously *have* been messing with critical sections, so maybe they accidentally changed something else as well... What bothers me is that nobody else has reported this. It could be that this was exposed by the changes to the signal handling done for 8.1, and the ppl with this level of concurrency are either still on 8.0 or just not on SP1 for their windows boxes yet... Do you have any other software installed on the machine? That might possibly interfere in some way? But let's have it run for a bit longer to confirm this does help. If so, we could perhaps recode that part using a Mutex instead of a critical section - since it's not a performance critical path, the difference shouldn't be large. If I code up a patch for that, can you re-apply SP1 and test it? Or is this a production system you can't really touch? //Magnus ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| On Friday 10 March 2006 13:25, Magnus Hagander wrote: > > > > Could it be they broke it when they did that???? > > > > > > In theory, yes, but it still seems a bit far fetched :-( > > > > Well, I rolled back SP1 and am running my test again. Looking > > much better, hasn't locked up in 45mins now, whereas before > > it would lock up within 5mins. > > > > So I think they broke something. > > Wow. I guess I was lucky that I didn't say it was impossible :-) > > > But what really is happening. What other thread is actually holding the > critical section at this point, causing us to block? The only places it > gets held is while looping the signal queue, but it is released while > calling the signal function itself... > > But they obviously *have* been messing with critical sections, so maybe > they accidentally changed something else as well... > > What bothers me is that nobody else has reported this. It could be that > this was exposed by the changes to the signal handling done for 8.1, and > the ppl with this level of concurrency are either still on 8.0 or just > not on SP1 for their windows boxes yet... Do you have any other software > installed on the machine? That might possibly interfere in some way? Just a JDK, JBoss, cygwin (running sshd), and a VNC Server. I don't think that interferes. > > But let's have it run for a bit longer to confirm this does help. I turned it off after 2.5hr. The longest I had to wait before, with less load, was 1.45hr. > If so, > we could perhaps recode that part using a Mutex instead of a critical > section - since it's not a performance critical path, the difference > shouldn't be large. If I code up a patch for that, can you re-apply SP1 > and test it? Or is this a production system you can't really touch? I can do whatever the hell I want with it, so if you could cook up a patch that would be great. As a BTW: I reinstalled SP1 and turned stats collection off. That also seems to work, but is not really a solution since we want to use autovacuuming. > > //Magnus jan -- -------------------------------------------------------------- Jan de Visser * * * * * * * * * * jdevisser@digitalfairway.com * * * * * * * * Baruk Khazad! Khazad ai-menu! -------------------------------------------------------------- ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| On Friday 10 March 2006 14:27, Jan de Visser wrote: > As a BTW: I reinstalled SP1 and turned stats collection off. That also > seems to work, but is not really a solution since we want to use > autovacuuming. I lied. I hangs now. Just takes a lot longer... jan -- -------------------------------------------------------------- Jan de Visser * * * * * * * * * * jdevisser@digitalfairway.com * * * * * * * * Baruk Khazad! Khazad ai-menu! -------------------------------------------------------------- ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |