RE: DBMS-servers die after E_CL2529_CS_SEM_OWNER_DEAD messages Hello Marty,
When we implemented the installation on our current production machine we started with 58 servers and 58 databases (besides 1 server for iidbdb and imadb). This was a 1 <-> 1 migration from Ingres 6.4 databases on 4 VAX-VMS machines.
This worked well. But after some months we had to move 11 other databases (4 of them quite big) from another Alpha Server to our current production machine, so we decided to set up 23 servers for the 69 databases, which is now reduced to 14 servers for the 69 databases . . .
Our current production machine has 3 processors, 7168 Mb memory. Normally about 60% memory is used. (We need room for still more databases).
The dbms-servers will accept 200 connections (for the smaller databases) to 400 connections (for the bigger ones).
The system is set up for 2050 user processes, of which about 50 % (normally) is in use.
The dbms-servers take about 20 MB memory, but can take more if enough memory is available.
Greetings
Willem
-----Original Message-----
From: Martin Bowes [mailto:bowes@bucket.its.unimelb.edu.au]
Sent: maandag 5 april 2004 3:11
To: Zande, van de Willem
Cc: ingres newsgroup
Subject: Re: DBMS-servers die after E_CL2529_CS_SEM_OWNER_DEAD messages
Hi Willem,
23 Servers! Reduced to 15!!!
Exactly how much memory do you have in this machine? How many CPUs?
How big (in memory terms) is each server?
How many connectedi/active sessions in each?
I am agog!
Marty
>
> Hello Marty
>
> Thanks for your suggestion. I asked our OS-experts to take a look at HP case ah667388.
> Besides that, CA gave us with information how to enable evidence sets.
> This will generate information when a specified error code occurs, in our case the
> E_CL2529_CS_SEM_OWNER_DEAD error.
> Hopefully this will give usefull information.
>
> I also decreased the number of dbms-servers from 23 to 15.
> This reduced the mutex-messages in the errlog.
> I had thousands of them every day (sometimes even more than 1,000,000) and now less than 100...
>
> Greetings,
> Willem
>
> -----Original Message-----
> From: Martin Bowes [mailto:bowes@bucket.its.unimelb.edu.au]
> Sent: woensdag 31 maart 2004 0:56
> To: Zande, van de Willem
> Cc: ingres newsgroup
> Subject: Re: DBMS-servers die after E_CL2529_CS_SEM_OWNER_DEAD messages
>
>
> >
> > Hello Marty,
> >
> > My first reaction was that this problem was caused by the combination OS-hardware.
> > Previously we had our productional system on a Alpha Server GS40. Never problems like this.
> > About a month after we moved to an HP Alpha Server ES40 (same OS version as on the GS40) in february we have the problem.
> > But our hardware- and VMS-engineers dont't think OS or hardware is the problem.
>
> They Never do!
>
> > There are logging facilities on VMS. During the last crash (FRI MAR 26) we made a crashdump and sent it to HP. I hope this will give us some more information.
>
> If it's of any use to you, here are the case numbers we had when we had
> problems with DUNIX 5.1b. It might be worth waving these under your
> Engineers nose.
>
> HP case ah667388 (new id ah697000) : System crash/kernel panic
>
> Marty
> >
> > Regards
> > Willem
> >
> > -----Original Message-----
> > From: Martin Bowes [mailto:bowes@bucket.its.unimelb.edu.au]
> > Sent: maandag 29 maart 2004 2:48
> > To: Zande, van de Willem
> > Cc: ingres newsgroup
> > Subject: Re: DBMS-servers die after E_CL2529_CS_SEM_OWNER_DEAD messages
> >
> >
> > Hi Willem,
> > >
> > > Hello Martin
> > >
> > > Thanks for your reply.
> > > Logfile was 0%, no force abort or logfull.
> >
> > Cool!
> >
> > > No messages like E_DM9815_ARCH_SHUTDOWN.
> > > Last message in the archiver log ACP: Archive Cycle Completed Successfully. No alarming messages, just normal ones.
> >
> > Thats a worry!
> > It really seems to have lost touch with the archiver. Which could also
> > explain the LG_MUTEX errors as well.
> >
> > This sounds more and more like a problem in the OS causing Ingres to go
> > belly up. I've seen situations in UNIX (axp.osf DUNIX 5.1b) where the
> > installation would collapse for no good reason due to kernel problems.
> >
> > Have you recently had an OS upgrade or a patch to the OS installed?
> > I'm not familiar with VMS, but in UNIX there are many logging features
> > that can be turned on - at different debugging levels - to spill the guts
> > on system issues. It might be worthwhile to talk to your systems people
> > about this and get as much turned on as possible!
> >
> > Marty
> >
> > > No startup in the archiver log, just one archiver running. . . .
> > > We had even crashes without any activity; no users, no batchjobs; the installation seems to crash at random.
> > > A very weird problem.
> > >
> > > Regards
> > > Willem
> > >
> > > -----Original Message-----
> > > From: Martin Bowes [mailto:bowes@bucket.its.unimelb.edu.au]
> > > Sent: vrijdag 26 maart 2004 1:25
> > > To: Zande, van de Willem
> > > Cc: ingres newsgroup
> > > Subject: Re: DBMS-servers die after E_CL2529_CS_SEM_OWNER_DEAD messages
> > >
> > >
> > > Hi Willem,
> > >
> > > How full was the log file at the time of the problems - had you gone into
> > > a Log Full or Force Abort case?
> > >
> > > What was in the errlog.log at the time of the shutdown. Did you see a
> > > message like:
> > > ::[II_ACP , 0000000140298080]: Thu Nov 27 18:10:09 2003 E_DM9815_ARCH_SHUTDOWN Archiver was told to shut down.
> > > What was in the archiver log at this time?
> > >
> > > Another possibility...
> > > I suspect that you may have had two Archivers running at once in the same
> > > installation. Do you have any process monitoring records that may show
> > > what ingres was running just before your problems started?
> > >
> > > Have you checked in the Archiver log file for any information? BTW. Starts
> > > and stops of the archiver are recorded there. You might like to trawl that
> > > to see if a startup occurred just before the problems.
> > >
> > > Martin Bowes
> > > >
> > > > We have Ingres Version II 2.0/0308 (axm.vms/00) Patch 9771 on OpenVMS
> > > > V7.2-1H1
> > > > on a 3 cpu AlphaServer ES40.
> > > > Last weeks we encountered lots of crashes of our Ingres installation.
> > > > The situation after all crashes (except one or two) as seen on the
> > > > system was:
> > > > 1. Only the II_GCN, II_IUSV_xxx, DMFACP and the II_GCC_xx processes
> > > > were present. All the dbms-servers died by itself.
> > > > 2. I could bring down the installation with INGSTOP without errors,
> > > > but after INGSTOP terminated the DMFACP process was still there.
> > > > 3. Inspecting the DMFACP proces with SHOW PROCESS /CONTINUOUS show no
> > > > activity at all, so I think DMFACP lost it's connection with Ingres
> > > > during the crash. Maybe the hanging DMFACP even caused the crash.
> > > > 4. After deleting DMFACP with STOP/ID I could succesfully restart the
> > > > installation with INGSTART.
> > > > 5. In the errorlog I saw (after lots of EV_SCB events and "LG LGD
> > > > status mutex" and " LG Local LFB curr mutex" messages):
> > > >
> > > > E_DMA42E_LG_MUTEX
> > > > E_DM014A_CHECK_DEAD
> > > > E_SC0322_CHECK_DEAD_EXIT
> > > > E_SC0241_VITAL_TASK_FAILURE
> > > > E_SC0127_SERVER_TERMINATE
> > > > E_PS0501_SESSION_OPEN
> > > > E_SC0127_SERVER_TERMINATE
> > > > I think these were caused by the dying DBMS-servers for after that the
> > > > E_GC0139_GCN_NO_DBMS were generated, followed by the
> > > > E_DM1051_JSP_NO_INSTALL message.
> > > > After running INGSTOP I did see the "E_GC2002_SHUTDOWN
> > > > Communication Server normal shutdown" and
> > > > "E_GC0152_GCN_SHUTDOWN Name Server normal shutdown" messages but no
> > > > sign of the DMFACP process.
> > > >
> > > > Anyone any idea?
> > > > Of course we hade several contacts with CA technical support, but
> > > > without any result so far.
> > > > Thanks,
> > > > Willem
> > > >
> > >
> > >
> > > --
> > > Random Duckman Quote #21:
> > > Duckman - At DDA our motto is 'We just want to see you greased up and
> > > semi-conscious'.
> > >
> >
> >
> > --
> > Random Farscape Quote #10:
> > Chiana - I can kick, kiss or cry my way out of any problem, but this is way,
> > way, way, way different.
> >
>
>
> --
> Random Farscape Quote #21:
> Jools - Everything I have seen is despicable.
> John - Welcome to the Federation Starship SS.Buttcrack!
>
--
Random Titus Quote #4:
It's always funny until someone gets hurt. And then its funnier. |