This is a discussion on How often should I reboot Solaris and LynxOS within the Sun Solaris Administration forums, part of the Solaris Operating System category; --> Hi, Would someone please give me some pointers on a trend analysis on resource leaks for Solaris 9 and ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, Would someone please give me some pointers on a trend analysis on resource leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically, I need statistics to determine how often I need to restart these machines to avoid unplanned failures. Thanks. ekl |
| |||
| On Thu, 24 Jul 2003 11:18:08 -0400, EKL <En-Kuang_Lung@raytheon.com>, in <QMSTa.3816$c6.3317@bos-service2.ext.raytheon.com> wrote: +> Would someone please give me some pointers on a trend analysis on resource +> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically, +> I need statistics to determine how often I need to restart these machines to +> avoid unplanned failures. Thanks. Ummm...you don't? I can't speak to LynxOS, but I've had Solaris boxen with uptimes on the order of 4-6 months. If they weren't connected to the commidity internet, and the power stayed on, and I wasn't so zealous in applying the recommended and security patches, I could probably get uptimes in the *years*. What you have to worry about are buggy applications doing bad things to your system resources. James -- Consulting Minister for Consultants, DNRC I can please only one person per day. Today is not your day. Tomorrow isn't looking good, either. I am BOFH. Resistance is futile. Your network will be assimilated. |
| |||
| > > +> Would someone please give me some pointers on a trend analysis on resource > +> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically, > +> I need statistics to determine how often I need to restart these machines to > +> avoid unplanned failures. Thanks. > > Ummm...you don't? I can't speak to LynxOS, but I've had Solaris boxen > with uptimes on the order of 4-6 months. If they weren't connected to > the commidity internet, and the power stayed on, and I wasn't so > zealous in applying the recommended and security patches, I could > probably get uptimes in the *years*. > > What you have to worry about are buggy applications doing bad things > to your system resources. > Thanks James. Even though you may not need to reboot for a long time, but do you notice any performance degradation over time or unexplained resource losses of usages? Thanks. |
| |||
| On Thu, 24 Jul 2003, EKL wrote: > Would someone please give me some pointers on a trend analysis on resource > leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically, > I need statistics to determine how often I need to restart these machines to > avoid unplanned failures. Thanks. I don't know about LynxOS, but you reboot a Solaris machine when you add hardware that isn't hot swappable, or when you apply one or more patches that need the machine to be rebooted to take effect (e.g., the kernel jumbo patch). Depending on your paranoia level, you should probably look at adding the latest recommended patch cluster quarterly perhaps, more or less. Rebooting Solaris machines "just because" is neither necessary nor desirable. It's a habit from people who look after Windoze machines. -- Rich Teer, SCNA, SCSA President, Rite Online Inc. Voice: +1 (250) 979-1638 URL: http://www.rite-online.net |
| |||
| In article <Y%STa.3818$c6.3326@bos-service2.ext.raytheon.com>, "EKL" <En-Kuang_Lung@raytheon.com> wrote: > > > > +> Would someone please give me some pointers on a trend analysis on > resource > > +> leaks for Solaris 9 and LynxOS (or in general on UNIX machines). > Basically, > > +> I need statistics to determine how often I need to restart these > machines to > > +> avoid unplanned failures. Thanks. > > > > Ummm...you don't? I can't speak to LynxOS, but I've had Solaris boxen > > with uptimes on the order of 4-6 months. If they weren't connected to > > the commidity internet, and the power stayed on, and I wasn't so > > zealous in applying the recommended and security patches, I could > > probably get uptimes in the *years*. > > > > What you have to worry about are buggy applications doing bad things > > to your system resources. > > Thanks James. Even though you may not need to reboot for a long time, but do > you notice any performance degradation over time or unexplained resource > losses of usages? Thanks. That only happens when you let evil developers use your systems. They can write the worst stuff that screws up your filesystems and not clean up after themselves. Then they want hourly backups of all their files in case they screw up. If you just run without users, your servers can go for _very long_ without rebooting. If the applications and system are designed with properly, the day-to-day procedures to run the system are established and maintained, and the environment is OK (e.g. datacenter with A/C, UPS, and backup generator), you shouldn't need to reboot your systems for anything less than a hardware failure. Obviously, that's not possible. You ask an overly-broad, marketing-type question without specifics of your situation. You must be responding to something a PHB asked you to find out. The "general rule" of UNIX is "don't reboot unless you have to (or you're lazy and don't want to figure out what to fix your problem". Sysadmins are forever quoting the longest "uptime" on their systems as a matter of pride. In common practice, scheduling downtime at least once a month gets the users used to the idea of a "downtime" and gives you headroom to plan projects that require an outage--hardware and software upgrades, testing, or whatever. If you can't do this, buy lots of hardware and build a fault tolerant system. Bring your checkbook. -- DeeDee, don't press that button! DeeDee! NO! Dee... |
| |||
| EKL wrote: > Hi, > > Would someone please give me some pointers on a trend analysis on resource > leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically, > I need statistics to determine how often I need to restart these machines to > avoid unplanned failures. Thanks. > > ekl > > > Dunno Solaris 9 specifically, but if there are no application leaks, uptime is more determined by the need to perform hardware and/or software maintenance/upgrades rather than any inherent need. Have seen servers with uptimes in excess of a year and doubt if that is a record....hopefully not starting a "longest uptime" subthread. |
| |||
| "Michael Vilain <vilain@spamcop.net>" wrote in news:news- 0A1283.10123524072003@news.tdl.com: > If you just run without users Yup, it's them darned users that are the problem! |
| |||
| In article <QMSTa.3816$c6.3317@bos-service2.ext.raytheon.com>, En- Kuang_Lung@raytheon.com says... > Hi, > > Would someone please give me some pointers on a trend analysis on resource > leaks for Solaris 9 and LynxOS (or in general on UNIX machines). Basically, > I need statistics to determine how often I need to restart these machines to > avoid unplanned failures. Thanks. For Solaris: Never. (Unless there's some application software that's always running and leaking memory.) At my last job, we had an old Sparc 20 that had been running continuously for four years before we shut it down and replaced it with an Ultra 10. I mean literally continuously -- the "uptime" command said something like 1481 days. The Cisco router on the same subnet had been running continuously for even longer. 'Course, that Sparc 20 wasn't running Solaris 9, obviously. It was Solaris 2.5.1. I presume that Solaris 9 is just as stable as 2.5.1 is/was. |
| |||
| In article <news-0A1283.10123524072003@news.tdl.com>, "Michael Vilain <vilain@spamcop.net>" says... > > The "general rule" of UNIX is "don't reboot unless you have to (or > you're lazy and don't want to figure out what to fix your problem". Hey, you'd be surprised how often a reboot will fix a particularly pesky problem. Just ask your friendly neighborhood MCSE. > Sysadmins are forever quoting the longest "uptime" on their systems as a > matter of pride. Heh.. 1,481 days. Read it and weep. > In common practice, scheduling downtime at least once a month gets the > users used to the idea of a "downtime" and gives you headroom to plan > projects that require an outage--hardware and software upgrades, > testing, or whatever. And it gives you a chance to install the latest set of recommended and security patches, after which a reboot is usually a good idea. |
| ||||
| In article <Xns93C2A0A186616davetsccorpcom@199.45.49.11>, dave@tsc- corp.com says... > "Michael Vilain <vilain@spamcop.net>" wrote in news:news- > 0A1283.10123524072003@news.tdl.com: > > > If you just run without users > > Yup, it's them darned users that are the problem! Just "kill -SIGSTOP" all their processes, except for their shells. (Don't use SIGTERM or SIGKILL. That just pisses the users off. SIGSTOP is just as effective in terms of preventing the processes from hosing the system, but it's less psychologically traumatic to the users.) |