This is a discussion on network card stops responding within the Sco Unix forums, part of the Unix Operating Systems category; --> "Larry Rosenman" <ler@lerctr.org> wrote in message news:bumspd$p23@library1.airnews.net... > >> The computer has worked fine for a year. > My ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| "Larry Rosenman" <ler@lerctr.org> wrote in message news:bumspd$p23@library1.airnews.net... > >> The computer has worked fine for a year. > My standard next question: > > What changed? > Only adding and removing users on the server and to my knowledge 2 computers on the company wide network at least two months prior to the first incident. The only computer having difficulty is the SCO server ... and a hardware reset to the server restores operation. But then, sysadmins do not always tell the truth ... Ron |
| |||
| "Tony Lawrence" <apl@shell01.TheWorld.com> wrote in message news:bun258$mmq$2@pcls4.std.com... > Ronald J Marchand <rojomar@covad.net> wrote: > >> - Not current driver (see > >> ftp://ftp.sco.com/pub/openserver5/drivers/ > >> > >Driver is the one as supplied on the 5.0.6 install disk. > > > Wrong driver. Get the right one from the ftp site. > > >The computer has worked fine for a year. > > > So?? > So I find it strange that it takes a year to act up and I have others that have not locked up. I will, however, take your advice and find out for sure. Thanks to all Ron |
| |||
| In article <e87d9$400fe29b$42a6716f$11135@msgid.meganewsserve rs.com>, Ronald J Marchand <rojomar@covad.net> wrote: >"Larry Rosenman" <ler@lerctr.org> wrote in message >news:bumspd$p23@library1.airnews.net... >> >> The computer has worked fine for a year. >> My standard next question: >> >> What changed? >> > >Only adding and removing users on the server and to my knowledge 2 computers >on the company wide network at least two months prior to the first incident. >The only computer having difficulty is the SCO server ... and a hardware >reset to the server restores operation. But then, sysadmins do not always >tell the truth ... I know that feeling. I'm not an OpenServer guy, so I'm at a loss here. Sorry.... LER -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749 |
| |||
| Ronald J Marchand wrote: > "Tony Lawrence" <apl@shell01.TheWorld.com> wrote in message > news:bun258$mmq$2@pcls4.std.com... > >>Ronald J Marchand <rojomar@covad.net> wrote: >> >>>>- Not current driver (see >>>>ftp://ftp.sco.com/pub/openserver5/drivers/ >>>> >>> >>>Driver is the one as supplied on the 5.0.6 install disk. >> >> >>Wrong driver. Get the right one from the ftp site. >> >> >>>The computer has worked fine for a year. >> >> >>So?? >> > > So I find it strange that it takes a year to act up and I have others that > have not locked up. I will, however, take your advice and find out for > sure. > > Thanks to all > Ron > [ 2nd try - not sure if the 1st actually made it ] Ron, first of all make sure the above system is patched with the latest/greates patches available from SCO: http://www.sco.com/support/ftplists/osr5list.html Next, when the problem arises, are you able to stop/restart the network device by using the "nd(ADM)" command ? Did you notice anythin unusual under /usr/adm/syslog ? What does "ndstat -l" report ? Best, Roberto -- Roberto Zini - Technical Support Manager - email:r.zini<AT>strhold.it Technical Support Manager -- Strhold Evolution Division R.E. (ITALY) --------------------------------------------------------------------- "Has anybody around here seen an aircraft carrier?" (Pete "Maverick" Mitchell - Top Gun) |
| |||
| In article <e87d9$400fe29b$42a6716f$11135@msgid.meganewsserve rs.com>, Ronald J Marchand <rojomar@covad.net> wrote: >"Larry Rosenman" <ler@lerctr.org> wrote in message >news:bumspd$p23@library1.airnews.net... >> >> The computer has worked fine for a year. >> My standard next question: >> What changed? >Only adding and removing users on the server and to my knowledge >2 computers on the company wide network at least two months prior >to the first incident. The only computer having difficulty is >the SCO server ... and a hardware reset to the server restores >operation. But then, sysadmins do not always tell the truth ... I've seen this when other equipment was added - two firewalls were being swapped hot - because one needed to tunnel to a protected site, and then they'd swap the other one back in so their remote user could get in. [Hardware was spec'ed by a company on the other side of the continent and the advice given was less than stellar] In your first message you say it 'freezes' - but in actuallity you have only lost the ability to communicate on the network. Since the IP number is used only to get the MAC address and then the computers commnicate MAC to MAC - I found that in this instance the SCO was trying to find the MAC address associated with the IP. I don't know the timeout on the system. I do know on some big routers I had to flush the arp cache to relearn new MAC/address associations as the default on those was 4 hours. So try just stopping and restarting the TCP services. Also make sure that any/all routes are added after everything else. If routes are placed in the standard S85tcp script I've found they have a tendency to disappear. If stopping/restarting TCP/IP does not fix it the consider the possibility that the NIC has become a bit flakey. Bill -- Bill Vermillion - bv @ wjv . com |
| |||
| My first reply to you bounced. "Roberto Zini" <rob@robnothere.com> wrote in message news:buovk7$uo$1@newsread.albacom.net... > [ 2nd try - not sure if the 1st actually made it ] > > Ron, > > first of all make sure the above system is patched with the > latest/greates patches available from SCO: > > http://www.sco.com/support/ftplists/osr5list.html It is fully patched. > > Next, when the problem arises, are you able to stop/restart the network > device by using the "nd(ADM)" command ? > This is a command that I was not familiar with. I am now. Thanks > Did you notice anythin unusual under /usr/adm/syslog ? Just this, but this was during the time of the shutdown (init 6) Jan 20 15:41:09 dpp WARNING: NFS server dixie.uucp not responding, still trying Jan 20 15:44:46 dpp lockd[553]: Unable to kill UDP server 554 : errno No such pr ocess Jan 20 15:44:46 dpp lockd[553]: Unable to kill UDP server 554 : errno No such pr ocess Jan 20 15:47:09 dpp syslogd: restart > > What does "ndstat -l" report ? > Currently, the only thing with a large number is this: Underruns/Overruns 106 But I will attempt to capture it next time. Could bad memory be at fault? the machine did issue a trap 0xE once. Ron |
| |||
| On Wed, 21 Jan 2004 23:30:53 +0000 (UTC), Tony Lawrence <apl@shell01.TheWorld.com> wrote: [...] >>Openserver becomes very unstable when attempting to >>handle moderate volumes of network traffic (we experienced severe IO >>errors and disk corruption when attempting to use OS as a mail relay >>for an office of 23 people). > >Disk corruption blamed on a network card? Yes I'm afraid so. Apparently there's a common problem with the OS not allocating IRQs correctly. I assume it's due to the lack of an in-house development team keeping their products up to date. I'm sure that poor old Bela does his best, but having to work in a closet since the rest of the company buildings are filled with lawyers and cold-callers threatening people over the phone cannot be the best way to write code. Nice article in the Salt Lake Tribune today, BTW, detailing the history of SCO and its current disgusting, illegal, greed-driven antics. -- FyRE < "War: The way Americans learn geography" > |
| |||
| Ronald J Marchand wrote: > My first reply to you bounced. > OK, no problem. > "Roberto Zini" <rob@robnothere.com> wrote in message > news:buovk7$uo$1@newsread.albacom.net... > >>[ 2nd try - not sure if the 1st actually made it ] >> >>Ron, >> >>first of all make sure the above system is patched with the >>latest/greates patches available from SCO: >> >> http://www.sco.com/support/ftplists/osr5list.html > > > It is fully patched. Good. > >>Next, when the problem arises, are you able to stop/restart the network >>device by using the "nd(ADM)" command ? >> > > This is a command that I was not familiar with. I am now. Thanks > You're welcome. > >>Did you notice anythin unusual under /usr/adm/syslog ? > > > Just this, but this was during the time of the shutdown (init 6) > Jan 20 15:41:09 dpp WARNING: NFS server dixie.uucp not responding, still > trying > Jan 20 15:44:46 dpp lockd[553]: Unable to kill UDP server 554 : errno No > such pr > ocess > Jan 20 15:44:46 dpp lockd[553]: Unable to kill UDP server 554 : errno No > such pr > ocess > Jan 20 15:47:09 dpp syslogd: restart > Uhm ... the above WARNING message is likely to appear when you change hostname under SCO OS5; in fact, when you ONLY change it, you end up having 2 names (into /etc/hosts) which point to the same IP address. Since the resolver routines serach the file "linearly" (ie, stop at the first matiching line starting from the beginning of the file), some services (such as NFS) go berserk. Please make sure your /etc/hosts file reflects the current configuration of your server (even if I'm not sure this is the cause of your problems - assuming this is __NOT__ and NFS client/server. If in doubt, disable NFS services on this box). >>What does "ndstat -l" report ? >> > > Currently, the only thing with a large number is this: > Underruns/Overruns 106 > But I will attempt to capture it next time. > > Could bad memory be at fault? the machine did issue a trap 0xE once. > Can't tell. Sometimes, a 0xE panic could be caused by a "spike" on the power line. If you get repeated panic messages, then you'd start investigating by analyzing the memory dump on the swap filesystem. What about the "arp -a" command (even executed on remote boxes) ? Best, Roberto -- Best, Roberto -- Roberto Zini - Technical Support Manager - email:r.zini<AT>strhold.it Technical Support Manager -- Strhold Evolution Division R.E. (ITALY) --------------------------------------------------------------------- "Has anybody around here seen an aircraft carrier?" (Pete "Maverick" Mitchell - Top Gun) |
| |||
| "Roberto Zini" <rob@robnothere.com> wrote in message news:buqn6n$c7h$1@newsread.albacom.net... <<snip>> > >>Next, when the problem arises, are you able to stop/restart the network > >>device by using the "nd(ADM)" command ? > >> The tcp traffic died again while I was there and I did the following: # nd stop # nd start dlpid: Unable to open network adapter driver (/dev/mdi/e3H0) dlpid: No such file or directory. # currently the directory contains: # pwd /dev/mdi # l total 0 crw------- 1 root root 119, 0 Dec 3 2002 e3H0 crw------- 1 root root 119, 1 Dec 3 2002 e3H1 crw------- 1 root root 119, 2 Dec 3 2002 e3H2 crw------- 1 root root 119, 3 Dec 3 2002 e3H3 # Several hours later the machine did issue a panic trap 0xE. Any additional suggestions appreciated. Ron |
| ||||
| On Sat, 24 Jan 2004 04:40:19 -0600, "Ronald J Marchand" <rojomar@covad.net> wrote: >"Roberto Zini" <rob@robnothere.com> wrote in message >news:buqn6n$c7h$1@newsread.albacom.net... ><<snip>> >> >>Next, when the problem arises, are you able to stop/restart the network >> >>device by using the "nd(ADM)" command ? >> >> > >The tcp traffic died again while I was there and I did the following: ># nd stop ># nd start >dlpid: Unable to open network adapter driver (/dev/mdi/e3H0) >dlpid: No such file or directory. ># >currently the directory contains: ># pwd >/dev/mdi > Check standard problems of SCO. 1.) Reduce swap space to 2GB if it's more. 2.) Check IRQ's for Devices, avoid IRQ sharing. Stefan |