This is a discussion on Curious pings on SCO 5.0.4/6 within the Sco Unix forums, part of the Unix Operating Systems category; --> Hello, sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow network conection. PING kasseob4 (10.22.136.54): ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello, sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow network conection. PING kasseob4 (10.22.136.54): 56 data bytes 64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms 64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms A PC, same HW, same OS, same NIC works fine connected at the same Hub. But sometimes another PC has this error which works fine a few weeks. Solution: Reboot The ping looks like a sinus curve, running in a loop. The RS 50x is installed. Nic: SMC EtherPower II Driver (ver 2.0.5) HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device# 4,Funct Any ideas ? Stefan |
| |||
| In article <14biqvckqcobdh1h3mis4undq2tjhof16o@4ax.com>, Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote: >Hello, > >sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow >network conection. > >PING kasseob4 (10.22.136.54): 56 data bytes >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms > > >A PC, same HW, same OS, same NIC works fine connected at the same Hub. >But sometimes another PC has this error which works fine a few weeks. >Solution: Reboot >The ping looks like a sinus curve, running in a loop. >The RS 50x is installed. >Nic: SMC EtherPower II Driver (ver 2.0.5) > HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device# >4,Funct >Any ideas ? Some auto-negotiation could be failing. And if something goes into fdx mode the hubs are the culprit. Swithces are cheap enough to toss all the hubs into the trash. As to the large time going down to a slow time, that is typical of not being able to make the connection, and then when it opens up the timing of the first sent gets high, and then each succeeding one is lower as they are all answered sequentially in real time. Those times are very typical of an intermittent connection - and also something doing HDX on a FDX and generating collisions - which FDX doesn't have. You will be best served by fixing the port speeds on all NICs. -- Bill Vermillion - bv @ wjv . com |
| |||
| Bill Vermillion wrote: > In article <14biqvckqcobdh1h3mis4undq2tjhof16o@4ax.com>, > Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote: > >sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow > >network conection. > > > >PING kasseob4 (10.22.136.54): 56 data bytes > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms > >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms > > > >A PC, same HW, same OS, same NIC works fine connected at the same Hub. > >But sometimes another PC has this error which works fine a few weeks. > > >Solution: Reboot > > >The ping looks like a sinus curve, running in a loop. > >The RS 50x is installed. > >Nic: SMC EtherPower II Driver (ver 2.0.5) > > HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device# > >4,Funct > > >Any ideas ? > > Some auto-negotiation could be failing. And if something goes into > fdx mode the hubs are the culprit. Swithces are cheap enough to > toss all the hubs into the trash. > > As to the large time going down to a slow time, that is typical of > not being able to make the connection, and then when it opens up > the timing of the first sent gets high, and then each succeeding > one is lower as they are all answered sequentially in real time. > > Those times are very typical of an intermittent connection - and > also something doing HDX on a FDX and generating collisions - which > FDX doesn't have. > > You will be best served by fixing the port speeds on all NICs. Actually, those ping times are typical of DNS lookup failures. Or not failures, but timeouts. ping does its sending semi-autonomously, in the background, while it processes received packets in the foreground. It receives a packet, does a reverse DNS lookup to convert its IP address to a name. If that reverse lookup takes 5 seconds, no other received packets are processed during that time, but more packets are still _sent_ once a second by the background processing. Then the DNS lookup completes. ping reports the correct time for that first packet. The second packet was sent early in the DNS wait, but didn't get read until much later (after the DNS lookup completed), so its round-trip time is misreported. Each subsequent packet looks like it took 1 second less, because each was _sent_ one second later than the previous one. Now, the _cause_ of the DNS timeouts might be something like what you're talking about... >Bela< |
| |||
| In article <20031105193824.GT14056@sco.com>, Bela Lubkin <belal@sco.com> wrote: >Bill Vermillion wrote: > >> In article <14biqvckqcobdh1h3mis4undq2tjhof16o@4ax.com>, >> Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote: > >> >sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow >> >network conection. >> > >> >PING kasseob4 (10.22.136.54): 56 data bytes >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms >> > >> >A PC, same HW, same OS, same NIC works fine connected at the same Hub. >> >But sometimes another PC has this error which works fine a few weeks. >> >> >Solution: Reboot >> >> >The ping looks like a sinus curve, running in a loop. >> >The RS 50x is installed. >> >Nic: SMC EtherPower II Driver (ver 2.0.5) >> > HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device# >> >4,Funct >> >> >Any ideas ? >> >> Some auto-negotiation could be failing. And if something goes into >> fdx mode the hubs are the culprit. Swithces are cheap enough to >> toss all the hubs into the trash. >> >> As to the large time going down to a slow time, that is typical of >> not being able to make the connection, and then when it opens up >> the timing of the first sent gets high, and then each succeeding >> one is lower as they are all answered sequentially in real time. >> >> Those times are very typical of an intermittent connection - and >> also something doing HDX on a FDX and generating collisions - which >> FDX doesn't have. >> You will be best served by fixing the port speeds on all NICs. >Actually, those ping times are typical of DNS lookup failures. Or not >failures, but timeouts. I'd agree if the once the long numbers settled down to the lower numbers - eg going from 3000 down to 40, but going from 3080 to 2080 to 1080 to 40 to 2240 to 1240 to 240 doesn't fit with any DNS I've seen. Once you have the DNS info all the times should settle down to a number that is similar except for delays. Also the packets are coming back out of sequence. I see packet order of 3,1,2,4,5,6,7,8,9,13,10,11,12,14,21 ... That surely does not indicate DNS to me. If I'm overlooking someting obvious in DNS please do let me know. >ping does its sending semi-autonomously, in the background, >while it processes received packets in the foreground. It >receives a packet, does a reverse DNS lookup to convert its >IP address to a name. If that reverse lookup takes 5 seconds, >no other received packets are processed during that time, but >more packets are still _sent_ once a second by the background >processing. Then the DNS lookup completes. ping reports the >correct time for that first packet. The second packet was sent >early in the DNS wait, but didn't get read until much later >(after the DNS lookup completed), so its round-trip time is >misreported. Each subsequent packet looks like it took 1 second >less, because each was _sent_ one second later than the previous >one. And then the packet time would remain relatively even after the huge numbers decremented. I didn't explain large numbers to small numbers as well as you. >Now, the _cause_ of the DNS timeouts might be something like >what you're talking about... Looking back at those sequence numbers and packets not being returned in order, do you still feel that way? It's almost as if some packets are taking a differnt route back. It definately is screwy. Bill -- Bill Vermillion - bv @ wjv . com |
| |||
| Bill Vermillion wrote: [regarding:] > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms > >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms Bela>> Actually, those ping times are typical of DNS lookup failures. Or not Bela>> failures, but timeouts. Bill> I'd agree if the once the long numbers settled down to the lower Bill> numbers - eg going from 3000 down to 40, but going from 3080 to Bill> 2080 to 1080 to 40 to 2240 to 1240 to 240 doesn't fit with any DNS Bill> I've seen. Once you have the DNS info all the times should settle Bill> down to a number that is similar except for delays. For whatever reason (I'll speculate in a moment), OSR5 `ping` does a reverse DNS lookup of _every_ packet it receives. It doesn't try to cache IP-to-name information. This is probably so that if you had a long-running ping and one day someone changed that address's name, ping would suddenly start reporting the new name. This _could_ have been implemented with a cache, some knowledge of DNS record timeouts, etc., but it wasn't. The data above is _almost_ characteristic of OpenServer ping's handling of DNS timeouts. But after a closer look I think I agree that something different is going on. Look at packets number 0 1 2 3. Because ping is sending them monotonically at 1 second intervals, they were sent at times like 1000000.23, 1000001.23, 1000002.23, 1000003.23. Now, if they had been received in sequence, here's what I would believe had happened: the first packet came back 40ms after it was sent. ping read the packet, noted that interval, did an RDNS lookup. The lookup took several seconds. While it was waiting, its background thread continued to send more pings, and their replies also came back in ~40ms each. But ping didn't read them until its foreground reader thread came back from the RDNS lookup, so _as far as it could tell_ they had taken much longer. The reply to the packet sent at 1000000.23 was received at 1000000.27, 40ms later, and read immediately. The reply to the 1000001.23 packet was received at 1000001.27, but ping didn't _read_ it until 1000004.30, so it reported that as a ~3-second turnaround. But: Bill> Also the packets are coming back out of sequence. Bill> Bill> I see packet order of 3,1,2,4,5,6,7,8,9,13,10,11,12,14,21 ... Bill> Bill> That surely does not indicate DNS to me. If I'm overlooking Bill> someting obvious in DNS please do let me know. I think you're right, because of the out of order receipt. That means that there was blockage somewhere along the way. Some router between the two machines was holding either the outgoing packets or the replies -- _not_ losing them, just holding them and eventually letting them all fly at once. During this holding period they got out of order (which is fairly normal, routers do not guarantee in-order delivert). When ping finally received them back, it reported them as having taken various times about 1.0 second apart, because they all arrived at the same time but were _sent_ 1 second apart. Bela>> ping does its sending semi-autonomously, in the background, Bela>> while it processes received packets in the foreground. It Bela>> receives a packet, does a reverse DNS lookup to convert its Bela>> IP address to a name. If that reverse lookup takes 5 seconds, Bela>> no other received packets are processed during that time, but Bela>> more packets are still _sent_ once a second by the background Bela>> processing. Then the DNS lookup completes. ping reports the Bela>> correct time for that first packet. The second packet was sent Bela>> early in the DNS wait, but didn't get read until much later Bela>> (after the DNS lookup completed), so its round-trip time is Bela>> misreported. Each subsequent packet looks like it took 1 second Bela>> less, because each was _sent_ one second later than the previous Bela>> one. Bill> And then the packet time would remain relatively even after the Bill> huge numbers decremented. I didn't explain large numbers to small Bill> numbers as well as you. Bela>> Now, the _cause_ of the DNS timeouts might be something like Bela>> what you're talking about... Bill> Looking back at those sequence numbers and packets not being Bill> returned in order, do you still feel that way? Bill> Bill> It's almost as if some packets are taking a differnt route back. Bill> It definately is screwy. A router along the way is going into a mode where it collects but does not forward packets; then waking back up and forwarding several seconds worth of collected packets. The down time is plausible for e.g. an ISDN link to go through the stages of: link drops; router notices link is down; router redials; link comes back up. >Bela< |
| |||
| On Wed, 05 Nov 2003 18:05:05 GMT, bv@wjv.comREMOVE (Bill Vermillion) wrote: >You will be best served by fixing the port speeds on all NICs. How can i see whether it's half or full duplex from remote ? Device MAC address in use Factory MAC Address ------ ------------------ ------------------- /dev/net1 00:04:e2:0a:84:10 00:04:e2:0a:84:10 Multicast address table ----------------------- 01:00:5e:00:00:01 FRAMES Unicast Multicast Broadcast Error Octets Queue Length ---------- --------- --------- ------ ----------- ------------ In: 18597 0 4481 0 2725265 0 Out: 18692 0 2 0 3830565 0 DLPI Module Info: 2 SAPs open, 18 SAPs maximum 473 frames received destined for an unbound SAP MAC Driver Info: Media_type: Ethernet Min_SDU: 14, Max_SDU: 1514, Address length: 6 Interface speed: 100 Mbits/sec DLPI Restarts Info: Last queue size: 0 Last send time: 5352483 Restart in progress: 0 Interface Version: MDI 100 ETHERNET SPECIFIC STATISTICS Collision Table - The number of frames successfully transmitted, but involved in at least one collision: Frames Frames ------- ------- 1 collision 0 9 collisions 0 2 collisions 0 10 collisions 0 3 collisions 0 11 collisions 0 4 collisions 0 12 collisions 0 5 collisions 0 13 collisions 0 6 collisions 0 14 collisions 0 7 collisions 0 15 collisions 0 8 collisions 0 16 collisions 0 0 collisions = Switch ? Stefan |
| |||
| In article <20031106003805.GW14056@sco.com>, Bela Lubkin <belal@sco.com> wrote: >Bill Vermillion wrote: > >[regarding:] >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms >Bela>> Actually, those ping times are typical of DNS lookup failures. Or not >Bela>> failures, but timeouts. >Bill> I'd agree if the once the long numbers settled down to the lower >Bill> numbers - eg going from 3000 down to 40, but going from 3080 to >Bill> 2080 to 1080 to 40 to 2240 to 1240 to 240 doesn't fit with any DNS >Bill> I've seen. Once you have the DNS info all the times should settle >Bill> down to a number that is similar except for delays. >For whatever reason (I'll speculate in a moment), OSR5 `ping` does a >reverse DNS lookup of _every_ packet it receives. It doesn't try to >cache IP-to-name information. This is probably so that if you had a >long-running ping and one day someone changed that address's name, ping >would suddenly start reporting the new name. This _could_ have been >implemented with a cache, some knowledge of DNS record timeouts, etc., >but it wasn't. That seems a rather bizarre way to do things. That's just from my way of thinking about things. Most of the things based on names lookup an IP for a given name. The chances that someone would change a name on an IP but that would typically be seen only on a local network would it not. As anything outside is going to rely on someone elses DNS and when the address/IP resolv is made up stream, even if it has to go to the root servers to get that IP initially, then the next level up will cache that name/ip resolution as long as the TTL is still valid. That's my impression, but I've never looked at the source code. >The data above is _almost_ characteristic of OpenServer ping's >handling of DNS timeouts. But after a closer look I think I >agree that something different is going on. Many readers on this list are fairly recent - eg since the new internet came up in the early-mid 1990s. But I've been reading your posts for 17 to 18 years - going back to the old Dr.Dobbs on Compuserve - and I dropped C'serve in '86 when I brought up my own usenet node. This is the first time I've ever had a single question about anything you posted, and I think I must finally be starting to understand all this mess. I save a good deal of your posts, and if I went back an searched the floppy archives I have I'd think I'd even find messages from your mother. >Look at packets number 0 1 2 3. Because ping is sending them >monotonically at 1 second intervals, they were sent at times like >1000000.23, 1000001.23, 1000002.23, 1000003.23. Now, if they had been >received in sequence, here's what I would believe had happened: the >first packet came back 40ms after it was sent. ping read the packet, >noted that interval, did an RDNS lookup. The lookup took several >seconds. While it was waiting, its background thread continued to send >more pings, and their replies also came back in ~40ms each. But ping >didn't read them until its foreground reader thread came back from the >RDNS lookup, so _as far as it could tell_ they had taken much longer. >The reply to the packet sent at 1000000.23 was received at 1000000.27, >40ms later, and read immediately. The reply to the 1000001.23 packet >was received at 1000001.27, but ping didn't _read_ it until 1000004.30, >so it reported that as a ~3-second turnaround. > >But: > >Bill> Also the packets are coming back out of sequence. >Bill> >Bill> I see packet order of 3,1,2,4,5,6,7,8,9,13,10,11,12,14,21 ... >Bill> >Bill> That surely does not indicate DNS to me. If I'm overlooking >Bill> someting obvious in DNS please do let me know. >I think you're right, because of the out of order receipt. That means >that there was blockage somewhere along the way. Some router between >the two machines was holding either the outgoing packets or the replies >-- _not_ losing them, just holding them and eventually letting them all >fly at once. During this holding period they got out of order (which is >fairly normal, routers do not guarantee in-order delivert). When ping >finally received them back, it reported them as having taken various >times about 1.0 second apart, because they all arrived at the same time >but were _sent_ 1 second apart. I can envision that something somewhere is waiting until it gets a packet big enough to send a minimum amount of data, or waits a pre-determined interval to return that. But why I have no idea. But what we don't know is just how far apart the pinged IP is. The orginal poster had munged the original IP and gave no clue as to what/where it was. Long delays would/could be indicative of a typical land/satellite link. The sent data goes via land line, and the return data is intercepted as I recall at the level 3 of the ISO stack, and diverted to an uplink and the down to the end user. That would almost guarantee a minimum of about 700ms. And I would think you really would want to aggregate the data and send in bigger chunks. I've read about problems on what are called 'elephants' [ELFN - Exteremely Long Fat Networks - very high speed distant links where they make the packets HUGE and have large windows, otherwise the data is slowed by the handshake/protocols/etc of small packets and few outstanding]. Probably has nothing to do with this, but it reminded me of disusssion I'd recentely seen. >Bela>> ping does its sending semi-autonomously, in the background, >Bela>> while it processes received packets in the foreground. It >Bela>> receives a packet, does a reverse DNS lookup to convert its >Bela>> IP address to a name. If that reverse lookup takes 5 seconds, >Bela>> no other received packets are processed during that time, but >Bela>> more packets are still _sent_ once a second by the background >Bela>> processing. Then the DNS lookup completes. ping reports the >Bela>> correct time for that first packet. The second packet was sent >Bela>> early in the DNS wait, but didn't get read until much later >Bela>> (after the DNS lookup completed), so its round-trip time is >Bela>> misreported. Each subsequent packet looks like it took 1 second >Bela>> less, because each was _sent_ one second later than the previous >Bela>> one. > >Bill> And then the packet time would remain relatively even after the >Bill> huge numbers decremented. I didn't explain large numbers to small >Bill> numbers as well as you. >Bela>> Now, the _cause_ of the DNS timeouts might be something like >Bela>> what you're talking about... >Bill> Looking back at those sequence numbers and packets not being >Bill> returned in order, do you still feel that way? >Bill> >Bill> It's almost as if some packets are taking a differnt route back. >Bill> It definately is screwy. >A router along the way is going into a mode where it collects >but does not forward packets; then waking back up and forwarding >several seconds worth of collected packets. The down time is >plausible for e.g. an ISDN link to go through the stages of: >link drops; router notices link is down; router redials; link >comes back up. That had not crossed my mind as I've not worked with ISDN in quite awhile - though at one ISP we had several using it - they thought it would be cheaper. We propopsed a dedicated T1 from Florida to Ohio, but they looked at the ISDN cost, and did not realized there was a connect charge on each connection, and running a remote mail kiosk they figured it would be cheaper than the $1500 for the PtP T. When they got their first phone bill of over $5000 they realized that we did know what we were talking about. So and ISDN could be it - and IF the line is used as voice and IP - then the data channel will drop back to 64K when the vox is in use. However now that many places are out-sourcing their dialup systems getting a bondable ISDN is almost impossible as to bond them you have to come in on the same PRI. At one place their 'modem' bank occupied about 15 feet of rack space. Each rack had at least 5 of the Lucent/Max - each with a DS3 and each handling about 600 lines. So with about 30,000 lines available it would be only by extreme chance you could get two links into the same PRI, so that's why the only way to get a bonded system is direct from telco, if that's still possible. This is an interesting problem, and if any of our theories are correct - then it will probably be solved only by an onsite person who knows networking intimately. Bill -- Bill Vermillion - bv @ wjv . com |
| |||
| In article <hgokqvomnv9dpneo4oipor8l2cc9roiko8@4ax.com>, Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote: >On Wed, 05 Nov 2003 18:05:05 GMT, bv@wjv.comREMOVE (Bill Vermillion) >wrote: > >>You will be best served by fixing the port speeds on all NICs. > >How can i see whether it's half or full duplex from remote ? > >Device MAC address in use Factory MAC Address >------ ------------------ ------------------- >/dev/net1 00:04:e2:0a:84:10 00:04:e2:0a:84:10 > <snip> >MAC Driver Info: Media_type: Ethernet > Min_SDU: 14, Max_SDU: 1514, Address length: 6 > Interface speed: 100 Mbits/sec > Sco Openserver 5.0.5 by default sets the Nic Cards to Auto negotiate. i quote from space.h in /etc/conf/pack.d/e3H/space.h "Media type may be overridden. Default is to let the NIC determine the speed and duplex mode." For a good description of this, go to Tony's site at: http://aplawrence.com/SCOFAQ/scotec4.html#duplexspeed Dave |
| |||
| On Thu, 06 Nov 2003 16:02:00 +0100, Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote: >How can i see whether it's half or full duplex from remote ? I couldn't find any incantation that displays this. You may wanna run: ndstat -l and see if there are any obvious errors. However, I suspect you won't find any. If your target is local, see if you can create errors by using ping flood. ping -f target_IP Hopefully, this will give ndstat something to chew upon. My guess(tm) is that you have a bad switch, bad cable, miswired cable, or 100baseT to 10baseT transition where the internal buffer in the switch is losing packets. If the switch is a managed switch with an IP address, try pinging the switch and see if the problem persists. Also see if you can extract SNMP statistics from the switch if it's a managed switch. Also try pinging other machines on the network. The idea is to isolate the common network segment that's causing a problem. I've also successfully induced a similar problem with some creative wiring on the ethernet cable. I had the polarity of on of the data wires reversed. Everything sorta functioned but I had lots of delays that did NOT show up on the server diagnostic output. The switch and card were apparently spending their time doing almost continuous NWAY negotiations. The only clue was that the lights on the switch port would sometimes do a weird dance when there was no traffic. It was difficult to see as it sorta looked like normal traffic. If you have any home made cables, I suggest you check them. Since the target machine is on the local LAN, I can safely assume that all packets are coming and going directly to the target and are NOT being routed through some circuitous route. Just to be sure, run: netstat -rn and see if the routing table looks sane. There are two different chips used on this card. The old one uses a DEC Tulip chip. The current version uses an "Epic" 83C170 chip. The Linux Epic driver code mentions that some chips have a hardware multicast filter flaw. That should not affect ping which is unicast. http://www.scyld.com/network/epic100.html -- Jeff Liebermann 150 Felker St #D Santa Cruz CA 95060 (831)421-6491 pgr (831)336-2558 home http://www.LearnByDestroying.com AE6KS jeffl@comix.santa-cruz.ca.us jeffl@cruzio.com |
| ||||
| In article <3faa79d1$0$41289$a1866201@newsreader.visi.com>, Dave Gresham <gresham@visi.com> wrote: >In article <hgokqvomnv9dpneo4oipor8l2cc9roiko8@4ax.com>, >Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote: >>On Wed, 05 Nov 2003 18:05:05 GMT, bv@wjv.comREMOVE (Bill Vermillion) >>wrote: >> >>>You will be best served by fixing the port speeds on all NICs. >> >>How can i see whether it's half or full duplex from remote ? >> >>Device MAC address in use Factory MAC Address >>------ ------------------ ------------------- >>/dev/net1 00:04:e2:0a:84:10 00:04:e2:0a:84:10 >> ><snip> > >>MAC Driver Info: Media_type: Ethernet >> Min_SDU: 14, Max_SDU: 1514, Address length: 6 >> Interface speed: 100 Mbits/sec >> > >Sco Openserver 5.0.5 by default sets the Nic Cards to Auto negotiate. >i quote from space.h in /etc/conf/pack.d/e3H/space.h >"Media type may be overridden. Default is to let the NIC determine > the speed and duplex mode." >For a good description of this, go to Tony's site at: >http://aplawrence.com/SCOFAQ/scotec4.html#duplexspeed And for a very good description on what happens when things are mis-matched see http://www.cisco.com/warp/public/473/46.html Though it was written about a Cisco switch it documents how manufacturers adding their own ehancements make the the problem of automatically determing duplex mode and the transfer speed impossible in some circumstanes. A chart shows what happens in each of the instances. -- Bill Vermillion - bv @ wjv . com |