Unix Technical Forum

Curious pings on SCO 5.0.4/6

This is a discussion on Curious pings on SCO 5.0.4/6 within the Sco Unix forums, part of the Unix Operating Systems category; --> Hello, sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow network conection. PING kasseob4 (10.22.136.54): ...


Go Back   Unix Technical Forum > Unix Operating Systems > Sco Unix

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 02-15-2008, 12:10 PM
Stefan Marquardt
 
Posts: n/a
Default Curious pings on SCO 5.0.4/6

Hello,

sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow
network conection.

PING kasseob4 (10.22.136.54): 56 data bytes
64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms
64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms


A PC, same HW, same OS, same NIC works fine connected at the same Hub.
But sometimes another PC has this error which works fine a few weeks.

Solution: Reboot

The ping looks like a sinus curve, running in a loop.
The RS 50x is installed.
Nic: SMC EtherPower II Driver (ver 2.0.5)
HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device#
4,Funct


Any ideas ?

Stefan
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-15-2008, 12:10 PM
Bill Vermillion
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

In article <14biqvckqcobdh1h3mis4undq2tjhof16o@4ax.com>,
Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote:
>Hello,
>
>sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow
>network conection.
>
>PING kasseob4 (10.22.136.54): 56 data bytes
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms
>64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms
>
>
>A PC, same HW, same OS, same NIC works fine connected at the same Hub.
>But sometimes another PC has this error which works fine a few weeks.


>Solution: Reboot


>The ping looks like a sinus curve, running in a loop.
>The RS 50x is installed.
>Nic: SMC EtherPower II Driver (ver 2.0.5)
> HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device#
>4,Funct


>Any ideas ?


Some auto-negotiation could be failing. And if something goes into
fdx mode the hubs are the culprit. Swithces are cheap enough to
toss all the hubs into the trash.

As to the large time going down to a slow time, that is typical of
not being able to make the connection, and then when it opens up
the timing of the first sent gets high, and then each succeeding
one is lower as they are all answered sequentially in real time.

Those times are very typical of an intermittent connection - and
also something doing HDX on a FDX and generating collisions - which
FDX doesn't have.

You will be best served by fixing the port speeds on all NICs.




--
Bill Vermillion - bv @ wjv . com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-15-2008, 12:10 PM
Bela Lubkin
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

Bill Vermillion wrote:

> In article <14biqvckqcobdh1h3mis4undq2tjhof16o@4ax.com>,
> Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote:


> >sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow
> >network conection.
> >
> >PING kasseob4 (10.22.136.54): 56 data bytes
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms
> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms
> >
> >A PC, same HW, same OS, same NIC works fine connected at the same Hub.
> >But sometimes another PC has this error which works fine a few weeks.

>
> >Solution: Reboot

>
> >The ping looks like a sinus curve, running in a loop.
> >The RS 50x is installed.
> >Nic: SMC EtherPower II Driver (ver 2.0.5)
> > HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device#
> >4,Funct

>
> >Any ideas ?

>
> Some auto-negotiation could be failing. And if something goes into
> fdx mode the hubs are the culprit. Swithces are cheap enough to
> toss all the hubs into the trash.
>
> As to the large time going down to a slow time, that is typical of
> not being able to make the connection, and then when it opens up
> the timing of the first sent gets high, and then each succeeding
> one is lower as they are all answered sequentially in real time.
>
> Those times are very typical of an intermittent connection - and
> also something doing HDX on a FDX and generating collisions - which
> FDX doesn't have.
>
> You will be best served by fixing the port speeds on all NICs.


Actually, those ping times are typical of DNS lookup failures. Or not
failures, but timeouts.

ping does its sending semi-autonomously, in the background, while it
processes received packets in the foreground. It receives a packet,
does a reverse DNS lookup to convert its IP address to a name. If that
reverse lookup takes 5 seconds, no other received packets are processed
during that time, but more packets are still _sent_ once a second by the
background processing. Then the DNS lookup completes. ping reports the
correct time for that first packet. The second packet was sent early in
the DNS wait, but didn't get read until much later (after the DNS lookup
completed), so its round-trip time is misreported. Each subsequent
packet looks like it took 1 second less, because each was _sent_ one
second later than the previous one.

Now, the _cause_ of the DNS timeouts might be something like what you're
talking about...

>Bela<

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 02-15-2008, 12:10 PM
Bill Vermillion
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

In article <20031105193824.GT14056@sco.com>,
Bela Lubkin <belal@sco.com> wrote:
>Bill Vermillion wrote:
>
>> In article <14biqvckqcobdh1h3mis4undq2tjhof16o@4ax.com>,
>> Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote:

>
>> >sometimes we had on many SCO PC's with 5.0.4 and 5.0.6 very slow
>> >network conection.
>> >
>> >PING kasseob4 (10.22.136.54): 56 data bytes
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms
>> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms
>> >
>> >A PC, same HW, same OS, same NIC works fine connected at the same Hub.
>> >But sometimes another PC has this error which works fine a few weeks.

>>
>> >Solution: Reboot

>>
>> >The ping looks like a sinus curve, running in a loop.
>> >The RS 50x is installed.
>> >Nic: SMC EtherPower II Driver (ver 2.0.5)
>> > HW SMC EtherPower II 9432BFTX 10/100Mbps - PCI Bus# 0,Device#
>> >4,Funct

>>
>> >Any ideas ?

>>
>> Some auto-negotiation could be failing. And if something goes into
>> fdx mode the hubs are the culprit. Swithces are cheap enough to
>> toss all the hubs into the trash.
>>
>> As to the large time going down to a slow time, that is typical of
>> not being able to make the connection, and then when it opens up
>> the timing of the first sent gets high, and then each succeeding
>> one is lower as they are all answered sequentially in real time.
>>
>> Those times are very typical of an intermittent connection - and
>> also something doing HDX on a FDX and generating collisions - which
>> FDX doesn't have.


>> You will be best served by fixing the port speeds on all NICs.


>Actually, those ping times are typical of DNS lookup failures. Or not
>failures, but timeouts.


I'd agree if the once the long numbers settled down to the lower
numbers - eg going from 3000 down to 40, but going from 3080 to
2080 to 1080 to 40 to 2240 to 1240 to 240 doesn't fit with any DNS
I've seen. Once you have the DNS info all the times should settle
down to a number that is similar except for delays.

Also the packets are coming back out of sequence.

I see packet order of 3,1,2,4,5,6,7,8,9,13,10,11,12,14,21 ...

That surely does not indicate DNS to me. If I'm overlooking
someting obvious in DNS please do let me know.

>ping does its sending semi-autonomously, in the background,
>while it processes received packets in the foreground. It
>receives a packet, does a reverse DNS lookup to convert its
>IP address to a name. If that reverse lookup takes 5 seconds,
>no other received packets are processed during that time, but
>more packets are still _sent_ once a second by the background
>processing. Then the DNS lookup completes. ping reports the
>correct time for that first packet. The second packet was sent
>early in the DNS wait, but didn't get read until much later
>(after the DNS lookup completed), so its round-trip time is
>misreported. Each subsequent packet looks like it took 1 second
>less, because each was _sent_ one second later than the previous
>one.


And then the packet time would remain relatively even after the
huge numbers decremented. I didn't explain large numbers to small
numbers as well as you.

>Now, the _cause_ of the DNS timeouts might be something like
>what you're talking about...


Looking back at those sequence numbers and packets not being
returned in order, do you still feel that way?

It's almost as if some packets are taking a differnt route back.
It definately is screwy.

Bill
--
Bill Vermillion - bv @ wjv . com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 02-15-2008, 12:10 PM
Bela Lubkin
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

Bill Vermillion wrote:

[regarding:]

> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms
> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms


Bela>> Actually, those ping times are typical of DNS lookup failures. Or not
Bela>> failures, but timeouts.

Bill> I'd agree if the once the long numbers settled down to the lower
Bill> numbers - eg going from 3000 down to 40, but going from 3080 to
Bill> 2080 to 1080 to 40 to 2240 to 1240 to 240 doesn't fit with any DNS
Bill> I've seen. Once you have the DNS info all the times should settle
Bill> down to a number that is similar except for delays.

For whatever reason (I'll speculate in a moment), OSR5 `ping` does a
reverse DNS lookup of _every_ packet it receives. It doesn't try to
cache IP-to-name information. This is probably so that if you had a
long-running ping and one day someone changed that address's name, ping
would suddenly start reporting the new name. This _could_ have been
implemented with a cache, some knowledge of DNS record timeouts, etc.,
but it wasn't.

The data above is _almost_ characteristic of OpenServer ping's handling
of DNS timeouts. But after a closer look I think I agree that something
different is going on.

Look at packets number 0 1 2 3. Because ping is sending them
monotonically at 1 second intervals, they were sent at times like
1000000.23, 1000001.23, 1000002.23, 1000003.23. Now, if they had been
received in sequence, here's what I would believe had happened: the
first packet came back 40ms after it was sent. ping read the packet,
noted that interval, did an RDNS lookup. The lookup took several
seconds. While it was waiting, its background thread continued to send
more pings, and their replies also came back in ~40ms each. But ping
didn't read them until its foreground reader thread came back from the
RDNS lookup, so _as far as it could tell_ they had taken much longer.
The reply to the packet sent at 1000000.23 was received at 1000000.27,
40ms later, and read immediately. The reply to the 1000001.23 packet
was received at 1000001.27, but ping didn't _read_ it until 1000004.30,
so it reported that as a ~3-second turnaround.

But:

Bill> Also the packets are coming back out of sequence.
Bill>
Bill> I see packet order of 3,1,2,4,5,6,7,8,9,13,10,11,12,14,21 ...
Bill>
Bill> That surely does not indicate DNS to me. If I'm overlooking
Bill> someting obvious in DNS please do let me know.

I think you're right, because of the out of order receipt. That means
that there was blockage somewhere along the way. Some router between
the two machines was holding either the outgoing packets or the replies
-- _not_ losing them, just holding them and eventually letting them all
fly at once. During this holding period they got out of order (which is
fairly normal, routers do not guarantee in-order delivert). When ping
finally received them back, it reported them as having taken various
times about 1.0 second apart, because they all arrived at the same time
but were _sent_ 1 second apart.

Bela>> ping does its sending semi-autonomously, in the background,
Bela>> while it processes received packets in the foreground. It
Bela>> receives a packet, does a reverse DNS lookup to convert its
Bela>> IP address to a name. If that reverse lookup takes 5 seconds,
Bela>> no other received packets are processed during that time, but
Bela>> more packets are still _sent_ once a second by the background
Bela>> processing. Then the DNS lookup completes. ping reports the
Bela>> correct time for that first packet. The second packet was sent
Bela>> early in the DNS wait, but didn't get read until much later
Bela>> (after the DNS lookup completed), so its round-trip time is
Bela>> misreported. Each subsequent packet looks like it took 1 second
Bela>> less, because each was _sent_ one second later than the previous
Bela>> one.

Bill> And then the packet time would remain relatively even after the
Bill> huge numbers decremented. I didn't explain large numbers to small
Bill> numbers as well as you.

Bela>> Now, the _cause_ of the DNS timeouts might be something like
Bela>> what you're talking about...

Bill> Looking back at those sequence numbers and packets not being
Bill> returned in order, do you still feel that way?
Bill>
Bill> It's almost as if some packets are taking a differnt route back.
Bill> It definately is screwy.

A router along the way is going into a mode where it collects but does
not forward packets; then waking back up and forwarding several seconds
worth of collected packets. The down time is plausible for e.g. an ISDN
link to go through the stages of: link drops; router notices link is
down; router redials; link comes back up.

>Bela<

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 02-15-2008, 12:10 PM
Stefan Marquardt
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

On Wed, 05 Nov 2003 18:05:05 GMT, bv@wjv.comREMOVE (Bill Vermillion)
wrote:

>You will be best served by fixing the port speeds on all NICs.


How can i see whether it's half or full duplex from remote ?

Device MAC address in use Factory MAC Address
------ ------------------ -------------------
/dev/net1 00:04:e2:0a:84:10 00:04:e2:0a:84:10

Multicast address table
-----------------------
01:00:5e:00:00:01

FRAMES
Unicast Multicast Broadcast Error Octets Queue Length
---------- --------- --------- ------ ----------- ------------
In: 18597 0 4481 0 2725265 0
Out: 18692 0 2 0 3830565 0

DLPI Module Info: 2 SAPs open, 18 SAPs maximum
473 frames received destined for an unbound SAP

MAC Driver Info: Media_type: Ethernet
Min_SDU: 14, Max_SDU: 1514, Address length: 6
Interface speed: 100 Mbits/sec

DLPI Restarts Info: Last queue size: 0
Last send time: 5352483
Restart in progress: 0
Interface Version: MDI 100

ETHERNET SPECIFIC STATISTICS

Collision Table - The number of frames successfully transmitted,
but involved in at least one collision:

Frames Frames
------- -------
1 collision 0 9 collisions 0
2 collisions 0 10 collisions 0
3 collisions 0 11 collisions 0
4 collisions 0 12 collisions 0
5 collisions 0 13 collisions 0
6 collisions 0 14 collisions 0
7 collisions 0 15 collisions 0
8 collisions 0 16 collisions 0



0 collisions = Switch ?



Stefan

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 02-15-2008, 12:10 PM
Bill Vermillion
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

In article <20031106003805.GW14056@sco.com>,
Bela Lubkin <belal@sco.com> wrote:
>Bill Vermillion wrote:
>
>[regarding:]


>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=13 ttl=62 time=40 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=10 ttl=62 time=3080 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=11 ttl=62 time=2080 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=12 ttl=62 time=1090 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=14 ttl=62 time=820 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms
>> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms


>Bela>> Actually, those ping times are typical of DNS lookup failures. Or not
>Bela>> failures, but timeouts.


>Bill> I'd agree if the once the long numbers settled down to the lower
>Bill> numbers - eg going from 3000 down to 40, but going from 3080 to
>Bill> 2080 to 1080 to 40 to 2240 to 1240 to 240 doesn't fit with any DNS
>Bill> I've seen. Once you have the DNS info all the times should settle
>Bill> down to a number that is similar except for delays.


>For whatever reason (I'll speculate in a moment), OSR5 `ping` does a
>reverse DNS lookup of _every_ packet it receives. It doesn't try to
>cache IP-to-name information. This is probably so that if you had a
>long-running ping and one day someone changed that address's name, ping
>would suddenly start reporting the new name. This _could_ have been
>implemented with a cache, some knowledge of DNS record timeouts, etc.,
>but it wasn't.


That seems a rather bizarre way to do things. That's just from my
way of thinking about things. Most of the things based on names
lookup an IP for a given name. The chances that someone would
change a name on an IP but that would typically be seen only on a
local network would it not. As anything outside is going to rely
on someone elses DNS and when the address/IP resolv is made up
stream, even if it has to go to the root servers to get that IP
initially, then the next level up will cache that name/ip
resolution as long as the TTL is still valid. That's my
impression, but I've never looked at the source code.

>The data above is _almost_ characteristic of OpenServer ping's
>handling of DNS timeouts. But after a closer look I think I
>agree that something different is going on.


Many readers on this list are fairly recent - eg since the new
internet came up in the early-mid 1990s. But I've been reading
your posts for 17 to 18 years - going back to the old Dr.Dobbs
on Compuserve - and I dropped C'serve in '86 when I brought up
my own usenet node. This is the first time I've ever had a single
question about anything you posted, and I think I must finally be
starting to understand all this mess. I save a good deal of your
posts, and if I went back an searched the floppy archives I have
I'd think I'd even find messages from your mother.

>Look at packets number 0 1 2 3. Because ping is sending them
>monotonically at 1 second intervals, they were sent at times like
>1000000.23, 1000001.23, 1000002.23, 1000003.23. Now, if they had been
>received in sequence, here's what I would believe had happened: the
>first packet came back 40ms after it was sent. ping read the packet,
>noted that interval, did an RDNS lookup. The lookup took several
>seconds. While it was waiting, its background thread continued to send
>more pings, and their replies also came back in ~40ms each. But ping
>didn't read them until its foreground reader thread came back from the
>RDNS lookup, so _as far as it could tell_ they had taken much longer.
>The reply to the packet sent at 1000000.23 was received at 1000000.27,
>40ms later, and read immediately. The reply to the 1000001.23 packet
>was received at 1000001.27, but ping didn't _read_ it until 1000004.30,
>so it reported that as a ~3-second turnaround.
>
>But:
>
>Bill> Also the packets are coming back out of sequence.
>Bill>
>Bill> I see packet order of 3,1,2,4,5,6,7,8,9,13,10,11,12,14,21 ...
>Bill>
>Bill> That surely does not indicate DNS to me. If I'm overlooking
>Bill> someting obvious in DNS please do let me know.


>I think you're right, because of the out of order receipt. That means
>that there was blockage somewhere along the way. Some router between
>the two machines was holding either the outgoing packets or the replies
>-- _not_ losing them, just holding them and eventually letting them all
>fly at once. During this holding period they got out of order (which is
>fairly normal, routers do not guarantee in-order delivert). When ping
>finally received them back, it reported them as having taken various
>times about 1.0 second apart, because they all arrived at the same time
>but were _sent_ 1 second apart.


I can envision that something somewhere is waiting until it gets a
packet big enough to send a minimum amount of data, or waits a
pre-determined interval to return that. But why I have no idea.

But what we don't know is just how far apart the pinged IP is.
The orginal poster had munged the original IP and gave no clue
as to what/where it was. Long delays would/could be indicative
of a typical land/satellite link. The sent data goes via land
line, and the return data is intercepted as I recall at the level
3 of the ISO stack, and diverted to an uplink and the down to the
end user. That would almost guarantee a minimum of about 700ms.
And I would think you really would want to aggregate the data
and send in bigger chunks.

I've read about problems on what are called 'elephants' [ELFN -
Exteremely Long Fat Networks - very high speed distant links where
they make the packets HUGE and have large windows, otherwise
the data is slowed by the handshake/protocols/etc of small packets
and few outstanding]. Probably has nothing to do with this, but it
reminded me of disusssion I'd recentely seen.

>Bela>> ping does its sending semi-autonomously, in the background,
>Bela>> while it processes received packets in the foreground. It
>Bela>> receives a packet, does a reverse DNS lookup to convert its
>Bela>> IP address to a name. If that reverse lookup takes 5 seconds,
>Bela>> no other received packets are processed during that time, but
>Bela>> more packets are still _sent_ once a second by the background
>Bela>> processing. Then the DNS lookup completes. ping reports the
>Bela>> correct time for that first packet. The second packet was sent
>Bela>> early in the DNS wait, but didn't get read until much later
>Bela>> (after the DNS lookup completed), so its round-trip time is
>Bela>> misreported. Each subsequent packet looks like it took 1 second
>Bela>> less, because each was _sent_ one second later than the previous
>Bela>> one.
>
>Bill> And then the packet time would remain relatively even after the
>Bill> huge numbers decremented. I didn't explain large numbers to small
>Bill> numbers as well as you.


>Bela>> Now, the _cause_ of the DNS timeouts might be something like
>Bela>> what you're talking about...


>Bill> Looking back at those sequence numbers and packets not being
>Bill> returned in order, do you still feel that way?
>Bill>
>Bill> It's almost as if some packets are taking a differnt route back.
>Bill> It definately is screwy.


>A router along the way is going into a mode where it collects
>but does not forward packets; then waking back up and forwarding
>several seconds worth of collected packets. The down time is
>plausible for e.g. an ISDN link to go through the stages of:
>link drops; router notices link is down; router redials; link
>comes back up.


That had not crossed my mind as I've not worked with ISDN in quite
awhile - though at one ISP we had several using it - they thought
it would be cheaper. We propopsed a dedicated T1 from Florida
to Ohio, but they looked at the ISDN cost, and did not realized
there was a connect charge on each connection, and running a
remote mail kiosk they figured it would be cheaper than the $1500
for the PtP T. When they got their first phone bill of over
$5000 they realized that we did know what we were talking about.

So and ISDN could be it - and IF the line is used as voice and IP
- then the data channel will drop back to 64K when the vox is in
use. However now that many places are out-sourcing their dialup
systems getting a bondable ISDN is almost impossible as to
bond them you have to come in on the same PRI. At one place
their 'modem' bank occupied about 15 feet of rack space. Each
rack had at least 5 of the Lucent/Max - each with a DS3 and each
handling about 600 lines. So with about 30,000 lines available
it would be only by extreme chance you could get two links into the
same PRI, so that's why the only way to get a bonded system is
direct from telco, if that's still possible.

This is an interesting problem, and if any of our theories are
correct - then it will probably be solved only by an onsite person
who knows networking intimately.

Bill

--
Bill Vermillion - bv @ wjv . com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 02-15-2008, 12:10 PM
Dave Gresham
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

In article <hgokqvomnv9dpneo4oipor8l2cc9roiko8@4ax.com>,
Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote:
>On Wed, 05 Nov 2003 18:05:05 GMT, bv@wjv.comREMOVE (Bill Vermillion)
>wrote:
>
>>You will be best served by fixing the port speeds on all NICs.

>
>How can i see whether it's half or full duplex from remote ?
>
>Device MAC address in use Factory MAC Address
>------ ------------------ -------------------
>/dev/net1 00:04:e2:0a:84:10 00:04:e2:0a:84:10
>

<snip>

>MAC Driver Info: Media_type: Ethernet
> Min_SDU: 14, Max_SDU: 1514, Address length: 6
> Interface speed: 100 Mbits/sec
>


Sco Openserver 5.0.5 by default sets the Nic Cards to Auto negotiate.
i quote from space.h in /etc/conf/pack.d/e3H/space.h

"Media type may be overridden. Default is to let the NIC determine
the speed and duplex mode."

For a good description of this, go to Tony's site at:

http://aplawrence.com/SCOFAQ/scotec4.html#duplexspeed

Dave
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 02-15-2008, 12:10 PM
Jeff Liebermann
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

On Thu, 06 Nov 2003 16:02:00 +0100, Stefan Marquardt
<erase-this.stefan.marquardt@hagebau.de> wrote:

>How can i see whether it's half or full duplex from remote ?


I couldn't find any incantation that displays this.

You may wanna run:
ndstat -l
and see if there are any obvious errors. However, I suspect you won't
find any.

If your target is local, see if you can create errors by using ping
flood.
ping -f target_IP
Hopefully, this will give ndstat something to chew upon.

My guess(tm) is that you have a bad switch, bad cable, miswired cable,
or 100baseT to 10baseT transition where the internal buffer in the
switch is losing packets. If the switch is a managed switch with an
IP address, try pinging the switch and see if the problem persists.
Also see if you can extract SNMP statistics from the switch if it's a
managed switch. Also try pinging other machines on the network. The
idea is to isolate the common network segment that's causing a
problem.

I've also successfully induced a similar problem with some creative
wiring on the ethernet cable. I had the polarity of on of the data
wires reversed. Everything sorta functioned but I had lots of delays
that did NOT show up on the server diagnostic output. The switch and
card were apparently spending their time doing almost continuous NWAY
negotiations. The only clue was that the lights on the switch port
would sometimes do a weird dance when there was no traffic. It was
difficult to see as it sorta looked like normal traffic. If you have
any home made cables, I suggest you check them.

Since the target machine is on the local LAN, I can safely assume that
all packets are coming and going directly to the target and are NOT
being routed through some circuitous route. Just to be sure, run:
netstat -rn
and see if the routing table looks sane.

There are two different chips used on this card. The old one uses a
DEC Tulip chip. The current version uses an "Epic" 83C170 chip. The
Linux Epic driver code mentions that some chips have a hardware
multicast filter flaw. That should not affect ping which is unicast.
http://www.scyld.com/network/epic100.html



--
Jeff Liebermann 150 Felker St #D Santa Cruz CA 95060
(831)421-6491 pgr (831)336-2558 home
http://www.LearnByDestroying.com AE6KS
jeffl@comix.santa-cruz.ca.us jeffl@cruzio.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 02-15-2008, 12:10 PM
Bill Vermillion
 
Posts: n/a
Default Re: Curious pings on SCO 5.0.4/6

In article <3faa79d1$0$41289$a1866201@newsreader.visi.com>,
Dave Gresham <gresham@visi.com> wrote:
>In article <hgokqvomnv9dpneo4oipor8l2cc9roiko8@4ax.com>,
>Stefan Marquardt <erase-this.stefan.marquardt@hagebau.de> wrote:
>>On Wed, 05 Nov 2003 18:05:05 GMT, bv@wjv.comREMOVE (Bill Vermillion)
>>wrote:
>>
>>>You will be best served by fixing the port speeds on all NICs.

>>
>>How can i see whether it's half or full duplex from remote ?
>>
>>Device MAC address in use Factory MAC Address
>>------ ------------------ -------------------
>>/dev/net1 00:04:e2:0a:84:10 00:04:e2:0a:84:10
>>

><snip>
>
>>MAC Driver Info: Media_type: Ethernet
>> Min_SDU: 14, Max_SDU: 1514, Address length: 6
>> Interface speed: 100 Mbits/sec
>>

>
>Sco Openserver 5.0.5 by default sets the Nic Cards to Auto negotiate.
>i quote from space.h in /etc/conf/pack.d/e3H/space.h


>"Media type may be overridden. Default is to let the NIC determine
> the speed and duplex mode."


>For a good description of this, go to Tony's site at:


>http://aplawrence.com/SCOFAQ/scotec4.html#duplexspeed


And for a very good description on what happens when things are
mis-matched see http://www.cisco.com/warp/public/473/46.html

Though it was written about a Cisco switch it documents how
manufacturers adding their own ehancements make the the
problem of automatically determing duplex mode and the transfer
speed impossible in some circumstanes. A chart shows what happens
in each of the instances.


--
Bill Vermillion - bv @ wjv . com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 11:42 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com