Unix Technical Forum

Xeon hyperthreading problem

This is a discussion on Xeon hyperthreading problem within the Linux Operating System forums, part of the Unix Operating Systems category; --> Michel Bardiaux wrote: > Jean-David Beyer wrote: > >> Hash: SHA1 >> >> Michel Bardiaux wrote: >> >>> I've ...


Go Back   Unix Technical Forum > Unix Operating Systems > Linux Operating System

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 01-18-2008, 08:30 AM
Jean-David Beyer
 
Posts: n/a
Default Re: Xeon hyperthreading problem LONG

Michel Bardiaux wrote:
> Jean-David Beyer wrote:
>
>> Hash: SHA1
>>
>> Michel Bardiaux wrote:
>>
>>> I've compared autoconf.h and I think this is the significant difference;
>>> yours:
>>>
>>> autoconf.h:#define CONFIG_X86_GOOD_APIC 1
>>> autoconf.h:#define CONFIG_X86_CLUSTERED_APIC 1
>>> autoconf.h:#define CONFIG_X86_CLUSTERED_APIC 1
>>> autoconf.h:#define CONFIG_X86_IO_APIC 1
>>> autoconf.h:#define CONFIG_X86_LOCAL_APIC 1
>>> autoconf.h:#define CONFIG_X86_NUMA 1
>>>
>>>
>>> Mine:
>>>
>>> autoconf.h:#define CONFIG_X86_GOOD_APIC 1
>>> autoconf.h:#define CONFIG_X86_IO_APIC 1
>>> autoconf.h:#define CONFIG_X86_LOCAL_APIC 1
>>> autoconf.h:#undef CONFIG_X86_NUMA

>
>
> We have made a new 2.4.28 with NUMA support selected to have the same
> autoconf.h, still the same problem of all interrupts to the same CPU. It
> seems to me now that the critical line in your dmesg output is this one:
>
>> xAPIC support is present

>
>
> I dont have it, moreover this string appears nowhere in the stock 2.4.28
> sources! From web searches it seems this might be part of a RedHat
> patch. So my best hope now is to compare my kernel sources with yours.
> Where did you get your kernel 2.4.21-27.0.2.ELsmp ? Did you apply any
> additional patches?


Red Hat; the up2date daemon runs and notifies me whenever updates are
available. I click on an icon and it downloads the RPMs and installs them. I
made no patches. (I am now running
$ uname -r
2.4.21-27.0.4.ELsmp )
>
> And hope is what I need now. Another machine having failed, the Dell
> Poweedge 1800 in question had to take over some production tasks, and
> because of these interrupt storms, we have to run without APIC, which
> means without SMP, which means without HT; in other words, *slow*!
>

I do not see how to run without APIC as there are four PCI busses on my
machine and I do not see how the chipset and processors could deal with the
interrupts without it.

--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 11:30:00 up 2 days, 5:07, 3 users, load average: 4.44, 4.28, 4.21
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 01-18-2008, 08:30 AM
Michel Bardiaux
 
Posts: n/a
Default Re: Xeon hyperthreading problem LONG

Jean-David Beyer wrote:
>
>>And hope is what I need now. Another machine having failed, the Dell
>>Poweedge 1800 in question had to take over some production tasks, and
>>because of these interrupt storms, we have to run without APIC, which
>>means without SMP, which means without HT; in other words, *slow*!
>>

>
> I do not see how to run without APIC as there are four PCI busses on my
> machine and I do not see how the chipset and processors could deal with the
> interrupts without it.
>

In /etc/lilo.conf uncomment the append line and change it to:

append="nosmp noapic"

Now /proc/interrupts shows:

CPU0
0: 10834001 XT-PIC timer
1: 10 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 0 XT-PIC usb-uhci
10: 63 XT-PIC aic7xxx, usb-uhci
11: 38368049 XT-PIC aacraid, usb-uhci, eth0
14: 2 XT-PIC ide0
NMI: 0
LOC: 0
ERR: 3
MIS: 0

The system seems a bit less suceptible to interrupt storms, but they
still occur under heavy combined local and network IO, probably because
we have the AACRAID and a Gigabit Ethernet on the same interrupt. Is
there a way to reconfigure so that say eth0 goes to IRQ 3?

--
Michel Bardiaux
Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles
Tel : +32 2 790.29.41
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 01-18-2008, 08:30 AM
Michel Bardiaux
 
Posts: n/a
Default Re: Xeon hyperthreading problem LONG

prg wrote:
>
> This is a complete shot in the dark -- you sound like you're nearing
> the desperate edge.


Yep. This was to be a test machine plus home to developpers; but a
production server has gone on the fritz (multiple disc failures in a
RAID) and the Poweredge 1800 had to be stabilized at any cost to take
over some production tasks. Replacement machines are being ordered but
they are likely to be recent Dells too, so there...

>
> [q]
> I have also given into the peer pressure and enabled SMP with this
> release.
>
> The debian-dell-2.4.29.iso (md5sum: dcd375d887f18159dd01ba47f513db23)
> should support the following Dell products:
>
> PowerEdge 400SC
> PowerEdge 420SC
> PowerEdge 600SC (thanks Nat)
> PowerEdge 650
> PowerEdge 700 (thanks Eric Busick)
> PowerEdge 750 (thanks Carsten Buchenau)
> PowerEdge 800 (thanks Jean-Christophe Montigny)
> PowerEdge 1300 (thanks Lukasz Andersson)
> PowerEdge 1400 (thanks Matt Griffin)
> PowerEdge 1550
> PowerEdge 1650
> PowerEdge 1600SC (thanks Stephane)
> PowerEdge 1655C
> PowerEdge 1750
> PowerEdge 1800 (thanks weasel)
> PowerEdge 1850 (thanks James L. Morton)
> (and many more ...)
> [eq]
>
> http://wiki.osuosl.org/display/LNX/D...=true#comments


This looks promising. Thanks for the tip.

>
> You may already be aware of it.
>
> RH based kernels (kernel 2.4.21-27.0.2.ELsmp ) here or near?
> ftp://ftp.linux.ncsu.edu/pub/centos/...86/RedHat/RPMS
> ftp://ftp.linux.ncsu.edu/pub/centos/...64/RedHat/RPMS


We had found these but not tried them yet. Also found some patches about
xapic support but they are for 2.4.21 or 2.4.22 and do not apply on 2.4.28.

>
> good luck,
> prg
>



--
Michel Bardiaux
Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles
Tel : +32 2 790.29.41
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 01-18-2008, 08:30 AM
Jean-David Beyer
 
Posts: n/a
Default Re: Xeon hyperthreading problem LONG

Michel Bardiaux wrote:
> Jean-David Beyer wrote:
>
>>
>>> And hope is what I need now. Another machine having failed, the Dell
>>> Poweedge 1800 in question had to take over some production tasks, and
>>> because of these interrupt storms, we have to run without APIC, which
>>> means without SMP, which means without HT; in other words, *slow*!
>>>

>>
>> I do not see how to run without APIC as there are four PCI busses on my
>> machine and I do not see how the chipset and processors could deal
>> with the
>> interrupts without it.
>>

> In /etc/lilo.conf uncomment the append line and change it to:
>
> append="nosmp noapic"


I know how to do that, but I do not dare. I do not see how all the
interrupts from the E7501 chipset (including E7501-ICH3-S,two E7501-P64H2.pdf,
and a E7501_MCH.pdf) would work without APIC enabled.


> Now /proc/interrupts shows:
>
> CPU0
> 0: 10834001 XT-PIC timer
> 1: 10 XT-PIC keyboard
> 2: 0 XT-PIC cascade
> 5: 0 XT-PIC usb-uhci
> 10: 63 XT-PIC aic7xxx, usb-uhci
> 11: 38368049 XT-PIC aacraid, usb-uhci, eth0
> 14: 2 XT-PIC ide0
> NMI: 0
> LOC: 0
> ERR: 3
> MIS: 0
>
> The system seems a bit less suceptible to interrupt storms, but they
> still occur under heavy combined local and network IO, probably because
> we have the AACRAID and a Gigabit Ethernet on the same interrupt. Is
> there a way to reconfigure so that say eth0 goes to IRQ 3?
>



--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 11:45:00 up 3 days, 5:22, 4 users, load average: 4.41, 4.25, 4.29
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #15 (permalink)  
Old 01-18-2008, 08:30 AM
Michel Bardiaux
 
Posts: n/a
Default Re: Xeon hyperthreading problem LONG

Jean-David Beyer wrote:
>>In /etc/lilo.conf uncomment the append line and change it to:
>>
>>append="nosmp noapic"

>
>
> I know how to do that, but I do not dare. I do not see how all the
> interrupts from the E7501 chipset (including E7501-ICH3-S,two E7501-P64H2.pdf,
> and a E7501_MCH.pdf) would work without APIC enabled.
>

All I can tell is that our system works and is a bit more stable that
way than with HT and SMP and APIC. Reading at

http://www.intel.com/design/chipsets/embedded/e7501.htm

I would guess the "legacy I/O" of the ICH3 includes working with XT-PIC,
and in uniprocessor non-HT mode the MCH does not need APIC either.

(Isnt the web great? 1 minute ago I had not a clue what an E7501 was!)

--
Michel Bardiaux
Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles
Tel : +32 2 790.29.41
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 01:09 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com