This is a discussion on Xeon hyperthreading problem within the Linux Operating System forums, part of the Unix Operating Systems category; --> Michel Bardiaux wrote: > Jean-David Beyer wrote: > >> Hash: SHA1 >> >> Michel Bardiaux wrote: >> >>> I've ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Michel Bardiaux wrote: > Jean-David Beyer wrote: > >> Hash: SHA1 >> >> Michel Bardiaux wrote: >> >>> I've compared autoconf.h and I think this is the significant difference; >>> yours: >>> >>> autoconf.h:#define CONFIG_X86_GOOD_APIC 1 >>> autoconf.h:#define CONFIG_X86_CLUSTERED_APIC 1 >>> autoconf.h:#define CONFIG_X86_CLUSTERED_APIC 1 >>> autoconf.h:#define CONFIG_X86_IO_APIC 1 >>> autoconf.h:#define CONFIG_X86_LOCAL_APIC 1 >>> autoconf.h:#define CONFIG_X86_NUMA 1 >>> >>> >>> Mine: >>> >>> autoconf.h:#define CONFIG_X86_GOOD_APIC 1 >>> autoconf.h:#define CONFIG_X86_IO_APIC 1 >>> autoconf.h:#define CONFIG_X86_LOCAL_APIC 1 >>> autoconf.h:#undef CONFIG_X86_NUMA > > > We have made a new 2.4.28 with NUMA support selected to have the same > autoconf.h, still the same problem of all interrupts to the same CPU. It > seems to me now that the critical line in your dmesg output is this one: > >> xAPIC support is present > > > I dont have it, moreover this string appears nowhere in the stock 2.4.28 > sources! From web searches it seems this might be part of a RedHat > patch. So my best hope now is to compare my kernel sources with yours. > Where did you get your kernel 2.4.21-27.0.2.ELsmp ? Did you apply any > additional patches? Red Hat; the up2date daemon runs and notifies me whenever updates are available. I click on an icon and it downloads the RPMs and installs them. I made no patches. (I am now running $ uname -r 2.4.21-27.0.4.ELsmp ) > > And hope is what I need now. Another machine having failed, the Dell > Poweedge 1800 in question had to take over some production tasks, and > because of these interrupt storms, we have to run without APIC, which > means without SMP, which means without HT; in other words, *slow*! > I do not see how to run without APIC as there are four PCI busses on my machine and I do not see how the chipset and processors could deal with the interrupts without it. -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939. /( )\ Shrewsbury, New Jersey http://counter.li.org ^^-^^ 11:30:00 up 2 days, 5:07, 3 users, load average: 4.44, 4.28, 4.21 |
| |||
| Jean-David Beyer wrote: > >>And hope is what I need now. Another machine having failed, the Dell >>Poweedge 1800 in question had to take over some production tasks, and >>because of these interrupt storms, we have to run without APIC, which >>means without SMP, which means without HT; in other words, *slow*! >> > > I do not see how to run without APIC as there are four PCI busses on my > machine and I do not see how the chipset and processors could deal with the > interrupts without it. > In /etc/lilo.conf uncomment the append line and change it to: append="nosmp noapic" Now /proc/interrupts shows: CPU0 0: 10834001 XT-PIC timer 1: 10 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 0 XT-PIC usb-uhci 10: 63 XT-PIC aic7xxx, usb-uhci 11: 38368049 XT-PIC aacraid, usb-uhci, eth0 14: 2 XT-PIC ide0 NMI: 0 LOC: 0 ERR: 3 MIS: 0 The system seems a bit less suceptible to interrupt storms, but they still occur under heavy combined local and network IO, probably because we have the AACRAID and a Gigabit Ethernet on the same interrupt. Is there a way to reconfigure so that say eth0 goes to IRQ 3? -- Michel Bardiaux Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles Tel : +32 2 790.29.41 |
| |||
| prg wrote: > > This is a complete shot in the dark -- you sound like you're nearing > the desperate edge. Yep. This was to be a test machine plus home to developpers; but a production server has gone on the fritz (multiple disc failures in a RAID) and the Poweredge 1800 had to be stabilized at any cost to take over some production tasks. Replacement machines are being ordered but they are likely to be recent Dells too, so there... > > [q] > I have also given into the peer pressure and enabled SMP with this > release. > > The debian-dell-2.4.29.iso (md5sum: dcd375d887f18159dd01ba47f513db23) > should support the following Dell products: > > PowerEdge 400SC > PowerEdge 420SC > PowerEdge 600SC (thanks Nat) > PowerEdge 650 > PowerEdge 700 (thanks Eric Busick) > PowerEdge 750 (thanks Carsten Buchenau) > PowerEdge 800 (thanks Jean-Christophe Montigny) > PowerEdge 1300 (thanks Lukasz Andersson) > PowerEdge 1400 (thanks Matt Griffin) > PowerEdge 1550 > PowerEdge 1650 > PowerEdge 1600SC (thanks Stephane) > PowerEdge 1655C > PowerEdge 1750 > PowerEdge 1800 (thanks weasel) > PowerEdge 1850 (thanks James L. Morton) > (and many more ...) > [eq] > > http://wiki.osuosl.org/display/LNX/D...=true#comments This looks promising. Thanks for the tip. > > You may already be aware of it. > > RH based kernels (kernel 2.4.21-27.0.2.ELsmp ) here or near? > ftp://ftp.linux.ncsu.edu/pub/centos/...86/RedHat/RPMS > ftp://ftp.linux.ncsu.edu/pub/centos/...64/RedHat/RPMS We had found these but not tried them yet. Also found some patches about xapic support but they are for 2.4.21 or 2.4.22 and do not apply on 2.4.28. > > good luck, > prg > -- Michel Bardiaux Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles Tel : +32 2 790.29.41 |
| |||
| Michel Bardiaux wrote: > Jean-David Beyer wrote: > >> >>> And hope is what I need now. Another machine having failed, the Dell >>> Poweedge 1800 in question had to take over some production tasks, and >>> because of these interrupt storms, we have to run without APIC, which >>> means without SMP, which means without HT; in other words, *slow*! >>> >> >> I do not see how to run without APIC as there are four PCI busses on my >> machine and I do not see how the chipset and processors could deal >> with the >> interrupts without it. >> > In /etc/lilo.conf uncomment the append line and change it to: > > append="nosmp noapic" I know how to do that, but I do not dare. I do not see how all the interrupts from the E7501 chipset (including E7501-ICH3-S,two E7501-P64H2.pdf, and a E7501_MCH.pdf) would work without APIC enabled. > Now /proc/interrupts shows: > > CPU0 > 0: 10834001 XT-PIC timer > 1: 10 XT-PIC keyboard > 2: 0 XT-PIC cascade > 5: 0 XT-PIC usb-uhci > 10: 63 XT-PIC aic7xxx, usb-uhci > 11: 38368049 XT-PIC aacraid, usb-uhci, eth0 > 14: 2 XT-PIC ide0 > NMI: 0 > LOC: 0 > ERR: 3 > MIS: 0 > > The system seems a bit less suceptible to interrupt storms, but they > still occur under heavy combined local and network IO, probably because > we have the AACRAID and a Gigabit Ethernet on the same interrupt. Is > there a way to reconfigure so that say eth0 goes to IRQ 3? > -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939. /( )\ Shrewsbury, New Jersey http://counter.li.org ^^-^^ 11:45:00 up 3 days, 5:22, 4 users, load average: 4.41, 4.25, 4.29 |
| ||||
| Jean-David Beyer wrote: >>In /etc/lilo.conf uncomment the append line and change it to: >> >>append="nosmp noapic" > > > I know how to do that, but I do not dare. I do not see how all the > interrupts from the E7501 chipset (including E7501-ICH3-S,two E7501-P64H2.pdf, > and a E7501_MCH.pdf) would work without APIC enabled. > All I can tell is that our system works and is a bit more stable that way than with HT and SMP and APIC. Reading at http://www.intel.com/design/chipsets/embedded/e7501.htm I would guess the "legacy I/O" of the ICH3 includes working with XT-PIC, and in uniprocessor non-HT mode the MCH does not need APIC either. (Isnt the web great? 1 minute ago I had not a clue what an E7501 was!) -- Michel Bardiaux Peaktime Belgium S.A. Bd. du Souverain, 191 B-1160 Bruxelles Tel : +32 2 790.29.41 |