Manuel Bouyer | 16 Apr 2012 23:05

support for more than 32 CPUs

Hello,
the attached patch, based on a patch sent by Mindaugas Rasiukevicius
on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
amd64, and should easily allow up to 64 for Xen/amd64. I tested it
on a x86 with 64 AMD cores (lighly as this box has now known drive yet - some
driver hacking is needed), 8 intel cores and with a Xen domU with 4 core.
I didn't notice regressions so far.

Comments before I commit ?

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

Thor Lancelot Simon | 16 Apr 2012 23:22
Picon
Favicon

Re: support for more than 32 CPUs

On Mon, Apr 16, 2012 at 11:05:04PM +0200, Manuel Bouyer wrote:
> Hello,
> the attached patch, based on a patch sent by Mindaugas Rasiukevicius
> on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
> amd64, and should easily allow up to 64 for Xen/amd64. I tested it
> on a x86 with 64 AMD cores (lighly as this box has now known drive yet - some
> driver hacking is needed), 8 intel cores and with a Xen domU with 4 core.
> I didn't notice regressions so far.
> 
> Comments before I commit ?

Only that for even a compilation workload, we seem to lose performance
after about 15 CPUs, so though we'll probe more CPUs this way... there
may be less benefit than one would expect.

Thor

Mindaugas Rasiukevicius | 16 Apr 2012 23:54
Picon

Re: support for more than 32 CPUs

Thor Lancelot Simon <tls <at> panix.com> wrote:
> On Mon, Apr 16, 2012 at 11:05:04PM +0200, Manuel Bouyer wrote:
> > Hello,
> > the attached patch, based on a patch sent by Mindaugas Rasiukevicius
> > on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
> > amd64, and should easily allow up to 64 for Xen/amd64. I tested it
> > on a x86 with 64 AMD cores (lighly as this box has now known drive yet
> > - some driver hacking is needed), 8 intel cores and with a Xen domU
> > with 4 core. I didn't notice regressions so far.
> > 
> > Comments before I commit ?
> 
> Only that for even a compilation workload, we seem to lose performance
> after about 15 CPUs, so though we'll probe more CPUs this way... there
> may be less benefit than one would expect.

It depends on the workload.  The ~16 CPUs threshold you are talking about
applies when there is a high pressure on UVM and global page queue locks
get contended.  For some other workloads, we scale better.

In any case, this is a step to the right direction.

> 
> Thor

--

-- 
Mindaugas

Manuel Bouyer | 17 Apr 2012 10:15

Re: support for more than 32 CPUs

On Mon, Apr 16, 2012 at 10:54:44PM +0100, Mindaugas Rasiukevicius wrote:
> Thor Lancelot Simon <tls <at> panix.com> wrote:
> > On Mon, Apr 16, 2012 at 11:05:04PM +0200, Manuel Bouyer wrote:
> > > Hello,
> > > the attached patch, based on a patch sent by Mindaugas Rasiukevicius
> > > on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
> > > amd64, and should easily allow up to 64 for Xen/amd64. I tested it
> > > on a x86 with 64 AMD cores (lighly as this box has now known drive yet
> > > - some driver hacking is needed), 8 intel cores and with a Xen domU
> > > with 4 core. I didn't notice regressions so far.
> > > 
> > > Comments before I commit ?
> > 
> > Only that for even a compilation workload, we seem to lose performance
> > after about 15 CPUs, so though we'll probe more CPUs this way... there
> > may be less benefit than one would expect.
> 
> It depends on the workload.  The ~16 CPUs threshold you are talking about
> applies when there is a high pressure on UVM and global page queue locks
> get contended.  For some other workloads, we scale better.
> 
> In any case, this is a step to the right direction.

It was not clear in my original email, but without this patch NetBSD
doesn't boot on such a machine. Even if performances are not there yet,
at last we can run benchmarks now :)
Or we can turn it into a Xen dom0 ...

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
(Continue reading)

Roger Pau Monné | 16 Apr 2012 23:46
Favicon

Re: support for more than 32 CPUs

On Mon, Apr 16, 2012 at 10:05 PM, Manuel Bouyer <bouyer <at> antioche.eu.org> wrote:
> Hello,
> the attached patch, based on a patch sent by Mindaugas Rasiukevicius

Maybe it's my problem, but I didn't receive any patch with this email,
could you check it please? I would like to try it.

Thanks, Roger.

> on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
> amd64, and should easily allow up to 64 for Xen/amd64. I tested it
> on a x86 with 64 AMD cores (lighly as this box has now known drive yet - some
> driver hacking is needed), 8 intel cores and with a Xen domU with 4 core.
> I didn't notice regressions so far.
>
> Comments before I commit ?
>
> --
> Manuel Bouyer <bouyer <at> antioche.eu.org>
>     NetBSD: 26 ans d'experience feront toujours la difference
> --

Manuel Bouyer | 17 Apr 2012 10:12

Re: support for more than 32 CPUs

On Mon, Apr 16, 2012 at 11:05:04PM +0200, Manuel Bouyer wrote:
> Hello,
> the attached patch,

Which I forgot to attach, as pointed out by several of you. Here it is.

> based on a patch sent by Mindaugas Rasiukevicius
> on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
> amd64, and should easily allow up to 64 for Xen/amd64. I tested it
> on a x86 with 64 AMD cores (lighly as this box has now known drive yet - some
> driver hacking is needed), 8 intel cores and with a Xen domU with 4 core.
> I didn't notice regressions so far.
> 
> Comments before I commit ?
> 
> -- 
> Manuel Bouyer <bouyer <at> antioche.eu.org>
>      NetBSD: 26 ans d'experience feront toujours la difference
> --

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--
Index: arch/amd64/amd64/genassym.cf
===================================================================
RCS file: /cvsroot/src/sys/arch/amd64/amd64/genassym.cf,v
retrieving revision 1.49
(Continue reading)

Manuel Bouyer | 17 Apr 2012 20:21

Re: support for more than 32 CPUs

On Tue, Apr 17, 2012 at 10:12:26AM +0200, Manuel Bouyer wrote:
> On Mon, Apr 16, 2012 at 11:05:04PM +0200, Manuel Bouyer wrote:
> > Hello,
> > the attached patch,
> 
> Which I forgot to attach, as pointed out by several of you. Here it is.

And it looks like it didn't get to the lists, maybe because it's too
large. You can find it at: http://www.netbsd.org/~bouyer/x8664cpu.diff

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

Mindaugas Rasiukevicius | 18 Apr 2012 00:08
Picon

Re: support for more than 32 CPUs

Manuel Bouyer <bouyer <at> antioche.eu.org> wrote:
> > > Hello,
> > > the attached patch,
> > 
> > Which I forgot to attach, as pointed out by several of you. Here it is.
> 
> And it looks like it didn't get to the lists, maybe because it's too
> large. You can find it at: http://www.netbsd.org/~bouyer/x8664cpu.diff

Cool!  The patch seems the same as original, but my silly bugs fixed. :)
Did you try it on multiple SMP machines?  Concern is the early boot, when
due to our messy MD initialisation code bugs like missing TLB flush can
happen.  While on some machines it fails immediately, on other machines
it might be pretty lucky (and once booted, it is handled correctly).

From pmap_tlb_intr():

+	if (!kcpuset_isset(tm->tm_pending, cid)) {
+		return;
+	}

I kept this pending mask to have the code more defensive i.e. it would
handle spurious IPIs.  However, AFAIK, that should not happen, unless
the hardware fails.  If so, tm_pendcount and tm_gen is enough and that
pending mask can be removed.  Do you see/know any corner case here?

Thanks a lot for working on this!  Do you want me to commit the patch,
or would like to do it yourself?

--

-- 
(Continue reading)

Manuel Bouyer | 18 Apr 2012 13:25

Re: support for more than 32 CPUs

On Tue, Apr 17, 2012 at 11:08:17PM +0100, Mindaugas Rasiukevicius wrote:
> Manuel Bouyer <bouyer <at> antioche.eu.org> wrote:
> > > > Hello,
> > > > the attached patch,
> > > 
> > > Which I forgot to attach, as pointed out by several of you. Here it is.
> > 
> > And it looks like it didn't get to the lists, maybe because it's too
> > large. You can find it at: http://www.netbsd.org/~bouyer/x8664cpu.diff
> 
> Cool!  The patch seems the same as original, but my silly bugs fixed. :)
> Did you try it on multiple SMP machines?

I booted the 64-core AMD system (of course) and also a 4-core hyperthread
Intel Xeon system. I'll also test a dual-core hyperthread i5 intel desktop.

> Concern is the early boot, when
> due to our messy MD initialisation code bugs like missing TLB flush can
> happen.  While on some machines it fails immediately, on other machines
> it might be pretty lucky (and once booted, it is handled correctly).
> 
> >From pmap_tlb_intr():
> 
> +	if (!kcpuset_isset(tm->tm_pending, cid)) {
> +		return;
> +	}
> 
> I kept this pending mask to have the code more defensive i.e. it would
> handle spurious IPIs.  However, AFAIK, that should not happen, unless
> the hardware fails.  If so, tm_pendcount and tm_gen is enough and that
(Continue reading)

Christoph Egger | 18 Apr 2012 22:17
Picon
Picon

Re: support for more than 32 CPUs

On 18.04.12 13:25, Manuel Bouyer wrote:
> On Tue, Apr 17, 2012 at 11:08:17PM +0100, Mindaugas Rasiukevicius wrote:
>> Manuel Bouyer <bouyer <at> antioche.eu.org> wrote:
>>>>> Hello,
>>>>> the attached patch,
>>>>
>>>> Which I forgot to attach, as pointed out by several of you. Here it is.
>>>
>>> And it looks like it didn't get to the lists, maybe because it's too
>>> large. You can find it at: http://www.netbsd.org/~bouyer/x8664cpu.diff
>>
>> Cool!  The patch seems the same as original, but my silly bugs fixed. :)
>> Did you try it on multiple SMP machines?
> 
> I booted the 64-core AMD system (of course) and also a 4-core hyperthread
> Intel Xeon system. I'll also test a dual-core hyperthread i5 intel desktop.
> 

It is possible to boot NetBSD as Xen HVM guest. This way you can test
with up to 128 cpus.

Christoph

Hubert Feyrer | 18 Apr 2012 22:32
Picon
Favicon

Re: support for more than 32 CPUs

On Wed, 18 Apr 2012, Christoph Egger wrote:
> It is possible to boot NetBSD as Xen HVM guest. This way you can test
> with up to 128 cpus.

post dmesg :)

  - Hubert

Manuel Bouyer | 20 Apr 2012 16:12

Re: support for more than 32 CPUs

On Wed, Apr 18, 2012 at 10:32:13PM +0200, Hubert Feyrer wrote:
> On Wed, 18 Apr 2012, Christoph Egger wrote:
> >It is possible to boot NetBSD as Xen HVM guest. This way you can test
> >with up to 128 cpus.
> 
> post dmesg :)

Here it is !

Not much tested yet, but it boots multiuser.
I'll try a build.sh -j<a lot> later ...

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 6.99.4 (GENERIC) #5: Thu Apr 19 23:56:17 CEST 2012
	bouyer <at> houla:/dsk/l1/misc/bouyer/tmp/amd64/obj/dsk/l1/misc/bouyer/HEAD/src/sys/arch/amd64/compile/GENERIC
total memory = 8191 MB
avail memory = 7938 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
(Continue reading)

Hubert Feyrer | 20 Apr 2012 21:55
Picon
Favicon

Re: support for more than 32 CPUs

On Fri, 20 Apr 2012, Manuel Bouyer wrote:
> Here it is !

Cool!
BTW, apparenly you can test this with Qemu, too.
When I last tried this in 2005[1] it was very slow, though, and 
apparently that hasn't improved. Still, recent qemu has a bunch of knobs 
to play with:

-smp n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]
                 set the number of CPUs to 'n' [default=1]
                 maxcpus= maximum number of total cpus, including
                 offline CPUs for hotplug, etc
                 cores= number of CPU cores on one socket
                 threads= number of threads on one CPU core
                 sockets= number of discrete sockets in the system

... resulting in NetBSD proving this as:

cpu0 at mainbus0 apid 0: QEMU Virtual CPU version 0.14.1, id 0x633
cpu1 at mainbus0 apid 1: QEMU Virtual CPU version 0.14.1, id 0x633
...

(I don't have -current or so here, to see how high I can get)

  - Hubert

[1] http://www.feyrer.de/NetBSD/blog.html/nb_20051222_0659.html

(Continue reading)

Joerg Sonnenberger | 18 Apr 2012 13:38
Picon

Re: support for more than 32 CPUs

On Mon, Apr 16, 2012 at 11:05:04PM +0200, Manuel Bouyer wrote:
> the attached patch, based on a patch sent by Mindaugas Rasiukevicius
> on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
> amd64, and should easily allow up to 64 for Xen/amd64.

There are some unrelated changes e.g. to the halt logic in mptramp. Is
that intentional?

Joerg

Mindaugas Rasiukevicius | 18 Apr 2012 22:52
Picon

Re: support for more than 32 CPUs

Joerg Sonnenberger <joerg <at> britannica.bec.de> wrote:
> On Mon, Apr 16, 2012 at 11:05:04PM +0200, Manuel Bouyer wrote:
> > the attached patch, based on a patch sent by Mindaugas Rasiukevicius
> > on tech-kern <at>  some time ago, bumps the max number of CPUs to 256 for
> > amd64, and should easily allow up to 64 for Xen/amd64.
> 
> There are some unrelated changes e.g. to the halt logic in mptramp. Is
> that intentional?

You mean jump to the idle_loop()?  The changes are not directly related.
A while ago I added kcpuset_running setting in idle_loop() (committed it
separately) and the change was in the way of code inspecting the callers.
Call from C is just more readable.

> 
> Joerg

--

-- 
Mindaugas


Gmane