Low-level details of SGI multiprocessing?

IRIX/Nekoware development, porting and related topics.
Forum rules
Any posts concerning pirated software or offering to buy/sell/trade commercial software are subject to removal.
cesss
Posts: 100
Joined: Mon Apr 27, 2009 8:02 am

Low-level details of SGI multiprocessing?

Unread postby cesss » Mon Jan 04, 2016 4:41 am

Are there any known details on programming the ASICs responsible for managing the CPUs in multiprocessor SGIs? I mean, details from an OS-development point of view (ie: ASIC commands needed to create a new process on a certain CPU, or for scheduling each CPU, etc...).

I don't mind if the details are for Challenge, for the Octane, or for NUMA or later MP boxes.

The purpose is that I feel curious about the protocol of commands used by SGI for controlling MP systems. I'm beginning a pet project which needs (simple) MP and maybe I could get some inspiration from older SGI designs...

robespierre
Posts: 1578
Joined: Mon Sep 12, 2011 2:28 pm
Location: Boston

Re: Low-level details of SGI multiprocessing?

Unread postby robespierre » Mon Jan 04, 2016 10:51 am

The O2000 and Onyx2 (and by extension the Octane and Tezro) are based on an academic computer design from Stanford called DASH. There are many papers about it on citeseer. This design uses directory memories to produce coherency from a loosely-coupled mesh of processing nodes (each of which has several CPUs in a traditional SMP arrangement).
:PI: :O2: :Indigo2IMP: :Indigo2IMP:

User avatar
GL1zdA
Donor
Donor
Posts: 427
Joined: Thu Dec 31, 2009 3:18 pm
Location: Warsaw, PL

Re: Low-level details of SGI multiprocessing?

Unread postby GL1zdA » Mon Jan 04, 2016 1:23 pm

This is the DASH book: Scalable Shared-Memory Multiprocessing - 30% (100 pages) of it is "Experience with DASH". I've got it some time ago for less than $10.
:PI: :Indigo: :Indigo: :Indigo: :Indy: :Indy: :Indigo2: :Indigo2IMP: :Octane: :Fuel: :540:

robespierre
Posts: 1578
Joined: Mon Sep 12, 2011 2:28 pm
Location: Boston

Re: Low-level details of SGI multiprocessing?

Unread postby robespierre » Tue Jan 05, 2016 9:33 am

I think that the Octane and Desktop Tezro only have a single node, either with 2 CPUs (for Octane) or 4 CPUs (for Tezro). You would think that the Rackmount Tezro is really an O350 so it would support multiple nodes, but the second brick has graphics and I/O only, no CPUs.
:PI: :O2: :Indigo2IMP: :Indigo2IMP:

User avatar
GL1zdA
Donor
Donor
Posts: 427
Joined: Thu Dec 31, 2009 3:18 pm
Location: Warsaw, PL

Re: Low-level details of SGI multiprocessing?

Unread postby GL1zdA » Tue Jan 05, 2016 1:36 pm

robespierre wrote:I think that the Octane and Desktop Tezro only have a single node, either with 2 CPUs (for Octane) or 4 CPUs (for Tezro). You would think that the Rackmount Tezro is really an O350 so it would support multiple nodes, but the second brick has graphics and I/O only, no CPUs.

You're right with the Octane and Desktop Tezro, but the Rackmount Tezro was also available as 2 nodes 2 CPUs each using the Workstation Expansion Module (since you couldn't use a 4x 1 GHz CPU nodeboard with the V12). I've tried to compile the available options here, but I don't think anyone has experimented with extreme (unsupported) Tezro configs like the ones in my questions. But there were successful attempts to change the various personalities (Origin 350/Onyx 350/Tezro) of the Chimera.
:PI: :Indigo: :Indigo: :Indigo: :Indy: :Indy: :Indigo2: :Indigo2IMP: :Octane: :Fuel: :540:

cesss
Posts: 100
Joined: Mon Apr 27, 2009 8:02 am

Re: Low-level details of SGI multiprocessing?

Unread postby cesss » Fri Jan 08, 2016 6:27 am

ivelegacy wrote:good question, I do not have an answer, but I can say OpenBSD and Linux has SMP for IP30

Thanks, I forgot this. I'll take a look at the source. Thanks a lot too for the pointers towards DASH documentation to everybody who contributed.

User avatar
vishnu
Donor
Donor
Posts: 3189
Joined: Sun Mar 18, 2007 3:25 pm
Location: Minneapolis, Minnesota USA

Re: Low-level details of SGI multiprocessing?

Unread postby vishnu » Fri Jan 08, 2016 9:38 pm

Most of the Linux SMP code came out of SGI, if you're of a mind to look at any of that. Probably you're not... :roll:
Project:
Temporarily lost at sea...
Plan:
World domination! Or something...

:Tezro: :Octane2:

kramlq
Donor
Donor
Posts: 994
Joined: Tue Sep 20, 2005 5:10 pm
Location: IRL

Re: Low-level details of SGI multiprocessing?

Unread postby kramlq » Fri Jan 22, 2016 10:57 am

cesss wrote:Are there any known details on programming the ASICs responsible for managing the CPUs in multiprocessor SGIs? I mean, details from an OS-development point of view (ie: ASIC commands needed to create a new process on a certain CPU, or for scheduling each CPU, etc...).

I don't mind if the details are for Challenge, for the Octane, or for NUMA or later MP boxes.

The purpose is that I feel curious about the protocol of commands used by SGI for controlling MP systems. I'm beginning a pet project which needs (simple) MP and maybe I could get some inspiration from older SGI designs...


If you are interested in experimenting a bit with multiprocessors, inter-processor interrupts etc. then Stanford's SimOS/MIPS may also be of interest. Its related to the FLASH architecture (http://mprc.pku.edu.cn/mentors/training ... kuskin.pdf) which many Stanford DASH people (and future VMware people) were involved in. Its a partial implementation though and probably about 50% of its commands and registers are no-ops. Documentation isn't great, but you do however have the source code to see how the 'hardware' works, making it possible to understand enough to do CPU startup/shutdown, interrupt setup, interprocessor interrupts, timers etc, and it doesn't take a lot of code - maybe 10% of what an x86/Intel MPS multiprocessor from that same era would have taken. If learning basic SMP operations is your goal, its probably a good starting point. For example, it would be a good exercise to try to port an existing OS like Linux to it.

Linux SMP isn't that hard to understand (though admittedly I haven't kept up with it for many years now, and it tends to grow in complexity each year). Even though they were obsolete, I found it easier to look at simpler early SMP architectures first (e.g. SPARC32, Alpha) - PROM and hardware handled many of the really low level details and so you can concentrate on understanding how stuff like how tracking and managing CPUs, IPIs, cache operations work. I certainly think diving straight into something like a high end NUMA architecture is going to be a more difficult approach.

Also, I'd recommend you get a really good grounding in cache/TLB coherency, atomic locking, memory ordering, and interrupts on multiprocessor systems first. Otherwise you will probably have to redesign all your code when it doesn't work as you expect on real hardware :-)

User avatar
GL1zdA
Donor
Donor
Posts: 427
Joined: Thu Dec 31, 2009 3:18 pm
Location: Warsaw, PL

Re: Low-level details of SGI multiprocessing?

Unread postby GL1zdA » Mon Jan 25, 2016 5:47 am

The Onyx is actually a very classic SMP design - a fast bus integrating all components. Onyx2 is where it becomes interesting, when they try to find the right balance between tightly coupled SMP and loosely coupled cluster.
:PI: :Indigo: :Indigo: :Indigo: :Indy: :Indy: :Indigo2: :Indigo2IMP: :Octane: :Fuel: :540:

User avatar
vishnu
Donor
Donor
Posts: 3189
Joined: Sun Mar 18, 2007 3:25 pm
Location: Minneapolis, Minnesota USA

Re: Low-level details of SGI multiprocessing?

Unread postby vishnu » Mon Jan 25, 2016 11:25 pm

Linus Torvald's on 13 November 2012:

Code: Select all

SGI in particular worked a lot on scaling past a few hundred CPUs. Their initial patches could just not be merged. There was no way we could take the work they did and use it on a regular PC because they added all this infrastructure to work on thousands of CPUs. That was way too expensive to do when you had only a couple.

I was afraid for the longest time that we would have the high-performance kernel for the big machines, and the source code would be separate from the normal kernel. People worked a lot on just making sure that we had a clean code base where you can say at compile time that, hey, I want the kernel that works for 4,000 CPUs, and it generates the code for that, and at the same time, if you say no, I want the kernel that works on 2 CPUs, the same source code compiles.

It was something that in retrospect is really important because it actually made the source code much better. All the effort that SGI and others spent on unifying the source code, actually a lot of it was clean-up – this doesn't work for a hundred CPUs, so we need to clean it up so that it works. And it actually made the kernel more maintainable. Now on the desktop, 8 and 16 CPUs are almost common; it used to be that we had trouble scaling to 8, now it's like child's play.


Link to the original interview
Project:
Temporarily lost at sea...
Plan:
World domination! Or something...

:Tezro: :Octane2:


Return to “SGI: Development”

Who is online

Users browsing this forum: No registered users and 3 guests