Index Home About Blog
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: What will come after the E10k?
Date: 12 Apr 2000 22:42:30 GMT

(Somebody said my name was being invoked, so I checked...)

(1) My label, in 1996, was Director, Systems Technology, in Corporate
R&D reporting to Forest Baskett.  [I.e., a nice "do whatever" job :-)].
I was not a VP (I am now, but not then).

(2) I certainly had nothing to do with the Cray acquisition (I was on
sabbatical at the time, and the first I heard of it was a New York Times
friend calling me at home, before it was announced, wanting to get a comment.
Me: "John, leave me alone, I'm on sabbatical and know nothing."

(3) Having done a bunch of the Cray competitive analysis for years,
including financials, *I* wouldn't have done the acquisition ... but
I'm not a CEO either.

Not being privy to any of the conversations, I don't know who did what,
but *obviously* Ed McCracken, as CEO, had to have been willing to do this.
Maybe Tom Jermoluk was as well; I have zero inside info [if I did,
I would say so, but not be able to say what I knew, but I don't,
so I can just say I don't know.]

(4) The Cray merger was not well-executed [personal opinion] the MIPS
acquisition worked very well, at least at first, but this may have made
people over-confident, as a multi-geography acquisition of a company
you've been competing with, is much more difficult than acquiring a supplier
located a few miles away, and where there are many existing personal
relationships [for example, I managed the SYS V Port that turned into
both MIPS RISC/os and SGI MIP/IRIX, using a team split between MIPS & SGI;
Mayfield was lead VC for both companies; Hennessy, Baskett, Clark still
knew each other :-); SGI had a fair number of ex-Convergent people I knew, etc.]

(5) However, once a decision has been made, either you get behind it and
try to make it work, or you go somewhere else.
Some people may not be in the chain of command,
but are noisy enough that lots of people do listen to them,
and hence have some responsibility for what they say.
(McCalpin & I both fit all that).

However, none of this is as simple as it looks from outside;
I merely sampled the discussion, but (as is all too often the case),
there were so many irrelevant posts it was hard to find the relevant ones.
I point out some issues that I didn't see getting mentioned.

(a) Recall that SGI did not go looking for Cray, Cray was being shopped
around, i.e., somebody was going to buy Cray.
Now, it happens that most Cray vector systems have had many SGI workstations
around them somewhere.  SGI was expanding into HPC, viewing it as a natural
and critical part of its technical business.

Many people in the industry (some inside Cray, but especially in IBM & SGI)
believed that the vector HPC market would be eroded by parallel microprocessor
systems [which happened].  People believed that whoever bought Cray would
naturally maintain the relationships with Cray customers (which go very
deep), and as those customers shifted to micros, they'd shift to the
Cray buyer's micros, not somebody else's.  Hence, it was believed, inside
SGI (and rightly or wrongly, but all of this came up fairly quickly),
that there was a serious danger to SGI's business if the *wrong* somebody-else
bought Cray and did this.

(People may recall that SGI was still a bit
sensitive over what had happened with SoftImage (which had always produced
software only on SGI, but then got bought by Microsoft in 1994), and
shifted a lot of work to NT.)

(Again, you might say SGI paid too much, you might say it was a bad idea,
you might say there were better strategies,
(and if The Innovator's Dilemma had been published in 1994 instead of
1997, people might have decided differently, or maybe not),
you might say that anyone else might have had acquistion trouble as well,
but there were at least some plausible reasons for doing it.
Monday morning quarterbacks are *always* better than the quarterbacks on
the field who have to decide what to do in the couple seconds before the
300-lb defensive tackles crush them.  It is no accident that lots of
CEOs burn-out, get divorced, etc, they do have to make the calls on
the available information.  One more time: Cray was being shopped, and
*was* going to be sold; this was not SGI running around looking.)


(b) Now, personally, I am *always* scared of computer systems company
mergers, as they must be executed superbly to have any chance, and it
wasn't clear that we had that as a normal competence (say, as compared to
Cisco, which does acquisitions all the time), and I was certainly worried
about the complexity/conflicts of product line being acquired.


(c) On the other hand, there were clear positive things about the
merger, which I said at the time, and still believe:

	(1) We in fact acquired and kept quite a few Cray customers,
	and converted many to our microprocessor systems.  This was good,
	and very much in the line of business.

	(2) We got a large number of *really good* employees, who
	were culturally more compatible with SGI than one might expect.
	We got a midwest R&D base, very useful, since not everyone
	wants to move to the Valley.
	There are great folks in Chippewa Falls & Eagan that I enjoy
	working with, who do great technology, and that I learn from.
	They are working on IRIX, Linux, Origins, Origin followons, and
	crucial to our efforts.

On the other hand, the acquisition was not executed very well [but that's all
I'll say about that], but the lesson is:
	Any decision is likely to be a mixed bag, but the outcome often
	depends primarily on execution more than whether the original
	idea is any good or not.

(6) Regarding the UE10000, that also, is more complex than it looks.
Personally, I would have preferred that it just disappear, since SGI/Cray
could never really take advantage of it [hmm, how up-to-date would Solaris
source releases come from Sun to SGI?] and I wasn't close nough to this to know,
but there were issues:
	(a) For sure, a bunch of UE10000 folks would have been laid-off,
	i.e., severance payments, $$. [Recall that McCracken was from HP,
	with HP-style ethics, and the {"we're offering you Oregon folks jobs
	in Silicon Valley" trick to get people to quit} game wouldn't play.

	(b) Although I never knew the details, I heard there were issues
	of existing CS6400 customers who had contracts that basically
	required the UE10000, or at least were heavily promised them.
	Aside from the dubious ethics of pulling the plug without at least
	attempting to find a buyer that would produce the product,
	there is also the issue that when you do that sort of thing,
	you are probably writing those customers off forever, because
	they get very peeved ... and there were some serious customers.
	[I've seen that happen before at other companies.]
	(c) Now, while Sun has made a lot of money on them, I believe that
	most *10000s have gone into customer applications that SGI would
	not have gotten anyway, due to application emphasis. The *10000s
	just are not much of a factor in HPC [TOP500 list notwithstanding:
	if you look carefully at the list, you'll find that most HPC1000s
	are doing DBMS work, i.e., I think 10000s took market from HP & IBM
	more than from SGI.]

(7) So, in summary:
	There were some good reasons for doing the acquisition,
	there were some dangers. I wouldn't have done it, because I'm
	a chicken when it comes to mergers, but there were enough upside
	possibilities to be worth trying to make it work.  There were
	also non-obvious strategic issues whose importance/reality still
	eludes me, because I doubt if anybody knows what would have happened
	if somebody else had bought Cray, and whether or not this was
	a good idea depends more on that [i.e., it is clear that the whole
	process caused SGI/Cray a lot of trouble, but it is not clear if,
	for example, Cray-as-part-of-{IBM, Sun, or DEC} would have been
	great for SGI, or a disaster.]

	Anyway, my personal opinion is that this decision was driven by
	at most a few people, and I'm not surprised that McCalpin
	can't find those particular people :-)
	That's all, folks....


--
-john mashey EMAIL:  mash@sgi.com  DDD: 650-933-3090 FAX: 650-933-2663
USPS:   SGI 1600 Amphitheatre Pkwy, ms 005, Mountain View, CA 94043-1351


From: mccalpin@gmp246.austin.ibm.com (McCalpin)
Newsgroups: comp.sys.super,comp.arch
Subject: Re: Climate, US, Japan & supers query
Date: 30 Apr 2001 16:15:15 GMT

In article <3AE9EFFA.18727167@execpc.com>,
Paul Foster  <pfoster@execpc.com> wrote:
>McCalpin wrote:
>>
>> In article <0pGF6.3654$AU4.293372@bgtnsc04-news.ops.worldnet.att.net>,
>> Stephen Fuld <s.fuld@worldnet.att.net> wrote:
>> >"McCalpin" <mccalpin@gmp246.austin.ibm.com> wrote in message
>> >news:9c6js0$l3m$1@ausnews.austin.ibm.com...
>> >
>>
>> Cray might have had a much more successful series of T3D/T3E machines
>> if they had recognized the value of large caches, and had worked
>> harder to make the machine cost-effective for smaller processor
>> counts.  That machine alone might have saved them, though it would
>> not have satisfied all their legacy customers.
>>
>
>And if SGI hadn't decided the T3E was competing in their customer base,
>CRAY would have continued the T3 line of computers.

That was not an easy call, especially since the T3E was Cray's
most economically successful computer during the merged years.

Unfortunately, because the T3E had no off-chip caches, it was
regularly beaten by the Origin2000 in benchmarks.  The typical
ratio was that the Origin was twice as fast as the T3E per cpu,
and although the T3E often scaled slightly better, it meant that
you had to be looking at a 512 cpu T3E to significantly outperform
a 256 processor Origin2000, but at about 4 times the price (2x for
the processor count, and 2x for the higher price per cpu of the T3E).

This was not a particularly appealing basis for a business model.
An EV6-based "T3F" could have been a very nice box.

Anyway, the bottom line was that Chippewa Falls was allowed to
continue with one project, and they decided on the SV2.  We
are still waiting to hear how that one turns out....
--
John D. McCalpin, Ph.D.           mccalpin@austin.ibm.com
Senior Scientist           IBM POWER Microprocessor Development
    "I am willing to make mistakes as long as
     someone else is willing to learn from them."


From: mccalpin@gmp246.austin.ibm.com (McCalpin)
Newsgroups: comp.sys.super,comp.arch
Subject: Re: Climate, US, Japan & supers query
Date: 30 Apr 2001 18:07:41 GMT

In article <Pine.SOL.4.33.0104301752110.18962-100000@holyrood.ed.ac.uk>,
Peter Boyle  <pboyle@holyrood.ed.ac.uk> wrote:
>
>On 30 Apr 2001, McCalpin wrote:
>
>> [...]  The typical
>> ratio was that the Origin was twice as fast as the T3E per cpu,
>
>Is the 2x before or after the stream buffers were fixed?

This was mostly true even after the stream buffers were fixed,
though the comparison changes somewhat (faster Origin processors
for comparison).


>I'm teaching my granny to suck eggs here, but 570MB/s vs 390MB/s (from
>your own web page) to the t3e means the 2x isn't universal.

STREAM is not a "real" application.  The T3E has fine main memory
bandwidth per processor, but the large caches on the Origin meant
that for real customer applications lots of the data was coming
from main memory on the T3E and from cache on the Origin.  The
Origin cache bandwidth was slightly more than 2x the T3E memory
bandwidth, which is not far off of what was seen in application
performance.

--
John D. McCalpin, Ph.D.           mccalpin@austin.ibm.com
Senior Scientist           IBM POWER Microprocessor Development
    "I am willing to make mistakes as long as
     someone else is willing to learn from them."


From: mccalpin@gmp246.austin.ibm.com (McCalpin)
Newsgroups: comp.arch
Subject: Re: CA-fest, california
Date: Fri, 18 Apr 2003 14:30:34 +0000 (UTC)
Message-ID: <b7p26a$bi2$1@ausnews.austin.ibm.com>

In article <45022fc8.0304171009.a301b3f@posting.google.com>,
Iain McClatchie <iain-3@truecircuits.com> wrote:
> I think Cray also built their stuff from bipolar
>gate arrays, until they tried CMOS, got acquired by SGI, and,
>well, what the heck happened after that?

Cray had just finished the development of three platforms
when they were purchased by SGI in 1996.

.. The T90 was bipolar.
.. The J90 was a CMOS implementation of the Cray Y/MP, but
	with a reduced memory subsystem.
.. The T3E used Alpha EV5 processors with some CMOS support
	chips designed by Cray.

It would take *way* too long to describe all the things that
happened after the acquisition, but the bottom line is that the
J90 was extended several generations into the Cray SV1 (which
appears to have been a very nice, solid performer, though it was
more of a mid-range machine than a supercomputer), while the T90
and T3E had no direct descendants.

To make a long story short, SGI management told the folks
in charge of development at Cray that they could build one new
system (rather than follow-ons to both the T90 and T3E).  The
system that the Cray folks chose to build had the code name of
SV2, and it has recently started shipping as the Cray X1.  It is
a new design that can be considered a hybrid of vector and MPP
architectures, and it is implemented with IBM CMOS technology.


>- About the SPARC based stuff down in San Diego (ex FPS folks):
>  he was really concerned about the people working there.  He
>  didn't want to just put them out of work, and he considered
>  selling them to Sun to be cheaper than paying layoff benefits.
>  Their project ended up being the Sun E10k, I think.

The decision was made before I joined SGI, but I agreed with it
at the time.  The E10000 was so obviously inferior to the
Origin2000 in its technical merits that we did not see it as a
competitive threat.   In fact, it was never a competitive threat
in SGI's core markets of scientific/technical/engineering
computing, but it certainly made Sun a lot of money in the
commercial broader marketplace.
--
John D. McCalpin, Ph.D.           mccalpin@austin.ibm.com
Senior Technical Staff Member     IBM POWER Microprocessor Development
    "I am willing to make mistakes as long as
     someone else is willing to learn from them."


Index Home About Blog