EPIC vs. The World

Ian D. Romanick
21-August-1998

[This article was originally written in response to the August 19th edition of Ask Grandmaster B. This article can also be found on Voodoo Extreme]

Disclamer: Anything written in this article is my opinion and mine alone. As such, these opinions may not represent those held by Sequent and should not be considered as Sequent policy.

I finally got around to reading your comments on CISC/RISC/EPIC. There are a couple of things that I wanted to clarify about both the relationship of CISC to RISC and RISC to EPIC, but mostly RISC vs. EPIC.

I don't think that your comments about RISC compiler technology are completely accurate. I realize that it was possible to take the output from most compilers on most platforms are re-optimize it. However, this was the case on both RISC and CISC platforms. There were two main design goals behind the RISC paradigm.

Get rid of instructions and addressing modes that compilers don't or can't use.
Create cores that pipeline (and later can be made superscalar) easier.

Most CPUs, such as the VAX, x86, and 68k, had lots of instructions and addressing modes that compilers just were not smart enough to use. I remember in 1993 trying to get three different 680x0 compilers to generate code that used either the postincrement or predecrement addressing modes, and none of them would even for code like that shown in figure 1. In every single case the compiler would generate some variant of figure 2.

	size_t strlen( const char * s )
	{
	    size_t   len;
	    
	    len = 0;
	    while( *(s++) != '\0' )
	        len++;

	    return len;
	}

Figure 1 - C code that should use postincrement

	strlen:
		moveq	#0,d0
		movea.l	4(sp),a0
		bra	.L1
		
	.L0:	addq.l	#1,a0
		addq.l	#1,d0
	.L1:	tst.b	(a0)
		bne	.L0
		
		rts

Figure 2 - Assembly output for strlen function

The designers of RISC CPUs realized that most people coded in some compiled language. They then realized that it did not make any sense to use valuable chip space for instructions and addressing modes that language compilers were not going to use. It's like selling toilets to people that don't have pluming.

EPIC, on the other hand, tries to address something very different. Superscalar CPUs have been around for quite some time now. There are two big problems in designing a superscalar CPU. The first problem is in deciding which instructions can be executed in parallel. Huge amounts of chip space in both the Alpha 21264 and the PentiumII are dedicated to this task. The second problem is in finding enough instructions that can be executed in parallel. Most C code results in a branch instruction after an average of five sequential instructions, and this makes it impossible to make most code run in parallel. Digital, Intel, and others have spent huge amounts of chip space trying to solve this problem too. These are the problems that EPIC tries to solve.

Instead of having the CPU guess at what instructions can be executed in parallel, EPIC requires that the compiler explicitly tell the CPU which instructions can execute in parallel. You picked up on this in your original response. However, if this were the extent of it, EPIC wouldn't represent any advancement over the other VLIW (very long instruction word) architectures from the mid 1980s.

EPIC goes beyond existing VLIW architectures by trying to solve the branching problem. EPIC, and IA64 specifically, do this by allowing the compiler to explicitly specify that both code paths of an if-statement can be executed in parallel. The "Inside IA-64" article in the June 1998 issue of Byte does a pretty good job of explaining how this works.

All of this sounds pretty good. EPIC will solve all of our CPU performance problems, make our games and databases run faster, feed starving children in Africa, etc., etc. However, I'm a sceptic, and as such I see a number of problems with both EPIC in general and Intel's implementation of EPIC.

EPIC requires that compiler technology adapt to the new architecture. This is a big departure from RISC. With RISC, the CPU architecture adapted to compiler technology.
EPIC expects the compiler to be able to optimize code that it can't see (such as shared libraries).
IA-64 specifically will require that code be recompiled for each new chip that comes out. The expected performance difference is even larger than with current CISC or RISC architectures.
Most of Intel's hype about the performance of Merced came from it being the first chip produced in a .18 fab. Since the Merced release has been pushed back, several other chips (including the PentiumII) will beat it to .18.

In general, I think that EPIC is a good idea, but I don't think it's the answer to CPU performance problems. I also don't think that Merced will give that much of a performance boost, if any at all, over the other processors that will be on the market when it is unveiled.