Technical Notes

BLITZ

Misc. Technical Notes

Harry H. Porter III, Ph.D.

Computer Science Department

Portland State University

Introduction

This document contains a collection of miscellaneous material that I wanted to include but that didn’t logically fir into other documents. This includes comments on unresolved compiler bugs and comments on some design tradeoffs I considered when designing the BLITZ system.

Students do not need to read this document.

Known Compiler Bug – CONTINUE in FOR Statement

Consider the following example. The CONTINUE statement does not work correctly:

for (i=1; i<10; i=i + 1)

printIntVar ("i", i)

if i == 5

continue

endIf

printIntVar ("i", i)

endFor

The reason is that the FOR statement is translated to the following WHILE statement by the parser:

i = 1

while i<10

printIntVar ("i", i)

if i == 5

continue

endIf

printIntVar ("i", i)

i = i + 1

endWhile

The CONTINUE causes a jump to the beginning of the while (to the point before the test), thereby missing the increment part.

Known Compiler Bug – Really Huge Records

In check.c, overflow can occur in laying out records, objects, and parameter lists. For example:

type BIGREC = record

a1: array [536870910] of int

a2: array [536870910] of int

a3: array [536870910] of int

a4: array [536870910] of int

a5: array [536870910] of int

endRecord

This is not detected, but should be.

This is a pretty nasty bug, but not something that students will ever encounter.

Known Compiler Bug – Cross-Package Inheritance

In rare programs, this message is produced by the compiler.

********************************************************************

*****

***** PROGRAM LOGIC ERROR

*****

***** It appears that this compiler contains a software bug.

***** I apologize for the inconvenience it causes you.

*****

***** Error message: "Should find a method for every method proto"

*****

********************************************************************

The problem arises when a class defined in one package inherits a method from a class defined in another package.

This bug is not encountered by students, but if it were, the work-around is simply to move the subclass into the same package as the superclass.

Fixing this bug requires a major amount of work and is still pending.

BLITZ Floating-Point Architecture – Design Tradeoffs

I considered including special instructions in the BLITZ instruction set to load a floating point register from a constant, in analogy with the “sethi” and “setlo” instructions, which load an integer register. For example, to load an integer register, we use code like this:

sethi 0xXXXX,r3

setlo 0xYYYY,r3

The idea was to have four instructions, each of which would load 2 bytes into a floating-point register. Code to load a floating-point register would then look like this:

fset1 0xWWWW,f3

fset2 0xXXXX,f3

fset3 0xYYYY,f3

fset4 0xZZZZ,f3

I ruled such instructions out. Instead, you must use code like this:

sethi 0xXXXX,r1

setlo 0xYYYY,r1

fload [r1],f3

...

.double 123.456

The code involving fset1, fset2, fset3, and fset4 is shorter (16 bytes), than the code sequence involving fload (20 bytes). However, I decided to avoid introducing the fset instructions since they complicated the instruction set. The “fload” instruction and the “.double” pseudo-op were necessary anyway, so this solution was simpler.

The most general design philosophy of the BLITZ architecture is that memory is cheap and execution speed is irrelevant, whereas simplicity is of paramount importance.

Several options were also considered with regard to the floating-point compare instruction (“fcmp”) and the conditional branching instructions.

One option (which was not chosen) was to introduce several new condition code bits in the status register, to reflect the outcome of the “fcmp” instruction. This would require a separate set of branch instructions to test these new bits. Thus, we would have both a “fbe” and a “be” instruction to branch “if equal.” This is obviously more complex and was ruled out.

Another option (also ruled out), was to use the same bits in the Status Register for both integer and floating-point comparisons, but to have a separate set of branch instructions, i.e., to have both “fbe” and “be”.

Double-Precision Floating Point Values

Floating-point numbers are represented using the IEEE “Standard for Binary Floating Point Arithmetic” (ANSI / IEEE Std 754-1985). BLITZ supports only double-precision numbers, not single- or quad-precision numbers. We make no claim that the IEEE standard is supported correctly or completely; much of the implementation is simply inherited from the underlying “C” language implementation on which BLITZ is built.

A “double precision” floating-point number is represented with two words (8 bytes):

byte 1 byte 2 byte 3 byte 4

==== ==== ==== ==== ==== ==== ==== ====

SEEE EEEE EEEE XXXX XXXX XXXX XXXX XXXX

byte 5 byte 6 byte 7 byte 8

==== ==== ==== ==== ==== ==== ==== ====

XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

Where

S = 1-bit sign bit

EEEE...EEEE = 11-bit exponent field

XXXX...XXXX = 52-bit fraction field

The sign bit is 0=positive, 1=negative.

The fraction field is 52 bits. With decimal and binary numbers, leading zeros are always insignificant and are often omitted. With a decimal number, the leading non-zero digit can be anything between 1 and 9; with binary numbers, the leading non-zero bit will always be a 1. Therefore, with binary numbers, the leading 1 need not be represented; it may be implicit. In double-precision floating-point numbers, the 52-bits of the fraction field give 53 bits of accuracy. The fractional part (call it F) is thus:

F = 1.XXXX...XXXX

F is then raised to some power of 2, as given by the exponent field.

The exponent is 11 bits and can therefore be interpreted as an unsigned integer between 0 and 2047. If the exponent field is between 1 and 2046, then a normal number is being represented; if the exponent field is 0 or 2048, then it is a special case as discussed below.

When the exponent field is between 1 and 2046, you should subtract 1023 to obtain the actual exponent. That is, after subtracting 1023, you get a number which we can call M.

To get the number being represented, take F and multiply it by 2, raised to the power M. Then, adjust the sign, according to the S bit.

The range of numbers representable with double precision floating-point is:

Smallest Number: 2.2250738585072014E-308

Largest Number: 1.7976931348623157E+308

Precision: about 17 decimal digits of accuracy

If the exponent is 2047 (i.e., all 1's), it signals a special case. Examples with exponent equal all ones are:

0x7FF00000 00000000 = POSITIVE-INFINITY

0xFFF00000 00000000 = NEGATIVE-INFINITY

0xFFFFFFFF FFFFFFFF = NaN (i.e., “Not-a-Number”)

Positive and negative infinity can result from division by zero. Not-a-number indicates that an error has occurred, such as the square root of a negative number. The operations (add, multiply, etc.) are defined on these special case values in fairly logical ways. Any operation on a not-a-number value will yield a not-a-number result.

If the exponent is all zeros, it signals a “subnormal” number. These are small numbers, close to zero. They are represented slightly differently than shown in the formula above. These numbers also have a reduced precision. In particular, the implicit leading 1-bit assumed for normal numbers is no longer assumed. For subnormal numbers, we have

F = 0.XXXX...XXXX

Exceptions During “fload” and “fstore” /

Page Faults and Race Conditions

The BLITZ architecture document says that if a page-readonly or page-invalid exception will occur during an instruction, then the instruction will have absolutely no effect and the exception will be processed as if the instruction had no been attempted.

This is not strictly true.

Consider an instruction (such as load) which reads a word from memory. Assume that the instruction fetch causes no problems, but that an exception occurs during the reading of the target word. For example, assume the target word causes a page-invalid exception. The instruction will be cancelled and the registers will be unchanged. The PC will not have been advanced, so it will appear that the instruction has not been attempted. However, one change to the system state may occur.

Recall that whenever a page is touched (i.e., read from) then its page table entry will have its “referenced” bit set. In this example, the CPU will set the referenced bit to 1, indicating that the page was used. Normally, during an instruction flow, the previous few instruction will be on the same page as the problematic instruction, so the referenced bit will already have been set. In such case, there would be no change. But it is possible that the problematic instruction lies in a page that has not been previously referenced. (Perhaps the flow of control has just crossed a page boundary; this will happen on average every 2048 instructions, so it is a fairly common occurrence.)

There could be a subtle race problem here, if the operating system relies on the referenced bit. Perhaps the OS logic goes something like this, “Try to begin instruction execution. After an exception, bring in the necessary pages and re-start instruction execution. If memory frames are in short supply, then it is ok to page this process’s pages out; we’ll just get the same fault again later. However, make sure that we are making positive progress on each process. Make sure we executed at least one instruction. If we have not executed any instructions since the last time slice, then do not page this process’s pages out. Instead, take frames from another process until this process has executed at least one instruction. Check the referenced bit to see if this process has made progress.”

Obviously, the problem is in the last sentence, “Check the referenced bit…” This is an unreliable way to determine if a process has made progress. However, checking the PC is not reliable either, since it may happen to be unchanged, due to a looping process.

Another place that we can have page-table problems is with the “fload” instruction. This instruction reads two words from memory, in addition to the instruction fetch. It may be that the first word is fetched okay, but an exception occurred on the second word. This could occur if the doubleword being fetched happens to straddle a page boundary. As before, the “referenced” bit will be set for the frame containing the first word of the doubleword. The exception will then occur for the second word of the doubleword. Note that the floating-point register will be completely unaffected; i.e., it will not be “half” loaded with just the first word.

A similar problem occurs with storing into memory, with the “fstore.” This instruction will store two words, and will mark the page table entries for these two words “dirty.” Usually, both words of the doubleword will be within the same page so that the entry will either be marked and the operation completed or will be unmarked and an exception will occur. However, if the doubleword happens to straddle a page boundary and there is a page-invalid or page-readonly fault for the second page, the entry for the first page will be marked dirty, even though no word has been written. In other words, the page table entries are both updated before any writing to memory occurs.

Note that fstore does not do an atomic store. In other words, memory is not locked between the write of the first half of the doubleword and the write of the second half of the double word. In a multi-processor implementation, it is possible that another process doing a write to the same doubleword will overlap and the value stored will be half of one value and half of the other value, and thus meaningless. It is the compiler’s responsibility to protect all fstore instructions with some sort of concurrency control if there is any possibility of concurrent access by multiple processes.

This would be a truly subtle, obscure, and hard-to-trace bug. It would result in no more than an apparent loss of less-significant bits. A value would appear to be approximately correct, but the final 32 bits would be incorrect, resulting in nothing more than a loss of accuracy.

Note that care must be taken in the operating system whenever doubleword values are stored and there is a possibility of concurrency.

Note that we may get into somewhat of a race condition in the OS. The fload and fstore instructions may require as many as 3 pages to be in memory at once: (1) the page containing the instruction, (2) the page containing the first half of the doubleword, and (3) the page containing the second half of the doubleword.

Overflows in Expression Evaluation

During Assembly and Linking

Consider the following instruction:

sub 0x12345678,r3

The assembler can deduce that this value will not fit into 16 bits and will issue a warning. The assembler will use the least significant 16 bits, and will assemble the program as if the following had been coded:

sub 0x00005678,r3

For instructions requiring a 16-bit sign-extended literal value, the assembler will ensure that when sign-extension from 16 to 32 bits occurs, the value will be unchanged. The assembler will issue a warning (not a fatal error) whenever the value is outside of the range

0xffff8000 through 0x00007fff (inclusive)

In decimal, this range is

-32768 through 32767 (inclusive)

During expression evaluation, overflow may occur. Consider this instruction:

sub r3,0x80000004+0x80000005,r6

In decimal, these numbers are:

-2147483644 and -2147483643

so this instruction is equivalent to

sub r3,-2147483644 + -2147483643,r6

All computation is performed 32-bit arithmetic and any overflow is ignored. The result of the addition in decimal is:

-4294967287

This number cannot be represented in 32-bit two’s complement. In hex, the result of the addition is

...FFFF00000009

When truncated to 32 bits, the value becomes

0x00000009

which is incorrect. Since this new value can be represented with only 16 bits, no error or warning will be issued. It will be as if the programmer had coded the following:

sub r3,0x0009,r6

sub r3,9,r6

Next, consider the following instruction, where “myExternalSymbol” is defined in another file:

.import myExternalSymbol

sub r3,myExternalSymbol,r6

The actual value cannot be known by the assembler, so it is impossible at assembly time to determine whether it will fit into 16 bits or not. When the linker determines the actual value, the linker will issue a warning if the value is not in the range:

0xffff8000 through 0x00007fff (inclusive)

This is a warning and not a fatal error. The linker will simply use the least significant 16 bits and proceed with that value. If this warning is ignored, the program will almost certainly malfunction.

There may be expressions in which one value is known by the assembler and the other is not. For example:

.import myExternalSymbol

sub r3,myExternalSymbol + -2147483643,r6

Assume that the actual value of “myExternalSymbol”, given in some other file, is

.export myExternalSymbol

myExternalSymbol = -2147483644

The linker will perform the addition and the result, which will overflow 32 bits, will be truncated to 32 bits. No error or warning will be issued, since the truncated value can be represented in 16 bits.

Now consider this example:

.import myExternalSymbol2

sub r3,myExternalSymbol2 + 0x11110000,r6

The value 0x11110000 exceeds the 16 bit limit but the assembler will not issue and error or warning since it cannot determine the final value of the expression.

The linker will perform the addition. As before, the linker will use 32 bit arithmetic (ignoring overflow) and will issue a warning if and only if the resulting 32 bit value cannot be represented in 16 bits.