BLITZ
Misc. Technical Notes
Harry H. Porter
III, Ph.D.
Computer Science
Department
Portland State
University
This document contains a collection of miscellaneous material that I wanted to include but that didnÕt logically fir into other documents. This includes comments on unresolved compiler bugs and comments on some design tradeoffs I considered when designing the BLITZ system.
Students do not need to read this document.
Consider the following example. The CONTINUE statement does not work correctly:
for (i=1; i<10; i=i + 1)
printIntVar ("i", i)
if i == 5
continue
endIf
printIntVar ("i", i)
endFor
The reason is that the FOR statement is translated to the following WHILE statement by the parser:
i = 1
while i<10
printIntVar ("i", i)
if i == 5
continue
endIf
printIntVar ("i", i)
i = i + 1
endWhile
The CONTINUE causes a jump to the beginning of the while (to the point before the test), thereby missing the increment part.
In check.c, overflow can occur in laying out records, objects, and parameter lists. For example:
type BIGREC = record
a1: array [536870910] of int
a2: array [536870910] of int
a3: array [536870910] of int
a4: array [536870910] of int
a5: array [536870910] of int
endRecord
This is not detected, but should be.
This is a pretty nasty bug, but not something that students
will ever encounter.
In rare programs, this message is produced by the compiler.
********************************************************************
*****
***** PROGRAM LOGIC ERROR
*****
***** It appears that this compiler contains
a software bug.
***** I apologize for the inconvenience it
causes you.
*****
***** Error message: "Should find a
method for every method proto"
*****
********************************************************************
The problem arises when a class defined in one package
inherits a method from a class defined in another package.
This bug is not encountered by students, but if it were, the
work-around is simply to move the subclass into the same package as the
superclass.
Fixing this bug requires a major amount of work and is still
pending.
I considered including special instructions in the BLITZ instruction set to load a floating point register from a constant, in analogy with the ÒsethiÓ and ÒsetloÓ instructions, which load an integer register. For example, to load an integer register, we use code like this:
sethi 0xXXXX,r3
setlo 0xYYYY,r3
The idea was to have four instructions, each of which would load 2 bytes into a floating-point register. Code to load a floating-point register would then look like this:
fset1 0xWWWW,f3
fset2 0xXXXX,f3
fset3 0xYYYY,f3
fset4 0xZZZZ,f3
I ruled such instructions out. Instead, you must use code like this:
sethi 0xXXXX,r1
setlo 0xYYYY,r1
fload [r1],f3
...
.double 123.456
The code involving fset1, fset2, fset3, and fset4 is shorter (16 bytes), than the code sequence involving fload (20 bytes). However, I decided to avoid introducing the fset instructions since they complicated the instruction set. The ÒfloadÓ instruction and the Ò.doubleÓ pseudo-op were necessary anyway, so this solution was simpler.
The most general design philosophy of the BLITZ architecture is that memory is cheap and execution speed is irrelevant, whereas simplicity is of paramount importance.
Several options were also considered with regard to the
floating-point compare instruction (ÒfcmpÓ) and the conditional branching
instructions.
One option (which was not chosen) was to introduce several new condition code bits in the status register, to reflect the outcome of the ÒfcmpÓ instruction. This would require a separate set of branch instructions to test these new bits. Thus, we would have both a ÒfbeÓ and a ÒbeÓ instruction to branch Òif equal.Ó This is obviously more complex and was ruled out.
Another option (also ruled out), was to use the same bits in the Status Register for both integer and floating-point comparisons, but to have a separate set of branch instructions, i.e., to have both ÒfbeÓ and ÒbeÓ.
Floating-point numbers are represented using the IEEE ÒStandard for Binary Floating Point ArithmeticÓ (ANSI / IEEE Std 754-1985). BLITZ supports only double-precision numbers, not single- or quad-precision numbers. We make no claim that the IEEE standard is supported correctly or completely; much of the implementation is simply inherited from the underlying ÒCÓ language implementation on which BLITZ is built.
A Òdouble precisionÓ floating-point number is represented with two words (8 bytes):
byte 1 byte 2 byte 3 byte 4
==== ==== ==== ====
==== ==== ==== ====
SEEE EEEE EEEE XXXX
XXXX XXXX XXXX XXXX
byte 5 byte 6 byte 7 byte 8
==== ==== ==== ====
==== ==== ==== ====
XXXX XXXX XXXX XXXX
XXXX XXXX XXXX XXXX
Where
S
= 1-bit sign bit
EEEE...EEEE =
11-bit exponent field
XXXX...XXXX =
52-bit fraction field
The sign bit is 0=positive, 1=negative.
The fraction field is 52 bits. With decimal and binary numbers, leading zeros are always insignificant and are often omitted. With a decimal number, the leading non-zero digit can be anything between 1 and 9; with binary numbers, the leading non-zero bit will always be a 1. Therefore, with binary numbers, the leading 1 need not be represented; it may be implicit. In double-precision floating-point numbers, the 52-bits of the fraction field give 53 bits of accuracy. The fractional part (call it F) is thus:
F = 1.XXXX...XXXX
F is then raised to some power of 2, as given by the exponent field.
The exponent is 11 bits and can therefore be interpreted as an unsigned integer between 0 and 2047. If the exponent field is between 1 and 2046, then a normal number is being represented; if the exponent field is 0 or 2048, then it is a special case as discussed below.
When the exponent field is between 1 and 2046, you should subtract 1023 to obtain the actual exponent. That is, after subtracting 1023, you get a number which we can call M.
To get the number being represented, take F and multiply it by 2, raised to the power M. Then, adjust the sign, according to the S bit.
The range of numbers representable with double precision floating-point is:
Smallest Number: 2.2250738585072014E-308
Largest Number: 1.7976931348623157E+308
Precision: about 17 decimal
digits of accuracy
If the exponent is 2047 (i.e., all 1's), it signals a special case. Examples with exponent equal all ones are:
0x7FF00000 00000000 = POSITIVE-INFINITY
0xFFF00000 00000000 = NEGATIVE-INFINITY
0xFFFFFFFF FFFFFFFF = NaN (i.e., ÒNot-a-NumberÓ)
Positive and negative infinity can result from division by zero. Not-a-number indicates that an error has occurred, such as the square root of a negative number. The operations (add, multiply, etc.) are defined on these special case values in fairly logical ways. Any operation on a not-a-number value will yield a not-a-number result.
If the exponent is all zeros, it signals a ÒsubnormalÓ number. These are small numbers, close to zero. They are represented slightly differently than shown in the formula above. These numbers also have a reduced precision. In particular, the implicit leading 1-bit assumed for normal numbers is no longer assumed. For subnormal numbers, we have
F = 0.XXXX...XXXX
The BLITZ architecture document says that if a page-readonly or page-invalid exception will occur during an instruction, then the instruction will have absolutely no effect and the exception will be processed as if the instruction had no been attempted.
This is not strictly true.
Consider an instruction (such as load) which reads a word from memory. Assume that the instruction fetch causes no problems, but that an exception occurs during the reading of the target word. For example, assume the target word causes a page-invalid exception. The instruction will be cancelled and the registers will be unchanged. The PC will not have been advanced, so it will appear that the instruction has not been attempted. However, one change to the system state may occur.
Recall that whenever a page is touched (i.e., read from) then its page table entry will have its ÒreferencedÓ bit set. In this example, the CPU will set the referenced bit to 1, indicating that the page was used. Normally, during an instruction flow, the previous few instruction will be on the same page as the problematic instruction, so the referenced bit will already have been set. In such case, there would be no change. But it is possible that the problematic instruction lies in a page that has not been previously referenced. (Perhaps the flow of control has just crossed a page boundary; this will happen on average every 2048 instructions, so it is a fairly common occurrence.)
There could be a subtle race problem here, if the operating system relies on the referenced bit. Perhaps the OS logic goes something like this, ÒTry to begin instruction execution. After an exception, bring in the necessary pages and re-start instruction execution. If memory frames are in short supply, then it is ok to page this processÕs pages out; weÕll just get the same fault again later. However, make sure that we are making positive progress on each process. Make sure we executed at least one instruction. If we have not executed any instructions since the last time slice, then do not page this processÕs pages out. Instead, take frames from another process until this process has executed at least one instruction. Check the referenced bit to see if this process has made progress.Ó
Obviously, the problem is in the last sentence, ÒCheck the referenced bitÉÓ This is an unreliable way to determine if a process has made progress. However, checking the PC is not reliable either, since it may happen to be unchanged, due to a looping process.
Another place that we can have page-table problems is with the ÒfloadÓ instruction. This instruction reads two words from memory, in addition to the instruction fetch. It may be that the first word is fetched okay, but an exception occurred on the second word. This could occur if the doubleword being fetched happens to straddle a page boundary. As before, the ÒreferencedÓ bit will be set for the frame containing the first word of the doubleword. The exception will then occur for the second word of the doubleword. Note that the floating-point register will be completely unaffected; i.e., it will not be ÒhalfÓ loaded with just the first word.
A similar problem occurs with storing into memory, with the Òfstore.Ó This instruction will store two words, and will mark the page table entries for these two words Òdirty.Ó Usually, both words of the doubleword will be within the same page so that the entry will either be marked and the operation completed or will be unmarked and an exception will occur. However, if the doubleword happens to straddle a page boundary and there is a page-invalid or page-readonly fault for the second page, the entry for the first page will be marked dirty, even though no word has been written. In other words, the page table entries are both updated before any writing to memory occurs.
Note that fstore does not do an atomic store. In other words, memory is not locked between the write of the first half of the doubleword and the write of the second half of the double word. In a multi-processor implementation, it is possible that another process doing a write to the same doubleword will overlap and the value stored will be half of one value and half of the other value, and thus meaningless. It is the compilerÕs responsibility to protect all fstore instructions with some sort of concurrency control if there is any possibility of concurrent access by multiple processes.
This would be a truly subtle, obscure, and hard-to-trace bug. It would result in no more than an apparent loss of less-significant bits. A value would appear to be approximately correct, but the final 32 bits would be incorrect, resulting in nothing more than a loss of accuracy.
Note that care must be taken in the operating system whenever doubleword values are stored and there is a possibility of concurrency.
Note that we may get into somewhat of a race condition in the OS. The fload and fstore instructions may require as many as 3 pages to be in memory at once: (1) the page containing the instruction, (2) the page containing the first half of the doubleword, and (3) the page containing the second half of the doubleword.
Consider the following instruction:
sub 0x12345678,r3
The assembler can deduce that this value will not fit into 16 bits and will issue a warning. The assembler will use the least significant 16 bits, and will assemble the program as if the following had been coded:
sub 0x00005678,r3
For instructions requiring a 16-bit sign-extended literal value, the assembler will ensure that when sign-extension from 16 to 32 bits occurs, the value will be unchanged. The assembler will issue a warning (not a fatal error) whenever the value is outside of the range
0xffff8000 through 0x00007fff (inclusive)
In decimal, this range is
-32768 through 32767 (inclusive)
During expression evaluation, overflow may occur. Consider this instruction:
sub r3,0x80000004+0x80000005,r6
In decimal, these numbers are:
-2147483644 and -2147483643
so this instruction is equivalent to
sub r3,-2147483644 + -2147483643,r6
All computation is performed 32-bit arithmetic and any overflow is ignored. The result of the addition in decimal is:
-4294967287
This number cannot be represented in 32-bit twoÕs complement. In hex, the result of the addition is
...FFFF00000009
When truncated to 32 bits, the value becomes
0x00000009
which is incorrect. Since this new value can be represented with only 16 bits, no error or warning will be issued. It will be as if the programmer had coded the following:
sub r3,0x0009,r6
or
sub r3,9,r6
Next, consider the following instruction, where ÒmyExternalSymbolÓ is defined in another file:
.import myExternalSymbol
sub r3,myExternalSymbol,r6
The actual value cannot be known by the assembler, so it is impossible at assembly time to determine whether it will fit into 16 bits or not. When the linker determines the actual value, the linker will issue a warning if the value is not in the range:
0xffff8000 through 0x00007fff (inclusive)
This is a warning and not a fatal error. The linker will simply use the least significant 16 bits and proceed with that value. If this warning is ignored, the program will almost certainly malfunction.
There may be expressions in which one value is known by the assembler and the other is not. For example:
.import myExternalSymbol
sub r3,myExternalSymbol + -2147483643,r6
Assume that the actual value of ÒmyExternalSymbolÓ, given in some other file, is
.export myExternalSymbol
myExternalSymbol = -2147483644
The linker will perform the addition and the result, which will overflow 32 bits, will be truncated to 32 bits. No error or warning will be issued, since the truncated value can be represented in 16 bits.
Now consider this example:
.import
myExternalSymbol2
sub r3,myExternalSymbol2 + 0x11110000,r6
The value 0x11110000 exceeds the 16 bit limit but the assembler will not issue and error or warning since it cannot determine the final value of the expression.
The linker will perform the addition. As before, the linker will use 32 bit arithmetic (ignoring overflow) and will issue a warning if and only if the resulting 32 bit value cannot be represented in 16 bits.