Users and
programmers are under the illusion they communicate with the computer through
higher-level languages or assembly languages. In either case, the communication
is through symbols. These are not understood by the target system, which only
interprets bits, binary symbols 0 and 1. However, it can interpret strings of
bits of various length. These strings result from mapping symbols.
This section
discusses instructions, their representation, encoding and decoding, and the
length of instructions and data.
·
Motivation
·
Definitions
·
Instruction
Encoding and Decoding
·
Sample
Assembler Source Program
·
Listing of
Sample Assembler Program
·
Exercises
·
Literature
References
·
Target
understands (interprets) solely bits
·
Humans
express themselves in symbols, not bits
·
Communicating
through bit strings too tedious, time consuming, sickening
·
Abstract
symbols must be encoded as bit strings
·
Bit strings
must be decoded by target hardware
·
After
decoding bit strings, target can execute
Software
tool that maps abstract assembler text into binary code. Some addresses may not
be resolved in a multi-module program or on a relocatable target.
String
of bits which, when interpreted according to conventions, results in executable
programs and/or corresponding data.
Complex
Instruction Set Computer: An architecture whose instructions vary in length,
depending on the number and type of operands. Generally, such an architecture
must decode in discrete steps the string of bits constituting an instruction.
The result of one partial decode defines the next decoding (and interpretation)
step. This generally extends the time perceived as instruction execution time.
Process
of breaking strings of bits into substrings according to position and value.
These substrings then can be interpreted.
Process
of mapping symbols into strings of bits according to rules and convenmtions.
Software
tool that combines 1 or more assembler outputs into one binary code object.
Typically, a linker resolves external names, includes libraries that were
assumed to exist, and provides the last information to the loader regarding
address locations that were not resolvable at link-time.
Bits
in an instruction that identify the meaning and purpose of the instruction plus
instruction’s operands.
Reduced
Instruction Set Computer, one in which the length of instructions is
consistent, typically 32 bits. The time to execute any one instruction is
generally unit. Most operations are interpreted by hardware, not microcode.
Instruction length does not need to be decoded; it is known a-priori.
·
Assembler
and compiler encode instructions and (some) data
·
Hardware
(processor decoding unit) and disassembler decode binary code
·
After the
decode, the hardware can execute instructions
·
Detail of
decoding:
·
Find first
instructions of complete program
·
Determine
the length; easy in RISK architecture; requires interpretation on CISC
architecture
·
Find the
next instruction; equivalent to finding next value of instruction pointer
·
At each
step: fetch sufficient bytes (in one or multiple steps) to decode complete
instruction with all of its operands
·
Fixed-field
encoding reserves predetermined number and location of bits for opcode
·
For
example, it would be possible to dedicate always the first (leftmost, lowest
addressed) byte of 1 or more instruction bytes for the opcode in a CISC architecture
·
Huffman
encoding reserves increasing number of bits for instructions with decreasing
frequency of execution.
·
Thus,
Huffman encoding allows the most used (run-time) instruction to be identifiable
with a single bit, but all other instructions may then not use that bit, i.e.
must have this bit set to 0
·
Other
opcodes in Huffman encoding use further bits for encoding
·
Length
decoding simple on RISC architecture; length known a-priori
· Opcode identification on RISC is stil possible either way, fixed-field of using Huffman encoding
·
Assembler
source below run through masm
·
Masm
generate optional listing
·
Listing
shows the way data are encoded, with initial values, if any
·
Listing
shows addresses of data
·
Also shows
how instructions are encoded
·
And shows
addresses of instructions
·
Addresses
of two adjacent instructions allows computation of length of prior instruction
·
Program
below defines several simple integer data objects
·
Some
integers (dw) are initialized
·
Others are
intentionally left uninitialized (via ?)
·
Uninitialized
scalars are still set to 0 by masm; since any value is allowed, 0 is permitted
too
·
Arrays are
defined using the dup pseudo-op
·
Arrays
defined to be left uninitialized are not preset by the Microsoft assembler masm
; Source file: arith.asm
; Author: Herb Mayer
; Date: 2-8-1997
; Purpose: arithmetic operations and listing
start macro ;
no parameters
mov ax, @data ; @data predefined macro
mov ds, ax ; now data segment reg set
endm ;
end macro start
Done macro ret_code ; if all o.k.: ret_code will be 0
mov ah, 4ch ; DOS routine to terminate
mov al, ret_code ; we wanna terminate, ah + al
int 21h ; terminate finally
endm ;
and macro Done
.model small ; assumes stack data code
.stack 100h ; assumes name: stack
.data ;
assumes name: data
d_first dw +0
w0 dw +109
w1 dw ?
a1 dw 1 dup( ? )
w2 dw -109
a2 dw 5 dup( 0ffffh )
a3 dw 5 dup( 3 dup( 0deadh ) )
a4 dw 5 dup( 0beefh, 0deedh, 0babeh )
.code ;
assumes name: code
arith proc ;
include a few arithmetic ops
mov ax, 100 ; literal into ax
mov cx, 100 ; literal into non ax
mov ax, w1 ; reloc into ax
mov cx, w1 ; reoc into non ax
mul cx
imul cx
neg cx
div ax
idiv ax
ret ;
back to caller
arith endp
main proc
start
call arith
Done 0
main endp
end main ;
start here!
; Source file: arith.asm
; Author: Herb
Mayer
; Date: 2-8-1997
; Purpose:
arithmetic operations and listing
start macro ;
no parameters
mov ax, @data ; @data predefined macro
mov ds, ax ; now data segment reg set
endm ;
end macro start
Done macro ret_code ; if all o.k.: ret_code will be 0
mov ah, 4ch ; DOS routine to terminate
mov al, ret_code ; we wanna terminate, ah + al
int 21h ; terminate finally
endm ;
and macro Done
.model small ; assumes stack data code
.stack 100h ; assumes name: stack
.data ;
assumes name: data
0000 0000 d_first dw +0
0002 006D w0 dw +109
0004 0000 w1 dw ?
0006 0001[ a1 dw 1
dup( ? )
???? ]
0008 FF93 w2 dw -109
000A 0005[ a2 dw 5
dup( 0ffffh )
FFFF ]
0014 0005[ a3 dw 5
dup( 3 dup( 0deadh ) )
0003[
DEAD ] ]
0032 0005[ a4 dw 5
dup( 0beefh, 0deedh, 0babeh )
BEEF
DEED
BABE ]
.code ;
assumes name: code
0000 arith proc ;
include a few arithmetic ops
0000 B8 0064 mov ax,
100 ; literal into ax
0003 B9 0064 mov cx,
100 ; literal into non ax
0006 A1 0004 R mov ax,
w1 ; reloc into ax
0009 8B 0E 0004 R mov cx,
w1 ; reloc into non ax
000D F7 E1 mul cx
000F F7 E9 imul cx
0011 F7 D9 neg cx
0013 F7 F0 div ax
0015 F7 F8 idiv ax
0017 C3 ret ;
back to caller
0018 arith endp
0018 main proc
start
0018 B8 ---- R 1 mov ax,
@data ; @data predefined macro
001B 8E D8 1 mov ds, ax ;
now data segment reg set
001D E8 0000 R call arith
Done 0
0020 B4 4C 1 mov ah, 4ch ;
DOS routine to terminate
0022 B0 00 1 mov al, 0 ;
we wanna terminate, ah + al
0024 CD 21 1 int 21h ;
terminate finally
0026 main endp
end
main ; start here!
Macros:
N a m e Lines
DONE . . . . . . . . . . .
. . . 3
START . . . . . . . . . . . . . 2
Segments and Groups:
N
a m e Length Align Combine
Class
DGROUP . . . . . . . . . .
. . . GROUP
_DATA . . . . . . . . . .
. . 0050 WORD PUBLIC 'DATA'
STACK . . . . . . . . . .
. . 0100 PARA STACK 'STACK'
_TEXT . . . . . . . . . . . . . 0026 WORD PUBLIC 'CODE'
Symbols:
N
a m e Type Value Attr
A1 . . . . . . . . . . . .
. . . L
WORD 0006 _DATA
A2 . . . . . . . . . . . .
. . . L
WORD 000A _DATA Length = 0005
A3 . . . . . . . . . . . .
. . . L
WORD 0014 _DATA Length = 0005
A4 . . . . . . . . . . . .
. . . L
WORD 0032 _DATA Length = 0005
ARITH . . . . . . . . . . . . . N
PROC 0000 _TEXT Length = 0018
D_FIRST . . . . . . . . . . . . L
WORD 0000 _DATA
MAIN . . . . . . . . . . .
. . . N
PROC 0018 _TEXT Length = 000E
W0 . . . . . . . . . . . .
. . . L
WORD 0002 _DATA
W1 . . . . . . . . . . . .
. . . L
WORD 0004 _DATA
W2 . . . . . . . . . . . .
. . . L
WORD 0008 _DATA
@CODE . . . . . . . . . . . . . TEXT _TEXT
@CODESIZE . . . . . . . . . . . TEXT 0
@CPU . . . . . . . . . . .
. . . TEXT 0101h
@DATASIZE . . . . . . . . . . . TEXT 0
@FILENAME . . . . . . . . . . . TEXT arith
@VERSION . . . . . . . . .
. . . TEXT 510
53 Source Lines
58 Total Lines
29 Symbols
47700 + 404728 Bytes symbol space free
0 Warning Errors
0 Severe Errors