|
Turbo Pascal 3.0 Compiler / Code Generation Internals
|
|
The following is documentation I created after reverse engineering the
Turbo Pascal 3.01A compiler. While many features, e.g. units and objects,
have been added, today's compiler is still related to the old code.
Before you flame me about stuff that has been fixed, remember that this
is about an OLD version of the compiler.
NEW: This file will generate disassembled, commented source from YOUR
3.01A compiler -> SCG.ZIP
|
|
Compiler Structure |
|
Compilers usually consist of the following functional groups:
- Lexical analysis
- Syntactical analysis (parser)
- Optimizer
- Code generator
- Symbol table manager
- Error handler
TURBO Pascal isn't your usual compiler... The parser is interspersed with
portions of the code generator, and there is no optimizer. Most compilers
need multiple passes to do their work, but TURBO is a (faster)
single pass compiler.
Fortunately, programming languages of the Pascal type are designed with simple
compilation in one pass in mind. All symbols must be defined before they are
used. The compiler can easily determine the type of a constant without
looking ahead:
- symbol
- $7FFF
- 12345
- 12345.5 - This is an exception. TURBO first verifies whether a numeric
constant is an integer or a real constant. Actually, a standard Pascal
compiler doesn't have to do this - standard Pascal requires that the integer
section of a real constant be a valid integer number !
- 09CFH - This notation (used by many assemblers and languages) is a
negative example. Compilers - like people - read from left to right. To
read a number in this notation, it has to be read into a buffer to recognize
its type, then it can be converted. Guess why TURBO uses $ for hex numbers...
I wonder why Niklaus Wirth chose this notation for Modula-2.
|
|
Lexical Analysis |
|
The task of the lexical analysis is to read the source code from memory or
from an include file, to eliminate comments and to recognize compiler
directives, symbols and keywords.
It is called by the parser. On each call a language element (keyword, symbol,
constant...) is read. The starting position is stored. If an error is
recognized the editor will be started and the cursor will point to this
position. |
|
Parsing Program Structures |
|
The task of this part of the compiler is to analyze the structure of the
program to be compiled and to check the syntax. Like most Pascal compilers,
TURBO uses a recursive descent parser. The code generation is included in
the parser.
The compilation of program structures is quite simple. Usually the syntax is
described in Backus-Naur form or by a "railway diagram". As an example
the IF statement will be covered.
IF cond THEN stat1 { ELSE stat2 }
___________________
____ ______ ______ ___________ / \
->_IF_->_expr_->_THEN_->_statement_-<> -->
\______ ___________/
_ELSE_->_statement__
After reading an element, the parser takes the applicable track. If there
isn't any the syntax is incorrect, and an error is reported.
It is possible to have a parser generated automatically by a so called
compiler-compiler, if the Backus-Naur form of the syntax is given.
Unfortunately this doesn't help very much: The really difficult parts of
a compiler - code generation and optimization - must still be written
manually.
How is the IF statement translated ? (The corresponding section in the
compiler is at offset 6C12). The statement procedure reads an IF and calls the
IF procedure. First the condition - actually an arithmetic expression
of type boolean - is evaluated. This is done by calling the expression
procedure. The expression is read until an illegal symbol (THEN) is found.
This terminates the expression, which is checked for type boolean. The
IF procedure inserts a conditional jump to the end of statement 1 here.
The displacement is inserted later - it is not yet known. If the expression
has been terminated by something else than a THEN, an error is reported.
Now the first statement (stat1) is translated. Actually, this is a RECURSIVE
call of the statement procedure (That's why this is a recursive descent
parser). Please note that the syntax definition is recursive, too !
Because of possible nested IF statements the variables of the IF
procedure are saved on the stack. After this statement, an ELSE
may follow. If it does, a jump to the end of stat2 is emitted
and the jump from the beginning of stat1 is patched, then the second statement
is translated and the second jump patched.
The code produced looks like this:
(IF..THEN) (IF..THEN..ELSE)
cond cond
JNZ l1 JNZ l1
JMP l2 JMP l2
l1: stat 1 l1: stat 1
l2: ... JMP l3
l2: stat 2
l3: ...
The long jump at the beginning isn't always necessary. Unfortunately, the
compiler cannot predict how long the statement will be. To improve this, the
jump would have to be replaced by a short one and the subsequent code moved,
which would complicate the compiler quite a bit.
All other program structures are translated in a similar way.
|
|
Parsing Arithmetic Expressions |
|
The evaluation of expressions is somewhat more complex, as the precedence of
the operations has to be taken into account. The solution in TURBO is, however,
quite simple (code starting at 7A70).
Expressions are usually translated to reverse polish notation (as used on
Hewlett-Packard calculators and in the programming language FORTH).
There are five groups of operations:
- negation (highest precedence)
- NOT
- multiplication, division, ...
- addition, subtraction, ...
- comparisons, IN (lowest precedence)
This translates into the following program structure:
PROCEDURE atom; { element }
BEGIN
CASE op OF
CONST:read constant
VAR :read variable { indexing -> recursive }
'(' :read expression { recursive }
')' must follow
func :read parameters { recursive }
emit function call
TYPE :'(' must follow { type conversion, e.g. Integer(TRUE) }
read expression { recursive }
')' must follow
convert type -> type wanted
ELSE syntax error;
END;
END;
PROCEDURE neg; { negation - }
VAR negflag:BOOLEAN;
BEGIN
negflag:=(op=neg);
atom;
IF negflag THEN emit negation;
END;
PROCEDURE NOT; { NOT }
VAR notflag:BOOLEAN;
BEGIN
notflag:=(op=NOT);
neg;
IF notflag THEN emit NOT;
END;
PROCEDURE mult_level; { multiplication ... }
VAR mult_op:operation;
BEGIN
NOT;
WHILE op IN mult_ops DO BEGIN
save the result;
mult_op:=op;
NOT;
emit operation(mult_op);
END;
END;
PROCEDURE add_level; { addition ... }
VAR add_op:operation;
BEGIN
mult_level;
WHILE op IN add_ops DO BEGIN
save the result;
add_op:=op;
mult_level;
emit operation(add_op);
END;
END;
PROCEDURE expression; { comparisons, IN }
VAR cmp_op:operation;
BEGIN
add_level;
IF op IN cmp_ops THEN BEGIN
save the result;
cmp_op:=op;
add_level;
emit operation(cmp_op);
END;
END;
Example 1: Translation of (a+b)=c -> RPN = a , b + c =
curr. char, stack (active procedure), code produced
---
(:expression add_level mult_level not neg atom
a:... expression add_level mult_level not neg atom
+:... expression add_level -> MOV AX,a
b:... expression add_level mult_level not neg atom
):... expression add_level -> ADD AX,b
):expression add_level mult_level not neg atom
=:expression
c:expression add_level mult_level not neg atom
:expression -> CMP AX,c
Please note:
- The parentheses trigger a recursive call of the expression procedure.
- The code production always lags behind the analysis. This improves the
code produced (e.g.
ADD AX,b).
Example 2: Translation of a+b*c -> RPN = a , b , c * +
curr. char, stack (active procedure), code produced
---
a:expression add_level mult_level not neg atom
+:expression add_level -> MOV AX,a
b:expression add_level mult_level not neg atom
*:expression add_level mult_level -> PUSH AX
-> MOV AX,b
c:expression add_level mult_level not neg atom
:expression add_level mult_level -> IMUL c
:expression add_level -> POP CX
-> ADD AX,CX
Please note:
- The content of
a must be stacked, as the AX register
is needed for the multiplication. This is recognized by setting the
flag push_ax. If subsequent code uses the AX register (destroying
its content), it has to emit PUSH AX. Finally, if this
has happened, the register must be restored by POP CX.
- The code produced is rather simple-minded. By transforming the
expression to b*c+a better code could be produced:
MOV AX,b
IMUL c
ADD AX,a
During evaluation, type checking and type conversion (Integer -> Real...) is
also done.
The 8088 instruction set is often not used well. a:=a+1 yields this code (INC
a would be better):
MOV AX,a
ADD AX,#1
MOV a,AX
Expressions usually account for the bulk of the code produced, so their
translation is very important.
|
|
Optimization |
|
The goal of code optimization is reducing the size and/or execution time of
the code produced. It is usually impossible to find an optimal solution, as a
space-time tradeoff has to be made. TURBO Pascal doesn't have an optimizer.
However, to improve the efficiency of your programs by manual optimizations or
by add-on optimizers, it is good to know how common optimizations work.
Optimizations can be local or global: They can cover a single statement or an
entire program. Global optimization is much more difficult and can cause
problems. GOTO's and function or procedure calls can keep the optimizer from
working efficiently.
Side effects can cause errors that are hard to find. Try it - you'll get what
you deserve... An example:
FUNCTION funny:INTEGER;
BEGIN
side_effect:=side_effect+1;
funny:=5;
END;
...
a:=side_effect+funny+side_effect;
The evaluation sequence and thus the result depends on the compiler used.
Variables don't necessarily stay constant between assignments. Consider this:
wait_int:=FALSE;
REPEAT UNTIL wait_int;
This might wait for an interrupt procedure to set a flag. An optimizing
compiler would convert this to an endless loop... Modern C compilers use
the volatile keyword to avoid this.
Use of Register Variables
Many load and store operations can be eliminated by using register variables.
On the 8088 this is rather difficult, as there are few registers, often with
special uses.
Common Subexpressions
c:=(a+b)*d;
e:=g-(a+b);
The subexpression (a+b) can be used twice. Expressions of the form a[i]:=a[i]+
1 also are a good target for optimizations.
Array Indexing
References with constant indices (a[5]) or indices with a constant offset
(a[i+1]) can be optimized. Array indexing in loops can often
be improved considerably, too.
Constant Folding
Programs can be more readable if constants expressions can be written in a
symbolic form. The compiler can evaluate these expressions at compilation time.
Later versions of the compiler do this.
Strength Reduction
This means replacing operations by "cheaper" equivalents, e.g. x*0.2 instead
of x/5 (multiplications are faster than divisions).
Loop Optimization
FOR i:=1 TO 100 DO dest[i]:=a+b;
The subexpression a+b can be evaluated outside the loop, as a and b don't
change in the loop.
Dead Code Elimination
CONST debug=FALSE;
IF debug THEN writeln('Debug');
The IF statement can be left out - the condition is never met. The same thing
can be done with procedures which are never used. There are optimizers that
eliminate all unused procedures from the run-time library of programs
translated by TURBO Pascal.
Later versions of the compiler do this.
Evaluation of Boolean Expressions
IF (a=5) AND (b=6) THEN ... can be changed into
IF (a=5) THEN
IF (b=6) THEN ...
The same thing can be done with OR and NOT. Never expect boolean expressions
to be executed completely ! Later versions of the compiler do this.
Variable Alignment
Variables in the data segment and on the stack should be aligned to even
offsets to improve performance on 16 bit PC's.
|
|
Code Generation |
|
The code generator has the difficult task of translating the elements
recognized by the parser into executable code. If it gets difficult to tell
whether the code has been generated by a human programmer or by a compiler
then it is indeed a good one... Don't expect too much of this from TURBO.
In the following sections the code produced by TURBO will be explained.
Program
run-time library, if not chain file
CALL initmem ;set segments
W mainflag ;see source code
W turbocs,turbods
W cssize,dssize
w heapsize,maxhpsz
w maxfile,stdinsz,stdoutsz
MOV BP,SP ;stack frame
CALL uncrunch ;expand overlays
W link,*+2
definition part
program part = main program
XOR AX,AX ;Halt
CALL progend
Definition Part
The definition part may contain code, therefore it must be skipped over by:
JMP l1
l1:
Structured Constants
Structured constants are stored in the same format as normal variables.
Overlays
The space needed for overlays is not stored in the COM file. It is freed by
the uncrunch procedure. This means moving up the subsequent code. This is
executed at the beginning of program execution and after loading an overlay
procedure.
CALL rdover ;read overlay file
W $ffff ;overlay procedure now in memory = invalid
B 'over.001' ;name of overlay file
In the section read from the overlay file:
CALL uncrunch ;expand overlay
W link,*+2 ;link for uncrunching
overlay procedure / function
W link,* ;for uncrunching
Forward Definitions
For forward definitions a jump to the final definition is produced. The
displacement is inserted when the real definition is made.
JMP defined_proc
External Procedures
The code read from an external file is not changed.
Procedure Definitions
Local variables of procedures and functions are always stored on the stack.
This means that only active procedures take up space on the stack. This also
enables recursive calls. The transfer of parameters and the allocation of
stack space can be quite complicated, thus slowing down procedure calls.
For every procedure a data structure called stack frame or activation record
is built on the stack. The pointer to the stack frame is always stored in the
BP register (the 8088 can't use the stack pointer SP as index register). The
structure of the stack frame is as follows:
BP+.:function result (space allocated by caller)
BP+.:first parameter
BP+4:last parameter
BP+2:return address
BP+0:pointer to caller's stack frame
BP-2:new stack frame
BP-4:local variables
BP-.:stack top
The code for a standard procedure entry looks like this:
PUSH BP ;save old pointer
MOV BP,SP ;set new pointer
PUSH BP ;save new pointer (for display)
definition part ;constants, local procedures
SUB SP,#size ;allocate space for local variables
;1..2 bytes: DEC SP
program part ;the actual procedure
MOV SP,BP ;forget local variables
POP BP ;restore old pointer
RET prmsize ;return, remove parameters from stack
;no parameters: RET
How function results are passed depends on their type. Scalars (integer...)
are returned in AX, for boolean results the flags are set with OR AX,AX. Reals
are on the stack anyway. Strings must be moved such that they occupy only
their effective length:
MOV DX,#pos_on_stack
MOV CL,#max_len
MOV SP,BP
POP BP
JMP retstr ;the normal end is omitted
Unfortunately, things aren't that simple. Consider nested procedure
definitions:
PROCEDURE level1;
VAR
i:INTEGER;
PROCEDURE level2;
BEGIN
i:=0;
level2;
END;
BEGIN
level2;
END;
The inner procedure level2 uses a local variable of level1, but also calls
itself recursively. The stack offset of i depends on the calling order. TURBO
Pascal uses a so called display to resolve this. The display contains pointers
to the stack frames of calling procedures. Each procedure also adds its own
pointer to the display. The display is an extension of the stack frame.
BP+0:old pointer
BP-2:display outermost procedure
BP-.:display
BP-.:display current procedure
BP-.:local variables
BP-.:stack top
This is maintained by the following code:
PUSH BP ;save old pointer
MOV AX,SP ;set new pointer - keep BP
PUSH [BP-nest*2] ;build display
.. ;once for each nesting level
MOV BP,AX ;set new pointer
PUSH BP ;add own pointer to display
definition part
Newer CPU's (186, 286...) have special commands for these operations (ENTER,
LEAVE). Please note that referencing variables via the display is slower than
normal references. If speed is important don't nest procedure definitions. |
Program Structures
|
|
Program Part
statements
l1: JMP l2 ;jump to the end
POP AX ;GOTO, EXIT: clean up the stack
JMP l1
l2:
GOTO's and EXIT's aren't really that simple. Sometimes stack variables (FOR,
WITH) must be removed, which is done at the end of the procedure.
Statement
If the user interrupt directive is set, an INT 3 is emitted for each statement.
This calls a routine which checks for user interrupts. This feature can be
"misused" to trace a program or to profile its execution time. If this isn't
used anywhere in the program, you can also insert breakpoints as INLINE
statements for debugging with DEBUG.
IF
This has been covered above.
WHILE
l1: condition ;evaluate condition
J.. l2 ;:condition met
JMP l3
l2: statement
JMP l1 ;try again
l3: ;end of loop
REPEAT
l1: statement
condition ;evaluate condition
J.. l2 ;condition met: end
JMP l1 ;not met: repeat
l2:
REPEAT loops are faster than WHILE loops.
FOR
The counter (stored on stack) and the control variable are independent:
assignments to the control variable don't change the number of loop executions.
starting value -> AX
PUSH AX
ending value -> AX
POP CX
XCHG CX,AX
SUB CX,AX ;calculate difference
JGE l1 ;(DOWNTO: JNG)
JMP l3 ;don't execute
l1: INC CX ;(DEC CX)
l2: PUSH CX ;save counter
POP CX ;restore counter
DEC CX ;(INC CX)
JZ l3 ;0: done
INC loop_var ;(DEC) update control variable
JMP l2 ;loop
l3: ;end
CASE
CASE .. OF
2,5 : .. ;
7..9: .. ;
ELSE .. ;
END;
;evaluate selection
CMP AX,#2 ;compare
JZ ok1 ;:yes
CMP AX,#5
JZ ok1 ;:yes
JMP test2 ;try next case
ok1: ;ok - execute
JMP endcase ;jump to end
test2: CMP AX,#7 ;check subrange
JL test2no ;:no
CMP AX,#9
JLE ok2 ;:yes
test2no: JMP else ;no: execute ELSE part
ok2: ;ok - execute
JMP endcase
else: ;ELSE part
endcase:
Complicated CASE statements can exceed the range of short jumps. In this case
so called "hips" are emitted:
JMP hip2 ;skip hip
hip: JMP ok ;jump to statement part
hip2: ;normal continuation
Some compilers also use this technique for IF and other statements.
GOTO
A jump is emitted. If the GOTO leaves a WITH or a FOR block, the stack must be
cleaned up. This is recognized and fixed at the end of the program part.
WITH
The compiler has an internal WITH stack. The pointers for indexed WITH's are
stored on the stack:
set pointer to variable
PUSH ES ;store pointer on stack
PUSH DI
statement
ADD SP,#4 ;remove pointer from stack
If the address is known at compilation time this is not necessary.
Procedure Calls
If the directive K+ is set, a stack check is executed:
MOV CX,#space_needed
CALL xchkstk
Then parameters are evaluated and passed:
- Normal parameter:
evaluate expression
optional range check
PUSH DX ;pointer
PUSH AX ;scalar and pointer
- String:
MOV CL,#max_length;string is extended to maximal length
CALL xstrparm ;-> on stack like a local variable
- Set:
MOV CX,#crunch ;set crunch parameter:
;lo = number of bytes
;hi = number empty bytes at beginning
CALL xsetparm ;adapt set
- Real: already on stack
- Structured variable
set pointer to variable
MOV CX,#size
MOV xblkparm ;copy variable onto stack
- VAR parameters: put pointer on stack
set pointer to variable
PUSH ES
PUSH DI
For overlay procedures this must be inserted:
MOV AX,#length/256
MOV DX,#pos in overlay file / 256
Then the procedure is called by
CALL proc.
Function Call
Stack space is allocated for the result (SUB SP,#space_needed), everything
else is the same. On return the result is on stack (real, string) or in AX
(integer, scalar). It would be easy to return structured variables - I don't
understand why this isn't a standard feature. It would make things like
complex arithmetics much easier.
Calling Standard Procedures and Functions
Standard procedures like Read and Write can have any number of parameters of
any type. This means much flexibility is needed. This problem is solved in
TURBO in a very efficient way: The standard procedure table only contains the
address of the corresponding translation routine. This routine emits the code
for reading the parameters (some of them passed in registers !) and for
calling the run-time library. Some functions (Swap, Hi, Lo) don't call a
procedure but create inline code instead.
Assignments and Expressions
- Scalar / pointer: ES and DI are saved, if necessary. This is only
done if the expression does not consist of a constant or a simple variable.
- Normal variable:
set pointer to variable
PUSH ES ;save pointer to destination variable
PUSH DI
evaluate expression
type conversion
store result in destination variable
- Structured variable:
pointer to second variable
MOV CX,#size ;pointer to destination variable on stack
CALL xmovevar ;copy variable
- Type conversions:
CALL xintreal ;Integer -> Real
MOV AH,AL ;Char -> String: char -> second byte
MOV AL,01 ;length: 1 char
PUSH AX ;push as string
CALL xstrch ;String -> Char
Expressions
The algorithms for translation of expressions were explained in section 4.
Arithmetic operations for scalars are emitted by ecalc. For each operation
there's a parameter block (starting at 973E) controlling the code generation.
expr1 - 5 -> SUB AX,#5
expr1 - var -> SUB AX,var
expr1 - expr2 -> XCHG CX,AX (first result in CX, second in AX)
SUB AX,CX
Expressions of the a:=a+1 type aren't translated well. a:=succ(a) is better,
but not optimal.
Set Expressions
Sets are stored in a compressed form and must be expanded to their full size
(32 bytes) for doing set operations. Because of this set operations tend to be
slow. Set constructors are handled in an inefficient way. [5,var1..var2] is
translated like this:
CALL sldempty ;store empty set on stack
MOV AX,#5 ;expression = 5
CALL setincl ;include element in set
MOV AX,var1 ;first expression subrange
PUSH AX ;save
MOV AX,var2 ;second expression subrange
CALL setinrng ;include subrange in set
If the parameters are variable this is all right. For constant sets this is
disastrous.
IF ch IN ['0'..'9','A'..'Z','a'..'z','_'] THEN ...
The conventional solution. Very slow, as the set is always built when
this is executed. Takes much space for complicated sets.
-
CONST setcn:SET OF char = ['0'..'9','A'..'Z','a'..'z','_'];
IF ch IN setcn THEN ...
This takes up somewhat more space for this example (set constant takes up
32 bytes), but is much faster.
-
CASE ch OF
'0'..'9','A'..'Z','a'..'z','_':...;
END;
For simple cases this gives the shortest and fastest code.
Variable References
MEM / MEMW
expression: segment
PUSH AX
expression: offset
XCHG DI,AX ;pointer -> ES:DI
POP ES
Use ABSOLUTE for variables with a constant address.
WITH Indexing
In a WITH block all variable names must be searched first in the scopes of the
active records and then in the regular symbol table. This can take quite some
time. If the base offset is not known at compilation time (WITH rec[var] DO),
a WITH pointer must be calculated and stored on the stack, otherwise this is
done at compilation time.
Array Indexing, String Indexing
If necessary ES and DI must be saved before evaluation. Different code is
produced depending on the index.
Constant index: The index is checked at compilation time, multiplied by the
element size and added to the base offset, no code is emitted.
Variable index with range checking:
SUB AX,#lower_bound (max be DEC AX / nothing)
MOV CX,#upper_bound+1
CALL xindchk ;check index
Variable index without range checking: The subtraction of the lower bound can
be omitted, it is multiplied by the element size and then subtracted from the
base offset.
The index is multiplied by the element size. This is optimized for some
important element sizes:
no code ;size = 1
SHL AX,1 ;size = 2
SHL AX,1 ;size = 4
SHL AX,1
SHL AX,1 ;size = 6
MOV CX,AX
SHL AX,1
ADD AX,CX
MOV CX,#size ;other element sizes
MUL CX
The index is then stored in DI:
XCHG DI,AX
/pre>
or added to the existing index:
ADD DI,AX
Record Indexing
This is very simple: The offset of the record variable is added to the memory
offset of the variable.
Pointer indexing
Pointers are loaded with LES DI,pointer_var.
Use of Addressing Modes
The procedure einstr emits a command using the correct
addressing mode. If necessary a segment prefix (CS: or ES:) is inserted.
- not indexed, not on stack:
MOV AX,var
- indexed:
MOV AX,[DI] ;no offset
MOV AX,[DI]offs8 ;short offset (-128..127)
MOV AX,[DI]offs16 ;long offset (0..65535)
- stack, local variables:
MOV AX,[BP]offs ;not indexed
MOV AX,[BP+DI]offs;indexed
- stack, callers local variables:
MOV BX,[BP]-lev*2 ;read display pointer
SS: ;indexed by BX - need prefix
MOV AX,[BX]offs ;not indexed
MOV AX,[BX+DI]offs;or indexed
Calculate Pointer to Variable
indexing
The offset is read into DI:
MOV DI,#offset ;not indexed
ADD DI,#offset ;indexed: short or long offset
LEA DI,[BP]offs ;stack: load effective address
The segment is handed over on the stack:
PUSH CS/DS/ES/SS
Read Variable
- Scalar:
MOV AX,var ;integer, not indexed, not on stack
MOVB AL,var ;byte, not indexed, not on stack
MOV AX,.. ;other integer (emitted by einstr)
MOVB AL,.. ;other byte
For byte variables XOR AH,AH is inserted - TURBO always uses integer variables
internally.
- Pointer:
LES AX,ptr_var ;AX = offset
MOV DX,ES ;DX = segment
- Real:
set pointer to variable
CALL xldreal
- Set:
set pointer to variable
MOV CX,#set_crunch
CALL xldset
- String:
set pointer to variable
CALL strload
Store Variable
- Scalar: If R+ is set, range checking is done:
MOV CX,#lower_bound
MOV DX,#upper_bound
CALL xrngchk
If the variable is neither indexed nor on the stack, there's a short form
again:
MOV var,AX ;integer
MOVB var,AL ;byte
Otherwise the correct MOV is emitted by einstr.
- Pointer:
MOV dest,AX ;offset
MOV dest+2,DX ;segment
- Real:
set pointer to variable
CALL xstoreal
- String:
set pointer to variable
MOV CL,#max_length
CALL strstore
- Set:
set pointer to variable
MOV CX,#set_crunch
CALL setsto
|
|
Symbol Table |
|
The symbol table manager has to insert and search symbols of any type. The
difficulty is that definitions may be of any complexity and names may have any
length. The structure implemented in TURBO closely represents the definition
and reference sequence, making it easy to "navigate" through complex types.
Symbols are searched beginning with the most recent definition. For new
definitions, the current block (limited by the "fence" set at the beginning of
a procedure definition) and the keyword table are searched for duplicate
definitions. Thus it is possible to override old definitions. At the end of a
procedure local variables may be removed from the symbol table. This reduces
memory usage and search time.
Symbol Table Entry Structure
The symbol table "symtab" is stored in the stack segment (the actual stack
doesn't need much space) and grows down. The entries have this basic structure:
off :next entry
--
off-2:tag word = entry type
off-4:name length
off-5:name
off-.:entry
0:offset to next entry
The symbol table is always searched backwards, looking at the most recent
entries first. A linear search is used. Sample symbol table entries can be
found in the compiler tables (starting from 9277).
tag = 0100: Label
- 1:procedure nesting (to prevent jumps into or out of procedures)
- 2:0=ok, FF=not yet defined
- 4:offset
tag = 0200: Constant
- 1:type of constant
- 3:constant - strings are stored backwards
Structured constants are initialized variables stored in the code segment and
are entered as such in the symbol table.
tag = 0300: Type
- 2:pointer to type definition
tag = 0400: Variable
For subvariables of a record the low byte of the tag word is the number of the
record definition this entry belongs to.
- 2:pointer to type definition
- 4:offset
- 5:0=normal, FF=indirect (VAR)
- 6:segment
FF = DS
FE = CS
FD = ES (via pointer)
.. = SS, the number corresponds to the procedure nesting level
For ABSOLUTE variables of the form $B800:0 a pointer is stored in the code
segment, the symbol table entry id CS indirect.
tag = 0500: Procedure
tag = 0600: Function
- 2:function only - pointer to result type
- 4:function only - offset on stack
- 5:function only - 0 = normal, FF = indirect
- 6:function only - segment: SS
- 8:code offset of procedure / function
- A:position in overlay file / of forward jump
- C:0=ok, FF=forward definition
- E:length of overlay procedure
-10:number of parameter lists
- 2:type pointer |may be repeated
- 3:0=normal, FF=VAR parameter |
- 4:number of parameters of this type |
- 5:names |
The parameters are also listed as local variables.
Structure of Subentries
The normal entries aren't sufficient for the description of complex types.
Unnamed subentries are used for this. For complex definitions this gives a
tree structure. ARRAY[1..15] OF INTEGER is an array having the subrange 1..15
as index type and INTEGER as element type.
tag = 0000, 0800: Subentries
- 2:component size
- 4:lower bound / pointer to type definition
- 6:upper bound / pointer to index type definition
- 7:flag / record number
- 8:component type
1 = array
2 = record
3 = set
4 = pointer
5 = typed file (FILE OF)
6 = text file (TEXT)
7 = untyped file (FILE)
8 = string
9 = real
A = integer
B = boolean
C = char
. = enumeration types (numbered)
The compiler often uses this register assignment:
CL = component type
CH = where's the result ?
0 = constant
1 = variable
2 = in AX / on stack
3 = flags set (jz/jnz)
4 = comparison, use branch opcode brnchop
The use of the subentry fields depends on the component type:
Array
- 2:component size
- 4:type pointer
- 6:pointer to index type
- 8:01
Record
- 2:component size
- 7:record number
- 8:02
Set
- 2:component size
- 4:pointer to index type
- 8:03
Pointer
- 2:component size = 4
- 6:pointer to type / type name
- 7:0=ok, FF=not yet resolved
- 8:04
Pointers can be forward defined. In this case the type name is stored as an
invisible entry. The type pointer is then inserted later.
String
- 2:length + 1
- 8:08
Text file with nonstandard buffer size
- 2:component size
- 8:06
Enumeration type, subrange
- 2:size: 1 or 2 bytes
- 4:lower bound
- 6:upper bound
- 8:number of enumeration type
The elements of an enumeration type are stored as constants.
Symbol Table Search
TURBO uses a linear search to find symbols in the symbol table. This can be
slow. A worst case program can get compilation speed down to 0.022 lines
per second...
There is a better way: Hashing. This is used in later versions of the
compiler.
|
|
Error Handler |
|
Many compilers try to continue compilation after an error has been found. The
difficulty about this is that this shouldn't trigger an avalanche of
meaningless error messages.
TURBO always stops if an error is found. This can also be seen as an
advantage: The programmer is forced to solve problems one at a time.
|
|
Run-time Library |
|
In the run-time library all standard procedures and functions needed for the
execution of programs are stored. TURBO does this in a rather wasteful (but
simple) way: it always inserts all procedures. Later versions of the
compiler only include the procedures that are actually used.
Memory Map
The segments are allocated as follows:
--
stack (SS) (grows down)
--
free
--
heap (grows up)
--
variables (DS)
--
code (CS)
--
Heap
Heap storage is allocated in blocks having a size that is a multiple of 8
bytes. The free blocks are kept track of with this structure:
+0: pointer to next free block -> linked list
+4: length of free block
+8: free block
Hpstrt points to the first free block. The last block in the list - the free
space between stack and heap - is marked by a 0. If there isn't enough space
between heap and stack an error is reported.
Floating Point Arithmetics
Floating point numbers are divided into two parts: The exponent gives the
order of magnitude, the mantissa gives the accuracy needed.
number = mantissa * 2 ^ exponent
The mantissa is a binary fraction with an accuracy of 40 bits. The mantissa
always represents a number between 0.5 and 1, i.e. it is normalized. This
means the most significant bit is always 1. Here the sign is stored. The
examples use decimal numbers for simplicity.
A floating point addition works as follows:
0.95 E+00
+ 0.60 E-01
0.95 E+00 The number with the smaller exponent is
+ 0.06 E+00 adjusted (de-normalized) to match the other.
= 1.01 E+00 The numbers can now be added as usual.
The result is too big, it must be normalized
= 0.10 E+01 and rounded up.
Please note that floating point additions and subtractions can cause large
round off errors, if the exponents differ too much.
It is often claimed that BCD arithmetics cause less errors. This is not
correct - they are not as visible. Some calculators actually do cosmetic
rounding, if the result is near to an integer number. The best way to avoid
round off errors is to calculate money amounts in cents and not in dollars.
BCD multiplications and divisions are slow. However, BCD does have one
advantage: Conversions to and from ASCII are much faster. As business
applications usually consist mainly of additions and subtractions, BCD can
actually be faster.
Floating point multiplication
0.13 E+00 The exponents are added (division: subtracted)
* 0.49 E+03 and the mantissas are multiplied.
0.13 E+00
* 0.49 E+00
* E+03
= 0.0637 The additional digits are cut off and the
* E+03 result is normalized and rounded up.
= 0.64 E+02
Square roots are evaluated using Newton's approximation. This is a good
solution, but not the best: The 8087 does square roots faster than divisions !
Transcendental functions are evaluated using the standard polynomials found in
math books. I have seen better algorithms to do this. Don't expect much speed
and precision from TURBO transcendentals. |
|
Appendix |
|
Bugs |
|
Thanks to the relative simplicity of the algorithms used TURBO Pascal is
almost bug-free. Well, almost.
Set as Procedure Parameter
If the last element is in the range 248..255 and the first above 7, this
doesn't work. Redefine the set or pass it as a VAR parameter.
SizeOf
Sometimes a redundant load is emitted.
UpCase
The argument type is not checked. Try UpCase(15).
WHILE
DO can be omitted. |
|
Compiler Speed |
|
TURBO may be faster than most other compilers, but there's still a wide margin
for improvements:
- symbol table: use hashing
- include files: use larger buffer
- don't copy source lines into another buffer
The editor could be much faster. The screen is re-displayed in an "intelligent"
way (using DelLine and InsLine). On a terminal this may be faster - with
memory mapped video this is a nuisance. Search and replace is slow. |
|
Write Faster Programs Using TURBO Pascal |
|
Avoid the standard string functions.
- Use ord(st[0]) instead of length(st)
- Use move instead of string assignments:
move(st1,st2,max_len+1);
Write real constants with decimal point (10.0 instead of 10). This eliminates
conversions.
GOTO statement considered harmful. So what... If a GOTO is the best way to
express a program structure - use it ! TURBO GOTO's are still better than
BASIC GOTO's: names can be used instead of numbers.
I/O can be improved if larger buffers are used (use multiples of 512 bytes
for
best results). Use BlockRead and BlockWrite. If large blocks are read or
written the MS-DOS and TURBO overhead doesn't matter as much as for small
blocks. |
|