PC Engines Home
  ALIXAPUCompactFlash AdaptersTest ToolsAtticInfoShopSupport

Turbo Pascal 3.0 Compiler / Code Generation Internals

The following is documentation I created after reverse engineering the Turbo Pascal 3.01A compiler. While many features, e.g. units and objects, have been added, today's compiler is still related to the old code.

Before you flame me about stuff that has been fixed, remember that this is about an OLD version of the compiler.

NEW: This file will generate disassembled, commented source from YOUR 3.01A compiler -> SCG.ZIP

Compiler Structure

Compilers usually consist of the following functional groups:

  • Lexical analysis
  • Syntactical analysis (parser)
  • Optimizer
  • Code generator
  • Symbol table manager
  • Error handler

TURBO Pascal isn't your usual compiler... The parser is interspersed with portions of the code generator, and there is no optimizer. Most compilers need multiple passes to do their work, but TURBO is a (faster) single pass compiler.

Fortunately, programming languages of the Pascal type are designed with simple compilation in one pass in mind. All symbols must be defined before they are used. The compiler can easily determine the type of a constant without looking ahead:

  • symbol
  • $7FFF
  • 12345
  • 12345.5 - This is an exception. TURBO first verifies whether a numeric constant is an integer or a real constant. Actually, a standard Pascal compiler doesn't have to do this - standard Pascal requires that the integer section of a real constant be a valid integer number !
  • 09CFH - This notation (used by many assemblers and languages) is a negative example. Compilers - like people - read from left to right. To read a number in this notation, it has to be read into a buffer to recognize its type, then it can be converted. Guess why TURBO uses $ for hex numbers... I wonder why Niklaus Wirth chose this notation for Modula-2.

Lexical Analysis

The task of the lexical analysis is to read the source code from memory or from an include file, to eliminate comments and to recognize compiler directives, symbols and keywords.

It is called by the parser. On each call a language element (keyword, symbol, constant...) is read. The starting position is stored. If an error is recognized the editor will be started and the cursor will point to this position.

Parsing Program Structures

The task of this part of the compiler is to analyze the structure of the program to be compiled and to check the syntax. Like most Pascal compilers, TURBO uses a recursive descent parser. The code generation is included in the parser.

The compilation of program structures is quite simple. Usually the syntax is described in Backus-Naur form or by a "railway diagram". As an example the IF statement will be covered.

	 IF cond THEN stat1 { ELSE stat2 }
	                                        ___________________
	    ____  ______  ______  ___________  /                   \
	  ->_IF_->_expr_->_THEN_->_statement_-<>                    -->
	                                       \______  ___________/
	                                        _ELSE_->_statement__
	

After reading an element, the parser takes the applicable track. If there isn't any the syntax is incorrect, and an error is reported.

It is possible to have a parser generated automatically by a so called compiler-compiler, if the Backus-Naur form of the syntax is given. Unfortunately this doesn't help very much: The really difficult parts of a compiler - code generation and optimization - must still be written manually.

How is the IF statement translated ? (The corresponding section in the compiler is at offset 6C12). The statement procedure reads an IF and calls the IF procedure. First the condition - actually an arithmetic expression of type boolean - is evaluated. This is done by calling the expression procedure. The expression is read until an illegal symbol (THEN) is found. This terminates the expression, which is checked for type boolean. The IF procedure inserts a conditional jump to the end of statement 1 here. The displacement is inserted later - it is not yet known. If the expression has been terminated by something else than a THEN, an error is reported. Now the first statement (stat1) is translated. Actually, this is a RECURSIVE call of the statement procedure (That's why this is a recursive descent parser). Please note that the syntax definition is recursive, too ! Because of possible nested IF statements the variables of the IF procedure are saved on the stack. After this statement, an ELSE may follow. If it does, a jump to the end of stat2 is emitted and the jump from the beginning of stat1 is patched, then the second statement is translated and the second jump patched.

The code produced looks like this:

	    (IF..THEN)              (IF..THEN..ELSE)

	        cond                    cond
	        JNZ l1                  JNZ l1
	        JMP l2                  JMP l2
	    l1: stat 1              l1: stat 1
	    l2: ...                     JMP l3
	                            l2: stat 2
	                            l3: ...
	
The long jump at the beginning isn't always necessary. Unfortunately, the compiler cannot predict how long the statement will be. To improve this, the jump would have to be replaced by a short one and the subsequent code moved, which would complicate the compiler quite a bit.

All other program structures are translated in a similar way.

Parsing Arithmetic Expressions

The evaluation of expressions is somewhat more complex, as the precedence of the operations has to be taken into account. The solution in TURBO is, however, quite simple (code starting at 7A70).

Expressions are usually translated to reverse polish notation (as used on Hewlett-Packard calculators and in the programming language FORTH).

There are five groups of operations:

  • negation (highest precedence)
  • NOT
  • multiplication, division, ...
  • addition, subtraction, ...
  • comparisons, IN (lowest precedence)

This translates into the following program structure:

	  PROCEDURE atom; { element }
	  BEGIN
	    CASE op OF
	      CONST:read constant
	      VAR  :read variable          { indexing -> recursive }
	      '('  :read expression        { recursive             }
	            ')' must follow
	      func :read parameters        { recursive             }
	            emit function call
	      TYPE :'(' must follow        { type conversion, e.g. Integer(TRUE) }
	            read expression        { recursive             }
	            ')' must follow
	            convert type -> type wanted
	      ELSE syntax error;
	    END;
	  END;

	  PROCEDURE neg; { negation - }
	  VAR negflag:BOOLEAN;
	  BEGIN
	    negflag:=(op=neg);
	    atom;
	    IF negflag THEN emit negation;
	  END;

	  PROCEDURE NOT; { NOT }
	  VAR notflag:BOOLEAN;
	  BEGIN
	    notflag:=(op=NOT);
	    neg;
	    IF notflag THEN emit NOT;
	  END;

	  PROCEDURE mult_level; { multiplication ... }
	  VAR mult_op:operation;
	  BEGIN
	    NOT;
	    WHILE op IN mult_ops DO BEGIN
	      save the result;
	      mult_op:=op;
	      NOT;
	      emit operation(mult_op);
	    END;
	  END;

	  PROCEDURE add_level; { addition ... }
	  VAR add_op:operation;
	  BEGIN
	    mult_level;
	    WHILE op IN add_ops DO BEGIN
	      save the result;
	      add_op:=op;
	      mult_level;
	      emit operation(add_op);
	    END;
	  END;

	  PROCEDURE expression; { comparisons, IN }
	  VAR cmp_op:operation;
	  BEGIN
	    add_level;
	    IF op IN cmp_ops THEN BEGIN
	      save the result;
	      cmp_op:=op;
	      add_level;
	      emit operation(cmp_op);
	    END;
	  END;
	

Example 1: Translation of (a+b)=c -> RPN = a , b + c =

	curr. char, stack (active procedure),              code produced
	---
	(:expression add_level mult_level not neg atom
	a:... expression add_level mult_level not neg atom
	+:... expression add_level                         -> MOV  AX,a
	b:... expression add_level mult_level not neg atom
	):... expression add_level                         -> ADD  AX,b
	):expression add_level mult_level not neg atom
	=:expression
	c:expression add_level mult_level not neg atom
	 :expression                                       -> CMP  AX,c
	

Please note:

  • The parentheses trigger a recursive call of the expression procedure.
  • The code production always lags behind the analysis. This improves the code produced (e.g. ADD AX,b).

Example 2: Translation of a+b*c -> RPN = a , b , c * +

	curr. char, stack (active procedure),              code produced
	---
	a:expression add_level mult_level not neg atom
	+:expression add_level                             -> MOV  AX,a
	b:expression add_level mult_level not neg atom
	*:expression add_level mult_level                  -> PUSH AX
	                                                   -> MOV  AX,b
	c:expression add_level mult_level not neg atom
	 :expression add_level mult_level                  -> IMUL c
	 :expression add_level                             -> POP  CX
	                                                   -> ADD  AX,CX
	

Please note:

  • The content of a must be stacked, as the AX register is needed for the multiplication. This is recognized by setting the flag push_ax. If subsequent code uses the AX register (destroying its content), it has to emit PUSH AX. Finally, if this has happened, the register must be restored by POP CX.
  • The code produced is rather simple-minded. By transforming the expression to b*c+a better code could be produced:
    	  MOV  AX,b
    	  IMUL c
    	  ADD  AX,a
    	
During evaluation, type checking and type conversion (Integer -> Real...) is also done.

The 8088 instruction set is often not used well. a:=a+1 yields this code (INC a would be better):

	  MOV  AX,a
	  ADD  AX,#1
	  MOV  a,AX
	
Expressions usually account for the bulk of the code produced, so their translation is very important.

Optimization

The goal of code optimization is reducing the size and/or execution time of the code produced. It is usually impossible to find an optimal solution, as a space-time tradeoff has to be made. TURBO Pascal doesn't have an optimizer. However, to improve the efficiency of your programs by manual optimizations or by add-on optimizers, it is good to know how common optimizations work.

Optimizations can be local or global: They can cover a single statement or an entire program. Global optimization is much more difficult and can cause problems. GOTO's and function or procedure calls can keep the optimizer from working efficiently.

Side effects can cause errors that are hard to find. Try it - you'll get what you deserve... An example:

	  FUNCTION funny:INTEGER;
	  BEGIN
	    side_effect:=side_effect+1;
	    funny:=5;
	  END;

	  ...

	  a:=side_effect+funny+side_effect;
	
The evaluation sequence and thus the result depends on the compiler used.

Variables don't necessarily stay constant between assignments. Consider this:

	  wait_int:=FALSE;
	  REPEAT UNTIL wait_int;
	
This might wait for an interrupt procedure to set a flag. An optimizing compiler would convert this to an endless loop... Modern C compilers use the volatile keyword to avoid this.

Use of Register Variables

Many load and store operations can be eliminated by using register variables. On the 8088 this is rather difficult, as there are few registers, often with special uses.

Common Subexpressions

	  c:=(a+b)*d;
	  e:=g-(a+b);
	
The subexpression (a+b) can be used twice. Expressions of the form a[i]:=a[i]+ 1 also are a good target for optimizations.

Array Indexing

References with constant indices (a[5]) or indices with a constant offset (a[i+1]) can be optimized. Array indexing in loops can often be improved considerably, too.

Constant Folding

Programs can be more readable if constants expressions can be written in a symbolic form. The compiler can evaluate these expressions at compilation time. Later versions of the compiler do this.

Strength Reduction

This means replacing operations by "cheaper" equivalents, e.g. x*0.2 instead of x/5 (multiplications are faster than divisions).

Loop Optimization

	  FOR i:=1 TO 100 DO dest[i]:=a+b;
	
The subexpression a+b can be evaluated outside the loop, as a and b don't change in the loop.

Dead Code Elimination

	  CONST debug=FALSE;

	  IF debug THEN writeln('Debug');
	
The IF statement can be left out - the condition is never met. The same thing can be done with procedures which are never used. There are optimizers that eliminate all unused procedures from the run-time library of programs translated by TURBO Pascal. Later versions of the compiler do this.

Evaluation of Boolean Expressions

	  IF (a=5) AND (b=6) THEN ... can be changed into

	  IF (a=5) THEN
	    IF (b=6) THEN ...
	
The same thing can be done with OR and NOT. Never expect boolean expressions to be executed completely ! Later versions of the compiler do this.

Variable Alignment

Variables in the data segment and on the stack should be aligned to even offsets to improve performance on 16 bit PC's.

Code Generation

The code generator has the difficult task of translating the elements recognized by the parser into executable code. If it gets difficult to tell whether the code has been generated by a human programmer or by a compiler then it is indeed a good one... Don't expect too much of this from TURBO.

In the following sections the code produced by TURBO will be explained.

Program

	  run-time library, if not chain file
	  CALL  initmem       ;set segments
	  W     mainflag      ;see source code
	  W     turbocs,turbods
	  W     cssize,dssize
	  w     heapsize,maxhpsz
	  w     maxfile,stdinsz,stdoutsz
	  MOV   BP,SP         ;stack frame
	  CALL  uncrunch      ;expand overlays
	  W     link,*+2
	  definition part
	  program part = main program
	  XOR   AX,AX         ;Halt
	  CALL  progend
	

Definition Part

The definition part may contain code, therefore it must be skipped over by:

	  JMP   l1
	  
	  l1:

Structured Constants

Structured constants are stored in the same format as normal variables.

Overlays

The space needed for overlays is not stored in the COM file. It is freed by the uncrunch procedure. This means moving up the subsequent code. This is executed at the beginning of program execution and after loading an overlay procedure.

	  CALL  rdover        ;read overlay file
	  W     $ffff         ;overlay procedure now in memory = invalid
	  B     'over.001'    ;name of overlay file

	    In the section read from the overlay file:

	    CALL  uncrunch      ;expand overlay
	    W     link,*+2      ;link for uncrunching
	    overlay procedure / function

	  W     link,*        ;for uncrunching
	

Forward Definitions

For forward definitions a jump to the final definition is produced. The displacement is inserted when the real definition is made.

	  JMP   defined_proc
	

External Procedures

The code read from an external file is not changed.

Procedure Definitions

Local variables of procedures and functions are always stored on the stack. This means that only active procedures take up space on the stack. This also enables recursive calls. The transfer of parameters and the allocation of stack space can be quite complicated, thus slowing down procedure calls.

For every procedure a data structure called stack frame or activation record is built on the stack. The pointer to the stack frame is always stored in the BP register (the 8088 can't use the stack pointer SP as index register). The structure of the stack frame is as follows:

	  BP+.:function result (space allocated by caller)
	  BP+.:first parameter
	  BP+4:last parameter
	  BP+2:return address
	  BP+0:pointer to caller's stack frame
	  BP-2:new stack frame
	  BP-4:local variables
	  BP-.:stack top
	
The code for a standard procedure entry looks like this:
	  PUSH  BP            ;save old pointer
	  MOV   BP,SP         ;set new pointer
	  PUSH  BP            ;save new pointer (for display)
	  definition part     ;constants, local procedures
	  SUB   SP,#size      ;allocate space for local variables
	                      ;1..2 bytes: DEC SP
	  program part        ;the actual procedure
	  MOV   SP,BP         ;forget local variables
	  POP   BP            ;restore old pointer
	  RET   prmsize       ;return, remove parameters from stack
	                      ;no parameters: RET
	
How function results are passed depends on their type. Scalars (integer...) are returned in AX, for boolean results the flags are set with OR AX,AX. Reals are on the stack anyway. Strings must be moved such that they occupy only their effective length:
	  MOV   DX,#pos_on_stack
	  MOV   CL,#max_len
	  MOV   SP,BP
	  POP   BP
	  JMP   retstr        ;the normal end is omitted
	
Unfortunately, things aren't that simple. Consider nested procedure definitions:
	  PROCEDURE level1;
	  VAR
	    i:INTEGER;

	    PROCEDURE level2;
	    BEGIN
	      i:=0;
	      level2;
	    END;

	  BEGIN
	    level2;
	  END;
	
The inner procedure level2 uses a local variable of level1, but also calls itself recursively. The stack offset of i depends on the calling order. TURBO Pascal uses a so called display to resolve this. The display contains pointers to the stack frames of calling procedures. Each procedure also adds its own pointer to the display. The display is an extension of the stack frame.
	  BP+0:old pointer
	  BP-2:display outermost procedure
	  BP-.:display
	  BP-.:display current procedure
	  BP-.:local variables
	  BP-.:stack top
	
This is maintained by the following code:
	  PUSH  BP            ;save old pointer
	  MOV   AX,SP         ;set new pointer - keep BP
	  PUSH  [BP-nest*2]   ;build display
	   ..                 ;once for each nesting level
	  MOV   BP,AX         ;set new pointer
	  PUSH  BP            ;add own pointer to display
	  definition part
	
Newer CPU's (186, 286...) have special commands for these operations (ENTER, LEAVE). Please note that referencing variables via the display is slower than normal references. If speed is important don't nest procedure definitions.

Program Structures

Program Part

	  statements
	  l1: JMP l2          ;jump to the end
	  POP   AX            ;GOTO, EXIT: clean up the stack
	  JMP   l1
	  l2:
	
GOTO's and EXIT's aren't really that simple. Sometimes stack variables (FOR, WITH) must be removed, which is done at the end of the procedure.

Statement

If the user interrupt directive is set, an INT 3 is emitted for each statement. This calls a routine which checks for user interrupts. This feature can be "misused" to trace a program or to profile its execution time. If this isn't used anywhere in the program, you can also insert breakpoints as INLINE statements for debugging with DEBUG.

IF

This has been covered above.

WHILE

	  l1: condition       ;evaluate condition
	  J..   l2            ;:condition met
	  JMP   l3
	  l2: statement
	  JMP   l1            ;try again
	  l3:                 ;end of loop
	

REPEAT

	  l1: statement
	  condition           ;evaluate condition
	  J..   l2            ;condition met: end
	  JMP   l1            ;not met: repeat
	  l2:
	
REPEAT loops are faster than WHILE loops.

FOR

The counter (stored on stack) and the control variable are independent: assignments to the control variable don't change the number of loop executions.

	  starting value -> AX
	  PUSH  AX
	  ending value -> AX
	  POP   CX
	  XCHG  CX,AX
	  SUB   CX,AX         ;calculate difference
	  JGE   l1            ;(DOWNTO: JNG)
	  JMP   l3            ;don't execute
	  l1: INC CX          ;(DEC CX)
	  
	  l2: PUSH CX         ;save counter
	  
	  POP   CX            ;restore counter
	  DEC   CX            ;(INC CX)
	  JZ    l3            ;0: done
	  INC   loop_var      ;(DEC) update control variable
	  JMP   l2            ;loop
	  l3:                 ;end
	

CASE

	  CASE .. OF
	    2,5 : .. ;
	    7..9: .. ;
	    ELSE  .. ;
	  END;

	   ;evaluate selection
	  CMP   AX,#2         ;compare
	  JZ    ok1           ;:yes
	  CMP   AX,#5
	  JZ    ok1           ;:yes
	  JMP   test2         ;try next case
	  ok1:     ;ok - execute
	  JMP   endcase       ;jump to end
	  test2: CMP AX,#7    ;check subrange
	  JL    test2no       ;:no
	  CMP   AX,#9
	  JLE   ok2           ;:yes
	  test2no: JMP else   ;no: execute ELSE part
	  ok2:     ;ok - execute
	  JMP   endcase
	  else:    ;ELSE part
	  endcase:
	
Complicated CASE statements can exceed the range of short jumps. In this case so called "hips" are emitted:
	  JMP   hip2          ;skip hip
	  hip: JMP ok         ;jump to statement part
	  hip2:               ;normal continuation
	
Some compilers also use this technique for IF and other statements.

GOTO

A jump is emitted. If the GOTO leaves a WITH or a FOR block, the stack must be cleaned up. This is recognized and fixed at the end of the program part.

WITH

The compiler has an internal WITH stack. The pointers for indexed WITH's are stored on the stack:

	  set pointer to variable
	  PUSH ES             ;store pointer on stack
	  PUSH DI
	  statement
	  ADD   SP,#4         ;remove pointer from stack
	
If the address is known at compilation time this is not necessary.

Procedure Calls

If the directive K+ is set, a stack check is executed:

	  MOV   CX,#space_needed
	  CALL  xchkstk
	
Then parameters are evaluated and passed:
  • Normal parameter:
    	evaluate expression
    	optional range check
    	PUSH  DX            ;pointer
    	PUSH  AX            ;scalar and pointer
  • String:
    	MOV   CL,#max_length;string is extended to maximal length
    	CALL  xstrparm      ;-> on stack like a local variable
  • Set:
    	MOV   CX,#crunch    ;set crunch parameter:
    	                    ;lo = number of bytes
    	                    ;hi = number empty bytes at beginning
    	CALL  xsetparm      ;adapt set
  • Real: already on stack
  • Structured variable
    	set pointer to variable
    	MOV   CX,#size
    	MOV   xblkparm      ;copy variable onto stack
  • VAR parameters: put pointer on stack
    	set pointer to variable
    	PUSH  ES
    	PUSH  DI
    	

For overlay procedures this must be inserted:

	  MOV   AX,#length/256
	  MOV   DX,#pos in overlay file / 256
	
Then the procedure is called by
	  CALL  proc.
	

Function Call

Stack space is allocated for the result (SUB SP,#space_needed), everything else is the same. On return the result is on stack (real, string) or in AX (integer, scalar). It would be easy to return structured variables - I don't understand why this isn't a standard feature. It would make things like complex arithmetics much easier.

Calling Standard Procedures and Functions

Standard procedures like Read and Write can have any number of parameters of any type. This means much flexibility is needed. This problem is solved in TURBO in a very efficient way: The standard procedure table only contains the address of the corresponding translation routine. This routine emits the code for reading the parameters (some of them passed in registers !) and for calling the run-time library. Some functions (Swap, Hi, Lo) don't call a procedure but create inline code instead.

Assignments and Expressions

  • Scalar / pointer: ES and DI are saved, if necessary. This is only done if the expression does not consist of a constant or a simple variable.
  • Normal variable:
    	  set pointer to variable
    	  PUSH  ES            ;save pointer to destination variable
    	  PUSH  DI
    	  evaluate expression
    	  type conversion
    	  store result in destination variable
    	
  • Structured variable:
    	  pointer to second variable
    	  MOV   CX,#size      ;pointer to destination variable on stack
    	  CALL  xmovevar      ;copy variable
    	
  • Type conversions:
    	  CALL  xintreal      ;Integer -> Real
    
    	  MOV   AH,AL         ;Char -> String: char -> second byte
    	  MOV   AL,01         ;length: 1 char
    	  PUSH  AX            ;push as string
    
    	  CALL  xstrch        ;String -> Char
    	

Expressions

The algorithms for translation of expressions were explained in section 4. Arithmetic operations for scalars are emitted by ecalc. For each operation there's a parameter block (starting at 973E) controlling the code generation.

	  expr1 - 5     -> SUB  AX,#5
	  expr1 - var   -> SUB  AX,var
	  expr1 - expr2 -> XCHG CX,AX (first result in CX, second in AX)
	                   SUB  AX,CX
	
Expressions of the a:=a+1 type aren't translated well. a:=succ(a) is better, but not optimal.

Set Expressions

Sets are stored in a compressed form and must be expanded to their full size (32 bytes) for doing set operations. Because of this set operations tend to be slow. Set constructors are handled in an inefficient way. [5,var1..var2] is translated like this:

	  CALL  sldempty      ;store empty set on stack
	  MOV   AX,#5         ;expression = 5
	  CALL  setincl       ;include element in set
	  MOV   AX,var1       ;first expression subrange
	  PUSH  AX            ;save
	  MOV   AX,var2       ;second expression subrange
	  CALL  setinrng      ;include subrange in set
	
If the parameters are variable this is all right. For constant sets this is disastrous.
  • IF ch IN ['0'..'9','A'..'Z','a'..'z','_'] THEN ...

    The conventional solution. Very slow, as the set is always built when this is executed. Takes much space for complicated sets.

  • CONST setcn:SET OF char = ['0'..'9','A'..'Z','a'..'z','_'];
    	IF ch IN setcn THEN ...
    	
    This takes up somewhat more space for this example (set constant takes up 32 bytes), but is much faster.
  • CASE ch OF
    	  '0'..'9','A'..'Z','a'..'z','_':...;
    	END;
    	
    For simple cases this gives the shortest and fastest code.

Variable References

MEM / MEMW

	  expression: segment
	  PUSH  AX
	  expression: offset
	  XCHG  DI,AX         ;pointer -> ES:DI
	  POP   ES
	

Use ABSOLUTE for variables with a constant address.

WITH Indexing

In a WITH block all variable names must be searched first in the scopes of the active records and then in the regular symbol table. This can take quite some time. If the base offset is not known at compilation time (WITH rec[var] DO), a WITH pointer must be calculated and stored on the stack, otherwise this is done at compilation time.

Array Indexing, String Indexing

If necessary ES and DI must be saved before evaluation. Different code is produced depending on the index.

Constant index: The index is checked at compilation time, multiplied by the element size and added to the base offset, no code is emitted.

Variable index with range checking:

	  SUB   AX,#lower_bound (max be DEC AX / nothing)
	  MOV   CX,#upper_bound+1
	  CALL  xindchk       ;check index
	
Variable index without range checking: The subtraction of the lower bound can be omitted, it is multiplied by the element size and then subtracted from the base offset.

The index is multiplied by the element size. This is optimized for some important element sizes:

	  no code             ;size = 1

	  SHL   AX,1          ;size = 2

	  SHL   AX,1          ;size = 4
	  SHL   AX,1

	  SHL   AX,1          ;size = 6
	  MOV   CX,AX
	  SHL   AX,1
	  ADD   AX,CX

	  MOV   CX,#size      ;other element sizes
	  MUL   CX
	
"> The index is then stored in DI:
	  XCHG  DI,AX
	

or added to the existing index:

	  ADD   DI,AX
	

Record Indexing

This is very simple: The offset of the record variable is added to the memory offset of the variable.

Pointer indexing Pointers are loaded with LES DI,pointer_var.

Use of Addressing Modes

The procedure einstr emits a command using the correct addressing mode. If necessary a segment prefix (CS: or ES:) is inserted.

  • not indexed, not on stack:
      MOV   AX,var
    	
  • indexed:
      MOV   AX,[DI]       ;no offset
    	  MOV   AX,[DI]offs8  ;short offset (-128..127)
    	  MOV   AX,[DI]offs16 ;long offset (0..65535)
    	
  • stack, local variables:
      MOV   AX,[BP]offs   ;not indexed
    	  MOV   AX,[BP+DI]offs;indexed
    	
  • stack, callers local variables:
      MOV   BX,[BP]-lev*2 ;read display pointer
    	  SS:                 ;indexed by BX - need prefix
    	  MOV   AX,[BX]offs   ;not indexed
    	  MOV   AX,[BX+DI]offs;or indexed
    	

Calculate Pointer to Variable

	  indexing
	

The offset is read into DI:

	  MOV   DI,#offset    ;not indexed
	  ADD   DI,#offset    ;indexed: short or long offset
	  LEA   DI,[BP]offs   ;stack: load effective address
	

The segment is handed over on the stack:

	  PUSH  CS/DS/ES/SS
	

Read Variable

  • Scalar:
      MOV   AX,var        ;integer, not indexed, not on stack
    	  MOVB  AL,var        ;byte, not indexed, not on stack
    	  MOV   AX,..         ;other integer (emitted by einstr)
    	  MOVB  AL,..         ;other byte
    	
    For byte variables XOR AH,AH is inserted - TURBO always uses integer variables internally.
  • Pointer:
      LES   AX,ptr_var    ;AX = offset
    	  MOV   DX,ES         ;DX = segment
    	
  • Real:
      set pointer to variable
    	  CALL  xldreal
    	
  • Set:
      set pointer to variable
    	  MOV   CX,#set_crunch
    	  CALL  xldset
    	
  • String:
      set pointer to variable
    	  CALL  strload
    	

Store Variable

  • Scalar: If R+ is set, range checking is done:
      MOV   CX,#lower_bound
    	  MOV   DX,#upper_bound
    	  CALL  xrngchk
    	
    If the variable is neither indexed nor on the stack, there's a short form again:
      MOV   var,AX        ;integer
    	  MOVB  var,AL        ;byte
    	
    Otherwise the correct MOV is emitted by einstr.
  • Pointer:
      MOV   dest,AX       ;offset
    	  MOV   dest+2,DX     ;segment
    	
  • Real:
      set pointer to variable
    	  CALL  xstoreal
    	
  • String:
      set pointer to variable
    	  MOV   CL,#max_length
    	  CALL  strstore
    	
  • Set:
      set pointer to variable
    	  MOV   CX,#set_crunch
    	  CALL  setsto
    	

Symbol Table

The symbol table manager has to insert and search symbols of any type. The difficulty is that definitions may be of any complexity and names may have any length. The structure implemented in TURBO closely represents the definition and reference sequence, making it easy to "navigate" through complex types.

Symbols are searched beginning with the most recent definition. For new definitions, the current block (limited by the "fence" set at the beginning of a procedure definition) and the keyword table are searched for duplicate definitions. Thus it is possible to override old definitions. At the end of a procedure local variables may be removed from the symbol table. This reduces memory usage and search time.

Symbol Table Entry Structure

The symbol table "symtab" is stored in the stack segment (the actual stack doesn't need much space) and grows down. The entries have this basic structure:

	  off  :next entry
	  --
	  off-2:tag word = entry type
	  off-4:name length
	  off-5:name
	  off-.:entry
	      0:offset to next entry
	
The symbol table is always searched backwards, looking at the most recent entries first. A linear search is used. Sample symbol table entries can be found in the compiler tables (starting from 9277).

tag = 0100: Label

	 - 1:procedure nesting (to prevent jumps into or out of procedures)
	 - 2:0=ok, FF=not yet defined
	 - 4:offset
	

tag = 0200: Constant

	 - 1:type of constant
	 - 3:constant - strings are stored backwards
	
Structured constants are initialized variables stored in the code segment and are entered as such in the symbol table.

tag = 0300: Type

	 - 2:pointer to type definition
	

tag = 0400: Variable

For subvariables of a record the low byte of the tag word is the number of the record definition this entry belongs to.

	 - 2:pointer to type definition
	 - 4:offset
	 - 5:0=normal, FF=indirect (VAR)
	 - 6:segment
	     FF = DS
	     FE = CS
	     FD = ES (via pointer)
	     .. = SS, the number corresponds to the procedure nesting level
	
For ABSOLUTE variables of the form $B800:0 a pointer is stored in the code segment, the symbol table entry id CS indirect.

tag = 0500: Procedure

tag = 0600: Function

	 - 2:function only - pointer to result type
	 - 4:function only - offset on stack
	 - 5:function only - 0 = normal, FF = indirect
	 - 6:function only - segment: SS
	 - 8:code offset of procedure / function
	 - A:position in overlay file / of forward jump
	 - C:0=ok, FF=forward definition
	 - E:length of overlay procedure
	 -10:number of parameter lists
	 - 2:type pointer                      |may be repeated
	 - 3:0=normal, FF=VAR parameter        |
	 - 4:number of parameters of this type |
	 - 5:names                             |
	
The parameters are also listed as local variables.

Structure of Subentries

The normal entries aren't sufficient for the description of complex types. Unnamed subentries are used for this. For complex definitions this gives a tree structure. ARRAY[1..15] OF INTEGER is an array having the subrange 1..15 as index type and INTEGER as element type.

tag = 0000, 0800: Subentries

	 - 2:component size
	 - 4:lower bound / pointer to type definition
	 - 6:upper bound / pointer to index type definition
	 - 7:flag / record number
	 - 8:component type
	      1 = array
	      2 = record
	      3 = set
	      4 = pointer
	      5 = typed file   (FILE OF)
	      6 = text file    (TEXT)
	      7 = untyped file (FILE)
	      8 = string
	      9 = real
	      A = integer
	      B = boolean
	      C = char
	      . = enumeration types (numbered)
	
The compiler often uses this register assignment:
	  CL = component type
	  CH = where's the result ?
	        0 = constant
	        1 = variable
	        2 = in AX / on stack
	        3 = flags set (jz/jnz)
	        4 = comparison, use branch opcode brnchop
	
The use of the subentry fields depends on the component type:

Array

	 - 2:component size
	 - 4:type pointer
	 - 6:pointer to index type
	 - 8:01
	
Record
	 - 2:component size
	 - 7:record number
	 - 8:02
	
Set
	 - 2:component size
	 - 4:pointer to index type
	 - 8:03
	
Pointer
	 - 2:component size = 4
	 - 6:pointer to type / type name
	 - 7:0=ok, FF=not yet resolved
	 - 8:04
	
Pointers can be forward defined. In this case the type name is stored as an invisible entry. The type pointer is then inserted later.

String

	 - 2:length + 1
	 - 8:08
	
Text file with nonstandard buffer size
	 - 2:component size
	 - 8:06
	
Enumeration type, subrange
	 - 2:size: 1 or 2 bytes
	 - 4:lower bound
	 - 6:upper bound
	 - 8:number of enumeration type
	
The elements of an enumeration type are stored as constants.

Symbol Table Search

TURBO uses a linear search to find symbols in the symbol table. This can be slow. A worst case program can get compilation speed down to 0.022 lines per second...

There is a better way: Hashing. This is used in later versions of the compiler.

Error Handler

Many compilers try to continue compilation after an error has been found. The difficulty about this is that this shouldn't trigger an avalanche of meaningless error messages.

TURBO always stops if an error is found. This can also be seen as an advantage: The programmer is forced to solve problems one at a time.

Run-time Library

In the run-time library all standard procedures and functions needed for the execution of programs are stored. TURBO does this in a rather wasteful (but simple) way: it always inserts all procedures. Later versions of the compiler only include the procedures that are actually used.

Memory Map

The segments are allocated as follows:

	  --
	   stack     (SS)  (grows down)
	  --
	   free
	  --
	   heap            (grows up)
	  --
	   variables (DS)
	  --
	   code      (CS)
	  --
	

Heap

Heap storage is allocated in blocks having a size that is a multiple of 8 bytes. The free blocks are kept track of with this structure:

	 +0: pointer to next free block -> linked list
	 +4: length of free block
	 +8: free block
	
Hpstrt points to the first free block. The last block in the list - the free space between stack and heap - is marked by a 0. If there isn't enough space between heap and stack an error is reported.

Floating Point Arithmetics

Floating point numbers are divided into two parts: The exponent gives the order of magnitude, the mantissa gives the accuracy needed.

	  number = mantissa * 2 ^ exponent
	
The mantissa is a binary fraction with an accuracy of 40 bits. The mantissa always represents a number between 0.5 and 1, i.e. it is normalized. This means the most significant bit is always 1. Here the sign is stored. The examples use decimal numbers for simplicity.

A floating point addition works as follows:

	   0.95 E+00
	+  0.60 E-01

	   0.95 E+00   The number with the smaller exponent is
	+  0.06 E+00   adjusted (de-normalized) to match the other.
	=  1.01 E+00   The numbers can now be added as usual.
	               The result is too big, it must be normalized
	=  0.10 E+01   and rounded up.
	
Please note that floating point additions and subtractions can cause large round off errors, if the exponents differ too much.

It is often claimed that BCD arithmetics cause less errors. This is not correct - they are not as visible. Some calculators actually do cosmetic rounding, if the result is near to an integer number. The best way to avoid round off errors is to calculate money amounts in cents and not in dollars. BCD multiplications and divisions are slow. However, BCD does have one advantage: Conversions to and from ASCII are much faster. As business applications usually consist mainly of additions and subtractions, BCD can actually be faster.

Floating point multiplication

	   0.13 E+00   The exponents are added (division: subtracted)
	*  0.49 E+03   and the mantissas are multiplied.

	   0.13 E+00
	*  0.49 E+00
	*       E+03

	=  0.0637      The additional digits are cut off and the
	*       E+03   result is normalized and rounded up.

	=  0.64 E+02
	
Square roots are evaluated using Newton's approximation. This is a good solution, but not the best: The 8087 does square roots faster than divisions ! Transcendental functions are evaluated using the standard polynomials found in math books. I have seen better algorithms to do this. Don't expect much speed and precision from TURBO transcendentals.

Appendix

Bugs

Thanks to the relative simplicity of the algorithms used TURBO Pascal is almost bug-free. Well, almost.

Set as Procedure Parameter

If the last element is in the range 248..255 and the first above 7, this doesn't work. Redefine the set or pass it as a VAR parameter.

SizeOf

Sometimes a redundant load is emitted.

UpCase

The argument type is not checked. Try UpCase(15).

WHILE

DO can be omitted.

Compiler Speed

TURBO may be faster than most other compilers, but there's still a wide margin for improvements:

  • symbol table: use hashing
  • include files: use larger buffer
  • don't copy source lines into another buffer
The editor could be much faster. The screen is re-displayed in an "intelligent" way (using DelLine and InsLine). On a terminal this may be faster - with memory mapped video this is a nuisance. Search and replace is slow.

Write Faster Programs Using TURBO Pascal

Avoid the standard string functions.

  • Use ord(st[0]) instead of length(st)
  • Use move instead of string assignments:
    	  move(st1,st2,max_len+1);
    	

Write real constants with decimal point (10.0 instead of 10). This eliminates conversions.

GOTO statement considered harmful. So what... If a GOTO is the best way to express a program structure - use it ! TURBO GOTO's are still better than BASIC GOTO's: names can be used instead of numbers.

I/O can be improved if larger buffers are used (use multiples of 512 bytes for best results). Use BlockRead and BlockWrite. If large blocks are read or written the MS-DOS and TURBO overhead doesn't matter as much as for small blocks.

© 2002-2014 PC Engines GmbH. All rights reserved.