You are on page 1of 23

Compilers

8. Run-time Support

Laszlo Bszrmenyi

Compilers

Run-time - 1

Run-Time Environment
A compiler needs an abstract model of the runtime environment of the compiled code It must generate code for cooperation with it The run-time environment communicates with the operating system and maybe with the hardware Main tasks
Storage management Handling of run-time errors
Provide information for symbolic debugging Real Time (RTD) resp. Post Mortem Debugger (PMD)

Hardware extensions
Emulated instructions, virtual registers
Laszlo Bszrmenyi Compilers Run-time - 2

Design questions for imperative languages


Are procedures recursive? Are functions (procedures returning a value) supported? What should happen with local names (variables) of a procedure after return? Can non-local names be referenced? What kind of parameter passing modes and return-types are supported? Can be procedures passed as parameters or returned as a function value? Is dynamic memory management desirable? Is unused memory to be de-allocated explicitly or is a garbage-collector required? further questions for not procedure-oriented languages
Laszlo Bszrmenyi Compilers Run-time - 3

Kinds of Storage
Static storage
Code (immutable) Static instance (or module) variables Code Static Data Heap free Stack
Usual subdivison of run-time memory Heap and stack may be co-managed by a VMM
Run-time - 4

Semi-dynamic storage (stack)


Allocated on procedure activation and de-allocated on return

Dynamic storage (heap)


Storage allocated explicitly at run time
E.g. new in Java, malloc in C

De-allocated explicitly
E.g. free in C

Or implicitly, e.g. in Java and C#


By a garbage collector
Laszlo Bszrmenyi Compilers

Activation Tree and Control Stack


The activation tree describes the flow of control over the procedure call-chains
1. Each call is represented by a node 2. The root represents the activation of the main procedure 3. Node a is a parent of b iff the flow of control goes from a to b (a is calling b) 4. Node a is to the left to b iff a terminates before b

Control-Stack
That set of the active procedures of a call-chain Push at call (activation) Pop at return

Laszlo Bszrmenyi

Compilers

Run-time - 5

Sketch of Quicksort
class sort { int Arr[11]; void readarray (); { int i; } int partition (int m, int n); { } void quicksort (int m, int n); { int i; if (n > m) i = partition(m, n); quicksort(m, i-1); quicksort(i+1, n) } } main () { readarray(); Arr[0] := -9999; Arr[10] := 9999; quicksort(1, 9) } // class sort
Laszlo Bszrmenyi

// Array to be sorted: Arr[1].. Arr[9] // Reads 9 integers in Arr[i]; 1 i 9 // Let assume: -9999 < Arr[i] < 9999 // Partitions Arr[m..n] over a separator value V: // Arr[m..p-1] < V and Arr[p+1..n] V; returns p

// As long not sorted (left-right partitions distinct) // Partition // Call quicksort recursively on the left // and on the right partition

// Initialize Arr // Sentinel values (accelerate tests) (- and +) // Initial call of quicksort on 9 elements

Compilers

Run-time - 6

Activation tree and control stack for quicksort

Activation tree

Control stack at q(2,3)


Laszlo Bszrmenyi Compilers Run-time - 7

Activation record of a procedure


At call: push activation record
1. Reserve place for return values
If not returned in a register

Set by callee push C a ll e r C a ll e e Returned values Actual parameters Saved status Return address static link dynamic link Local variables Temp. variables

2. Push actual parameters 3. Push actual status (registers)


Temporary results of an expression

4. Push return address


PC (program counter) Points to surrounding scope Points to the callers local data
Variable-length data: indirection: A pointer to stack or heap

5. Set access (static) link 6. Set control (dynamic) link 7. Local and temporary variables

On return: pop in similar steps


Laszlo Bszrmenyi Compilers Run-time - 8

Stack of activation records - Example


Code for readarray Code for partition Code for quicksort Code for main Arr[0] .. Arr[10] parameter m: 1 parameter n: 9 status local variable i parameter m: 1 parameter n: 3 status local variable i parameter m: 2 parameter n: 3 status local variable i
Run-time - 9

Control stack at q(2,3)


Laszlo Bszrmenyi Compilers

Scoping
Names are valid in a certain scope Static scoping
The validity area is defined by the place in program text Nested block can access outer names (via access link)

Dynamic scoping
Validity is defined by actual state of variables Dynamic binding of a variable to a type e.g. class membership
E.g. ((Student)person).matrNum valid, if person is instance of Student

Mapping of names of variables to storage areas


The compiler generates relative addresses (offsets) Run-time environment maps these to storage address During execution values are assigned to variables Name Offset Storage Value XYZ 38 100038 25 (int XYZ:= 25)
Laszlo Bszrmenyi Compilers Run-time - 10

Static (lexical scoping)


program sort(input, output); n=0 var Arr : array [1 .. 10] of integer; x: integer; procedure readarray; n=1 var i: integer; begin Arr end {readarray} ; procedure exchange (i, j: integer); n=1 begin x := Arr[i]; Arr[i] : = Arr[j]; Arr[j] := x end {exchange}; procedure quicksort(m, n: integer); n=1 var k, v : integer; function partition(y, z: integer): integer; n = 2 var i, j : integer ; begin k, v, . exchange(i, j); end {partition} ; n: nesting begin ... end {quicksort} ; level begin end {sort}. Which k and v?
Laszlo Bszrmenyi Compilers

Points to the valid set of qs locals

Arr[1] .. Arr[10] x q(1, 9) access link k, v q(1, 3) access link k, v p(1, 3) access link i,j e(1, 3) access link
Run-time - 11

Fast static scoping (Display)


Display d[i] points to the actual activation record at nesting level i
Nesting level known at compilation Fast, but predefined length of d
Arr[1] .. Arr[10] x q(1, 9) saved d[1] k, v q(1, 3) saved d[1] k, v p(1, 3) saved d[2] i,j e(1, 3) saved d[1]
Run-time - 12

At call of a procedure
Store d[i] in the activation record Let d[i] point to d[0] the new d[1] activation record d[2]

Before returning
Restore d[i]
Laszlo Bszrmenyi

null

Compilers

Parameter passing modes


Call by value
1. Value of actual parameter is computed 2. Value is assigned to formal parameter
As an initialized local variable

Call by reference
The address of the actual parameter is passed Assignment to the formal parameter effects the act. par.

Call by name (textual replacement, as a macro)


Out-dated mode

Copy restore (used in remote procedure calls)


Call by value and use locally Before return assign formal par. to act. par.
Laszlo Bszrmenyi Compilers Run-time - 13

Parameter passing - Examples


VAR i: INTEGER; a: ARRAY [1 .. 2] of INTEGER; PROCEDURE P (x: INTEGER); BEGIN i := i + 1; x := x + 2; END P; ... BEGIN a[1] := 10; a[2] := 20; i := 1; P(a[i]) Call by value: Call by reference: Call by name: Copy restore:
Laszlo Bszrmenyi

(* P (VAR x: INTEGER);*)

a[1]: 10 a[2]: 20

a = (10, 20) a = (12, 20) a = (10, 22) a = (12, 20)

(same as ref. not always!)


Run-time - 14

Compilers

Dynamic Storage Allocation


Explicit allocation of variable-length blocks
External (global) fragmentation Allocation methods
First-fit (fast), Best-fit, Worst-fit

Explicit allocation of fixed-length blocks


Continuous runs for large areas Internal fragmentation

Compromise: Buddy algorithm


Variable length of blocks is limited to power of 2

Virtual memory management solves fragmentation


Fixed-length blocks, linked together with hardware support
Laszlo Bszrmenyi Compilers Run-time - 15

Garbage Collection
Explicit de-allocation is error-prone
Memory leaks
Memory never released Bad in server code
p q Tree in use 12 15 20 List unused 7 37 r
Compilers Run-time - 16

59 9

Dangling references
Pointing at released memory very bad!

Java, C#, Modula-3, ML, Prolog use g.c.

Difficulties
Careful memory usage Avoid too long pause
Laszlo Bszrmenyi

Reachability of Data
Root set
Data accessed directly, without a pointer E.g. in Java the static field members + stack The root set is always reachable

Transitive (recursive) reachability


Data that can be reached via the reachable set is reachable

Set of reachable data changes at


Object allocations Parameter passing and return values Reference assignments Procedure returns
Laszlo Bszrmenyi Compilers Run-time - 17

Mark and Sweep


Mark all reachable data
function DFS(x) if x is a pointer pointing to the heap if record x is not marked Mark x for each field fi of record x DFS(x.fi)

Sweep: release all unmarked data


p:= first address in heap while p < last address in heap if record p is marked unmark p else let f1 be the first word of p p.f1:= freeList; freeList:= p; p:= p + (size of record p)
Laszlo Bszrmenyi

(is unused, because free!)

Compilers

Run-time - 18

Price for mark and sweep


Let be
H: Heap size R: Size of all reachable blocks C1,C2: Number of required instructions for mark resp. sweep Cost = (C1*R + C2*H) / (H R) (usually C1 > C2) If H >> R (which is desirable): Cost ~ C2 If H >> R not true
Try to get memory from the operating system

Problem of the recursive algorithm


Depth of recursion could be H in the worst case! Stack must be rather built explicitly on the heap itself
Traversed record points back to predecessor (pointer reversal) On return, the reversed pointers have to reset (reversed again) Needs only a few additional variables for the management
Laszlo Bszrmenyi Compilers Run-time - 19

Mark and Sweep - Example

Laszlo Bszrmenyi

Compilers

Run-time - 20

Reference counting
Each memory block has a reference counter
If a new reference to the block is set: increment If a reference is deleted: decrement If reference counter == 0, the block is garbage

Assignments must be tracked


p:= q
pprev.refC--; q.refC++ Makes code slow
p

ref. counter
2 p 1

With data flow analysis


Counter ops can be reduced Complex task in compiler
3 q q 4

Cycles remain undetected


Laszlo Bszrmenyi Compilers

Run-time - 21

Copying collectors
Memory is partitioned into 2 semispaces A and B
Memory is allocated in A If end reached
Used blocks are copied into B A is now fully free

Role of A and B swaps after each run

Disadvantage
Half of the memory remains unused Addresses must be changed at run-time
Especially bad, if memory address is used as a hash value (e.g. in legacy C code)
Run-time - 22

Laszlo Bszrmenyi

Compilers

Short-Pause Garbage Collection


Simple gc stops the world
Especially bad for real-time applications E.g. during watching a movie

Partial collection
We collect only a little bit Generational garbage collection
Many generations of memory areas Only the oldest generation is collected Very efficient, if not too much cross-generation references exist

Incremental collection
The reachability analysis is broken into small pieces The collector may oversee garbage
But must never collect non-garbage!

Typically runs in an own thread in the background


Laszlo Bszrmenyi Compilers Run-time - 23

You might also like