You are on page 1of 89

SOEN 6431 SOFTWARE MAINTENANCE AND EVOLUTION

Dr. Juergen Rilling

Notes:#7 Program Slicing

Program comprehension
Program comprehension is the study of how software engineers understand programs. Program comprehension is needed for:

Debugging Code inspection Test case design Re-documentation Design recovery Code revisions

Program comprehension process


Involves the use of existing knowledge to acquire new knowledge about a program. Existing knowledge:

Programming languages Computing environment Programming principles Architectural models Possible algorithms and solution approaches Domain-specific information Any previous knowledge about the code Code functionality Architecture Algorithm implementation details Control flow Data flow

New knowledge:

Comprehension techniques
Reading by step-wise abstraction

Determine the function of critical subroutines, work through the program hierarchy until the function of the program is determined.

Checklist-based reading

Readers are given a checklist to focus their attention on particular issues within the document. Different readers were given different checklists, therefore each reader would concentrate on different aspects of the document.
Defects are categorized and characterized (e.g., data type inconsistency, incorrect functionality, missing functionality, etc.) A set of steps (a scenario) is then developed for each defect class to guide the reader to find those defects. Similar to defect-based reading, but instead of different defect classes, readers have different roles (tester, designer and user) to guide them in reading.

Defect-based reading

Perspective-based reading

Sources of variation
Aside from the issue of how comprehension occurs, comprehension performance and effectiveness are affected by many factors:

Maintainer characteristics Program characteristics Task characteristics

Program

Maintainer characteristics
Familiarity with code base Application domain knowledge Programming language knowledge Programming expertise Tool expertise Individual differences

Program

Program characteristics
Application domain Programming domain Quality of problem to be understood Program size and complexity Availability and accuracy of documentation

10

Task characteristics
Task type

Experimental: recall, modification Perfective, corrective, adaptive, reuse, extension.

Task size and complexity Time constraints Environmental factors

11

Models
Mental models

Internal working representation of the software under consideration.

Cognitive models

Theories of the processes by which software engineers arrive at a mental model.

Program

CognitiveMod el

Mental Model

12

Mental models
Static elements

Text structure knowledge Microstructure Chunks (macrostructure) Plans (objects) Hypotheses

Dynamic elements

Strategies (chunking and cross-referencing)


Beacons Rules of discourse

Supporting elements

13

Text structure
The program text and its structure

Control structure: iterations, sequences, conditional constructs Variable definitions Calling hierarchies Parameter definitions

Microstructure actual program statements and their relationships.

14

Chunks
Contain various levels of text structure abstractions. Also called macrostructure. Can be identified by a descriptive label. Can be composed into higher level chunks.

15

Plans (objects)
Knowledge elements for developing and validating expectations, interpretations, and inferences. Include causal knowledge about information flow and relationships between parts of a program. Programming plans

Based on programming concepts. Low level: iteration and conditional code segments. Intermediate level: searching, sorting, summing algorithms; linked lists and trees. High level All knowledge about the problem area. Examples: problem domain objects, system environment, domainspecific solutions and architectures.

Domain plans

16

Hypotheses
Conjectures that are results of comprehension activities that can take seconds or minutes to occur. Three types:

Why hypothesize the purpose/rationale of a function of design choice. How hypothesize the method for accomplishing a certain goal. What hypothesize classification.

Hypotheses are drivers of cognition. They help to define the direction of further investigation. Code cognition formulates hypotheses, checks them whether they are true or false, and revises them when necessary. Hypotheses fail for several reasons:

Cant find code to support a hypothesis. Confusion due to one piece of code satisfying different hypothesis. Code cannot be explained.

17

Supporting elements
Beacons

Cues that index into existing knowledge. A swap routine can be a beacon for a sorting function. Experienced programmers recognize beacons much faster than novice programmers. Used commonly in top-down comprehension.

Rules of discourse

Rules that specify programming conventions. Examples: coding standards, algorithm implementations, expected use of data structures.

18

Mental models dynamic elements


Strategies

Sequences of actions that lead to a particular goal.

Actions

Classify programmer activities implcitly and explicitly during a maintenance task.

Episodes

Sequences of actions.

Processes

Aggregations of episodes.

19

Strategies
Guide the sequence of actions while following a plan to reach a goal. Match programming plans to code.

Shallow reasoning do not perform in-depth analysis; stop upon recognition of familiar idioms and programming plans. Deep reasoning perform detailed analysis.

Mechanisms for understanding


Chunking Cross-referencing

20

Chunking
Creates new, higher-level abstraction structures Labels replace the detail of the lower level chunks.

21

Cross-referencing
Map program parts to functional descriptions

temp = a; a = b; b = temp;

swap

for (i=0; i<size; i++) if (array[i]==target) return true;

sequential search

22

Cognitive models
Letovsky Shneiderman and Mayer Brooks Soloway, Adelson and Ehrlich Pennington Mayrhauser and Vans (Integrated)

23

24

Letovsky model

Shneiderman model

25

26

Program Compreh ension

Brooks model

27

Program Compreh ension

Soloway model

28

Pennington model

Program Compreh ension

29

Integrated model

Program Compreh ension

Distributed cognition
Traditional cognitive models deal the cognitive processes inside one persons brain. On real projects, software developers:

Work in teams Can ask people questions Can surf the web for answers

How do these affect the cognitive process?

Program

30

Program Comprehension - support

Program Slicing

SOEN 6431

31

Can we learn from other domains?

We might not want or can digest a whole

32

Solution?

33

34

35

What is Program Slicing?


36

More descriptively, it is a decomposition technique that extracts statements relevant to a particular computation from a program. Slicing Criterion <s, v> Program Slices as Originally introduced by Weiser[1] are known as executable backward static slices

36

37

Given: (1) A program (2) A variable v at some point P in the program Goal: Finding the part of the program that is responsible for the computation of variable v at point P.

Basic Idea

38

Why Program Slicing?


39

Program Debugging: thats how slicing was discovered! Testing: reduce cost of regression testing after modifications (only run those tests that needed) Parallelization Integration : merging two programs A and B that both resulted from modifications to BASE

Reverse Engineering: comprehending the design by abstracting out of the source code the design decisions Software Maintenance: changing source code without unwanted side effects Software Quality Assurance: validate interactions between safety-critical components
39

40

41

Types of Slicing (Executable)

42

Static Backward Program Slicing


43

Static Backward Program Slicing was original introduced by Weiser in 1982. A static program slice consists of these parts of a program P that potentially could affect the value of a variable v at a point of interest.
Program P For all possible program inputs (executions) v = v Static Slice v
43

Slicing Properties:
44

Static Slicing

Statically available information only No assumptions made on input Computed slice can never be accurate (minimal slice) Problem is undecidable reduction to the halting problem Current static methods can only compute approximations Result may not be usefull

45

46

47

Creating a PDG
1 2 3 4 5 6 7 8 9 10 11 12 input (n,a); max := a[1]; min := a[1]; i := 2; s:= 0; while i n do begin if max < a[i] then begin max := a[i]; s := max; end; if min > a[i] then begin min := a[i]; s := min; end; output (s); i := i +2; end; output (max) ; output (min);

Data Dependence:
Represents a data flow (definition-use chain).
=> Data dependence between 2 and 7 but not between 2 and 8.

Control Dependence:
The execution of a node depends on the outcome of a predicate node. => Control dependence between node 6 and 8, but not between 6 and 15.

13 14
15 16

48

Program Dependence Graph (PDG)


A Program dependence graph is formed by combining data and control dependencies 49 between nodes. 1 input (n,a); 2 max := a[1]; 3 min := a[1]; 4 i := 2; 5 s:= 0; 6 while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 16 12 s := min; end;

13
14

i := i +2; end; 15 output (max); 16 output (min);

output (s);

Data Dependency Control Dependency


49

Any problems within this PDG?

Slicing Example
50

1 main( ) 2{ 3 int i, sum; 4 sum = 0; 5 i = 1; 6 while(i <= 10) 7 { 8 sum = sum + 1; 9 ++ i; 10 } 11 cout<< sum; 12 cout<< i; 13 }

An Example Program & its slice w.r.t. <12, i>

50

PDG of the Example Program


51

Control Dep. Edge

Data Dep. Edge

11

12

Slice Point 8 9

51

new

52

new

53

Loops
1. 2. 3. 4. 3 read (n); i :=n; sum :=0; product:= 1; while (i>0) { 4 sum:= sum+i 5 product:= product*i; 6 i:=i -1; } 7 write(sum); 8 write (product);
SOEN 6431 54

Static Backward slicing example


new

55

Forward Slice (static)

Note: It is not necessarily value preserving - meaning the value for the variable in the Slice might not be the same as in the original program.

56

Slicing Forward Static

Objective: what parts of a program are affected by a modification to the the variable specified in the slicing criterion.

57

Slicing Forward Static

58

Slicing Forward Static

59

60

Controversial statement:

Forward slicing provides more meaningful insights compared to backward slicing?

Question : Yes No Justify your answer

61

Slicing classifications
62

Types of slices

Direction of slicing

Static Dynamic

Executabiliy of slice

Backward Forward

Levels of slices

Executable Closure

Intraprocedural Interprocedural

62

63

Executable vs. non-executable slice

64

Dynamic Program Slicing


Dynamic slicing was originally introduced by Korel and Laski in 1988. A dynamic slice is an executable part of a program P whose behavior is identical, for the same program input, to that of the original program with respect to a variable v at some execution position.
Program P
Dyn. Slice

65

for a specific program input (execution)

v = v

Slicing Properties
66

Dynamic Slicing

Computed for a single input scenario Deterministic instead of probabilistic Useful for applications that are input driven (debugging, testing) Slicing criterion <i, p, v>

66

Two Major Dynamic Slicing Categories


Execution trace based algorithms:
Require first the recording of an execution trace and then compute a dynamic slice based on the recorded execution trace.

67

Non-Execution trace based algorithms:


Compute the dynamic slice during run-time without requiring any major recording of the program execution.

68

Program Execution Trace


Sample program:
1 2 3 4 5 6 input (n,a); max := a[1]; min := a[1]; i := 2; s:= 0; while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 10 if min > a[i] then begin 11 min := a[i]; 12 s := min; end; 13 output (s); 14 i := i +2; end; 15 output (max); 16 output (min);

Execution trace for n=2 ,a =(1,2)


11 22 33 44 55 66 77 88 99 1010 input (n,a); max := a[1]; min := a[1]; i := 2; s := 0; in max < a[i]; max := a[i]; s := max; min > a[i]; output(s); i := i +2; in output (max); output (min);

1311
1412 613 1514 1615

69

Dynamic Dependency Graph

Execution trace based Algorithms


Backward Algorithm
Original dynamic slicing algorithm presented by Korel and Laski in 1988. Based on a recorded execution trace for an input x.

Traces the execution trace backwards to derive dynamic data and control dependencies.
Create individual node in the PDG for each executed statement.

70

Backward Algorithm
Program Execution for n=2, a[1,2] at statement 15

71

16

15

Static and Dynamic slice for variable s


Static Slice: 1 2 3 4 5 6 7 8 9 10 input (n,a); max := a[1]; min := a[1]; i := 2; s:= 0; while i n do begin if max < a[i] then begin max := a[i]; s := max; end; if min > a[i] then begin min := a[i]; s := min; end; output (s); i := i +2; end; Dynamic Slice: 1 2 4 5 6 input (n,a); max := a[1]; i := 2; s:= 0; while i n do begin 7 if max < a[i] then begin 8 max := a[i]; 9 s := max; end; 13 output (s); 14 i := i +2; end;

72

11 12
13 14

Dynamic slice for variable s on input n=2, a[1,2];

73

new

Another Dynamic Slice Example

74

new c

Another Dynamic Slice Example

75

new

Static vs. Dynamic Slice

76

new

Any problems?
Q: How many nodes do we have in a Dynamic Dependency Graph? A: ???
Q: How many dynamic slices can we compute? A: ???

Q: Any suggestion on how to reduce the complexity? A: ???

77

Dynamic Forward Slicing

78

Algorithm based on removable blocks


Presented by Korel in 1994 and extended later on. Execution trace based. Overcomes limitations of dependency based algorithms with respect to unstructured programs. Uses data dependency. Uses removable blocks instead of control dependencies.
All Blocks are initially marked as removable => identify the blocks which are not removable.

79

Challenge: Complexity of Dynamic Slicing

B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B13 B14 B15

11 22 33 44 55 66 77 88 99 1010 1311 1412 613 1514

input (n,a); max := a[1]; min := a[1]; i := 2; s := 0; i< n max < a[i]; max := a[i]; s := max; min > a[i]; output(s); i := i + 2; i < n output (max];

A removable block is informally:


The smallest part of program text that can be removed during a slice computation without violating the syntactical correctness of the program, e.g.: loops, if/then/else, assignmentstatements, goto-statements,and break statements.

Sample program
80

Please note variables, a[] and n are omitted to reduce the complexity of the table

new

81

Challenges: Slicing unstructured programs


new

82

Explicit control transfer statements (goto, return, exit, break, continue) complicate the construction of control set A conservative solution: if goto statement has a nonempty relevant set, include goto and its target in the slice An alternative approach: look for labeled statements in the slice, then include goto statements that branch to these labels

Challenges: Arrays, Records, and Pointers (mainly static slicing)


new

Arrays:

Simple approach: treat each array assignment as both definition and use. Problem: too conservative To determine if use of a[g(j)] depends on definition of a[f(i)], we need to test whether f(i) can be equal to g(j)
Undecidable in general but can be solved for some expression types The solutions are one sided: can determine if f(i) and g(j) cannot be equal, but no information otherwise

Records:

Pointers:

Easy: treat record.field as record_field Hard: requires points-to analysis

83

new

84

new

85

new

86

new

87

Example Procedural
new

88

new

89

new

90

new

91

new

92