Professional Documents
Culture Documents
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
F O U R T H E D I T I O N
Cast of characters
Programmer needs to develop a working solution. Student might play any or all of these roles someday.
Running time
As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will ariseBy what course of calculation can these results be arrived at by the machine in the shortest time? Charles Babbage (1864)
Analytic Engine
3
client gets poor performance because programmer did not understand performance characteristics
Break down waveform of N samples into periodic components. Applications: DVD, JPEG, MRI, astrophysics, . Brute force: N steps. FFT algorithm: N log N steps, enables new technology.
2
time
64T
quadratic
32T
16T 8T
linearithmic linear
1K 2K 4K 8K
5
size
Simulate gravitational interactions among N bodies. Brute force: N steps. Barnes-Hut algorithm: N log N steps, enables new research.
2
time
64T
quadratic
32T
16T 8T
linearithmic linear
1K 2K 4K 8K
6
size
The challenge
Q. Will my program be able to solve a large practical input?
Observe some feature of the natural world. Hypothesize a model that is consistent with the observations. Predict events using the hypothesis. Verify the predictions by making further observations. Validate by repeating until the hypothesis and observations agree.
Principles.
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
Example: 3-SUM
3-SUM. Given N distinct integers, how many triples sum to exactly zero?
a[k] 10 -10 0 10
sum 0 0 0 0
public class ThreeSum { public static int count(int[] a) { int N = a.length; int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) for (int k = j+1; k < N; k++) if (a[i] + a[j] + a[k] == 0) count++; return count; } public static void main(String[] args) { int[] a = In.readInts(args[0]); StdOut.println(count(a)); } }
11
4039
12
(part of stdlib.jar )
double elapsedTime()
public static void main(String[] args) { int[] a = In.readInts(args[0]); Stopwatch stopwatch = new Stopwatch(); StdOut.println(ThreeSum.count(a)); double time = stopwatch.elapsedTime(); client code }
13
Empirical analysis
Run the program for various input sizes and measure running time.
14
Empirical analysis
Run the program for various input sizes and measure running time.
15
Data analysis
Standard plot. Plot running time T (N) vs. input size N.
standard plot
50
log-log plot
51.2 25.6
40
12.8
30
lg (T(N ))
1K 2K 4K 8K
20
10
.2 .1
Data analysis
Log-log plot. Plot running time T (N) vs. input size N using log-log scale.
log-log plot
51.2 25.6 12.8
lg (T(N ))
8K
1K
2K
4K
8K
lg N
power law slope
Predictions.
validates hypothesis!
18
Doubling hypothesis
Doubling hypothesis. Quick way to estimate b in a power-law relationship. Run program, doubling the size of the input.
N 250 500 1,000 2,000 4,000 8,000 time (seconds) 0.0 0.0 0.1 0.8 6.4 51.1 4.8 6.9 7.7 8.0 8.0
ratio
Hypothesis. Running time is about a N b with b = lg ratio. Caveat. Cannot identify logarithmic factors with doubling hypothesis.
19
Doubling hypothesis
Doubling hypothesis. Quick way to estimate b in a power-law relationship. Q. How to estimate a (assuming we know b) ? A. Run the program (for a sufficient large value of N) and solve for a.
51.1 = a 80003
51.0 51.1
a = 0.998 10 10
Experimental algorithmics
System independent effects.
Hardware: CPU, memory, cache, Software: compiler, interpreter, garbage collector, System: operating system, network, other apps,
Bad news. Difficult to get precise measurements. Good news. Much easier and cheaper than other sciences.
e.g., can run huge number of experiments
21
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
Need to analyze program to determine set of operations. Cost depends on machine, compiler. Frequency depends on algorithm, input data.
operation integer add integer multiply integer divide floating-point add floating-point multiply floating-point divide sine arctangent ...
nanoseconds 2.1 2.4 5.4 4.6 4.2 13.5 91.3 129.0 ...
25
operation variable declaration assignment statement integer compare array element access array length 1D array allocation 2D array allocation string length substring extraction string concatenation
example int a a = b a < b a[i] a.length new int[N] new int[N][N] s.length() s.substring(N/2, N) s + t
nanoseconds c1 c2 c3 c4 c5 c6 N c7 N c8 c9 c10 N
2
Example: 1-SUM
Q. How many instructions as a function of input size N ?
int count = 0; for (int i = 0; i < N; i++) if (a[i] == 0) count++;
operation variable declaration assignment statement less than compare equal to compare array access increment
frequency
2 2 N+1 N N N to 2 N
27
Example: 2-SUM
Q. How many instructions as a function of input size N ?
int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) if (a[i] + a[j] == 0) count++;
0 + 1 + 2 + . . . + (N 1) = = 1 N (N 2 N 2 1)
operation variable declaration assignment statement less than compare equal to compare array access increment
frequency
N+2 N+2 (N + 1) (N + 2) N (N 1)
tedious to count exactly
N (N 1) N (N 1) to N (N 1)
28
It is convenient to have a measure of the amount of work involved in a computing process, even though it be a very crude one. We may count up the number of times that various elementary operations are applied in the whole process and then given them various weights. We might, for instance, count the number of additions, subtractions, multiplications, divisions, recording of numbers, and extractions of figures from tables. In the case of computing with matrices most of the work consists of multiplications and writing down numbers, and we shall therefore only attempt to count the number of multiplications and recordings. Alan Turing
29
Downloaded
operation variable declaration assignment statement less than compare equal to compare array access increment
frequency
N+2 N+2 (N + 1) (N + 2) N (N 1) N (N 1) N (N 1) to N (N 1)
cost model = array accesses (we assume compiler/JVM do not optimize array accesses away!)
30
Estimate running time (or memory) as a function of input size N. Ignore lower order terms.
when N is large, terms are negligible when N is small, we don't care Ex 1. Ex 2. Ex 3. N 3 + 20 N + 16
N
3
N 3/6
~ N3 ~ N
3
166,666,667 N 3/6 N 2/2 + N /3 166,167,000 N 1,000 Leading-term approximation
+ 100 N
2
4/3
+ 56
N3 - N
+ N
~ N3
discard lower-order terms (e.g., N = 1000: 500 thousand vs. 166 million)
lim
f (N ) = 1 g( N )
31
Estimate running time (or memory) as a function of input size N. Ignore lower order terms.
when N is large, terms are negligible when N is small, we don't care
operation variable declaration assignment statement less than compare equal to compare array access increment
frequency
tilde notation
N+2 N+2 (N + 1) (N + 2) N (N 1) N (N 1) N (N 1) to N (N 1)
~N ~N ~ N2 ~ N2 ~ N2 ~ N2 to ~ N2
32
Example: 2-SUM
Q. Approximately how many array accesses as a function of input size N ?
int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) if (a[i] + a[j] == 0) count++;
"inner loop"
0 + 1 + 2 + . . . + (N
1) = =
A. ~ N 2 array accesses.
1 N (N 2 N 2
1)
Bottom line. Use cost model and tilde notation to simplify counts.
33
Example: 3-SUM
Q. Approximately how many array accesses as a function of input size N ?
int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) for (int k = j+1; k < N; k++) if (a[i] + a[j] + a[k] == 0) count++;
"inner loop"
N 3
A. ~ N 3 array accesses.
N (N 1 3 N 6
1)(N 3!
2)
Bottom line. Use cost model and tilde notation to simplify counts.
34
Ex 1. 1 + 2 + + N.
i=1
i
N
x dx
x=1
1 2 N 2
Ex 2. 1k + 2k + + N k.
i
i=1
k x=1
xk dx
1 N k+1 k+1
1 i
x=1
1 dx = ln N x
x=1
y =x
dz dy dx
z =y
1 3 N 6
35
Formulas can be complicated. Advanced mathematics might be required. Exact models best left for experts.
costs (depend on machine, compiler)
TN = c1 A + c2 B + c3 C + c4 D + c5 E
A= B= C= D= E=
array access integer add integer compare increment variable assignment
frequencies (depend on algorithm, input)
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
time
1, log N, N, N log N, N
2,
3,
and
2N
logarithmic constant
500K
size
log-log plot
exponential
cubi c
tic
dra
hm r it ea lin
qua
64T
time
8T 4T 2T T
logarithmic constant
1K 2K 4K 8K
size
lin
512K
ea
ic
512T
name
description
example
T(2N) / T(N)
constant
a = b + c;
statement
log N
logarithmic
divide in half
binary search
~1
linear
loop
N log N
linearithmic
mergesort
~2
N2
quadratic
for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) { ... } for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) for (int k = 0; k < N; k++) { ... }
double loop
N3
cubic
triple loop
2N
exponential
exhaustive search
T(N)
40
Bottom line. Need linear or linearithmic alg to keep pace with Moore's law.
41
6
0
13
1
14
2
25
3
33
4
43
5
51
6
53
7
64
8
72
9
84
10
93
11
95
12
96
13
97
14
lo
hi
42
First binary search published in 1946; first bug-free one in 1962. Bug in Java's Arrays.binarySearch() discovered in 2006.
public static int binarySearch(int[] a, int key) { int lo = 0, hi = a.length-1; while (lo <= hi) { int mid = lo + (hi - lo) / 2; if (key < a[mid]) hi = mid - 1; else if (key > a[mid]) lo = mid + 1; else return mid; } return -1; }
Invariant. If key appears in the array a[], then a[lo] key a[hi].
43
Pf sketch.
T (N) T (N / 2) + 1 T (N / 4) + 1 + 1 T (N / 8) + 1 + 1 + 1 ... T (N / N) + 1 + 1 + + 1 = 1 + lg N
44
Step 1: Step 2:
5 10 30 40
(-20, -10)
N2
log N.
Step 1: Step 2:
45
Comparing programs
Hypothesis. The sorting-based N 2 log N algorithm for 3-SUM is significantly faster in practice than the brute-force N 3 algorithm.
ThreeSum.java
ThreeSumDeluxe.java
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
Types of analyses
Best case. Lower bound on cost.
Types of analyses
Best case. Lower bound on cost. Worst case. Upper bound on cost. Average case. Expected cost.
Need to understand input to effectively process it. Approach 1: design for the worst case. Approach 2: randomize, depend on probabilistic guarantee.
50
Theory of algorithms
Goals.
Suppress details in analysis: analyze to within a constant factor. Eliminate variability in input model by focusing on the worst case.
Optimal algorithm.
Performance guarantee (to within a constant factor) for any input. No algorithm can provide a better performance guarantee.
51
notation
provides
example
shorthand for
used to
N2
Big Theta asymptotic order of growth
(N2)
10 N2 5 N2 + 22 N log N + 3N 10 N2
classify algorithms
Big Oh
O(N2)
100 N 22 N log N + 3 N N2
Big Omega
(N2)
and larger
(N2)
N5 N3 + 22 N log N + 3 N
52
Establish difficulty of a problem and develop optimal algorithms. Ex. 1-SUM = Is there a 0 in the array?
Upper bound. A specific algorithm.
Ex. Brute-force algorithm for 1-SUM: Look at every array entry. Running time of the optimal algorithm for 1-SUM is O(N ).
Lower bound. Proof that no algorithm can do better.
Ex. Have to examine all N entries (any unexamined one might be 0). Running time of the optimal algorithm for 1-SUM is (N ).
Optimal algorithm.
Lower bound equals upper bound (to within a constant factor). Ex. Brute-force algorithm for 1-SUM is optimal: its running time is (N ).
53
Ex. Brute-force algorithm for 3-SUM. Running time of the optimal algorithm for 3-SUM is O(N
).
54
Ex. Improved algorithm for 3-SUM. Running time of the optimal algorithm for 3-SUM is O(N
Lower bound. Proof that no algorithm can do better.
logN ).
Ex. Have to examine all N entries to solve 3-SUM. Running time of the optimal algorithm for solving 3-SUM is (N ).
Open problems.
Optimal algorithm for 3-SUM? Subquadratic algorithm for 3-SUM? Quadratic lower bound for 3-SUM?
55
Lower the upper bound (discover a new algorithm). Raise the lower bound (more difficult).
Golden Age of Algorithm Design.
1970s-. Steadily decreasing upper bounds for many important problems. Many known optimal algorithms.
Caveats.
Overly pessimistic to focus on worst case? Need better than to within a constant factor to predict performance.
56
Commonly-used notations
notation provides example shorthand for used to provide approximate model
10 N2
Tilde leading term
~ 10 N2
10 N2 + 22 N log N 10 N2 + 2 N + 37 N2
Big Theta
(N2)
10 N2 5 N2 + 22 N log N + 3N 10 N2
classify algorithms
Big Oh
O(N2)
100 N 22 N log N + 3 N N2
Big Omega
(N2)
N5 N3 + 22 N log N + 3 N
Common mistake. Interpreting big-Oh as an approximate model. This course. Focus on approximate models: use Tilde-notation
57
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
Basics
Bit. 0 or 1. Byte. 8 bits. Megabyte (MB). 1 million or 220 bytes. Gigabyte (GB). 1 billion or 230 bytes.
NIST most computer scientists
60
bytes 1 1 2 4 4 8
bytes 2N + 24 4N + 24 8N + 24
type 8 char[][]
for primitive types
int[][] double[][]
24 bytes
public class Integer Padding. { Each object uses a private int x; ... }
multiple of 8 bytes.
object overhead int x value bytes padding of memory.
32 bytes
object overhead day month year padding
4 bytes (int)
values
int
counter object
public class Counter {
32 bytes
62
40 bytes
object overhead value offset count hash padding
reference
int
values
substring example
String genome = "CGCCTGGCGTCTGTAC";
Primitive type: 4 bytes for int, 8 bytes for double, Object reference: 8 bytes. Array: 24 bytes + memory for each array entry. Object: 16 bytes + memory for each instance variable
+ 8 bytes if inner class (for pointer to enclosing class).
Padding:
Shallow memory usage: Don't count referenced objects. Deep memory usage: If array entry or instance variable is a reference, add memory (recursively) for referenced object.
64
Example
Q. How much memory does WeightedQuickUnionUF use as a function of N ? Use tilde notation to simplify your answer.
public class WeightedQuickUnionUF { private int[] id; private int[] sz; private int count; public WeightedQuickUnionUF(int N) { id = new int[N]; sz = new int[N]; for (int i = 0; i < N; i++) id[i] = i; for (int i = 0; i < N; i++) sz[i] = 1; } ... }
16 bytes (object overhead) 8 + (4N + 24) each reference + int[] array 4 bytes (int) 4 bytes (padding)
A. 8 N + 88 ~ 8 N bytes.
65
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
R OBERT S EDGEWICK | K EVIN W AYNE
http://algs4.cs.princeton.edu
Execute program to perform experiments. Assume power law and formulate a hypothesis for running time. Model enables us to make predictions.
Mathematical analysis.
Analyze algorithm to count frequency of operations. Use tilde notation to simplify analysis. Model enables us to explain behavior.
Scientific method.
Algorithms
1.4 A NALYSIS
introduction observations
OF
A LGORITHMS
Algorithms
F O U R T H E D I T I O N