Chapter 17 of The Art of Computer Systems Performance Analysis covers 2k
factorial designs. A 2k factorial design determines the effect of k factors
where each factor has two levels or alternatives. The 2k factorial design
is useful at the start of a performance study to reduce the amount of detail
needed for a full factorial design. Most factors are unidirectional, so a 2k
factorial design only examines the min and max values for each factor. If the
range of performance for a factor is small, you can likely stop with just 2
factors saving time and cost for a full factorial design.
22 factorial design¶
We'll use the example from the book. The performance in million instructions per
second (MIPS) is measured by varying the cache size of 1 and 2 KB, and the
memory size of 4 MB and 16 MB.
Cache size (KB) |
MIPS (Memory size = 4 MB) |
MIPS (Memory Size = 16 MB) |
1 |
15 |
45 |
2 |
25 |
75 |
We'll define two variables: xA is -1 for 4MB of memory and 1 for 16 MB of
memory. xB is -1 for 1 KB of cache and 1 for 2 KB of cache.
The performance y in MIPS is regressed on xA and xB with a nonlinear
regression:
y=q0+qAxA+qBxB+qABxAxB
Substituting the 1 and -1 for qA and qB as well as the observation y
yields:
15=q0−qA−qB+qAB45=q0+qA−qB−qAB25=q0−qA+qB−qAB75=q0+qA+qB+qAB
Solving the regression equation for y is:
y=40+20xA+10xB+5xAxB
In general, solving for qi's by using the four observations yi:
y1=q0−qA−qB+qABy2=q0+qA−qB−qABy3=q0−qA+qB−qABy4=q0+qA+qB+qAB
q0qAqBqAB=41(y1+y2+y3+y4)=41(−y1+y2−y3+y4)=41(−y1−y2+y3+y4)=41(y1−y2−y3+y4)
Allocation of variation¶
The sample variance of
y=sy2=22−1∑i=122(yi−yˉ)2.
The numerator is of the faction is the total variation of y, or SST. The
variation consists of 3 parts:
SST=22qA2SST=SSA+22qB2+SSB+22qAB2+SSAB