Chapter One 1.5~1.11
1.5 Technologies for Building Processors and Memory
-
transistor
An on/off switch controlled by an electric signal. -
very large-scale integrated (VLSI) circuit
A device containing hundreds of thousands to millions of transistors.
-
die
The individual rectangular sections that are cut from a wafer, more informally known as chips. - yield The percentage of good dies from the total number of dies on the wafer.
1.6 Performance
keyword:
- response time(execution time)
- throughput(bandwidth)
1.6.1 Defining Performance
performance formula.PNGn times.PNG
1.6.2 Measuring Performance
-
clock cycle
Also called tick, clock tick, clock period, clock, or cycle. Th e time for one clock period, usually of the processor clock, which runs at a constant rate. -
clock period
The length of each clock cycle.
1.6.3 CPU Performance and Its Factors
f2.PNG1.6.4 Instruction Performance
- clock cycles per instruction (CPI) Average number of clock cycles per instruction for a program or program fragment.
1.6.5 The Classic CPU Performance Equation
CPU timeexample.PNG exe time formula.PNG
Elaboration:
Although you might expect that the minimum CPI is 1.0, as we’ll see in Chapter 4, some processors fetch and execute multiple instructions per clock cycle. To reflect that approach, some designers invert CPI to talk about IPC, or instructions per clock cycle. If a processor executes on average 2 instructions per clock cycle, then it has an IPC of 2 and hence a CPI of 0.5.
Elaboration:
Although clock cycle time has traditionally been fixed, to save energy or temporarily boost performance, today’s processors can vary their clock rates, so we would need to use the average clock rate for a program. For example, the Intel Core i7 will temporarily increase clock rate by about 10% until the chip gets too warm. Intel calls this Turbo mode.
1.7 The Power Wall
The dominant technology for integrated circuits is called CMOS (complementary metal oxide semiconductor). For CMOS, the primary source of energy consumption is so-called dynamic energy—that is, energy that is consumed when transistors switch states from 0 to 1 and vice versa.
Elaboration:
Although dynamic energy is the primary source of energy consumption in CMOS, static energy consumption occurs because of leakage current that flows even when a transistor is off. In servers, leakage is typically responsible for 40% of the energy consumption. Thus, increasing the number of transistors increases power dissipation, even if the transistors are always off. A variety of design techniques and technology
innovations are being deployed to control leakage, but it’s hard to lower voltage further.
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors
Parallelism has always been critical to performance in computing, but it was often hidden.
we must:
- schedule the sub-tasks.
- balance the load evenly to get the desired speed up
- reduce communication and synchronization overhead
1.10 Fallacies and Pitfalls
Pitfall: Expecting the improvement of one aspect of a computer to increase overall performance by an amount proportional to the size of the improvement.
Fallacy: Computers at low utilization use little power.
Fallacy: Designing for performance and designing for energy effi ciency are unrelated goals.
Pitfall: Using a subset of the performance equation as a performance metric.
MIPS (million instructions per second):
A measurement of program execution speed based on the number of millions of instructions. MIPS is computed as the instruction count divided by the product of the execution time and 10^6
MIPS.PNG
Finally, and most importantly, if a new program executes more instructions but each instruction is faster, MIPS can vary independently from performance!
1.11 Concluding Remarks
performance.PNGTwo of the key ideas are exploiting parallelism in the program, typically today via multiple processors, and exploiting locality of accesses to a memory hierarchy, typically via caches.