Computer Abstractions and Techno
1.6 Performance
说到性能(Performance),首先就需要定义什么是性能?
Although this question might seem simple, an analogy with passenger airplanes shows how subtle the question of performance can be.
从飞机的例子来看,也有多个维度:
回到计算机上,通常采用两个指标:
response time Also called execution time. The total time required for the computer to complete a task, including disk accesses, memory accesses, I/O activities, operating system overhead, CPU execution time, and so on.
throughput Also called bandwidth. Another measure of performance, it is the number of tasks completed per unit time.
现实生活中,不同的场景下,关心的性能指标并不相同:
Different applications are sensitive to different aspects of the performance of a computer system. Many applications, especially those running on servers, depend as much on I/O performance, which, in turn, relies on both hardware and software. Total elapsed time measured by a wall clock is the measurement of interest. In some application environments, the user may care about throughput, response time, or a complex combination of the two (e.g., maximum throughput with a worst-case response time). To improve the performance of a program, one must have a clear definition of what performance metric matters and then proceed to find performance bottlenecks by measuring program execution and looking for the likely bottlenecks. In the following chapters, we will describe how to search for bottlenecks and improve performance in various parts of the system.
在初始的几章,我们主要关心响应时间,因而:
所以如果X比Y快n倍:
然而,响应时间也要看,到底是响应谁呢?
The most straightforward definition of time is called wall clock time, response time, or elapsed time. These terms mean the total time to complete a task, including disk accesses, memory accesses, input/output (I/O) activities, operating system overhead—everything.
问题在于,CPU等硬件的忙碌状态不同,一个程序执行所需的时间可能也会不同,因此响应时间需要进一步被划分:
CPU execution time or simply CPU time, which recognizes this distinction, is the time the CPU spends computing for this task and does not include time spent waiting for I/O or running other programs. (Remember, though, that the response time experienced by the user will be the elapsed time of the program, not the CPU time.) CPU time can be further divided into the CPU time spent in the program, called user CPU time, and the CPU time spent in the operating system performing tasks on behalf of the program, called system CPU time. Differentiating between system and user CPU time is difficult to do accurately, because it is often hard to assign responsibility for operating system activities to one user program rather than another and because of the functionality differences between operating systems.
对应地:
We will use the term system performance to refer to elapsed time on an unloaded system and CPU performance to refer to user CPU time.
本章主要关心CPU性能。
如果考虑“时间”这个概念在工程上的实现,就需要引入一系列如下概念:
clock period The length of each clock cycle. Designers refer to the length of a clock period both as the time for a complete clock cycle (e.g., 250 picoseconds, or 250ps) and as the clock rate (e.g., 4 gigahertz, or 4GHz), which is the inverse of the clock period.
clock cycle Also called tick, clock tick, clock period, clock, or cycle. The time for one clock period, usually of the processor clock, which runs at a constant rate.
clock cycles per instruction (CPI) Average number of clock cycles per instruction for a program or program fragment.
instruction count The number of instructions executed by the program.
所以有公式如下:
或者:
示例:
从另一个角度来再次拆解(类似于杜邦分析了):