[Common] Clean Architecture (1)
The goal of software architecture is to minimize the human resources required to build and maintain the required system.
Architects create an architecture that allows those features and functions to be easily developed, easily modified, and easily extended.
PARADIGM
The three paradigms included in this overview chapter are structured programming, object-orient programming, and functional programming.
- Structured Programming
Dijkstra showed that the use of unrestrained jumps (goto statements) is harmful to program structure. He replaced those jumps with the more familiar if/then/else and do/while/until constructs.
Structured programming imposes discipline on direct transfer of control.
- Object-oriented Programming
Programmers noticed that the function call stack frame in the ALGOL language could be moved to a heap, thereby allowing local variables declared by a function to exist long after the function returned.
The function became a constructor for a class, the local variables became instance variables, and the nested functions became methods.
Object-oriented programming imposes discipline on indirect transfer of control.
- Functional Programming
This effectively means that a functional language has no assignment statement. Most functional languages do, in fact, have some means to alter the value of a variable, but only under very strict discipline.
Functional programming imposes discipline upon assignment.
None of them adds new capabilities. Each imposes some kind of extra discipline that is negative in its intent. The paradigms tell us what not to do, more than they tell us what to do.
STRUCTURED PROGRAMMING
结构化的程式是以一些简单、有层次的程式流程架构所组成,可分为顺序(sequence)、选择(selection)及循环(repetition)。
顺序是指程式正常的执行方式,执行完一个指令后,执行后面的指令;选择结构顾名思义,当程序到了一定的处理过程时,遇到了很多分支,无法按直线走下去,它需要根据某一特定选择结构表示程序的处理步骤出现了分支,它需要根据某一特定的条件选择其中的一个分支执行,选择结构有单选择、双选择和多选弹三种形式;不断的重复,被称作循环。
Software development is not a mathematical endeavor, even though it seems to manipulate mathematical constructs. Rather, software is like a science. We show correctness by failing to prove incorrectness, despite our best efforts.
Such proofs of incorrectness can be applied only to provable programs. A program that is not provable—due to unrestrained use of goto, for example—cannot be deemed correct no matter how many tests are applied to it.
Structured programming forces us to recursively decompose a program into a set of small provable functions. We can then use tests to try to prove those small provable functions incorrect. If such tests fail to prove incorrectness, then we deem the functions to be correct enough for our purposes.
OBJECT-ORIENTED PROGRAMMING
It is difficult to accept that OO depends on strong encapsulation. Indeed, many OO languages2 have little or no enforced encapsulation.
It’s fair to say that while OO languages did not give us something completely brand new, it did make the masquerading of data structures significantly more convenient.(让继承更容易了,在 OO 之前也可以通过强转不继承实现转换)
To recap: We can award no point to OO for encapsulation, and perhaps a half-point for inheritance. So far, that’s not such a great score.
在 OO 之前其实也有多态,例如操作系统里面的读取输入,会根据输入设备的不同,也就是pointer不一样,所调用的读取方法也不一样。Ah, but that’s not quite correct. OO languages may not have given us polymorphism, but they have made it much safer and much more convenient.
What is OO? There are many opinions and many answers to this question. To the software architect, however, the answer is clear: OO is the ability, through the use of polymorphism, to gain absolute control over every source code dependency in the system. It allows the architect to create a plugin architecture, in which modules that contain high-level policies are independent of modules that contain low-level details. The low-level details are relegated to plugin modules that can be deployed and developed independently from the modules that contain high-level policies. OO 就是通过多态实现上层不直接依赖下层,可以分别独立的编程。
Dependency inversion
FUNCTIONAL PROGRAMMING
如果让你print前25个数的平方,你会这样做:
示例
The Java program uses a mutable variable—a variable that changes state during the execution of the program. That variable is i—the loop control variable. No such mutable variable exists in the Clojure program. In the Clojure program, variables like x are initialized, but they are never modified.
This leads us to a surprising statement: Variables in functional languages do not vary.
All race conditions, deadlock conditions, and concurrent update problems are due to mutable variables. You cannot have a race condition or a concurrent update problem if no variable is ever updated. You cannot have deadlocks without mutable locks.
In other words, all the problems that we face in concurrent applications—all the problems we face in applications that require multiple threads, and multiple processors—cannot happen if there are no mutable variables.
The point is that well-structured applications will be segregated into those components that do not mutate variables and those that do. This kind of segregation is supported by the use of appropriate disciplines to protect those mutated variables.
Architects would be wise to push as much processing as possible into the immutable components, and to drive as much code as possible out of those components that must allow mutation.
设想一个银行存款的应用,如何可以immutable的实现呢?至少balance需要mutable吖。那么如果我们不存balance,只是存储transaction呢,也就是只存储每一笔交易,然后需要query balance的时候,从第一笔开始计算所有的交易,得到余额。
这样做的问题就是你需要很大的storage来存储所有的transaction,以及需要很好的计算能力,能够总结汇总这些交易。但是好处是只有CRUD里面的Create和Retrieve没有删除和更新,所以就可以避免多线程问题,做到了immutable和functional。
DESIGN PRINCIPLES
The goal of the principles is the creation of mid-level software structures that:
• Tolerate change,
• Are easy to understand, and
• Are the basis of components that can be used in many software systems.
Solid
-
SRP: The Single Responsibility Principle
A class should only have a single responsibility, that is, only changes to one part of the software's specification should be able to affect the specification of the class. -
OCP: The Open-Closed Principle
"Software entities ... should be open for extension, but closed for modification." -
LSP: The Liskov Substitution Principle
"Objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program." -
ISP: The Interface Segregation Principle
"Many client-specific interfaces are better than one general-purpose interface." -
DIP: The Dependency Inversion Principle
One should "depend upon abstractions, [not] concretions."
SRP
其实每个类只有一个reason去改变,这个reason往往就是为了满足客户,所以这个可以改成每个类改变的原因只有一个客户,但有的时候其实不仅仅是客户让你修改,所以SRP的定义其实可以改为:
A module should be responsible to one, and only one, actor.
就类似于,我们的代码经常因为别的组的PM而改变那就是有问题的。
The simplest definition of module is just a source file. In some cases a module is just a cohesive set of functions and data structures.
举一些反例:例如一个Employee类有三个方法,calculatePay(), reportHours(), and save()
SRP反例
The calculatePay() method is specified by the accounting department, which reports to the CFO; The reportHours() method is specified and used by the human resources department, which reports to the COO; The save() method is specified by the database administrators (DBAs), who report to the CTO.
现在把这三个方法都放在一个类里面,那么CFO对calculatePay的更改需求可能就会影响到COO的reportHours,这就很不符合SRP了。
而且这个还会造成一个问题,如果两个team分别负责CFO以及COO的需求,那么提交的时候可能会conflict,于是就需要merge,merge就可能会造成未知的错误。(其实conflict的存在也是一个类没能SRP的flag,如果是源于需求修改)
解决方案
解决的方式可以让三个类来负责三件事,然后他们share员工数据;如果觉得对外而言三个类比较多,可以包一层Facade pattern,也就是外观模式,用一个EmployeeFacade来调用具体的三个类。
OCP
This is how the OCP works at the architectural level. Architects separate functionality based on how, why, and when it changes, and then organize that separated functionality into a hierarchy of components. Higher-level components in that hierarchy are protected from the changes made to lower-level components.
LSP
一个里式替换的反例就是正方形如果是长方形的子类,其实是不能把所有用到长方形的地方替换为正方形的,因为长方形可以长宽不一致,而正方形不可以:
LSP反例
In the early years of the object-oriented revolution, we thought of the LSP as a way to guide the use of inheritance, as shown in the previous sections. However, over the years the LSP has morphed into a broader principle of software design that pertains to interfaces and implementations.
The interfaces in question can be of many forms. We might have a Java-style interface, implemented by several classes. Or we might have several Ruby classes that share the same method signatures. Or we might have a set of services that all respond to the same REST interface.
In all of these situations, and more, the LSP is applicable because there are users who depend on well-defined interfaces, and on the substitutability of the implementations of those interfaces.
有的时候如果新加了一些需求,就在原有的后端API上面加了一些参数,那么处理的函数里面就要各种判断这个参数。如果完全处理流程不同呢?if-else的意义其实就是最外层包了一下,这样的情况下可以用不同的接口,不要复用旧的加参数了。
ISP
ISP反例
This dependence means that a change to the source code of op2 in OPS will force User1 to be recompiled and redeployed, even though nothing that it cared about has actually changed.
解决方案
In dynamically typed languages like Ruby and Python, such declarations don’t exist in source code. Instead, they are inferred at runtime. Thus there are no source code dependencies to force recompilation and redeployment. This is the primary reason that dynamically typed languages create systems that are more flexible and less tightly coupled than statically typed languages.
Statically typed languages like Java force programmers to create declarations that users must import, or use, or otherwise include.This fact could lead you to conclude that the ISP is a language issue, rather than an architecture issue.
In general, it is harmful to depend on modules that contain more than you need. This is obviously true for source code dependencies that can force unnecessary recompilation and redeployment—but it is also true at a much higher, architectural level. 依赖不需要的部分最明显的问题就是会造成不必要的重新编译,例如A库依赖了B库,B库依赖了C库,那么C的修改就会造成A的重新编译,并且有可能会带来bug,所以不仅从编译角度而言,也是不应该依赖不用的部分的。
DIP
The Dependency Inversion Principle (DIP) tells us that the most flexible systems are those in which source code dependencies refer only to abstractions, not to concretions.
Clearly, treating this idea as a rule is unrealistic, because software systems must depend on many concrete facilities. For example, the String class in Java is concrete, and it would be unrealistic to try to force it to be abstract. 但由于string类不会经常变化所以其实就还好~ We tolerate those concrete dependencies because we know we can rely on them not to change.
It is the volatile concrete elements of our system that we want to avoid depending on. Those are the modules that we are actively developing, and that are undergoing frequent change.
依赖反转
The curved line is an architectural boundary. It separates the abstract from the concrete. All source code dependencies cross that curved line pointing in the same direction, toward the abstract side.
Note that the flow of control crosses the curved line in the opposite direction of the source code dependencies. The source code dependencies are inverted against the flow of control—which is why we refer to this principle as Dependency Inversion.曲线之外的箭头方向和内部的是相反的,也就是依赖反转。
The way the dependencies cross that curved line in one direction, and toward more abstract entities, will become a new rule that we will call the Dependency Rule.
COMPONENT PRINCIPLES
If the SOLID principles tell us how to arrange the bricks into walls and rooms, then the component principles tell us how to arrange the rooms into buildings.
最最早的程序,都要写自己想要从哪个内存地址开始做什么事情,那么如果我们依赖了某个library,就需要把他加载到特定的地址,然后读取之类的。那么随着代码越来越多,内存肯定是不够用的,所以就出现了relocatable binaries。
The compiler was changed to output binary code that could be relocated in memory by a smart loader. The loader would be told where to load the relocatable code. The relocatable code was instrumented with flags that told the loader which parts of the loaded data had to be altered to be loaded at the selected address. Usually this just meant adding the starting address to any memory reference addresses in the binary.
The linking loader allowed programmers to divide their programs up onto separately compilable and loadable segments. This worked well when relatively small programs were being linked with relatively small libraries.
Eventually, the loading and the linking were separated into two phases. Programmers took the slow part—the part that did that linking—and put it into a separate application called the linker. The output of the linker was a linked relocatable that a relocating loader could load very quickly. This allowed programmers to prepare an executable using the slow linker, but then they could load it quickly, at any time.
These dynamically linked files, which can be plugged together at runtime, are the software components of our architectures. It has taken 50 years, but we have arrived at a place where component plugin architecture can be the casual default as opposed to the herculean effort it once was.
COMPONENT COHESION
Which classes belong in which components? In this chapter we will discuss the three principles of component cohesion:
• REP: The Reuse/Release Equivalence Principle
• CCP: The Common Closure Principle
• CRP: The Common Reuse Principle
REP
The granule of reuse is the granule of release. 重用的粒度就是发布的粒度。
REP要求我们从重用的角度去考虑一个组件的内容,一个组件中的类要么都可以重用,要么都不是可重用的( Either all the classes in a component are reusable, or none of them are.)。例如在一个基于Model 2的Java EE项目中,我们将所有的DAO类打包为一个jar,目的是在其他基于相同数据库的项目中能够重用所有的DAO类,在这个DAO组件中就不应该包含任何控制层(Servlet)或者表示层(JSP等)相关的类,因为这些类不能同时被重用。
简而言之,REP要求我们从复用的角度来设计组件,让一个组件中所有的类都能够一起被复用,不存在不能复用的类,也不存在只有在另一种场合下才能够被复用的类。复用的粒度即发布组件的粒度。
CCP
Gather into components those classes that change for the same reasons and at the same times. Separate into different components those classes that change at different times and for different reasons. 一个组件中的所有类对于同一种类型的变化应该是共同封闭的。一个变化若对一个组件产生影响,则将影响该组件中所有的类,而对其他组件不造成影响。
This is the Single Responsibility Principle restated for components. This principle is closely associated with the Open Closed Principle (OCP). We design our classes such that they are closed to the most common kinds of changes that we expect or have experienced. The CCP amplifies this lesson by gathering together into the same component those classes that are closed to the same types of changes.
CRP
Don’t force users of a component to depend on things they don’t need. 一个组件中的类需一起被重用。如果你重用了组件中的一个类,那么就要重用其中所有的类。
如果组件A仅仅依赖了组件B里面的一个类,但是B如果做了修改,A仍旧有可能要改,即使它并不关心这个修改。但这就会导致庞大系统中一个类修改引发一系列的修改,所以我们只应该依赖必须依赖的组件,不要依赖过多只用了一点的。如果只用到一点说明被依赖的库可以拆分,不够内聚。
Therefore the CRP tells us more about which classes shouldn’t be together than about which classes should be together. The CRP says that classes that are not tightly bound to each other should not be in the same component.
The CRP is the generic version of the ISP. The ISP advises us not to depend on classes that have methods we don’t use. The CRP advises us not to depend on components that have classes we don’t use.
COMPONENT COUPLING
Acyclic Dependencies Principle (ADP): Allow no cycles in the component dependency graph.
无环图
Notice one more thing: Regardless of which component you begin at, it is impossible to follow the dependency relationships and wind up back at that component. This structure has no cycles. It is a directed acyclic graph (DAG).
Now consider what happens when the team responsible for Presenters makes a new release of their component. It is easy to find out who is affected by this release; you just follow the dependency arrows backward. Thus View and Main will both be affected.
When the developers working on the Presenters component would like to run a test of that component, they just need to build their version of Presenters with the versions of the Interactors and Entities components that they are currently using. None of the other components in the system need be involved.
When it is time to release the whole system, the process proceeds from the bottom up. First the Entities component is compiled, tested, and released. Then the same is done for Database and Interactors. These components are followed by Presenters, View, Controllers, and then Authorizer. Main goes last. This process is very clear and easy to deal with. We know how to build the system because we understand the dependencies between its parts.
环形图会导致 Unit testing and releasing become very difficult and error prone. In addition, build issues grow geometrically with the number of modules. 并且编译的时候的顺序也很难解决。
- Solutions:
(1) Dependency Inversion Principle (DIP)
依赖接口
(2) Create a new component that both Entities and Authorizer depend on. Move the class(es) that they both depend on into that new component.
提取新的component
TOP-DOWN DESIGN
The issues we have discussed so far lead to an inescapable conclusion: The component structure cannot be designed from the top down. It is not one of the first things about the system that is designed, but rather evolves as the system grows and changes.
其实设计主要是为了维护和构建,开始的时候都没有class,那么谈设计是没有意义的,到后面class越来越多,出现的问题越来越多我们才需要去解决它们,才会有CRP、ADP之类的。
THE STABLE DEPENDENCIES PRINCIPLE
Any component that we expect to be volatile should not be depended on by a component that is difficult to change.
Stability is related to the amount of work required to make a change.
One sure way to make a software component difficult to change, is to make lots of other software components depend on it. A component with lots of incoming dependencies is very stable because it requires a great deal of work to reconcile any changes with all the dependent components.
How can we measure the stability of a component? One way is to count the number of dependencies that enter and leave that component. These counts will allow us to calculate the positional stability of the component.
• Fan-in: Incoming dependencies. This metric identifies the number of classes outside this component that depend on classes within the component.
• Fan-out: Outgoing depenencies. This metric identifies the number of classes inside this component that depend on classes outside the component.
• I: Instability: I = Fan-out , (Fan-in + Fan-out). This metric has the range [0, 1]. I = 0 indicates a maximally stable component. I = 1 indicates a maximally unstable component.
被依赖的越多,对外的依赖越少,那么就会更稳定,因为更难改变。
The SDP says that the I metric of a component should be larger than the I metrics of the components that it depends on. That is, I metrics should decrease in the direction of dependency. 应该让不稳定的去依赖稳定的,而非让稳定的去依赖不稳定的,否则不稳定的想要变也变不了。
The changeable components are on top and depend on the stable component at the bottom. 如果有不符合的还是可以用DIP来通过依赖稳定的接口,让不稳定的component去实现接口来解决。
STABLE ABSTRACTIONS PRINCIPLE
The Stable Abstractions Principle (SAP) sets up a relationship between stability and abstractness. On the one hand, it says that a stable component should also be abstract so that its stability does not prevent it from being extended.
Thus, if a component is to be stable, it should consist of interfaces and abstract classes so that it can be extended. The combination of the SDP and the SAP deals with components, and allows that a component can be partially abstract and partially stable.
The A metric is a measure of the abstractness of a component. Its value is simply the ratio of interfaces and abstract classes in a component to the total number of classes in the component.
• Nc: The number of classes in the component.
• Na: The number of abstract classes and interfaces in the component.
• A: Abstractness. A = Na ÷ Nc.
Zones of exclusion
我们不能强迫所有组件需要在这个坐标系里面的哪里,但是可以指出不要在哪里。比如(0,0)附近的就是concrete class并且被很严重的依赖不能修改,也就是非常rigid的,我们其实不希望有这样的组件。例如database就经常是这样的存在,所以改起来会非常痛苦。还有一些例如系统库String也是完全被依赖并且concrete,但是真的不易改变。
所以如果是不易改变的在(0,0)附近是可以的,但是如果是易改变的不要在这个区域。
现在考虑一下(1,1)附近,也就是完全都是接口组成,但是外部对它完全没有依赖,只有对外的依赖,这样的组件其实完全没有用,因为没有被依赖还都是接口。
Main Sequence指(1,0)和(0,1)相连的线,A component that sits on the Main Sequence is not “too abstract” for its stability, nor is it “too unstable” for its abstractness.
The most desirable position for a component is at one of the two endpoints of the Main Sequence.
This leads us to our last metric. If it is desirable for components to be on, or close, to the Main Sequence, then we can create a metric that measures how far away a component is from this ideal.
• D: Distance. D = |A+I–1| . The range of this metric is [0, 1]. A value of 0 indicates that the component is directly on the Main Sequence. A value of 1 indicates that the component is as far away as possible from the Main Sequence.
Any component that has a D value that is not near zero can be reexamined and restructured.
我们可以计算所有组件的D的平均值,然后如果有问题(被依赖并且concrete或者很抽象却不怎么被依赖)就去修改过大的组件;也可以每个版本都计算D的平均值,如果哪个版本超过阈值就需要改了。
但注意哦,D值衡量并不是万能的,只是提供一种方法和思路。D值不一定就是要全员为0,也不是为0的设计一定是好的。