Cloud Computing 总结
Cloud Computing Week5
==========================
总算跟完了,这里再根据final test做一些总结。
1.Power usage effectiveness (PUE) is a measure of how efficiently a computer data center uses energy; specifically, how much energy is used by the computing equipment (in contrast to cooling and other overhead). 参考
PUE = Total Facility Energy / IT Equipment Energy
2.One problem with the Hadoop system is that by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program. (有一个node很慢导致整体很慢,如何解决?引入speculative execution)
Tasks may be slow for various reasons, including hardware degradation, or software mis-configuration, but the causes may be hard to detect since the tasks still complete successfully, albeit after a longer time than expected.
Hadoop doesn’t try to diagnose and fix slow-running tasks; instead, it tries to detect when a task is running slower than expected and launches another, equivalent, task as a backup. This is termed speculative execution of tasks. (参考)
3.Hadoop Yarn作为一个分布式系统scheduler,将每一个server看作一个container的集合(collection),这里的container可以认为是一些CPU和memory的容易。Yarn由以下三部分组成:
- Global Resource Manager:包含一个capacity scheduler,主要功能是schedule container给任务。
- (Per-server) Node Manager:执行一些server级的功能,比如返回container已经完成任务的消息给RM。
- Application Manager: 包含2个功能1)在RM与NM之间进行使用container的协商(negotiation)2)侦测task failure。
4.对于分布式系统而言,有两个概念:
- Satety: Something bad will never happen.
- Liveness: Guarantee good will happen eventually.
5.Gossip-style failure detection protocol具有一个更新规则。假设Node p当前时间为123,有一个entry (q, 34, 101),entry分别表示 (address, heartbeat counter, local time)。此时若来了新的entry (q, 35, 110),因为这个entry中q的heartbeat counter较之前的大,因此entry更新为 (q, 35, 123)。
6.CAP:Consistency、Availability和Partition Tolerance。
- Consistency(一致性):一致性是说数据的原子性,这种原子性在经典的数据库中是通过事务来保证的,当事务完成时,无论其是成功还是回滚,数据都会处于一致的状态。在分布式环境中,一致性是说多个节点的数据是否一致。
- Availability(可用性):可用性是说服务能一直保证是可用的状态,当用户发出一个请求,服务能在有限时间内返回结果。
- Partition Tolerance(分区容错性):Partition是指网络的分区。可以这样理解,一般来说,关键的数据和服务都会位于不同的IDC。
CAP原理指出一个分布式系统不可能同时满足一致性,可用性和分区容错性这三个需求,三个要素中最多只能同时满足两点。
7.BASE来自于互联网的电子商务领域的实践,它是基于CAP理论逐步演化而来,核心思想是即便不能达到强一致性(Strong consistency),但可以根据应用特点采用适当的方式来达到最终一致性(Eventual consistency)的效果。BASE是Basically Available、Soft state、Eventually consistent三个词组的简写,是对CAP中C & A的延伸。BASE的含义:
- Basically Available:基本可用
- Soft-state:软状态/柔性事务,即状态可以有一段时间的不同步
- Eventual consistency:最终一致性
参考
http://www.cnblogs.com/hustcat/archive/2010/09/07/1820970.html