云设计模式

Throttling限流

2017-06-11  本文已影响144人  jorgensen

Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service. This can allow the system to continue to function and meet service level agreements, even when an increase in demand places an extreme load on resources.
控制应用实例,个人租户或整个服务的实例使用的资源消耗。 这可以使系统能够继续运行并达到服务级别的协议,即使是在需求增加对资源造成极大的负担时。

Context and problem

The load on a cloud application typically varies over time based on the number of active users or the types of activities they are performing. For example, more users are likely to be active during business hours, or the system might be required to perform computationally expensive analytics at the end of each month. There might also be sudden and unanticipated bursts in activity. If the processing requirements of the system exceed the capacity of the resources that are available, it'll suffer from poor performance and can even fail. If the system has to meet an agreed level of service, such failure could be unacceptable.
云应用程序的负载通常随时间而变化,这取决于活跃用户的数量或其正在执行的动作类型。例如,更多用户可能在上班时间内处于活动状态,或者可能需要在每个月底执行昂贵的分析计算。活动中也可能出现突发和意料之外的爆发。如果系统的处理要求超过可用资源的容量,那么它的性能会降低甚至会失败。如果系统必须达到商定的服务水平,这种失败可能是不可接受的。

There're many strategies available for handling varying load in the cloud, depending on the business goals for the application. One strategy is to use autoscaling to match the provisioned resources to the user needs at any given time. This has the potential to consistently meet user demand, while optimizing running costs. However, while autoscaling can trigger the provisioning of additional resources, this provisioning isn't immediate. If demand grows quickly, there can be a window of time where there's a resource deficit.
根据应用程序的业务目标,有很多策略可用于处理云中的不同负载。一种策略是在任何给定的时间使用弹性伸缩将供应用户所需的资源。这有可能始终满足用户需求,同时优化运行成本。然而,虽然弹性伸缩可以触发附加资源的配置,但这种配置不是即时的。如果需求快速增长,可能会出现资源短缺的时间窗口。

Solution

An alternative strategy to autoscaling is to allow applications to use resources only up to a limit, and then throttle them when this limit is reached. The system should monitor how it's using resources so that, when usage exceeds the threshold, it can throttle requests from one or more users. This will enable the system to continue functioning and meet any service level agreements (SLAs) that are in place. For more information on monitoring resource usage, see the Instrumentation and Telemetry Guidance.
弹性伸缩的一种替代策略是允许应用程序使用有限的资源,然后在达到此限制时进行限流。 系统应该监控资源的使用情况,以便当使用率超过阈值时,可以抑制来自一个或多个用户的请求。 这将使系统能够继续运行并满足已经制定的任何服务级别协议(SLA)。 有关监控资源使用情况的更多信息,请参阅“仪器与遥测指导”。

The system could implement several throttling strategies, including:

该系统可以实施几个节流策略,其中包括:

The figure shows an area graph for resource use (a combination of memory, CPU, bandwidth, and other factors) against time for applications that are making use of three features. A feature is an area of functionality, such as a component that performs a specific set of tasks, a piece of code that performs a complex calculation, or an element that provides a service such as an in-memory cache. These features are labeled A, B, and C.
该图显示了利用三个功能的应用程序的资源使用区域图(内存,CPU,带宽和其他因素的组合)与时间的关系。 特征是功能区域,例如执行特定任务集的组件,执行复杂计算的代码片段或提供诸如内存中缓存的服务的元素。 这些特征标记为A,B和C.

The area immediately below the line for a feature indicates the resources that are used by applications when they invoke this feature. For example, the area below the line for Feature A shows the resources used by applications that are making use of Feature A, and the area between the lines for Feature A and Feature B indicates the resources used by applications invoking Feature B. Aggregating the areas for each feature shows the total resource use of the system.
特征线下方的区域表示应用程序在调用此功能时使用的资源。 例如,Feature A线下方的区域显示了正在使用Feature A的应用程序使用的资源,Feature A和Feature B的行之间的区域表示应用程序调用Feature B所使用的资源。汇总区域显示每个功能系统的总资源使用情况。

The previous figure illustrates the effects of deferring operations. Just prior to time T1, the total resources allocated to all applications using these features reach a threshold (the limit of resource use). At this point, the applications are in danger of exhausting the resources available. In this system, Feature B is less critical than Feature A or Feature C, so it's temporarily disabled and the resources that it was using are released. Between times T1 and T2, the applications using Feature A and Feature C continue running as normal. Eventually, the resource use of these two features diminishes to the point when, at time T2, there is sufficient capacity to enable Feature B again.
上图说明了延期操作的效果。 就在T1之前,分配给使用这些功能的所有应用程序的总资源达到阈值(资源使用限制)。 在这一点上,应用程序有可能耗尽可用的资源。 在该系统中,功能B相比Feature A或Feature C而言不太重要,因此暂时禁用了该功能,并释放了它所使用的资源。 在T1和T2之间,使用Feature A和Feature C的应用程序正常运行。 最终,这两个Feature使用的资源在时间点T2减少到有足够的容量以再次启用Feature B。

The autoscaling and throttling approaches can also be combined to help keep the applications responsive and within SLAs. If the demand is expected to remain high, throttling provides a temporary solution while the system scales out. At this point, the full functionality of the system can be restored.
弹性伸缩和限流方法也可以组合起来,以帮助应用程序保持响应并且符合SLA。 如果需求预期保持高位,节流将在系统扩展时提供临时解决方案。 此时,可以恢复系统的全部功能。

The next figure shows an area graph of the overall resource use by all applications running in a system against time, and illustrates how throttling can be combined with autoscaling.
下图显示了系统中运行的所有应用程序对时间的整体资源使用情况的区域图,并说明如何将限流与弹性伸缩
相结合。

At time T1, the threshold specifying the soft limit of resource use is reached. At this point, the system can start to scale out. However, if the new resources don't become available quickly enough, then the existing resources might be exhausted and the system could fail. To prevent this from occurring, the system is temporarily throttled, as described earlier. When autoscaling has completed and the additional resources are available, throttling can be relaxed.
在时间T1,达到指定资源使用的软限制的阈值。 在这一点上,系统可以开始扩展。 但是,如果新的资源没有足够快的可用性,那么现有资源可能会耗尽,并且系统可能会失败。 为了防止发生这种情况,系统会暂时被限制,如前所述。 当自动缩放完成并且额外的资源可用时,可以放宽节流。

Issues and considerations

You should consider the following points when deciding how to implement this pattern:

在决定如何实现这种模式时,您应该考虑以下几点:

When to use this pattern

Use this pattern:

使用此模式:

Example

The final figure illustrates how throttling can be implemented in a multi-tenant system. Users from each of the tenant organizations access a cloud-hosted application where they fill out and submit surveys. The application contains instrumentation that monitors the rate at which these users are submitting requests to the application.
下面的图片说明如何在多租户系统中实现节流。 每个租户的用户访问云托管的应用程序,他们填写并提交调查。 该应用程序包含监视这些用户向应用程序提交请求速率的工具。

In order to prevent the users from one tenant affecting the responsiveness and availability of the application for all other users, a limit is applied to the number of requests per second the users from any one tenant can submit. The application blocks requests that exceed this limit.
为了防止单个租户的用户影响所有其他用户访问应用程序的响应性和可用性,对每个租户的用户每秒可以提交的请求数量进行限制。 应用程序阻止超出此限制的请求。


Related patterns and guidance

The following patterns and guidance may also be relevant when implementing this pattern:

在实现此模式时,以下模式和指导也可能是相关的:

上一篇 下一篇

猜你喜欢

热点阅读