Spring Cloud Zuul 中 ribbon 和 hys

2018-12-03 本文已影响14人 AaronSimon

本篇主要讲解一下Spring Cloud 中 Zuul 组件的Ribbon和Hystrix的配置。

Ribbon：负载均衡，是针对服务的多实例负载均衡的配置
Hystrix：熔断器，当zuul网关调用具体的业务的时候可能受到网络，代码执行时间等影响长时间无响应，这个时候就需要配置hystrix，避免线程长时间占用内存，造成内存泄露，服务挂掉

Ribbon 配置
通过查阅官方文档，我找到了配置方案。路由方式的不同配置的方式也不一样。

当使用了Eureka注册中心，zuul.routes配置走service的时候，通过ribbon.ReadTimeout和ribbon.SocketTimeout配置
当zuul.routes配置走url的时候,通过zuul.host.connect-timeout-millis和zuul.host.socket-timeout-millis配置

如果想要对指定服务进行特殊配置，配置方式如下：

<serviceName>.ribbon.ReadTimeout

<serviceName> 为服务名
Hystrix配置
如果zuul配置了熔断fallback的话，熔断超时也要配置。配置属性如下：

hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=60000

default代表默认，如果你想为某个特定的service配熔断超时策略，配置方式如下：

hystrix.command.<serviceName>.execution.isolation.thread.timeoutInMilliseconds=60000

<serviceName> 为服务名

示例
通过上面的说明，Ribbon和Hystrix的配置如下（配置没有提示但依然有效）：

#是否开启路由重试
zuul.retryable=true
#对当前实例的重试次数
ribbon.MaxAutoRetries=1
#切换实例的重试次数
ribbon.MaxAutoRetriesNextServer=1
#请求处理的超时时间
ribbon.ReadTimeout=5000
#请求连接的超时时间
ribbon.ConnectTimeout=2000
#对所有操作请求都进行重试
ribbon.OkToRetryOnAllOperations=true
# hystrix 超时时间最好大于Ribbon的超时时间
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=16000

当启动Eureka,Zuul，ServiceA服务进行测试的时候，在Zuul服务的控制台打印了下面的警告：

2018-12-03 16:22:30.306  WARN [apigateway,7447152c2c5cc400,7447152c2c5cc400,true] 20024 --- [nio-8102-exec-3] o.s.c.n.z.f.r.s.AbstractRibbonCommand    : The Hystrix timeout of 16000ms for the command serviceA is set lower than the combination of the Ribbon read and connect timeout, 28000ms.

大概意思就是 Hystrix 的超时时间小于 Ribbon的超时时间。为什么Ribbon的超时时间是28000ms呢？这个警告是AbstractRibbonCommand.java报告的，于是我开始查阅它的源码。

protected static int getHystrixTimeout(IClientConfig config, String commandKey) {
  int ribbonTimeout = getRibbonTimeout(config, commandKey);
  DynamicPropertyFactory dynamicPropertyFactory = DynamicPropertyFactory.getInstance();
  // 获取默认的hytrix超时时间
  int defaultHystrixTimeout = dynamicPropertyFactory.getIntProperty("hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds", 0).get();
  // 获取具体服务的hytrix超时时间，这里应该是hystrix.command.serviceA.execution.isolation.thread.timeoutInMilliseconds
  int commandHystrixTimeout = dynamicPropertyFactory.getIntProperty("hystrix.command." + commandKey + ".execution.isolation.thread.timeoutInMilliseconds", 0).get();
  int hystrixTimeout;
  // hystrixTimeout的优先级是 具体服务的hytrix超时时间 > 默认的hytrix超时时间 > ribbon超时时间
  if (commandHystrixTimeout > 0) {
    hystrixTimeout = commandHystrixTimeout;
  } else if (defaultHystrixTimeout > 0) {
    hystrixTimeout = defaultHystrixTimeout;
  } else {
    hystrixTimeout = ribbonTimeout;
  }
  // 如果默认的或者具体服务的hytrix超时时间小于ribbon超时时间就会警告
  if (hystrixTimeout < ribbonTimeout) {
    LOGGER.warn("The Hystrix timeout of " + hystrixTimeout + "ms for the command " + commandKey + " is set lower than the combination of the Ribbon read and connect timeout, " + ribbonTimeout + "ms.");
  }

  return hystrixTimeout;
}

仔细查看发现ribbonTimeout是通过getRibbonTimeout()方法获取的

protected static int getRibbonTimeout(IClientConfig config, String commandKey) {
  int ribbonTimeout;
  // 默认为 2s
  if (config == null) {
    ribbonTimeout = 2000;
  } else {
    // 这里获取了四个参数，ReadTimeout，ConnectTimeout，MaxAutoRetries， MaxAutoRetriesNextServer,优先级：具体服务 > 默认
    // 1. 请求处理的超时时间,默认 1s
    int ribbonReadTimeout = getTimeout(config, commandKey, "ReadTimeout", Keys.ReadTimeout, 1000);
    // 2. 请求连接的超时时间,默认 1s
    int ribbonConnectTimeout = getTimeout(config, commandKey, "ConnectTimeout", Keys.ConnectTimeout, 1000);
    // 3. 对当前实例的重试次数.默认 0
    int maxAutoRetries = getTimeout(config, commandKey, "MaxAutoRetries", Keys.MaxAutoRetries, 0);
    // 4. 切换实例的重试次数,默认 1
    int maxAutoRetriesNextServer = getTimeout(config, commandKey, "MaxAutoRetriesNextServer", Keys.MaxAutoRetriesNextServer, 1);
    // ribbonTimeout的计算方法
    ribbonTimeout = (ribbonReadTimeout + ribbonConnectTimeout) * (maxAutoRetries + 1) * (maxAutoRetriesNextServer + 1);
  }

  return ribbonTimeout;
}

原来 ribbonTimeout的计算方法为:

ribbonTimeout = (ribbonReadTimeout + ribbonConnectTimeout) * (maxAutoRetries + 1) * (maxAutoRetriesNextServer + 1);

按照示例中的配置，我们进行计算：

ribbonTimeout = (5000 + 2000) * (1 + 1) * (1 + 1) = 28000

从逻辑上来说，hystrixTimeout要大于ribbonTimeout，所以更改配置如下：

#是否开启路由重试
zuul.retryable=true
#对当前实例的重试次数
ribbon.MaxAutoRetries=1
#切换实例的重试次数
ribbon.MaxAutoRetriesNextServer=1
#请求处理的超时时间
ribbon.ReadTimeout=5000
#请求连接的超时时间
ribbon.ConnectTimeout=2000
#对所有操作请求都进行重试
ribbon.OkToRetryOnAllOperations=true
# hystrix 超时时间最好大于Ribbon的超时时间
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=30000

如果hystrixTimeout小于ribbonTimeout，可能在Ribbon切换实例进行重试的过程中就会触发熔断。

Spring Cloud Zuul 中 ribbon 和 hys

猜你喜欢

热点阅读