Hystrix 分布式系统限流、降级、熔断框架

xiaoxiao2025-09-18 141

Hystrix提供了熔断、隔离、Fallback、cache、监控等功能，能够在一个、或多个依赖同时出现问题时保证系统依然可用。将服务的接口使用hystrix线程池做隔离，可以实现限流和熔断的效果。配合天舟平台提供的SpringCloudConfig配置中心，可以在不重启服务的情况下，动态调整hystrix限流的参数。

springboot工程使用hystrix的配置步骤: 1.pom.xml:

<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-hystrix</artifactId> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-hystrix-dashboard</artifactId> </dependency>

2.开启hystrix及hystrixDashboard:

入口类加注解，@EnableHystrix,@EnableHystrixDashboard @SpringBootApplication @EnableHystrix @EnableHystrixDashboard public class ContestDemoApplication{ public static void main(String[] args) { SpringApplication.run(ContestDemoApplication.class, args); } }

3.@HystrixCommand注解:

@HystrixCommand 加到服务的接口方法上，可以对接口限流。下面的代码，给服务的 /hello接口加了hystrix线程隔离，并且限制并发为5。当接口熔断或者降级时，会走降级方法，降级方法将异常信息返回，并且返回状态码 503 @RestController @RequestMapping(value = "/contest/demo") public class HelloController { //对controller层的接口做hystrix线程池隔离，可以起到限流的作用 @HystrixCommand( commandKey = "helloCommand",//缺省为方法名 threadPoolKey = "helloPool",//缺省为类名 fallbackMethod = "fallbackMethod",//指定降级方法，在熔断和异常时会走降级方法 commandProperties = { //超时时间 @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1000") }, threadPoolProperties = { //并发，缺省为10 @HystrixProperty(name = "coreSize", value = "5") } ) @RequestMapping( value = "/hello", method = RequestMethod.GET ) public String sayHello(HttpServletResponse httpServletResponse){ return "Hello World!：00000"; } /** * 降级方法，状态码返回503 * 注意，降级方法的返回类型与形参要与原方法相同，可以多一个Throwable参数放到最后，用来获取异常信息 */ public String fallbackMethod(HttpServletResponse httpServletResponse,Throwable e){ httpServletResponse.setStatus(HttpStatus.SERVICE_UNAVAILABLE.value()); return e.getMessage(); }

}

hystrix的属性配置:

如果将@HystrixCommand注解加到方法上，不对属性（如CoreSize）做任何配置，那么相当于使用了如下缺省配置。每个属性的意义可以参考 hystrix学习资料 @HystrixCommand( commandProperties = { //execution @HystrixProperty(name = "execution.isolation.strategy", value = "THREAD"), @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1000"), @HystrixProperty(name = "execution.timeout.enabled", value = "true"), @HystrixProperty(name = "execution.isolation.thread.interruptOnTimeout", value = "true"), @HystrixProperty(name = "execution.isolation.thread.interruptOnCancel", value = "false"), @HystrixProperty(name = "execution.isolation.semaphore.maxConcurrentRequests", value = "10"), //fallback @HystrixProperty(name = "fallback.isolation.semaphore.maxConcurrentRequests", value = "10"), @HystrixProperty(name = "fallback.enabled", value = "true"), //circuit breaker @HystrixProperty(name = "circuitBreaker.enabled", value = "true"), @HystrixProperty(name = "circuitBreaker.forceClosed", value = "false"), @HystrixProperty(name = "circuitBreaker.forceOpen", value = "false"), @HystrixProperty(name = "circuitBreaker.requestVolumeThreshold", value = "20"), @HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds", value = "5000"), @HystrixProperty(name = "circuitBreaker.errorThresholdPercentage", value = "50"), //Metrics @HystrixProperty(name = "metrics.rollingStats.timeInMilliseconds", value = "10000"), @HystrixProperty(name = "metrics.rollingStats.numBuckets", value = "10"), @HystrixProperty(name = "metrics.rollingPercentile.enabled", value = "true"), @HystrixProperty(name = "metrics.rollingPercentile.timeInMilliseconds", value = "60000"), @HystrixProperty(name = "metrics.rollingPercentile.numBuckets", value = "6"), @HystrixProperty(name = "metrics.rollingPercentile.bucketSize", value = "100"), @HystrixProperty(name = "metrics.healthSnapshot.intervalInMilliseconds", value = "500"), //request context @HystrixProperty(name = "requestCache.enabled", value = "true"), @HystrixProperty(name = "requestLog.enabled", value = "true")}, threadPoolProperties = { @HystrixProperty(name = "coreSize", value = "10"), @HystrixProperty(name = "maximumSize", value = "10"), @HystrixProperty(name = "maxQueueSize", value = "-1"), @HystrixProperty(name = "queueSizeRejectionThreshold", value = "5"), @HystrixProperty(name = "keepAliveTimeMinutes", value = "1"), @HystrixProperty(name = "allowMaximumSizeToDivergeFromCoreSize", value = "false"), @HystrixProperty(name = "metrics.rollingStats.timeInMilliseconds", value = "10000"), @HystrixProperty(name = "metrics.rollingStats.numBuckets", value = "10") } ) 如果要自定义这些属性，那么需要先了解下hystrix属性配置的优先级。

4 个不同优先级别的配置（优先级由低到高）。全局配置属性：通过在配置文件中定义全局属性值，在应用启动时或可用Spring Cloud Config和服务本身的动态刷新接口实现的动态刷新配置功能下，可以实现对“全局默认值”的覆盖以及在运行期对“全局默认值”的动态调整。

如下是 threadpool 和 command下的属性全局配置，注意具体的配置都在 default 下 #hystrix全局属性配置 hystrix: threadpool: default: #对应@HystrixCommand注解中threadPoolKey的属性值，默认为default coreSize: 50 #线程池的核心线程数及并发执行的最大线程数 command: default: #对应@HystrixCommand注解中commandKey的属性值，默认为default execution: isolation: thread: timeoutInMilliseconds: 1000 #执行超时时间 fallback: isolation: semaphore: maxConcurrentRequests: 1000 #任意时间点允许的降级方法并发数。当请求达到或超过该设置值其余请求不会调用fallback而是直接被拒绝实例默认值：通过代码为实例定义的默认值。通过代码的方式为实例设置属性值来覆盖默认的全局配置。示例如下： @HystrixCommand( commandKey = "helloCommand",//缺省为方法名 threadPoolKey = "helloPool",//缺省为类名 fallbackMethod = "fallbackMethod",//指定降级方法，在熔断和异常时会走降级方法 commandProperties = { //超时时间 @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1000") }, threadPoolProperties = { //并发，缺省为10 @HystrixProperty(name = "coreSize", value = "5") } ) @RequestMapping( value = "/hello", method = RequestMethod.GET ) public String sayHello(HttpServletResponse httpServletResponse){ return "Hello World!：00000"; }

实例配置属性：通过配置文件来为指定的实例进行属性配置，以覆盖前面的三个默认值。它也可用Spring Cloud Config和服务本身的动态刷新接口实现的动态刷新配置功能，实现对具体实例配置的动态调整。

以示例代码中的 /hello 接口为例， commandKey=helloCommand threadPoolKey=helloPool,那么对其的配置如下 #hystrix参数配置 hystrix: threadpool: helloPool: #注解中的threadPoolKey属性值, 如有特殊配置不想使用defualt默认配置可自行添加 coreSize: 20 command: helloCommand: #注解中的commandKey属性值,如有特殊配置不想使用默认配置可自行添加 execution: isolation: thread: timeoutInMilliseconds: 4000 #执行超时时间

服务熔断在微服务架构中，微服之间互相调用。如微服务A调用微服务B和微服务C，微服务B和微服务C又调用其它的微服务，这就是所谓的“扇出”。如果扇出的链路上某个微服务的调用响应时间过长或者不可用，微服务A的就会堆积大量的未响应的的请求，进而引起系统崩溃，所谓的“雪崩效应”。熔断机制是应对雪崩效应的一种微服务链路保护机制。当扇出链路的某个微服务不可用或者响应时间太长时，就熔断该节点微服务的调用，快速返回错误的响应信息。

服务降级降级是指自己的待遇下降了，就是此时客户端可以自己准备一个本地的fallback（回退）回调，返回一个缺省值。当双11活动时，把无关交易的服务统统降级，如查看蚂蚁深林，查看历史订单，商品历史评论，只显示最后100条等等。

服务限流

限流的目的是通过对并发访问/请求进行限速或者一个时间窗口内的的请求进行限速来保护系统，一旦达到限制速率则可以拒绝服务（定向到错误页或告知资源没有了）、排队或等待（比如秒杀、评论、下单）、降级（返回兜底数据或默认数据，如商品详情页库存默认有货）。一般开发高并发系统常见的限流有：限制总并发数（比如数据库连接池、线程池）、限制瞬时并发数（如nginx的limit_conn模块，用来限制瞬时并发连接数）、限制时间窗口内的平均速率（如Guava的RateLimiter、nginx的limit_req模块，限制每秒的平均速率）；其他还有如限制远程接口调用速率、限制MQ的消费速率。另外还可以根据网络连接数、网络流量、CPU或内存负载等来限流。

常见的限流算法有：令牌桶、漏桶。计数器也可以进行粗暴限流实现。

计数器方法系统维护一个计数器，来一个请求就加1，请求处理完成就减1，当计数器大于指定的阈值，就拒绝新的请求。基于这个简单的方法，可以再延伸出一些高级功能，比如阈值可以不是固定值，是动态调整的。另外，还可以有多组计数器分别管理不同的服务，以保证互不影响等。

漏桶(Leaky Bucket)算法思路很简单,水(请求)先进入到漏桶里,漏桶以一定的速度出水(接口有响应速率),当水流入速度过大会直接溢出(访问频率超过接口响应速率),然后就拒绝请求,可以看出漏桶算法能强行限制数据的传输速率.示意图如下: 这里写图片描述

令牌桶算法(Token Bucket)和 Leaky Bucket 效果一样但方向相反的算法,更加容易理解.随着时间流逝,系统会按恒定1/QPS时间间隔(如果QPS=100,则间隔是10ms)往桶里加入Token(想象和漏洞漏水相反,有个水龙头在不断的加水),如果桶已经满了就不再加了.新请求来临时,会各自拿走一个Token,如果没有Token可拿了就阻塞或者拒绝服务. 这里写图片描述

令牌桶的另外一个好处是可以方便的改变速度. 一旦需要提高速率,则按需提高放入桶中的令牌的速率. 一般会定时(比如100毫秒)往桶中增加一定数量的令牌, 有些变种算法则实时的计算应该增加的令牌的数量.

最新回复(0)