Ease Monitor 产品文档

2017-12-26  本文已影响25人  集库

http://megaease.com/docs/monitor/

1. 产品定位

Ease Monitor 有如下的产品定位:

  1. “体检”

    • 容量管理。 提供一个全局的系统运行时数据的展示,可以让工程师团队知道是否需要增加机器或是其它资源。

    • 性能管理。可以通过查看大盘,找到系统瓶颈,并有针对性的优化系统和相应代码。

  2. “急诊”

    • 定位问题。可以快速的暴露并找到问题的发生点,帮助技术人员诊断问题。

    • 性能分析。当出来不预期的流量提升,可以快速的找到系统的瓶颈,并可以帮助开发人员深入代码。

下图是一个很常见的情况:

Diagnosis Case

2. 设计原则

Ease Monitor 其实是一种 APM - Application Performance Management,但是又不同于传统意议上的 APM 软件。

主要有下面两个方面影响了 Ease Monitor 的设计:

所以,Ease Monitor 有如下的设计源则:

3. 系统架构设计

Ease Monitor Architecture

上图是整个 Ease Monitor 的架构图所用到的技术。

对于这个技术架构,其中的技术都是主流的成熟的,其被设计于可以监控一个规模很大的集群,而且其中的组件是可以被灵活的裁剪和取代。

4. 系统要求和限制

目前,Ease Monitor 只支持如下的系统环境。

5. 功能展示

5.1 总体仪表板

总体仪表板主要展示了系统整体的健康和容量情况。

Overview Dashboard

5.2 系统请求排名列表

系统请求排名列表列出了系统比较耗时的请求以及相关的请求热点。

Nginx 请求排名列表

Nginx 请求排名列表

JDBC 数据库操作排名列表

JDBC 数据库操作排名列表

5.3 函数调用栈分析

下图是某个请求的函数调用栈分析

Call Stack

5.4 调用链跟踪

下图展示了一个请求在整个系统中的服务调用链以及相应的时间分布。

External Service

5.5 可自由定制的仪表盘

基础操作系统的仪表盘

Dashboard

5.6 事件报警

下图是一个事件报警的报告图

Events

6. 技术细节

6.1 Ease Agent

Ease Agent 是一种 Java Agent ,它在运行时期间使用 java.lang.instrument API 对特定方法进行 字节码增强 ,以实现方法调用的上下文信息的采集,如:对于用户请求的时间测量,函数调用栈的信息,分布式系统的调用链跟踪,等等。

6.1.1 Design Principles

考虑到 Ease Agent 与宿主进程运行在同一个 JVM 中,所以其 可靠性弱干扰性 尤为重要, 为此它被设计为:

  1. 独立的 ClassLoader。 采用独立的 ClassLoader 装载自身字节码,与宿主字节码相隔离, 从而避免字节码冲突。
  2. 精巧的装载技术。 精巧的自定义装载机制允许自身共享宿主已有字节码, 大幅减少冗余字节码的依赖, 让部署和运行更为高效。
  3. 高效的采样技术。 多种调用采样机制可供选择, 满足不同场景下对于性能的苛刻要求。
  4. 简易的扩展性。 内建一套简洁的 DSL ,令扩展功能可在十来行代码中得以实现。

6.1.2 兼容性与要求

  1. 支持 Oracle JDKOpenJDK 6 到 8。
  2. 支持所有兼容 Java Servlet 3.0 的 Servlet 容器, 如 TomcatJettyJBoss 等。
  3. 支持所有兼容 JDBC 的数据库驱动,部分高级特性支持 MySQL (mysql-connector-java v5.1.33
  4. 支持 Apache HTTP Client v4.5.x
  5. 支持 Jedis v2.9.x
  6. 支持 Spring RestTemplate v4.x
  7. 支持 Zipkin v1.19.2+

6.1.3 采集数据种类

  1. 服务器收到 HTTP 请求的 Metric,以及调用关联信息(如调用栈等)
  2. JDBC Connection 获取和 Statement 执行的 Metric, 以及调用关联信息(如 URL,SQL等)
  3. 兼容 Zipkin 协议的分布式调用链数据,包括:
    • HTTP 接收 与 发送
    • SQL 执行
    • Redis 访问

6.1.4 安装与使用

下载 easeagent-dep.jar 后, 添加如下 Java 运行时参数:

-javaagent=easeagent-dep.jar

6.2 iOS/Android SDK

coming soon...

6.3 事件报警

当前,Ease Monitor 的事件报警支持如下的用户案例。

6.4 数据存储格式

下面是 Ease Monitor 在 ElasticSearch 中的数据存储格式。

6.4.1 索引格式

Index mapping template Index pattern Description
ease-monitor-metrics-* ease-monitor-metrics-YYYY.MM.DD Saves time series based metrics of monitored object from different categories. The metrics from different monitored object will be saved into a dedicated document type.
ease-monitor-aggregate-metrics-* ease-monitor-aggregate-metrics-YYYY.MM.DD Saves calculated performance statistics from different dimensions monitoring requirement needed. The statistics from different dimensions will be saved into a dedicated document type. Due to the statistic calculation are executed on these input metrics directly as streaming and the results will be saved into this index in advance, so the statistics can be loaded and used without any further aggregation(e.g. grouping and computing). This will definitely help the performance of ad-hoc query on the fine-grained metrics ES stored, especially on a large metrics data volume. This index was designed only to save these statistics ones can be calculated by a simple (fast) and fixed (can be implemented on product design stage instead of runtime stage) functions.
ease-monitor-logs-* ease-monitor-logs-YYYY.MM.DD Saves the logs outputted from OS, middleware and application. The different logs will be saved into a dedicated document type.

6.4.2 文档类型格式

我们有如下的文档类型的存储格式:

相关示例:

Index mapping template Category Document type Description
ease-monitor-metrics-* application http_request Saves application HTTP request records, which contains URL address and parameters, execution duration, response code and other useful fields.
platform jvm_memory Saves JVM performance counters and statistics for heap, non-heap and each spaces.
jvm_gc Saves JVM performance counters and statistics for garbage collector.
tomcat_global Saves the performance counters and statistics of global request processor and thread pool.
tomcat_cache Saves the performance counters and statistics of each context cache.
tomcat_servlet Saves the performance counters and statistics of each servlet.
nginx Saves nginx performance counters and statistics.
mysql Saves mysql performance counters and statistics.
redis_server Saves redis server performance counters and statistics.
redis_keyspace Saves redis key space performance counters and statistics.
infrastructure cpu Saves the percentage utilization of special logic core.
memory Saves the percentage utilization and capacity in bytes.
interface Saves the performance counters and statistics for each interface separately (without 'lo' loop device), e.g. tx and rx bytes.
disk Saves the performance counters and statistics for each block device separately, e.g. iops, mbps. (busy percentage indicator will be added in future).
df Saves the utilization counters for each block device
ease-monitor-aggregate-metrics-* application http_request Saves the calculated values of separated and total executions per second in every 1, 5, 15 minutes. The request count will be saved as well.
jdbc_statement Saves the calculated values of separated and total executions per second in every 1, 5, 15 minutes. And also saves minimal, mean, maximal and 25%, 50%, 75%, 95%, 98%, 99%, 99.9% user's execution duration. The execution count will be saved as well.
jdbc_connection Saves the calculated values of database connection establishment per second in every 1, 5, 15 minutes range. And also saves minimal, mean , maximal and 25%, 50%, 75%, 95%, 98%, 99%, 99.9% user's connection establishment duration. The establishment count will be saved as well.
ease-monitor-logs-* application <component-name> Saves log records collected from application's component.
platform tomcat_exception Saves the exception messages of the stack.
nginx_access Saves HTTP access records from nginx access log.
nginx_error Saves error records from nginx error log.
mysql_slow_sql Saves slow SQL records from MySQL log.
infrastructure os_syslog Saves log records from OS 'syslog' file.
os_dmesg Saves log records from OS 'dmesg' file.
上一篇 下一篇

猜你喜欢

热点阅读