LinuxSpringcloud 技术中台

Skywalking:定制化

2021-08-18  本文已影响0人  程序员王旺

为了满足一些业务上的特定场景,这时就需要定制化开发一些功能,在我们的业务代码里加入少许代码,就能实现和我们自身业务相关的一些监控功能,比如追踪日志里加入一些特殊的信息、对订单数量的变化进行监控、对用户数量变化进行监控等。

一、Trace

自定义一个跟踪方法很简单,只需在要跟踪的方法上添加@Trace注解即可,当然它也需要 activations/apm-toolkit-trace-activation-8.6.0.jar插件的支持

  1. 在springboot的pom.xml中引入
<dependency>
     <groupId>org.apache.skywalking</groupId>
     <artifactId>apm-toolkit-trace</artifactId>
     <version>${skywalking.version}</version>
</dependency>
  1. 定义一个Controller,添加下面请求
@GetMapping("tractAnnotation")
public User traceAnnotation(@RequestParam("name") String name) {
    log.info("参数:[{}]", name);
    User user = trace(name);
    ActiveSpan.tag("user-tag", user.toString());
    log.info("tractId:[{}]", TraceContext.traceId());
    return user;
}

@Trace(operationName = "myTrace")
@Tags({
        @Tag(key = "参数", value = "arg[0]"),
        @Tag(key = "返回值", value = "returnedObj.name")
})
private User trace(String name) {
    User user = new User();
    user.setName(name);
    return user;
}
  1. 请求 http://localhost:9000/tractAnnotation?name=xxx 后,在UI的追踪面板中查看记录。

二、Meter

skywalking 从8.0开始引入了指标监控,同时也可以支持 micrometer,这样就可以在自己的业务系统中自定义一些指标,比如访问总数,订单总数等,增强了扩展性。下面我们以一个实例来演示这个功能。

修改OAP配置

  1. 首先在服务器端增加一个自定义指标文件 spring-meter.yaml,并且要遵从MAL语法。

!!! 将spring-meter.yaml文件放到config/meter-analyzer-config下

expSuffix: instance(['service'], ['instance'])
metricPrefix: meter_order
metricsRules:
  - name: new_increase_count
    exp: new_increase_count.increase("PT1M")
  1. 修改config/application.yml 第280行左右找到 meterAnalyzerActiveFiles,配置为上面文件名spring-meter.yaml(去掉后缀)
agent-analyzer:
  selector: ${SW_AGENT_ANALYZER:default}
  default:
   ....
   meterAnalyzerActiveFiles: ${SW_METER_ANALYZER_ACTIVE_FILES:spring-meter}

如果存储用的是mysql,服务启动后,会生成一张 meter_order_new_increase_count 的表,说明服务端配置成功。

应用端开发

在springboot应用中引入meter依赖

<dependency>
    <groupId>org.apache.skywalking</groupId>
    <artifactId>apm-toolkit-meter</artifactId>
    <version>${project.version}</version>
</dependency>

编写一个Controller,多次请求meter 来模拟订单数量变化,并查看meter_order_new_increase_count 表是否有新增记录

@GetMapping("meter")
public void meter() {
    Counter counter = MeterFactory.counter(new MeterId ("new_increase_count",MeterId.MeterType.COUNTER)).tag("Order Count", "100").mode(Counter.Mode.INCREMENT).build();
    counter.increment(Math.random()*10);
    log.info("{}:{}", counter.getName(),counter.get());
}

注意!!!:启动springboot时别忘了在VM Option中添加javaagent参数

-javaagent:skywalking-agent\skywalking-agent.jar -Dskywalking.agent.service_name=myapp -Dskywalking.agent.instance_name=myapp -Dskywalking.collector.backend_service=localhost:11800

关于 micrometer 的使用大概 这个样子,这个我没有实践,感兴趣的可以测试下。

  <dependency>
      <groupId>org.apache.skywalking</groupId>
      <artifactId>apm-toolkit-micrometer-registry</artifactId>
      <version>${skywalking.version}</version>
   </dependency>
@GetMapping("micrometer")
public void micrometer() {
    // If you has some counter want to rate by agent side
    SkywalkingConfig config = new SkywalkingConfig(Arrays.asList("test_rate_counter"));
    SkywalkingMeterRegistry registry = new SkywalkingMeterRegistry(config);
    io.micrometer.core.instrument.Counter counter = registry.counter("order.count.total","china","beijing");
    counter.increment();

    log.info("Midrometer-{}:{}",registry.getMeters(),counter.measure());
}

UI图表

编辑UI,添加一个item,指标输入meter_order_new_increase_count(就是上面在OAP服务端定义的那个指标),选择read all values in..

注意: UI中添加的指标必须是在OAP服务端提前编写好的,否则这里无法添加

image-20210812161113238.png

三、Log

skywalking可以将应用日志收集到oap服务端方便在调用链中查看某个请求的相关日志。

  1. 在springboot应用中添加logback配置:logback-spring.xml
<?xml version="1.0" encoding="UTF-8"?>
<!--
  ~ Licensed to the Apache Software Foundation (ASF) under one or more
  ~ contributor license agreements.  See the NOTICE file distributed with
  ~ this work for additional information regarding copyright ownership.
  ~ The ASF licenses this file to You under the Apache License, Version 2.0
  ~ (the "License"); you may not use this file except in compliance with
  ~ the License.  You may obtain a copy of the License at
  ~
  ~      http://www.apache.org/licenses/LICENSE-2.0
  ~
  ~ Unless required by applicable law or agreed to in writing, software
  ~ distributed under the License is distributed on an "AS IS" BASIS,
  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  ~ See the License for the specific language governing permissions and
  ~ limitations under the License.
  -->
<configuration scan="true" scanPeriod=" 5 seconds">

    <appender name="stdout" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.mdc.TraceIdMDCPatternLogbackLayout">
                <Pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{tid}] [%thread] %-5level %logger{36} -%msg%n</Pattern>
            </layout>
        </encoder>
    </appender>

    <appender name="grpc-log" class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.log.GRPCLogClientAppender">
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.mdc.TraceIdMDCPatternLogbackLayout">
                <Pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{tid}] [%thread] %-5level %logger{36} -%msg%n</Pattern>
            </layout>
        </encoder>
    </appender>

    <appender name="fileAppender" class="ch.qos.logback.core.FileAppender">
        <file>d:/temp/e2e-service-provider.log</file>
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout">
                <Pattern>[%sw_ctx] [%level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %logger:%line - %msg%n</Pattern>
            </layout>
        </encoder>
    </appender>

    <root level="INFO">
        <appender-ref ref="grpc-log"/>
        <appender-ref ref="stdout"/>
    </root>

    <logger name="fileLogger" level="INFO">
        <appender-ref ref="fileAppender"/>
    </logger>
</configuration>
  1. 修改agent.config ,添加如下配置:
plugin.toolkit.log.grpc.reporter.server_host=${SW_GRPC_LOG_SERVER_HOST:192.168.x.x}
plugin.toolkit.log.grpc.reporter.server_port=${SW_GRPC_LOG_SERVER_PORT:11800}
plugin.toolkit.log.grpc.reporter.max_message_size=${SW_GRPC_LOG_MAX_MESSAGE_SIZE:10485760}
plugin.toolkit.log.grpc.reporter.upstream_timeout=${SW_GRPC_LOG_GRPC_UPSTREAM_TIMEOUT:30}
  1. 当访问应用时,会在Skywalking中产生日志
image-20210812162121233.png

四、node-exporter

Skywalking 也支持 Prometheus node-exporter导入指标,从而可以监控操作系统级别的指标。在Skywalking中类似这类的指标是通过OpenTelemetry Collector来收集,通过 OpenTelemetry receiver 来接收。因此要支持 node-exporter 需要分为三个步骤:

  1. 在vm01、vm02上,分别启动 node_exporter
$ tar -xzvf node_exporter-1.0.1.linux-amd64.tar.gz && cd node_exporter-1.0.1.linux-amd64 
$ nohup ./node_exporter &
  1. 安装OpenTelemetry Collector

    使用docker-compose方式启动一个otel-collector

version: "2"
services:
  # Collector
  otel-collector:
    # Specify the image to start the container from
    image: otel/opentelemetry-collector:0.19.0
    # Set the  otel-collector configfile
    command: ["--config=/etc/otel-collector-config.yaml"]
    # Mapping the configfile to host directory
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
    ports:
      - "13133:13133" # health_check extension
      - "55678:55678"       # OpenCensus receiver

修改 otel-collector-config.yaml配置,vm01、vm02为启动了node_exporter的机器IP,将oap替换成OAP服务地址。

注意:logging 级别不要设成debug,否则磁盘会被日志爆满

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: 'otel-collector'
          scrape_interval: 1s
          static_configs:
            - targets: ['vm01:9100']
            - targets: ['vm02:9100']

processors:
  batch:

exporters:
  opencensus:
    endpoint: "oap:11800" # The OAP Server address
    insecure: true
  # Exports data to the console
  logging:
    # 注意这里的日志级别不要设的太高,否则会磁盘爆满
    logLevel: error

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch]
      exporters: [opencensus,logging]

如果采用k8s来部署opentelemetry-collector,请参考下面

# otel-collector-k8s.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-agent-conf
  labels:
    app: opentelemetry
    component: otel-agent-conf
data:
  otel-agent-config: |
    receivers:
      prometheus:
        config:
          scrape_configs:
            - job_name: 'otel-collector'
              scrape_interval: 1s
              static_configs:
                - targets: ['vm-1:9100']
                - targets: ['vm-2:9100']     
    
    processors:
      batch:
    
    exporters:
      opencensus:
        endpoint: "oap.skywalking.svc.cluster.local:11800" # The OAP Server address
        insecure: true
      # Exports data to the console  
      #logging:
      #  logLevel: debug
    
    service:
      pipelines:
        metrics:
          receivers: [prometheus]
          processors: [batch]
          exporters: [opencensus]


---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: otel-agent
  labels:
    app: opentelemetry
    component: otel-agent
spec:
  serviceName: otel-agent
  selector:
    matchLabels:
      app: opentelemetry
      component: otel-agent
  template:
    metadata:
      labels:
        app: opentelemetry
        component: otel-agent
    spec:
      containers:
      - command:
          - "/otelcol"
          - "--config=/conf/otel-agent-config.yaml"
          # Memory Ballast size should be max 1/3 to 1/2 of memory.
          - "--mem-ballast-size-mib=165"
        image: otel/opentelemetry-collector:0.19.0
        name: otel-agent
        resources:
          limits:
            cpu: 500m
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 100Mi
        ports:
        - containerPort: 55679 # ZPages endpoint.
        - containerPort: 4317 # Default OpenTelemetry receiver port.
        - containerPort: 8888  # Metrics.
        volumeMounts:
        - name: otel-agent-config-vol
          mountPath: /conf
        # 这里不能开启探针检查,否则容器会自动退出  
        #livenessProbe:
        #  httpGet:
        #    path: /
        #    port: 13133 # Health Check extension default port.
        #readinessProbe:
        #  httpGet:
        #    path: /
        #    port: 13133 # Health Check extension default port.
      volumes:
        - configMap:
            name: otel-agent-conf
            items:
              - key: otel-agent-config
                path: otel-agent-config.yaml
          name: otel-agent-config-vol
  1. 修改OAP的配置文件config/application.yml,激活vm规则,这些规则配置存放在otel-oc-rules目录下,如果配置多个规则,以逗号分隔。如果要定制指标就修改 vm.yaml文件。

    按照官方的文档一步步操作完,发现UI上根本不显示。这里就要注意了,默认receiver-otel的selector是 -,因此receiver-otel插件根本不会加载的,所以需要将selector配置成default。

receiver-otel:
  selector: ${SW_OTEL_RECEIVER:default}
  default:
    enabledHandlers: ${SW_OTEL_RECEIVER_ENABLED_HANDLERS:"oc"}
    enabledOcRules: ${SW_OTEL_RECEIVER_ENABLED_OC_RULES:"vm,oap"}
  1. 查看UI中VM已经抓取到机器的指标,但貌似和真实值有些出入,暂先不管了
image-20210812164256777.png
上一篇下一篇

猜你喜欢

热点阅读