Druid

Druid-Druid中task详解

2020-03-02  本文已影响0人  李小李的路

概述

Task API

Task报告

Task终态报告

http://<OVERLORD-HOST>:<OVERLORD-PORT>/druid/indexer/v1/task/<task-id>/reports
{
  "ingestionStatsAndErrors": {
    "taskId": "compact_twitter_2018-09-24T18:24:23.920Z",
    "payload": {
      "ingestionState": "COMPLETED",
      "unparseableEvents": {},
      "rowStats": {
        "determinePartitions": {
          "processed": 0,
          "processedWithError": 0,
          "thrownAway": 0,
          "unparseable": 0
        },
        "buildSegments": {
          "processed": 5390324,
          "processedWithError": 0,
          "thrownAway": 0,
          "unparseable": 0
        }
      },
      "errorMsg": null
    },
    "type": "ingestionStatsAndErrors"
  }
}

Live类型报告

http://<OVERLORD-HOST>:<OVERLORD-PORT>/druid/indexer/v1/task/<task-id>/reports

http://<middlemanager-host>:<worker-port>/druid/worker/v1/chat/<task-id>/liveReports
{
  "ingestionStatsAndErrors": {
    "taskId": "compact_twitter_2018-09-24T18:24:23.920Z",
    "payload": {
      "ingestionState": "RUNNING",
      "unparseableEvents": {},
      "rowStats": {
        "movingAverages": {
          "buildSegments": {
            "5m": {
              "processed": 3.392158326408501,
              "unparseable": 0,
              "thrownAway": 0,
              "processedWithError": 0
            },
            "15m": {
              "processed": 1.736165476881023,
              "unparseable": 0,
              "thrownAway": 0,
              "processedWithError": 0
            },
            "1m": {
              "processed": 4.206417693750045,
              "unparseable": 0,
              "thrownAway": 0,
              "processedWithError": 0
            }
          }
        },
        "totals": {
          "buildSegments": {
            "processed": 1994,
            "processedWithError": 0,
            "thrownAway": 0,
            "unparseable": 0
          }
        }
      },
      "errorMsg": null
    },
    "type": "ingestionStatsAndErrors"
  }
}

Live报告的各个指标

Row stats

http://<middlemanager-host>:<worker-port>/druid/worker/v1/chat/<task-id>/rowStats
{
  "movingAverages": {
    "buildSegments": {
      "5m": {
        "processed": 3.392158326408501,
        "unparseable": 0,
        "thrownAway": 0,
        "processedWithError": 0
      },
      "15m": {
        "processed": 1.736165476881023,
        "unparseable": 0,
        "thrownAway": 0,
        "processedWithError": 0
      },
      "1m": {
        "processed": 4.206417693750045,
        "unparseable": 0,
        "thrownAway": 0,
        "processedWithError": 0
      }
    }
  },
  "totals": {
    "buildSegments": {
      "processed": 1994,
      "processedWithError": 0,
      "thrownAway": 0,
      "unparseable": 0
    }
  }
}
http://<OVERLORD-HOST>:<OVERLORD-PORT>/druid/indexer/v1/supervisor/<supervisor-id>/stats

Unparseable events

http://<middlemanager-host>:<worker-port>/druid/worker/v1/chat/<task-id>/unparseableEvents

Task lock system

"Overshadowing" between segments

Here are some examples.

Locking

Lock priority

task type default priority
Realtime index task 75
Batch index task 50
Merge/Append/Compaction task 25
Other tasks 0
"context" : {
  "priority" : 100
}

Context parameters

property default description
taskLockTimeout 300000 task lock timeout in millisecond. For more details, see Locking.
forceTimeChunkLock true Setting this to false is still experimental
Force to always use time chunk lock. If not set, each task automatically chooses a lock type to use. If this set, it will overwrite the druid.indexer.tasklock.forceTimeChunkLock configuration for the overlord. See Locking for more details.
priority Different based on task types. See Priority. Task priority

All task types

index

See Native batch ingestion (simple task).

index_parallel

See Native batch ingestion (parallel task).

index_sub

Submitted automatically, on your behalf, by an index_parallel task.

index_hadoop

See Hadoop-based ingestion.

index_kafka

Submitted automatically, on your behalf, by a
Kafka-based ingestion supervisor.

index_kinesis

Submitted automatically, on your behalf, by a
Kinesis-based ingestion supervisor.

index_realtime

Submitted automatically, on your behalf, by Tranquility.

compact

Compaction tasks merge all segments of the given interval. See the documentation on
compaction for details.

kill

Kill tasks delete all metadata about certain segments and removes them from deep storage.
See the documentation on deleting data for details.

append

Append tasks append a list of segments together into a single segment (one after the other). The grammar is:

{
    "type": "append",
    "id": <task_id>,
    "dataSource": <task_datasource>,
    "segments": <JSON list of DataSegment objects to append>,
    "aggregations": <optional list of aggregators>,
    "context": <task context>
}

merge

Merge tasks merge a list of segments together. Any common timestamps are merged.
If rollup is disabled as part of ingestion, common timestamps are not merged and rows are reordered by their timestamp.

The compact task is often a better choice than the merge task.

The grammar is:

{
    "type": "merge",
    "id": <task_id>,
    "dataSource": <task_datasource>,
    "aggregations": <list of aggregators>,
    "rollup": <whether or not to rollup data during a merge>,
    "segments": <JSON list of DataSegment objects to merge>,
    "context": <task context>
}

same_interval_merge

Same Interval Merge task is a shortcut of merge task, all segments in the interval are going to be merged.

The compact task is often a better choice than the same_interval_merge task.

The grammar is:

{
    "type": "same_interval_merge",
    "id": <task_id>,
    "dataSource": <task_datasource>,
    "aggregations": <list of aggregators>,
    "rollup": <whether or not to rollup data during a merge>,
    "interval": <DataSegment objects in this interval are going to be merged>,
    "context": <task context>
}
``

上一篇 下一篇

猜你喜欢

热点阅读