Flume实现对文件的监控
官方说明
source runs a given Unix command on start-up and expects that process to continuously produce data on standard out (stderr is simply discarded, unless property logStdErr is set to true). If the process exits for any reason, the source also exits and will produce no further data. This means configurations such as cat [named pipe] or tail -F [file] are going to produce the desired results where as date will probably not - the former two commands produce streams of data where as the latter produces a single event and exits.
Exec源在启动时运行给定的Unix命令,并期望该进程在标准输出上连续生成数据(除非将属性logStdErr设置为true,否则将丢弃stderr)。
如果进程因任何原因退出,则源也会退出并且不会产生更多数据。
这意味着诸如cat [named pipe]或tail -F [file]之类的配置将产生所需的结果,而日期可能不会 - 前两个命令产生数据流,而后者产生单个事件并退出。
整体思路:
exec source + memory channel + logger sink
配置说明
编写conf文件
需要配置的有source的type、command、shell属性,其他的可以直接使用默认值。
exec-memory-logger.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = exec
a1.sources.r1.command = tail -f /Users/david/Cores/apache-flume-1.6.0-cdh5.7.0-bin/conf/data.log
a1.sources.r1.shell = /bin/sh -c
a1.sources.r1.bind = localhost
#logger 控制台
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动flume
启动命令
flume-ng agent \
--name a1 \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/exec-memory-logger.conf \
-Dflume.root.logger=INFO,console
测试
配置文件指定了监听的文件目录,向文件中插入数据,查看flume控制台的输出。如果正常输出,即为配置成功。