【Ovirt 笔记】engine-log-collector 的
2018-05-18 本文已影响41人
58bc06151329
文前说明
作为码农中的一员,需要不断的学习,我工作之余将一些分析总结和学习笔记写成博客与大家一起交流,也希望采用这种方式记录自己的学习之旅。
本文仅供学习交流使用,侵权必删。
不用于商业目的,转载请注明出处。
分析整理的版本为 Ovirt 3.4.5 版本。
命令使用方式:
engine-log-collector [options] list 显示主机列表
engine-log-collector [options] collect 诊断报告收集。
选项组 | 说明 |
---|---|
--version | 显示程序的版本号 |
-h,--help | 显示帮助信息 |
--quiet | 控制台简洁输出(默认 false) |
--local-tmp= | 临时存储目录,目录随机生成,例如:/tmp/LogCultReXCFDV5 |
--ticket-number= | ticket ID |
--upload= | 将报告上传红帽(选择红帽支持的列表) |
--log-file=PATH | 日志文件路径(默认为 /var/log/ovirt-engine/ovirt-log-collector/ovirt-log-collector-yyyyMMddHHmmss.log) |
--conf-file=PATH | 配置文件路径(默认为 /etc/ovirt-engine/logcollector.conf) |
--cert-file=PATH | CA 证书用来验证引擎(默认为 /etc/pki/ovirt-engine/ca.pem) |
--insecure | 不验证引擎(默认 off) |
--output= | 将要存储报表的目标目录 |
- engine 配置,针对引擎 restApi 的授权和针对一个或者多个主机筛选日志集合,如果设置了 --no-
hypervisors,将不会从任何主机收集数据。
engine 配置组 | 说明 |
---|---|
--no-hypervisors | 跳过来自主机(Node)的收集,默认为 false |
-u,--user= | restApi 用户,例如:user@engine.example.com,默认 admin@internal |
-r,--engine= | restApi IP 地址,例如:localhost:443 |
-c,--cluster= | 添加群集过滤器列表(逗号隔开群集名称或正则),默认为 None |
-d,--data-center= | 添加数据中心过滤器列表(逗号隔开数据中心名称或者正则),默认为 None |
-H,--hosts= | 添加主机过滤器列表(逗号分隔主机名、FQDN、IP 地址或正则),默认为 None |
连接配置组 | 说明 |
---|---|
--ssh-port= | SSH 连接接口 |
-k,--key-file= | SSH Key 身份文件(私钥)用于访问文件服务器。 |
--max-connections= | 获取日志的最大并发连接数(默认为 10) |
- PostgreSQL 配置,可以指定数据库连接配置,连接到数据库收集相关日志,如果设置了 --no-postgresql,将跳过数据库连接。
PostgreSQL 数据库配置组 | 说明 |
---|---|
--no-postgresql | 跳过PostgreSQL 数据库的收集 |
--pg-user=engine | PostgreSQL 数据库用户名称(默认为 engine) |
--pg-dbname=engine | PostgreSQL 数据库名称(默认为 engine) |
--pg-dbhost=localhost | PostgreSQL 数据库连接地址(默认为 localhost) |
--pg-dbport=5432 | PostgreSQL 数据库连接端口(默认为 5432) |
--pg-ssh-user=root | 通过 SSH 用户远程连接 PostgreSQL 数据库(默认为 root) |
--pg-host-key=none | 使用身份文件(私钥)访问 PostgreSQL 数据库(默认如果是本机则不需要) |
- 此命令将收集系统日志信息,用于系统配置和诊断。
命令采用了 python 方式进行实现。
optparse 模块
-
engine-image-uploader.sh 中使用了 optparse 模块,这是一个专门用来在命令行添加选项的一个模块。
-
代码示例
from optparse import OptionParser
parser = OptionParser(...)
parser.add_option(.....)
-
OptionParser 命令参数
- 不要求一定要传递参数
参数 | 说明 |
---|---|
usage | 可以打印用法。 |
version | 在使用 %prog --version 的时候输出版本信息。 |
description | 描述信息 |
- add_option 添加命令行参数
参数 | 说明 |
---|---|
action | 指示 optparser 解析参数时候该如何处理。默认是 ' store ' 将命令行参数值保存 options 对象里 。action 的值有 store、store_true、store_false、store_const、append、count、callback。 |
type | 默认是 string,也可以是 int、float 等。 |
dest | 如果没有指定 dest 参数,将用命令行参数名来对 options 对象的值进行存取。 |
store | store 可以为 store_true 和 store_false 两种形式。用于处理命令行参数后面不带值的情况。如 -v、-q 等命令行参数。 |
default | 设置默认值。 |
help | 指定帮助文档。 |
metavar | 提示用户期望参数。 |
-
parse_args 解析命令行形参
- (options, args) = parser.parse_args() 可以传递一个参数列表给 parse_args()。否则,默认使用命令行参数 (sysargv[1:])。
- parse_args() 返回两个值
- options 这是一个对象(optpars.Values),保存有命令行参数值。只要知道命令行参数名,如 file,就可以访问其对应的值 options.file。
- args,一个由 positional arguments 组成的列表。
-
如果 options 很多的时候,可以进行分组
group = OptionGroup(parser)
group.add_option()
parser.add_option_group(group)
shutil 模块
- engine-image-uploader.sh 中使用了 shutil 模块,这是一个高级的文件、文件夹、压缩包 处理模块。
命令 | 说明 |
---|---|
shutil.copyfileobj(fsrc, fdst[, length]) | 将文件内容拷贝到另一个文件中 |
shutil.copyfile(src, dst) | 拷贝文件 |
shutil.copy(src, dst) | 拷贝文件和权限 |
shutil.copy2(src, dst) | 拷贝文件和状态信息 |
shutil.copymode(src, dst) | 仅拷贝权限。内容、组、用户均不变 |
shutil.copystat(src, dst) | 仅拷贝状态的信息,即文件属性,包括:mode bits, atime, mtime, flags |
shutil.ignore_patterns(*patterns) | 忽略哪个文件,有选择性的拷贝 |
shutil.copytree(src, dst, symlinks=False, ignore=None) | 递归的去拷贝文件夹 |
shutil.rmtree(path[, ignore_errors[, onerror]]) | 递归的去删除文件 |
shutil.move(src, dst) | 递归的去移动文件,它类似 mv 命令,其实就是重命名。 |
shutil.make_archive(base_name, format,...) | 创建压缩包并返回文件路径,例如:zip、tar |
sosreport 诊断报告工具
- sosreport 是一个类似于 supportconfig 的生成诊断报告的工具,sosreport 是 python 编写的一个工具,适用于 centos(和 redhat一样,包名为 sos)、ubuntu(其下包名为 sosreport)等大多数版本的 linux。
- sosreport 在 github上的托管页面为 https://github.com/sosreport/sos ,而且默认在很多系统的源里都已经集成有。
- redhat 一般也会通过 sosreport 将收集的信息进行分析查看。redhat 4.5 之前的版本中叫sysreport。通过以下命令可以安装。
yum -y insatll sos
sosreport 命令的使用方式:Usage: sosreport [options]
选项 | 说明 |
---|---|
-h,--help | 显示帮助信息 |
-l,--list-plugins | 显示插件和可用插件选项列表 |
-n,--skip-plugins= | 设置忽略插件 |
-e,--enable-plugins= | 启用插件 |
-o,--only-plugins= | 仅启用插件 |
-k | 设置插件参数(格式 plugname.option=value format),格式也可以通过 -l 查看 |
-a,--alloptions | 启动加载插件的所有选项 |
-u,--upload= | 将报告上传到 FTP 服务器 |
--batch | 不询问任何问题(批处理模式) |
--build | 保持 SOS 树可用,不返回结果 |
--no-colors | 不使用终端的文本颜色 |
--debug | 通过 python 调试器启用调试 |
--ticket-number= | 设置 ticket ID |
-name= | 自定义客户名称 |
--config-file= | 指定备用配置文件 |
--tmp-dir= | 指定备用临时目录 |
--diagnose | 启用诊断 |
--analyze | 启用分析 |
--report | 启用 HTML/XML 报告生成 |
--profile | 打开剖面图 |
[root@localhost ~]# /usr/sbin/sosreport --list-plugins
sosreport (version 2.2)
The following plugins are currently enabled:
acpid acpid related information
activemq ActiveMQ related information
anaconda Anaconda / Installation information
apache Apache related information
auditd Auditd related information
bootloader Bootloader information
cgroups cgroup subsystem information
crontab Crontab information
ctdb Samba CTDB related information
devicemapper device-mapper related information (dm, lvm, multipath)
distupgrade Distribution upgrade information
dovecot dovecot server related information
filesys information on filesystems
foreman Foreman related information
gdm gdm related information
general basic system information
gluster gluster related information
haproxy haproxy information
hardware hardware related information
hpasm HP ASM (hp Server Management Drivers and Agent) information
hts Red Hat Hardware Test Suite related information
i18n i18n related information
ipvs Ipvs information
iscsi iscsi-initiator related information
keepalived Keepalived information
kernel kernel related information
krb5 Samba related information
ldap LDAP related information
libraries information on shared libraries
libvirt libvirt-related information
logrotate logrotate configuration files and debug info
lsbrelease Linux Standard Base information
memory memory usage information
mongodb MongoDB related information
mrggrid MRG GRID related information
mrgmessg MRG Messaging related information
mysql MySQL related information
networking network related information
nfs NFS related information
nfsserver NFS server-related information
ntp NTP related information
openhpi OpenHPI related information
openshift Openshift related information
openssl openssl related information
pam PAM related information
pgsql PostgreSQL related information
postfix mail server related information
postgresql PostgreSQL related information
powerpc IBM Power System related information
printing printing related information (cups)
process process information
psacct Process accounting related information
rpm RPM information
samba Samba related information
selinux selinux related information
ssh ssh-related information
startup startup information
sunrpc Sun RPC related information
system core system related information
tomcat Tomcat related information
udev udev related information
x11 X related information
xen Xen related information
yum yum information
The following plugins are currently disabled:
amd Amd automounter information
autofs autofs server-related information
cloudforms CloudForms related information
cluster cluster suite and GFS related information
cobbler cobbler related information
corosync corosync information
cs Certificate System 7.x Diagnostic Information
dhcp DHCP related information
ds Directory Server information
emc EMC related information (PowerPath, Solutions Enabler CLI and Navisphere CLI)
ftp FTP server related information
infiniband Infiniband related information
initrd initrd related information
ipa IPA diagnostic information
ipsec ipsec related information
iscsitarget iscsi-target related information
kdump Kdump related information
kernel_realtime Information specific to the realtime kernel
kvm KVM related information
named named related information
netdump Netdump Configuration Information
nscd NSCD related information
oddjob oddjob related information
openswan ipsec related information
ovirt oVirt related information
ppp ppp, wvdial and rp-pppoe related information
pxe PXE related information
qpidd Messaging related information
quagga quagga related information
radius radius related information
rhn RHN Satellite related information
rhui Red Hat Update Infrastructure for Cloud Providers
s390 s390 related information
sanitize sanitize specified log files, etc
sanlock sanlock-related information
sar Generate the sar file from /var/log/sa/saXX files
sendmail sendmail information
smartcard Smart Card related information
snmp snmp related information
soundcard Sound card information
squid squid related information
sssd sssd-related Diagnostic Information
systemtap SystemTap information
tftpserver tftpserver related information
veritas veritas related information
vmware VMWare related information
xinetd xinetd information
The following plugin options are available:
apache.log off gathers all apache logs
auditd.logsize 15 max size (MiB) to collect per syslog file
auditd.all_logs off collect all logs regardless of size
devicemapper.lvmdump off collect an lvmdump
devicemapper.lvmdump-am off attempt to collect an lvmdump with advanced options and raw metadata collection
filesys.dumpe2fs off dump full filesystem information
general.syslogsize 15 max size (MiB) to collect per syslog file
general.all_logs off collect all log files defined in syslog.conf
gluster.logsize 5 max log size (MiB) to collect
gluster.all_logs off collect all log files present
kernel.modinfo on gathers information on all kernel modules
libraries.ldconfigv off the name of each directory as it is scanned, and any links that are created.
mysql.dbuser mysql username for database dumps
mysql.dbpass password for database dumps
mysql.dbdump off collect a database dump
mysql.all_logs off collect all MySQL logs
networking.traceroute off collects a traceroute to rhn.redhat.com
openshift.broker off Gathers broker specific files
openshift.node off Gathers node specific files
openshift.gear off Collect information about a specific gear
pgsql.pghome /var/lib/pgsql PostgreSQL server home directory (default=/var/lib/pgsql)
pgsql.username off username for pg_dump (default=postgres)
pgsql.password off password for pg_dump (password visible in process listings)
pgsql.dbname off database name to dump for pg_dump (default=None)
pgsql.dbhost off hostname/IP of the server upon which the DB is running (default=localhost)
pgsql.dbport off database server port number (default=5432)
postgresql.pghome /var/lib/pgsql PostgreSQL server home directory.
postgresql.username postgres username for pg_dump
postgresql.password off password for pg_dump (password visible in process listings)
postgresql.dbname database name to dump for pg_dump
postgresql.dbhost database hostname/IP (do not use unix socket)
postgresql.dbport 5432 database server port number
printing.logsize 5 max size (MiB) to collect per log file
printing.all_logs off collect all cups log files
psacct.all off collect all process accounting files
rpm.rpmq on queries for package information via rpm -q
rpm.rpmva off runs a verify on all packages
selinux.fixfiles off Print incorrect file context labels
selinux.list off List objects and their context
startup.servicestatus off get a status of all running services
yum.yumlist off list repositories and packages
yum.yumdebug off gather yum debugging data
engine-log-collector 命令执行流程
解析参数
conf = Configuration(parser)
if not conf.get('pg_pass') and pg_pass:
conf['pg_pass'] = pg_pass
collector = LogCollector(conf)
创建临时目录、草稿目录
- 临时目录中包含了日志收集目录
if os.path.exists(conf["local_tmp_dir"]):
if not os.path.isdir(conf["local_tmp_dir"]):
raise Exception(
'%s is not a directory.' % (conf["local_tmp_dir"])
)
else:
logging.info(
"%s does not exist. It will be created." % (
conf["local_tmp_dir"]
)
)
os.makedirs(conf["local_tmp_dir"])
conf["local_scratch_dir"] = os.path.join(
conf["local_tmp_dir"],
'log-collector-data'
)
if not os.path.exists(conf["local_scratch_dir"]):
os.makedirs(conf["local_scratch_dir"])
根据命令类型的不同执行不同的方法
- list 获取主机列表
- 调用 restApi 获取,在 /usr/lib/python2.6/site-packages/ovirt_log_collector/helper/hypervisors.py 中初始化 API 访问。
def _initialize_api(hostname, username, password, ca, insecure):
"""
Initialize the oVirt RESTful API
"""
url = 'https://{hostname}/ovirt-engine/api'.format(
hostname=hostname,
)
api = API(url=url,
username=username,
password=password,
ca_file=ca,
validate_cert_chain=not insecure)
pi = api.get_product_info()
if pi is not None:
vrm = '%s.%s.%s' % (
pi.get_version().get_major(),
pi.get_version().get_minor(),
pi.get_version().get_revision()
)
logging.debug("API Vendor(%s)\tAPI Version(%s)" % (
pi.get_vendor(), vrm)
)
else:
api.test(throw_exception=True)
return api
[root@localhost helper]# engine-log-collector list
This command will collect system configuration and diagnostic
information from this system.
The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before
being passed to any third party.
No changes will be made to system configuration.
Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip):
Host list (datacenter=None, cluster=None, host=None):
Data Center | Cluster | Hostname/IP Address
Default | Default | 192.168.103.117
- collect 诊断报告信息收集
- 收集 engine 诊断报告信息
def get_engine_data(self):
logging.info("Gathering oVirt Engine information...")
collector = ENGINEData(
"localhost",
configuration=self.conf
)
collector.sosreport()
- 查看 sosreport 诊断工具是否包含 ovirt 插件,执行 dwh 前置处理。
def __init__(self, hostname, configuration=None, **kwargs):
super(ENGINEData, self).__init__(hostname, configuration)
self._plugins = self.caller.call('sosreport --list-plugins')
if 'ovirt.sensitive_keys' in self._plugins:
self._engine_plugin = 'ovirt'
elif 'ovirt-engine.sensitive_keys' in self._plugins:
self._engine_plugin = 'ovirt-engine'
elif 'engine.sensitive_keys' in self._plugins:
self._engine_plugin = 'engine'
else:
logging.error('ovirt plugin not found, falling back on default')
self._engine_plugin = 'ovirt'
self.dwh_prep()
[root@localhost ~]# /usr/sbin/sosreport --list-plugins | grep ovirt
ovirt oVirt related information
- 收集 PostgreSQL 诊断报告信息
- 如果 no_postgresql 选项未设置,则进行收集。
- 从配置文件中读取数据库用户名、密码、IP、端口号等。
- 查看 诊断工具中是否包含 postgresql 插件。
def get_postgres_data(self):
if self.conf.get("no_postgresql") is False:
try:
try:
if not self.conf.get("pg_pass"):
self.conf.getpass(
"pg_pass",
msg="password for the PostgreSQL user, %s, \
to dump the %s PostgreSQL database instance" %
(
self.conf.get('pg_user'),
self.conf.get('pg_dbname')
)
)
logging.info(
"Gathering PostgreSQL the oVirt Engine database and \
log files from %s..." % (self.conf.get("pg_dbhost"))
)
except Configuration.SkipException:
logging.info(
"PostgreSQL oVirt Engine database \
will not be collected."
)
logging.info(
"Gathering PostgreSQL log files from %s..." % (
self.conf.get("pg_dbhost")
)
)
collector = PostgresData(self.conf.get("pg_dbhost"),
configuration=self.conf)
collector.sosreport()
except Exception, e:
ExitCodes.exit_code = ExitCodes.WARN
logging.error(
"Could not collect PostgreSQL information: %s" % e
)
else:
ExitCodes.exit_code = ExitCodes.NOERR
logging.info("Skipping postgresql collection...")
def __init__(self, hostname, configuration=None, **kwargs):
super(PostgresData, self).__init__(hostname, configuration)
self._postgres_plugin = 'postgresql'
[root@localhost ~]# sosreport -l | grep postgresql
postgresql PostgreSQL related information
postgresql.pghome /var/lib/pgsql PostgreSQL server home directory.
postgresql.username postgres username for pg_dump
postgresql.password off password for pg_dump (password visible in process listings)
postgresql.dbname database name to dump for pg_dump
postgresql.dbhost database hostname/IP (do not use unix socket)
postgresql.dbport 5432 database server port number
- 收集主机诊断报告信息
- 如果 no_hypervisor 选项未设置,则进行收集。
- 收集的主机列表范围,由 engine-log-collector 命令的 engine 配置组参数决定。
- 采用并行收集,默认为 10。
def get_hypervisor_data(self):
hosts = self.conf.get("hosts")
if hosts:
if not self.conf.get("quiet"):
# Check if there are more than MAX_WARN_HOSTS_COUNT hosts
# to collect from
if len(hosts) >= MAX_WARN_HOSTS_COUNT:
logging.warning(
_("{number} hypervisors detected. It might take some "
"time to collect logs from {number} hypervisors. "
"You can use the following filters -c, -d, -H. "
"For more information use -h".format(
number=len(hosts),
))
)
_continue = \
get_from_prompt(msg="Do you want to proceed(Y/n)",
default='y')
if _continue not in ('Y', 'y'):
logging.info(
_("Aborting hypervisor collection...")
)
return
else:
continue_ = get_from_prompt(
msg="About to collect information from "
"{len} hypervisors. Continue? (Y/n): ".format(
len=len(hosts),
),
default='y'
)
if continue_ not in ('y', 'Y'):
logging.info("Aborting hypervisor collection...")
return
logging.info("Gathering information from selected hypervisors...")
max_connections = self.conf.get("max_connections", 10)
import threading
from collections import deque
# max_connections may be defined as a string via a .rc file
sem = threading.Semaphore(int(max_connections))
time_diff_queue = deque()
threads = []
for datacenter, cluster, host in hosts:
sem.acquire(True)
collector = HyperVisorData(
host.strip(),
configuration=self.conf,
semaphore=sem,
queue=time_diff_queue,
gluster_enabled=cluster.gluster_enabled
)
thread = threading.Thread(target=collector.run)
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
self.write_time_diff(time_diff_queue)
- 将收集的诊断报告信息进行汇总压缩。
def archive(self):
"""
Create a single tarball with collected data from engine, postgresql
and all hypervisors.
"""
print _('Creating compressed archive...')
report_file_ext = 'bz2'
compressor = 'bzip2'
caller = Caller({})
try:
caller.call('xz --version')
report_file_ext = 'xz'
compressor = 'xz'
except Exception:
logging.debug('xz compression not available')
if not os.path.exists(self.conf["output"]):
os.makedirs(self.conf["output"])
self.conf["path"] = os.path.join(
self.conf["output"],
"sosreport-%s-%s.tar.%s" % (
'LogCollector',
time.strftime("%Y%m%d%H%M%S"),
report_file_ext
)
)
if self.conf["ticket_number"]:
self.conf["path"] = os.path.join(
self.conf["output"],
"sosreport-%s-%s-%s.tar.%s" % (
'LogCollector',
self.conf["ticket_number"],
time.strftime("%Y%m%d%H%M%S"),
report_file_ext
)
)
config = {
'report': os.path.splitext(self.conf['path'])[0],
'compressed_report': self.conf['path'],
'compressor': compressor,
'directory': self.conf["local_tmp_dir"],
}
caller.configuration = config
caller.call("tar -cf '%(report)s' -C '%(directory)s' .")
shutil.rmtree(self.conf["local_tmp_dir"])
caller.call("%(compressor)s -1 '%(report)s'")
os.chmod(self.conf["path"], stat.S_IRUSR | stat.S_IWUSR)
md5_out = caller.call("md5sum '%(compressed_report)s'")
checksum = md5_out.split()[0]
with open("%s.md5" % self.conf["path"], 'w') as checksum_file:
checksum_file.write(md5_out)
msg = ''
if os.path.exists(self.conf["path"]):
archiveSize = float(os.path.getsize(self.conf["path"])) / (1 << 20)
size = '%.1fM' % archiveSize
msg = _(
'Log files have been collected and placed in {path}.\n'
'The MD5 for this file is {checksum} and its size is {size}'
).format(
path=self.conf["path"],
size=size,
checksum=checksum,
)
if archiveSize >= 1000:
msg += _(
'\nYou can use the following filters in the next '
'execution -c, -d, -H to reduce the archive size.'
)
return msg