Ovirt

【Ovirt 笔记】engine-log-collector 的

2018-05-18  本文已影响41人  58bc06151329

文前说明

作为码农中的一员,需要不断的学习,我工作之余将一些分析总结和学习笔记写成博客与大家一起交流,也希望采用这种方式记录自己的学习之旅。

本文仅供学习交流使用,侵权必删。
不用于商业目的,转载请注明出处。

分析整理的版本为 Ovirt 3.4.5 版本。

命令使用方式:
engine-log-collector [options] list 显示主机列表
engine-log-collector [options] collect 诊断报告收集。

选项组 说明
--version 显示程序的版本号
-h,--help 显示帮助信息
--quiet 控制台简洁输出(默认 false)
--local-tmp= 临时存储目录,目录随机生成,例如:/tmp/LogCultReXCFDV5
--ticket-number= ticket ID
--upload= 将报告上传红帽(选择红帽支持的列表)
--log-file=PATH 日志文件路径(默认为 /var/log/ovirt-engine/ovirt-log-collector/ovirt-log-collector-yyyyMMddHHmmss.log
--conf-file=PATH 配置文件路径(默认为 /etc/ovirt-engine/logcollector.conf
--cert-file=PATH CA 证书用来验证引擎(默认为 /etc/pki/ovirt-engine/ca.pem
--insecure 不验证引擎(默认 off)
--output= 将要存储报表的目标目录
engine 配置组 说明
--no-hypervisors 跳过来自主机(Node)的收集,默认为 false
-u,--user= restApi 用户,例如:user@engine.example.com,默认 admin@internal
-r,--engine= restApi IP 地址,例如:localhost:443
-c,--cluster= 添加群集过滤器列表(逗号隔开群集名称或正则),默认为 None
-d,--data-center= 添加数据中心过滤器列表(逗号隔开数据中心名称或者正则),默认为 None
-H,--hosts= 添加主机过滤器列表(逗号分隔主机名、FQDN、IP 地址或正则),默认为 None
连接配置组 说明
--ssh-port= SSH 连接接口
-k,--key-file= SSH Key 身份文件(私钥)用于访问文件服务器。
--max-connections= 获取日志的最大并发连接数(默认为 10)
PostgreSQL 数据库配置组 说明
--no-postgresql 跳过PostgreSQL 数据库的收集
--pg-user=engine PostgreSQL 数据库用户名称(默认为 engine)
--pg-dbname=engine PostgreSQL 数据库名称(默认为 engine)
--pg-dbhost=localhost PostgreSQL 数据库连接地址(默认为 localhost)
--pg-dbport=5432 PostgreSQL 数据库连接端口(默认为 5432)
--pg-ssh-user=root 通过 SSH 用户远程连接 PostgreSQL 数据库(默认为 root)
--pg-host-key=none 使用身份文件(私钥)访问 PostgreSQL 数据库(默认如果是本机则不需要)

命令采用了 python 方式进行实现。

optparse 模块

from optparse import OptionParser
parser = OptionParser(...)
parser.add_option(.....)
参数 说明
usage 可以打印用法。
version 在使用 %prog --version 的时候输出版本信息。
description 描述信息
参数 说明
action 指示 optparser 解析参数时候该如何处理。默认是 ' store ' 将命令行参数值保存 options 对象里 。action 的值有 store、store_true、store_false、store_const、append、count、callback。
type 默认是 string,也可以是 int、float 等。
dest 如果没有指定 dest 参数,将用命令行参数名来对 options 对象的值进行存取。
store store 可以为 store_true 和 store_false 两种形式。用于处理命令行参数后面不带值的情况。如 -v、-q 等命令行参数。
default 设置默认值。
help 指定帮助文档。
metavar 提示用户期望参数。
group = OptionGroup(parser)
group.add_option()
parser.add_option_group(group)

shutil 模块

命令 说明
shutil.copyfileobj(fsrc, fdst[, length]) 将文件内容拷贝到另一个文件中
shutil.copyfile(src, dst) 拷贝文件
shutil.copy(src, dst) 拷贝文件和权限
shutil.copy2(src, dst) 拷贝文件和状态信息
shutil.copymode(src, dst) 仅拷贝权限。内容、组、用户均不变
shutil.copystat(src, dst) 仅拷贝状态的信息,即文件属性,包括:mode bits, atime, mtime, flags
shutil.ignore_patterns(*patterns) 忽略哪个文件,有选择性的拷贝
shutil.copytree(src, dst, symlinks=False, ignore=None) 递归的去拷贝文件夹
shutil.rmtree(path[, ignore_errors[, onerror]]) 递归的去删除文件
shutil.move(src, dst) 递归的去移动文件,它类似 mv 命令,其实就是重命名。
shutil.make_archive(base_name, format,...) 创建压缩包并返回文件路径,例如:zip、tar

sosreport 诊断报告工具

yum -y insatll sos

sosreport 命令的使用方式:Usage: sosreport [options]

选项 说明
-h,--help 显示帮助信息
-l,--list-plugins 显示插件和可用插件选项列表
-n,--skip-plugins= 设置忽略插件
-e,--enable-plugins= 启用插件
-o,--only-plugins= 仅启用插件
-k 设置插件参数(格式 plugname.option=value format),格式也可以通过 -l 查看
-a,--alloptions 启动加载插件的所有选项
-u,--upload= 将报告上传到 FTP 服务器
--batch 不询问任何问题(批处理模式)
--build 保持 SOS 树可用,不返回结果
--no-colors 不使用终端的文本颜色
--debug 通过 python 调试器启用调试
--ticket-number= 设置 ticket ID
-name= 自定义客户名称
--config-file= 指定备用配置文件
--tmp-dir= 指定备用临时目录
--diagnose 启用诊断
--analyze 启用分析
--report 启用 HTML/XML 报告生成
--profile 打开剖面图
[root@localhost ~]# /usr/sbin/sosreport --list-plugins

sosreport (version 2.2)

The following plugins are currently enabled:

 acpid           acpid related information
 activemq        ActiveMQ related information
 anaconda        Anaconda / Installation information
 apache          Apache related information
 auditd          Auditd related information
 bootloader      Bootloader information
 cgroups         cgroup subsystem information
 crontab         Crontab information
 ctdb            Samba CTDB related information
 devicemapper    device-mapper related information (dm, lvm, multipath)
 distupgrade     Distribution upgrade information
 dovecot         dovecot server related information
 filesys         information on filesystems
 foreman         Foreman related information
 gdm             gdm related information
 general         basic system information
 gluster         gluster related information
 haproxy         haproxy information
 hardware        hardware related information
 hpasm           HP ASM (hp Server Management Drivers and Agent) information
 hts             Red Hat Hardware Test Suite related information
 i18n            i18n related information
 ipvs            Ipvs information
 iscsi           iscsi-initiator related information
 keepalived      Keepalived information
 kernel          kernel related information
 krb5            Samba related information
 ldap            LDAP related information
 libraries       information on shared libraries
 libvirt         libvirt-related information
 logrotate       logrotate configuration files and debug info
 lsbrelease      Linux Standard Base information
 memory          memory usage information
 mongodb         MongoDB related information
 mrggrid         MRG GRID related information
 mrgmessg        MRG Messaging related information
 mysql           MySQL related information
 networking      network related information
 nfs             NFS related information
 nfsserver       NFS server-related information
 ntp             NTP related information
 openhpi         OpenHPI related information
 openshift       Openshift related information
 openssl         openssl related information
 pam             PAM related information
 pgsql           PostgreSQL related information
 postfix         mail server related information
 postgresql      PostgreSQL related information
 powerpc         IBM Power System related information
 printing        printing related information (cups)
 process         process information
 psacct          Process accounting related information
 rpm             RPM information
 samba           Samba related information
 selinux         selinux related information
 ssh             ssh-related information
 startup         startup information
 sunrpc          Sun RPC related information
 system          core system related information
 tomcat          Tomcat related information
 udev            udev related information
 x11             X related information
 xen             Xen related information
 yum             yum information

The following plugins are currently disabled:

 amd               Amd automounter information
 autofs            autofs server-related information
 cloudforms        CloudForms related information
 cluster           cluster suite and GFS related information
 cobbler           cobbler related information
 corosync          corosync information
 cs                Certificate System 7.x Diagnostic Information
 dhcp              DHCP related information
 ds                Directory Server information
 emc               EMC related information (PowerPath, Solutions Enabler CLI and Navisphere CLI)
 ftp               FTP server related information
 infiniband        Infiniband related information
 initrd            initrd related information
 ipa               IPA diagnostic information
 ipsec             ipsec related information
 iscsitarget       iscsi-target related information
 kdump             Kdump related information
 kernel_realtime   Information specific to the realtime kernel
 kvm               KVM related information
 named             named related information
 netdump           Netdump Configuration Information
 nscd              NSCD related information
 oddjob            oddjob related information
 openswan          ipsec related information
 ovirt             oVirt related information
 ppp               ppp, wvdial and rp-pppoe related information
 pxe               PXE related information
 qpidd             Messaging related information
 quagga            quagga related information
 radius            radius related information
 rhn               RHN Satellite related information
 rhui              Red Hat Update Infrastructure for Cloud Providers
 s390              s390 related information
 sanitize          sanitize specified log files, etc
 sanlock           sanlock-related information
 sar               Generate the sar file from /var/log/sa/saXX files
 sendmail          sendmail information
 smartcard         Smart Card related information
 snmp              snmp related information
 soundcard         Sound card information
 squid             squid related information
 sssd              sssd-related Diagnostic Information
 systemtap         SystemTap information
 tftpserver        tftpserver related information
 veritas           veritas related information
 vmware            VMWare related information
 xinetd            xinetd information

The following plugin options are available:

 apache.log            off gathers all apache logs
 auditd.logsize        15 max size (MiB) to collect per syslog file
 auditd.all_logs       off collect all logs regardless of size
 devicemapper.lvmdump  off collect an lvmdump
 devicemapper.lvmdump-am off attempt to collect an lvmdump with advanced options and raw metadata collection
 filesys.dumpe2fs      off dump full filesystem information
 general.syslogsize    15 max size (MiB) to collect per syslog file
 general.all_logs      off collect all log files defined in syslog.conf
 gluster.logsize       5 max log size (MiB) to collect
 gluster.all_logs      off collect all log files present
 kernel.modinfo        on gathers information on all kernel modules
 libraries.ldconfigv   off the name of each directory as it is scanned, and any links that are created.
 mysql.dbuser          mysql username for database dumps
 mysql.dbpass                password for database dumps
 mysql.dbdump          off collect a database dump
 mysql.all_logs        off collect all MySQL logs
 networking.traceroute off collects a traceroute to rhn.redhat.com
 openshift.broker      off Gathers broker specific files
 openshift.node        off Gathers node specific files
 openshift.gear        off Collect information about a specific gear
 pgsql.pghome          /var/lib/pgsql PostgreSQL server home directory (default=/var/lib/pgsql)
 pgsql.username        off username for pg_dump (default=postgres)
 pgsql.password        off password for pg_dump (password visible in process listings)
 pgsql.dbname          off database name to dump for pg_dump (default=None)
 pgsql.dbhost          off hostname/IP of the server upon which the DB is running (default=localhost)
 pgsql.dbport          off database server port number (default=5432)
 postgresql.pghome     /var/lib/pgsql PostgreSQL server home directory.
 postgresql.username   postgres username for pg_dump
 postgresql.password   off password for pg_dump (password visible in process listings)
 postgresql.dbname           database name to dump for pg_dump
 postgresql.dbhost           database hostname/IP (do not use unix socket)
 postgresql.dbport     5432  database server port number
 printing.logsize      5 max size (MiB) to collect per log file
 printing.all_logs     off collect all cups log files
 psacct.all            off collect all process accounting files
 rpm.rpmq              on queries for package information via rpm -q
 rpm.rpmva             off runs a verify on all packages
 selinux.fixfiles      off Print incorrect file context labels
 selinux.list          off List objects and their context
 startup.servicestatus off get a status of all running services
 yum.yumlist           off list repositories and packages
 yum.yumdebug          off gather yum debugging data

engine-log-collector 命令执行流程

解析参数

 conf = Configuration(parser)
        if not conf.get('pg_pass') and pg_pass:
            conf['pg_pass'] = pg_pass
        collector = LogCollector(conf)

创建临时目录、草稿目录

if os.path.exists(conf["local_tmp_dir"]):
            if not os.path.isdir(conf["local_tmp_dir"]):
                raise Exception(
                    '%s is not a directory.' % (conf["local_tmp_dir"])
                )
        else:
            logging.info(
                "%s does not exist.  It will be created." % (
                    conf["local_tmp_dir"]
                )
            )
            os.makedirs(conf["local_tmp_dir"])

conf["local_scratch_dir"] = os.path.join(
            conf["local_tmp_dir"],
            'log-collector-data'
        )


if not os.path.exists(conf["local_scratch_dir"]):
            os.makedirs(conf["local_scratch_dir"])

根据命令类型的不同执行不同的方法

def _initialize_api(hostname, username, password, ca, insecure):
    """
    Initialize the oVirt RESTful API
    """
    url = 'https://{hostname}/ovirt-engine/api'.format(
        hostname=hostname,
    )
    api = API(url=url,
              username=username,
              password=password,
              ca_file=ca,
              validate_cert_chain=not insecure)
    pi = api.get_product_info()
    if pi is not None:
        vrm = '%s.%s.%s' % (
            pi.get_version().get_major(),
            pi.get_version().get_minor(),
            pi.get_version().get_revision()
        )
        logging.debug("API Vendor(%s)\tAPI Version(%s)" % (
            pi.get_vendor(), vrm)
        )
    else:
        api.test(throw_exception=True)
    return api
[root@localhost helper]# engine-log-collector list
This command will collect system configuration and diagnostic
information from this system.
The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before
being passed to any third party.
No changes will be made to system configuration.
Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip): 
Host list (datacenter=None, cluster=None, host=None):
Data Center          | Cluster              | Hostname/IP Address
Default              | Default              | 192.168.103.117
def get_engine_data(self):
        logging.info("Gathering oVirt Engine information...")
        collector = ENGINEData(
            "localhost",
            configuration=self.conf
        )
        collector.sosreport()
def __init__(self, hostname, configuration=None, **kwargs):
        super(ENGINEData, self).__init__(hostname, configuration)
        self._plugins = self.caller.call('sosreport --list-plugins')
        if 'ovirt.sensitive_keys' in self._plugins:
            self._engine_plugin = 'ovirt'
        elif 'ovirt-engine.sensitive_keys' in self._plugins:
            self._engine_plugin = 'ovirt-engine'
        elif 'engine.sensitive_keys' in self._plugins:
            self._engine_plugin = 'engine'
        else:
            logging.error('ovirt plugin not found, falling back on default')
            self._engine_plugin = 'ovirt'
        self.dwh_prep()
[root@localhost ~]# /usr/sbin/sosreport --list-plugins | grep ovirt
 ovirt             oVirt related information
def get_postgres_data(self):
        if self.conf.get("no_postgresql") is False:
            try:
                try:
                    if not self.conf.get("pg_pass"):
                        self.conf.getpass(
                            "pg_pass",
                            msg="password for the PostgreSQL user, %s, \
to dump the %s PostgreSQL database instance" %
                                (
                                    self.conf.get('pg_user'),
                                    self.conf.get('pg_dbname')
                                )
                        )
                    logging.info(
                        "Gathering PostgreSQL the oVirt Engine database and \
log files from %s..." % (self.conf.get("pg_dbhost"))
                    )
                except Configuration.SkipException:
                    logging.info(
                        "PostgreSQL oVirt Engine database \
will not be collected."
                    )
                    logging.info(
                        "Gathering PostgreSQL log files from %s..." % (
                            self.conf.get("pg_dbhost")
                        )
                    )

                collector = PostgresData(self.conf.get("pg_dbhost"),
                                         configuration=self.conf)
                collector.sosreport()
            except Exception, e:
                ExitCodes.exit_code = ExitCodes.WARN
                logging.error(
                    "Could not collect PostgreSQL information: %s" % e
                )
        else:
            ExitCodes.exit_code = ExitCodes.NOERR
            logging.info("Skipping postgresql collection...")
def __init__(self, hostname, configuration=None, **kwargs):
        super(PostgresData, self).__init__(hostname, configuration)
        self._postgres_plugin = 'postgresql'
[root@localhost ~]# sosreport -l | grep postgresql
 postgresql      PostgreSQL related information
 postgresql.pghome     /var/lib/pgsql PostgreSQL server home directory.
 postgresql.username   postgres username for pg_dump
 postgresql.password   off password for pg_dump (password visible in process listings)
 postgresql.dbname           database name to dump for pg_dump
 postgresql.dbhost           database hostname/IP (do not use unix socket)
 postgresql.dbport     5432  database server port number
def get_hypervisor_data(self):
        hosts = self.conf.get("hosts")

        if hosts:
            if not self.conf.get("quiet"):
                # Check if there are more than MAX_WARN_HOSTS_COUNT hosts
                # to collect from
                        if len(hosts) >= MAX_WARN_HOSTS_COUNT:
                            logging.warning(
                                _("{number} hypervisors detected. It might take some "
                                  "time to collect logs from {number} hypervisors. "
                                  "You can use the following filters -c, -d, -H. "
                                  "For more information use -h".format(
                                      number=len(hosts),
                                  ))
                            )
                            _continue = \
                                get_from_prompt(msg="Do you want to proceed(Y/n)",
                                                default='y')
                            if _continue not in ('Y', 'y'):
                                logging.info(
                                    _("Aborting hypervisor collection...")
                                )
                                return
                        else:
                            continue_ = get_from_prompt(
                                msg="About to collect information from "
                                    "{len} hypervisors. Continue? (Y/n): ".format(
                                        len=len(hosts),
                                    ),
                                default='y'
                            )

                            if continue_ not in ('y', 'Y'):
                                logging.info("Aborting hypervisor collection...")
                                return

                    logging.info("Gathering information from selected hypervisors...")

                    max_connections = self.conf.get("max_connections", 10)

                    import threading
                    from collections import deque

                    # max_connections may be defined as a string via a .rc file
                    sem = threading.Semaphore(int(max_connections))
                    time_diff_queue = deque()

                    threads = []

                    for datacenter, cluster, host in hosts:
sem.acquire(True)
                        collector = HyperVisorData(
                            host.strip(),
                            configuration=self.conf,
                            semaphore=sem,
                            queue=time_diff_queue,
                            gluster_enabled=cluster.gluster_enabled
                        )
                        thread = threading.Thread(target=collector.run)
                        thread.start()
                        threads.append(thread)

                    for thread in threads:
                        thread.join()

                    self.write_time_diff(time_diff_queue)

    def archive(self):
        """
        Create a single tarball with collected data from engine, postgresql
        and all hypervisors.
        """
        print _('Creating compressed archive...')
        report_file_ext = 'bz2'
        compressor = 'bzip2'
        caller = Caller({})
        try:
            caller.call('xz --version')
            report_file_ext = 'xz'
            compressor = 'xz'
        except Exception:
            logging.debug('xz compression not available')

        if not os.path.exists(self.conf["output"]):
            os.makedirs(self.conf["output"])

        self.conf["path"] = os.path.join(
            self.conf["output"],
            "sosreport-%s-%s.tar.%s" % (
                'LogCollector',
                time.strftime("%Y%m%d%H%M%S"),
                report_file_ext
            )
        )

        if self.conf["ticket_number"]:
            self.conf["path"] = os.path.join(
                self.conf["output"],
                "sosreport-%s-%s-%s.tar.%s" % (
                    'LogCollector',
                    self.conf["ticket_number"],
                    time.strftime("%Y%m%d%H%M%S"),
                    report_file_ext
                )
            )

        config = {
            'report': os.path.splitext(self.conf['path'])[0],
            'compressed_report': self.conf['path'],
            'compressor': compressor,
            'directory': self.conf["local_tmp_dir"],
        }
        caller.configuration = config
        caller.call("tar -cf '%(report)s' -C '%(directory)s' .")
        shutil.rmtree(self.conf["local_tmp_dir"])
        caller.call("%(compressor)s -1 '%(report)s'")
        os.chmod(self.conf["path"], stat.S_IRUSR | stat.S_IWUSR)
md5_out = caller.call("md5sum '%(compressed_report)s'")
        checksum = md5_out.split()[0]
        with open("%s.md5" % self.conf["path"], 'w') as checksum_file:
            checksum_file.write(md5_out)

        msg = ''
        if os.path.exists(self.conf["path"]):
            archiveSize = float(os.path.getsize(self.conf["path"])) / (1 << 20)

            size = '%.1fM' % archiveSize

            msg = _(
                'Log files have been collected and placed in {path}.\n'
                'The MD5 for this file is {checksum} and its size is {size}'
            ).format(
                path=self.conf["path"],
                size=size,
                checksum=checksum,
            )

            if archiveSize >= 1000:
                msg += _(
                    '\nYou can use the following filters in the next '
                    'execution -c, -d, -H to reduce the archive size.'
                )
        return msg
上一篇 下一篇

猜你喜欢

热点阅读