sentry执行机制

2018-05-09  本文已影响229人  xuefly

隶属于文章系列:大数据安全实战 https://www.jianshu.com/p/76627fd8399c


启动过程

bin/sentry --command service --conffile sentry-site.xml

在脚本中调用jar包exec $HADOOP jar ${SENTRY_HOME}/lib/${_CMD_JAR} org.apache.sentry.SentryMain ${args[@]}

.............
if [ "${RUN_CONFIG_TOOL}" = "0" ]
then
  for f in ${SENTRY_HOME}/lib/server/*.jar; do
    HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${f}
  done
  for f in ${SENTRY_HOME}/lib/plugins/*.jar; do
    HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${f}
  done
  # Add Hive client configs to the classpath of Sentry
  HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HIVE_CONF_DIR}

  exec $HADOOP jar ${SENTRY_HOME}/lib/${_CMD_JAR} org.apache.sentry.SentryMain ${args[@]}
else
  exec ${SENTRY_HOME}/bin/config_tool ${args[@]}
fi
private static final ImmutableMap<String, String> COMMANDS = ImmutableMap
      .<String, String>builder()
      .put("service", "org.apache.sentry.service.thrift.SentryService$CommandImpl")
      .put("config-tool", "org.apache.sentry.binding.hive.authz.SentryConfigTool$CommandImpl")
      .put("schema-tool",
          "org.apache.sentry.provider.db.tools.SentrySchemaTool$CommandImpl")
          .build();
((Command)command).run(commandLine.getArgs());
<property>
 <name>sentry.service.client.server.rpc-port</name>
 <value>3893</value>
</property>
<property>
 <name>sentry.service.client.server.rpc-address</name>
 <value>hostname</value>
</property>
<property>
    <name>sentry.service.client.server.rpc-connection-timeout</name>
    <value>200000</value>
</property>

业务处理过程

image.png

hive的执行过程

从框架图中我们可以看见从用户提交一个查询(假设通过CLI入口)直到获取最终结果,Hive内部的执行流程主要包括:

CLI 获取用户查询,解析用户输入的命令,提交给Driver;
Driver 结合编译器(COMPILER)和元数据库(METASTORE),对用户查询进行编译解析;
根据解析结果(查询计划)生成MR任务提交给Hadoop执行;
获取最终结果;
[图片上传失败...(image-b3de86-1525847696420)]
[图片上传失败...(image-5281dc-1525847696420)]

image

hive跟sentry的结合点

thrift服务
image
如图所示,图中黄色部分是用户实现的业务逻辑,褐色部分是根据 Thrift 定义的服务接口描述文件生成的客户端和服务器端代码框架,红色部分是根据 Thrift 文件生成代码实现数据的读写操作。红色部分以下是 Thrift 的传输体系、协议以及底层 I/O 通信,使用 Thrift 可以很方便的定义一个服务并且选择不同的传输协议和传输层而不用重新生成代码。
参考:Apache Thrift - 可伸缩的跨语言服务开发框架

sentry使用了哪些hook-官网
当我提交一条语句的时候的过程是怎么样的。

image
这个hook获得这个查询需要以读和写的方式访问的对象,然后Sentry的Hive binding基于SQL授权模型将他们转换为授权的请求。
HiveSessionHook
package org.apache.hive.service.cli.session;

import org.apache.hadoop.hive.ql.hooks.Hook;
import org.apache.hive.service.cli.HiveSQLException;

/**
 * HiveSessionHook.
 * HiveServer2 session level Hook interface. The run method is executed
 *  when session manager starts a new session
 * 在一个新session创建的时候执行
 *
 */
public interface HiveSessionHook extends Hook {

  /**
   * @param sessionHookContext context
   * @throws HiveSQLException
   */
  public void run(HiveSessionHookContext sessionHookContext) throws HiveSQLException;
}

image

策略维护包括两个步骤。在查询编译期间,hive调用Sentry的授权任务工厂来生产会在查询过程中执行的Sentry的特定任务行。这个任务调用Sentry存储客户端发送RPC请求给Sentry服务要求改变授权策略。

<!--
Properties required on Hive to talk to Sentry policy store service: (hive-site.xml)
-->
<configuration>
  <property>
    <name>hive.security.authorization.task.factory</name>
    <value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
  </property>
image

Sentry通过pre-listener hook集成到Hive Metastore。metastore在执行metadata维护请求之前执行这个hook。metastore binding为提交给metastore和HCatalog客户端的metadata修改请求创建一个Sentry授权请求。

<!--
Properties required on Metastore to talk to Sentry policy store service: (hive-site.xml)
-->
<property>
  <name>hive.metastore.pre.event.listeners</name>
  <value>org.apache.sentry.binding.metastore.MetastoreAuthzBinding</value>
  <description>list of comma seperated listeners for metastore events.</description>
</property>

参考
Hive 源码解析之 Hive 基本框架和执行入口 - 我们俩 - SegmentFault
竞项网—你想了解的Hive Query生命周期--钩子函数篇
Hive Hook类型 - CSDN博客

调用案例

/xf_doc\img\tec\bigdata\大数据系统\权限\sentry远程调用.png

客户端


/xf_doc\img\tec\bigdata\大数据系统\权限\客户端SentryHiveAuthorizationTaskFactory.png
package org.apache.hadoop.hive.ql.parse.authorization;

@LimitedPrivate({"Apache Hive, Apache Sentry (incubating)"})
@Evolving
public interface HiveAuthorizationTaskFactory {
    Task<? extends Serializable> createCreateRoleTask(ASTNode var1, HashSet<ReadEntity> var2, HashSet<WriteEntity> var3) throws SemanticException;

    Task<? extends Serializable> createDropRoleTask(ASTNode var1, HashSet<ReadEntity> var2, HashSet<WriteEntity> var3) throws SemanticException;

    Task<? extends Serializable> createShowRoleGrantTask(ASTNode var1, Path var2, HashSet<ReadEntity> var3, HashSet<WriteEntity> var4) throws SemanticException;

    Task<? extends Serializable> createGrantRoleTask(ASTNode var1, HashSet<ReadEntity> var2, HashSet<WriteEntity> var3) throws SemanticException;

    Task<? extends Serializable> createRevokeRoleTask(ASTNode var1, HashSet<ReadEntity> var2, HashSet<WriteEntity> var3) throws SemanticException;

    Task<? extends Serializable> createGrantTask(ASTNode var1, HashSet<ReadEntity> var2, HashSet<WriteEntity> var3) throws SemanticException;

    Task<? extends Serializable> createShowGrantTask(ASTNode var1, Path var2, HashSet<ReadEntity> var3, HashSet<WriteEntity> var4) throws SemanticException;

    Task<? extends Serializable> createRevokeTask(ASTNode var1, HashSet<ReadEntity> var2, HashSet<WriteEntity> var3) throws SemanticException;

    Task<? extends Serializable> createSetRoleTask(String var1, HashSet<ReadEntity> var2, HashSet<WriteEntity> var3) throws SemanticException;

    Task<? extends Serializable> createShowCurrentRoleTask(HashSet<ReadEntity> var1, HashSet<WriteEntity> var2, Path var3) throws SemanticException;

    Task<? extends Serializable> createShowRolePrincipalsTask(ASTNode var1, Path var2, HashSet<ReadEntity> var3, HashSet<WriteEntity> var4) throws SemanticException;

    Task<? extends Serializable> createShowRolesTask(ASTNode var1, Path var2, HashSet<ReadEntity> var3, HashSet<WriteEntity> var4) throws SemanticException;
}

服务端


image.png
上一篇 下一篇

猜你喜欢

热点阅读