【安全】Hadoop 安全集群搭建及其对接
0.环境信息
软件名称 | 版本 |
---|---|
OS | openEuler 22.03 (LTS-SP1) |
BiSheng JDK | OpenJDK 64-Bit Server VM BiSheng (build 1.8.0_342-b11) |
Apache | Haoop 3.2.0 |
1. Kerberos 原理及其使用
1.1 Kerberos 原理
Kerberos是一种基于票据的、集中式的网络认证协议,适用于C/S模型,最初由麻省理工学院(Massachusetts Institute of Technology, MIT)开发的。
- 密钥分发中心 KDC (Key Distribution Center)是 Kerberos 的核心组件,包含 AS(Authentication Server) 和 TGS(Ticket Granting Server)
- AS(Authentication Server) 负责用户信息认证,给客户端提供TGT(Ticket Granting Tickets)。
- TGS(Ticket Granting Server) 向客户端提供ST(Service Ticket)和 Session Key(服务会话密钥)。
![](https://img.haomeiwen.com/i21744606/32959122d36e9d9c.png)
Kerberos 认证的时序图,如下所示。
![](https://img.haomeiwen.com/i21744606/414269efc5a8d3ef.png)
1.2 Kerberos 的安装和使用
1.2.1 Kerberos 的安装
server1 安装和配置 krb5
# 步骤1:server1安装krb5、krb5-libs、krb5-server、krb5-client
#方法1:配置好yum源后安装
$yum install -y krb5 krb5-libs krb5-server krb5-client
#方法2:根据系统版本仓库从下kerberos的rpm包,https://repo.openeuler.org/openEuler-22.03-LTS/OS/aarch64/Packages/
$rpm -iv krb5-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-libs-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-devel-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-server-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-client-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-help-1.19.2-6.oe2203sp1.noarch.rpm
# 步骤2:修改krb5.conf为如下内容
$vi /etc/krb5.conf
# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
default_realm = HADOOP.COM
#default_ccache_name = KEYRING:persistent:%{uid}
[realms]
HADOOP.COM = {
kdc = server1
admin_server = server1
}
[domain_realm]
.hadoop.com = HADOOP.COM
hadoop.com = HADOOP.COM
# 步骤3:创建KDC数据库
$ll /var/kerberos/krb5kdc/
$kdb5_util create -s
#输入密码如test123,再输入一次确认。
$ll /var/kerberos/krb5kdc/
#多了principal、principal.kadm5、principal.kadm5.lock、principal.ok等文件
# 步骤4:放行所有管理员权限,修改为如下内容
$vi /var/kerberos/krb5kdc/kadm5.acl
*/admin@TEST.COM *
# 步骤5:注释掉使用KCM作为凭据缓存,最后两行
$vi /etc/krb5.conf.d/kcm_default_ccache
#[libdefaults]
# default_ccache_name = KCM:
# 步骤6:新建管理员root/admin
#1.输入 addprinc root/admin -- 新建管理员 root/admin,允许用户端使用 kadmin 使用root/admin 账户密码登录。
#2.输入 root/admin 的密码如test123,并再次输入确认。
#3.输入 listprincs -- 查看是否存在新建的管理员信息。
#4.输入 exit -- 退出。
# 步骤7:启动kadmin、krb5kdc服务,设置开机启动
$systemctl start kadmin krb5kdc
$systemctl enable kadmin krb5kdc
$chkconfig --level 35 krb5kdc on
$chkconfig --level 35 kadmin on
# 步骤8:管理员登录测试
$kadmin
#1.输入root/admin密码。
#2.输入listprincs。
#3.输入exit 退出
# 步骤9:创建主机票据
$kadmin.local
#1.输入addprinc -randkey host/server1 -- 添加 host/server1,密码为随机。
#2.输入addprinc krbuser1 -- 添加一个普通账户 krbuser1。
#3.输入krbuser1 在 kerberos 上的密码如 123,再次输入确认。
#4.输入exit 退出。
说明:
kadmin.local
:需要在 KDC server上面运行,无需密码即可管理数据库。kadmin
:可以在任何一台KDC领域的系統上面运行,但是需要输入管理员密码。
agent1~3 安装和配置 krb5 client
# 步骤1:agent1~3安装krb client
$rpm -iv krb5-libs-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-client-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-help-1.19.2-6.oe2203sp1.noarch.rpm
# 步骤2:修改krb5.conf为如下内容
$vi /etc/krb5.conf
# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
default_realm = HADOOP.COM
#default_ccache_name = KEYRING:persistent:%{uid}
[realms]
HADOOP.COM = {
kdc = server1
admin_server = server1
}
[domain_realm]
.hadoop.com = HADOOP.COM
hadoop.com = HADOOP.COM
# 步骤3:注释掉使用KCM作为凭据缓存,最后两行
$vi /etc/krb5.conf.d/kcm_default_ccache
#[libdefaults]
# default_ccache_name = KCM:
# 步骤4:创建主机票据
$kadmin.local
#1.输入addprinc -randkey host/agent1 -- 添加 host/agent1,密码为随机。
#2.输入exit 退出。
1.2.2 Kerberos 的使用
样例1:添加namenode的kerberos用户
$kadmin.local
$addprinc -randkey nn/server1@HADOOP.COM
#查看全部凭证
$listprincs
#指定namenode的凭证文件导出到指定目录
$ktadd -k /etc/security/keytab/nn.keytab nn/server1
#指定root/admin和root的凭证文件导出到指定目录
$ktadd -k /etc/security/keytab/root.keytab root/admin
$ktadd -k /etc/security/keytab/root.keytab root
样例2:指定root/admin的kerberos用户进行认证
# 步骤1:指定用户登录
#方法1:输入密码
$kinit root
#方法2:利用凭证文件登录
$kinit -kt etc/security/keytab/root.keytab root/admin
# 步骤2:查看登录信息
$klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: root@HADOOP.COM
Valid starting Expires Service principal
07/31/2023 09:12:05 08/01/2023 09:12:05 krbtgt/HADOOP.COM@HADOOP.COM
renew until 07/31/2023 09:12:05
07/31/2023 09:26:21 08/01/2023 09:12:05 host/agent1@
renew until 07/31/2023 09:12:05
# 步骤3:退出
$kdestroy
2.HDFS 对接 Kerberos
说明:
1.HDFS 的 namenode、secondarynamenode 和 datanode 组件的各个文件目录权限全部使用 root,而没有做权限划分
2.namenode、secondarynamenode 部署在 server1,而 datanode 部署在 agent1~agent3
2.1 创建 hdfs 的 Kerberos 用户及其凭证文件
#在server1上创建用户和导出凭证文件
$mkdir -p /etc/security/keytab
$kadmin.local
$addprinc -randkey nn/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/nn.keytab nn/server1
$addprinc -randkey sn/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/sn.keytab sn/server1
$addprinc -randkey dn/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/dn.keytab dn/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
2.2 配置 core-site.xml 和 hdfs-site.xml
确定 hadoop 的安装目录
$env |grep HADOOP_HOME
HADOOP_HOME=/usr/local/hadoop
core-site.xml
新增配置
$cd /usr/local/hadoop/etc/hadoop
$vi core-site.xml
<!-- 新增如下配置 -->
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.token.service.use_ip</name>
<value>true</value>
</property>
<property>
<name>hadoop.rpc.protection</name>
<value>authentication</value>
</property>
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[2:$1@$0](nn@HADOOP.COM)s/.*/hdfs/
RULE:[2:$1@$0](sn@HADOOP.COM)s/.*/hdfs/
RULE:[2:$1@$0](dn@HADOOP.COM)s/.*/hdfs/
RULE:[2:$1@$0](nm@HADOOP.COM)s/.*/yarn/
RULE:[2:$1@$0](rm@HADOOP.COM)s/.*/yarn/
RULE:[2:$1@$0](tl@HADOOP.COM)s/.*/yarn/
RULE:[2:$1@$0](jh@HADOOP.COM)s/.*/mapred/
RULE:[2:$1@$0](HTTP@HADOOP.COM)s/.*/hdfs/
DEFAULT
</value>
</property>
<property>
<name>hadoop.security.token.service.use_ip</name>
<value>true</value>
</property>
hdfs-site-site.xml
新增配置
$cd /usr/local/hadoop/etc/hadoop
$vi hdfs-site.xml
<!-- 新增如下配置 -->
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>nn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/security/keytab/nn.keytab</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>sn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/etc/security/keytab/sn.keytab</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.journalnode.kerberos.principal</name>
<value>jn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.journalnode.keytab.file</name>
<value>/etc/security/keytab/jn.keytab</value>
</property>
<property>
<name>dfs.journalnode.kerberos.internal.spnego.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>dn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/security/keytab/dn.keytab</value>
</property>
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>700</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.keytab</name>
<value>/etc/security/keytab/spnego.keytab</value>
</property>
<property>
<name>dfs.permissions.superusergroup</name>
<value>hdfs</value>
<description>The name of the group of super-users.</description>
</property>
<property>
<name>dfs.http.policy</name>
<value>HTTP_ONLY</value>
</property>
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<property>
<name>dfs.data.transfer.protection</name>
<value>authentication</value>
</property>
把 core-site.xml
和 hdfs-site-site.xml
同步到 agent1~3
$scp core-site.xml agent1:/usr/local/hadoop/etc/hadoop
$scp hdfs-site.xml agent1:/usr/local/hadoop/etc/hadoop
#省略...
2.3 启动服务并验证
启动 namenode、secondarynamenode 和 datanode
#在server1节点上操作
$kinit -kt /etc/security/keytab/root.keytab root
$cd /usr/local/hadoop/sbin
$./start-dfs.sh
验证 hdfs
#server1
$ps -ef |grep namenode
$netstat -anp|grep 50070
#agent1~3
$ps -ef |grep datanode
#利用 hdfs cli
$hdfs dfs -ls /
$echo "func test!" > functest.txt
$hdfs dfs -put functest.txt /
$hdfs dfs -cat /functest.txt
$hdfs dfs -rm -f /functest.txt
打开web ui,http://server1:50070,查看overview-》Summary的Live Nodes
![](https://img.haomeiwen.com/i21744606/94a1234f6d33d946.jpg)
3.YARN 对接 Kerberos
说明:
1.yarn 的 resourcemanager 和 nodemanager 组件的各个文件目录权限全部使用 root,而没有做权限划分
2.resourcemanager 部署在 server1,而 nodemanager 部署在 agent1~agent3
3.1 创建 yarn 的 Kerberos 用户及其凭证文件
#在server1上创建用户和导出凭证文件
$kadmin.local
$addprinc -randkey rm/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/rm.keytab rm/server1
$addprinc -randkey nm/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/nm.keytab nm/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
3.2 配置 yarn-site-site.xml
yarn-site-site.xml
新增配置
$cd /usr/local/hadoop/etc/hadoop
$vi yarn-site.xml
<!-- 新增如下内容 -->
<property>
<name>yarn.resourcemanager.principal</name>
<value>rm/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.resourcemanager.keytab</name>
<value>/etc/security/keytab/rm.keytab</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.principal</name>
<value>nm/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/etc/security/keytab/nm.keytab</value>
</property>
<property>
<name>yarn.timeline-service.principal</name>
<value>tl/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.timeline-service.keytab</name>
<value>/etc/security/keytab/tl.keytab</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.type</name>
<value>kerberos</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.kerberos.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.kerberos.keytab</name>
<value>/etc/security/keytab/spnego.keytab</value>
</property>
<property>
<name>yarn.http.policy</name>
<value>HTTP_ONLY</value>
</property>
说明:建议设置
yarn.timeline-service.enabled
为false
mapred-site-site.xml
新增配置
$cd /usr/local/hadoop/etc/hadoop
$vi mapred-site.xml
<!-- 新增如下内容 -->
<property>
<name>mapreduce.jobhistory.keytab</name>
<value>/etc/security/keytab/jh.keytab</value>
</property>
<property>
<name>mapreduce.jobhistory.principal</name>
<value>jh/server1@HADOOP.COM</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.spnego-keytab-file</name>
<value>/etc/security/keytab/spnego.keytab</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.spnego-principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>mapreduce.jobhistory.http.policy</name>
<value>HTTP_ONLY</value>
</property>
3.3 启动服务并验证
启动 resourcemanager 和 nodemanager
#在server1节点上操作
$kinit -kt /etc/security/keytab/root.keytab root
$cd /usr/local/hadoop/sbin
$./start-yarn.sh
$ps -ef|grep resourcemanager
$netstat -anp|grep 8088
#打开 yarn web ui 页面,http://server1:8088/cluster
yarn 功能验证
$hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0.jar pi 10 10
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/tez-0.10.0/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Number of Maps = 10
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
省略...
2023-07-31 14:50:28,215 INFO mapreduce.Job: map 0% reduce 0%
2023-07-31 14:50:33,257 INFO mapreduce.Job: map 100% reduce 0%
2023-07-31 14:50:39,277 INFO mapreduce.Job: map 100% reduce 100%
2023-07-31 14:50:39,282 INFO mapreduce.Job: Job job_1690451579267_0899 completed successfully
2023-07-31 14:50:39,379 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=226
FILE: Number of bytes written=2543266
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2620
HDFS: Number of bytes written=215
HDFS: Number of read operations=45
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=10
Launched reduce tasks=1
Data-local map tasks=10
Total time spent by all maps in occupied slots (ms)=176478
Total time spent by all reduces in occupied slots (ms)=17310
Total time spent by all map tasks (ms)=29413
Total time spent by all reduce tasks (ms)=2885
Total vcore-milliseconds taken by all map tasks=29413
Total vcore-milliseconds taken by all reduce tasks=2885
Total megabyte-milliseconds taken by all map tasks=180713472
Total megabyte-milliseconds taken by all reduce tasks=17725440
Map-Reduce Framework
Map input records=10
Map output records=20
Map output bytes=180
Map output materialized bytes=280
Input split bytes=1440
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=280
Reduce input records=20
Reduce output records=0
Spilled Records=40
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=754
CPU time spent (ms)=5880
Physical memory (bytes) snapshot=5023223808
Virtual memory (bytes) snapshot=85769342976
Total committed heap usage (bytes)=16638803968
Peak Map Physical memory (bytes)=467083264
Peak Map Virtual memory (bytes)=7938478080
Peak Reduce Physical memory (bytes)=422326272
Peak Reduce Virtual memory (bytes)=6416666624
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1180
File Output Format Counters
Bytes Written=97
Job Finished in 20.347 seconds
Estimated value of Pi is 3.20000000000000000000
4.ZOOKEEPER 对接 Kerberos
说明:
1.zookeeper 各组件的各个文件目录权限全部使用 root,而没有做权限划分
2.zookeeper server 分别部署在 agent1~3
4.1 创建 zookeeper 的 Kerberos 用户及其凭证文件
#在server1上创建用户和导出凭证文件
$kadmin.local
$addprinc -randkey zookeeper/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/zookeeper.keytab zookeeper/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
4.2 配置 zoo.cfg
和 jaas.conf
#在agent1上进行如下操作
$env |grep ZOOK
ZOOKEEPER_HOME=/usr/local/zookeeper
$cd /usr/local/zookeeper/conf
$vi zoo.cfg
#新增如下配置
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
requireClientAuthScheme=sasl
jaasLoginRenew=3600000
#新建文件jaas.conf
$vi jaas.conf
Server {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/etc/security/keytab/zookeeper.keytab"
storeKey=true
useTicketCache=false
principal="zookeeper/server1@HADOOP.COM";
};
#把zoo.cfg和jaas.conf拷贝到其他agent节点
$scp zoo.cfg jaas.conf agent2:/usr/local/zookeeper/conf
4.3 启动服务并验证
#依次在agent1~3进行如下操作
$cd /usr/local/zookeeper/bin
$./zkServer.sh start
$ps -ef |grep zookeeper
$netstat -anp|grep 2181
#验证zookeeper基本功能
$./zkServer.sh status
$./zkCli.sh
ls /
5.HIVE 对接 Kerberos
说明:
1.hive 的 metastore、server2 和 hive cli 组件的各个文件目录权限全部使用 root,而没有做权限划分
2.metastore 部署在 server1
3.本文不涉及 server2 对接 Kerberos 和部署
5.1 创建 hive 的 Kerberos 用户及其凭证文件
#在server1上创建用户和导出凭证文件
$kadmin.local
$addprinc -randkey hive/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/hive.keytab hive/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
5.2 配置 hive-site.xml
说明:删除
/usr/local/hive/conf
目录下的 hadoop 配置文件core-site.xml
和hdfs-site.xml
,hive 会自动从HADOOP_HOME
下面的etc/hadoop
找到对应的配置
#查询 HIVE_HOME
$env |grep HIVE_HOME
HIVE_HOME=/usr/local/hive
$cd /usr/local/hive/conf
$vi hive-site.xml
<!-- 新增如下配置 -->
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/server1@HADOOP.COM</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/security/keytab/hive.keytab</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive@HADOOP.COM</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/security/keytab/hive.keytab</value>
</property>
<property>
<name>dfs.data.transfer.protection</name>
<value>authentication</value>
</property>
5.3 启动并验证 metastore 和 hive cli
启动 metastore
$hive --service metastore -p 9083 &
$ps -ef |grep metastore
$netstat -anp|grep 9083
验证 hive cli
#kerberos用户使用 root
$kinit -kt etc/security/keytab/root.keytab root
$klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: root@HADOOP.COM
Valid starting Expires Service principal
07/31/2023 12:45:37 08/01/2023 12:45:37 krbtgt/HADOOP.COM@HADOOP.COM
renew until 07/31/2023 12:45:37
#启动 hive cli
$hive
$use default;
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (
t1_a INT,
t1_b INT,
t1_c INT,
t1_d INT
);
INSERT INTO table1 (t1_a, t1_b, t1_c, t1_d)
VALUES
(1, 10, 100, 1000),
(2, 20, 200, 2000),
(3, 30, 300, 3000),
(4, 40, 400, 4000),
(5, 50, 500, 5000),
(6, 60, 600, 6000),
(7, 70, 700, 7000),
(8, 80, 800, 8000),
(9, 90, 900, 9000),
(10, 100, 1000, 10000);
SELECT * FROM table1;
6.SPARK 对接 Kerberos
说明:
1.spark 各组件的各个文件目录权限全部使用 root,而没有做权限划分
2.HistoryServer 部署在 server1
6.1 配置 spark-defaults.conf
$env |grep SPARK
SPARK_HOME=/usr/local/spark
$cd /usr/local/spark/conf
$vi spark-defaults.conf
#新增如下配置
spark.kerberos.principal root@HADOOP.COM
spark.kerberos.keytab /etc/security/keytab/root.keytab
6.2 启动并验证 HistoryServer
#启动 HistoryServer
$cd /usr/local/spark/sbin
$./start-history-server.sh
$ps -ef |grep history
$netstat -anp|grep 18080
#打开 spark history 的页面,http://server1:18080/
6.3 启动并验证 spark-sql 原生作业
#kerberos用户使用 root
$kinit -kt etc/security/keytab/root.keytab root
$klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: root@HADOOP.COM
Valid starting Expires Service principal
07/31/2023 12:45:37 08/01/2023 12:45:37 krbtgt/HADOOP.COM@HADOOP.COM
renew until 07/31/2023 12:45:37
#利用spark-sql交互式验证
$spark-sql
$use default;
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (
t1_a INT,
t1_b INT,
t1_c INT,
t1_d INT
);
INSERT INTO table1 (t1_a, t1_b, t1_c, t1_d)
VALUES
(1, 10, 100, 1000),
(2, 20, 200, 2000),
(3, 30, 300, 3000),
(4, 40, 400, 4000),
(5, 50, 500, 5000),
(6, 60, 600, 6000),
(7, 70, 700, 7000),
(8, 80, 800, 8000),
(9, 90, 900, 9000),
(10, 100, 1000, 10000);
SELECT * FROM table1;
#启动spark-sql Q_01原生作业
$spark-sql \
--deploy-mode client \
--driver-cores 2 \
--driver-memory 8g \
--num-executors 5 \
--executor-cores 2 \
--executor-memory 8g \
--master yarn \
--conf spark.task.cpus=1 \
--conf spark.sql.orc.impl=native \
--conf spark.sql.shuffle.partitions=600 \
--conf spark.sql.adaptive.enabled=true \
--conf spark.sql.autoBroadcastJoinThreshold=100M \
--conf spark.sql.broadcastTimeout=1000 \
--database tpcds_bin_partitioned_orc_5 \
--name spark_sql_01 \
-e "SELECT
dt.d_year,
item.i_brand_id brand_id,
item.i_brand brand,
SUM(ss_ext_sales_price) sum_agg
FROM date_dim dt, store_sales, item
WHERE dt.d_date_sk = store_sales.ss_sold_date_sk
AND store_sales.ss_item_sk = item.i_item_sk
AND item.i_manufact_id = 128
AND dt.d_moy = 11
GROUP BY dt.d_year, item.i_brand, item.i_brand_id
ORDER BY dt.d_year, sum_agg DESC, brand_id
LIMIT 100"
常见问题
问题1:HDFS CLI 没有 Kerberos 权限
问题现象:Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
解决方案:在配置文件 /etc/krb5.conf
删除或者注解 default_ccache_name
问题2:Spark 对接 HDFS 安全集群的认证信息不正确
问题现象: 没有足够的datanode,其错误日志如下,ERROR SparkContext: Error initializing SparkContext.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/.sparkStaging/application_1690451579267_0910/__spark_libs__993965846103132811.zip could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
解决方案:直接删除 SPARK_HOME
的 conf 目录下面的 hdfs-site.xml
,即 spark 会直接找 HADOOP_HOME
下面的 hdfs-site.xml
,或者把 hdfs 的 kerberos 认证信息填写在 SPARK_HOME
的 conf 目录下面的 hdfs-site.xml
。
问题3:Hive CLI 拿不到 HDFS 安全集群的认证信息
问题现象:hive cli 在执行 select 语句的时候出现无法连接 datanode 的错误日志,Failed with exception java.io.IOException:org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-977099591-192.168.43.210-1684921499379:blk_1074594764_854564 file=/user/hive/warehouse/table2/part-00000-c2edd0d3-580d-4de0-826d-ee559c6a61f6-c000
解决方案:在 HIVE_HOME
的 conf 目录下的 hive-site.xml
,添加如下配置。
<property>
<name>dfs.data.transfer.protection</name>
<value>authentication</value>
</property>