Note17:Spark-2.1.3安装配置

2020-06-23  本文已影响0人  K__3f8b

下载安装

[kevin@hadoop112 software]$ tar -zxvf spark-2.1.3-bin-hadoop2.7.tgz -C /opt/module/
[kevin@hadoop112 software]$ cd /opt/module/
[kevin@hadoop112 module]$ mv spark-2.1.3-bin-hadoop2.7/ spark-2.1.3
[kevin@hadoop112 module]$ cd /opt/module/spark-2.1.3
[kevin@hadoop112 spark-2.1.3]$ bin/spark-shell

# 在scala命令台:可以输入 sys.exit 退出

运行模式

Local 模式
[kevin@hadoop112 spark-2.1.3]$ bin/spark-submit --class org.apache.spark.examples.SparkPi --executor-memory 1G --total-executor-cores 2 ./examples/jars/spark-examples_2.11-2.1.3.jar 100
Standalone 模式 (资源调度和计算都在Spark)
[kevin@hadoop112 spark-2.1.3]$ cd conf/
[kevin@hadoop112 conf]$ mv slaves.template slaves

# 修改为
# A Spark Worker will be started on each of the machines listed below.
hadoop112
hadoop113
hadoop114
[kevin@hadoop112 conf]$ cd ..
[kevin@hadoop112 spark-2.1.3]$ vim sbin/spark-config.sh

# 添加
export JAVA_HOME=/opt/module/jdk1.8.0_241
[kevin@hadoop112 spark-2.1.3]$ vim conf/spark-env.sh

# 注释yarn的配置 及 添加 Spark Master
#YARN_CONF_DIR=/opt/module/hadoop-2.7.2/etc/hadoop

SPARK_MASTER_HOST=hadoop112
SPARK_MASTER_PORT=7077
[kevin@hadoop112 spark-2.1.3]$ cd ..
[kevin@hadoop112 module]$ xsync.sh spark-2.1.3/
[kevin@hadoop112 module]$ cd spark-2.1.3
[kevin@hadoop112 spark-2.1.3]$ sbin/start-all.sh
[kevin@hadoop112 spark-2.1.3]$ xcall.sh jps

启动成功会有

注意:如果遇到 “JAVA_HOME not set” 异常,可以在sbin目录下的spark-config.sh 文件中加入如下配置:

export JAVA_HOME=XXXX
[kevin@hadoop112 spark-2.1.3]$ bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://hadoop112:7077 --executor-memory 1G --total-executor-cores 2 ./examples/jars/spark-examples_2.11-2.1.3.jar 100
Yarn 模式 (资源调度在yarn 和 计算在Spark)(重点)
[kevin@hadoop112 spark-2.1.3]$ cd conf/
[kevin@hadoop112 conf]$ mv spark-env.sh.template spark-env.sh
[kevin@hadoop112 conf]$ vim spark-env.sh

# 在 spark-env.sh 后面添加
YARN_CONF_DIR=/opt/module/hadoop-2.7.2/etc/hadoop
[kevin@hadoop112 conf]$ cd /opt/module/hadoop-2.7.2/etc/hadoop/
[kevin@hadoop112 hadoop]$ vim yarn-site.xml

# 在 yarn-site.xml 后面添加
# 添加完成后分发 yarn-site.xml 文件
[kevin@hadoop112 hadoop]$ xsync.sh yarn-site.xml

# 添加内容
    <!-- 是否启动一个线程检查每个任务正使用的物理内存量,如界任务超出分配值,则直接将其杀掉,默认是true-->
    <property>
        <name>yarn.nodemanager.pmem-check-enabled</name>
        <value>false</value>
    </property>

    <!--是否启动一个线程检查每个任务正使用的虚拟内存量,如界任务超出分配值,则直接将其杀掉,默认是true-->
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property>
[kevin@hadoop112 hadoop]$ hadoop-cluster.sh start
[kevin@hadoop112 hadoop]$ xcall.sh jps
[kevin@hadoop112 hadoop]$ cd /opt/module/spark-2.1.3
[kevin@hadoop112 spark-2.1.3]$ bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client ./examples/jars/spark-examples_2.11-2.1.3.jar 100
JobHistoryServer 配置
[kevin@hadoop112 spark-2.1.3]$ cd conf/
[kevin@hadoop112 conf]$ mv spark-defaults.conf.template spark-defaults.conf
[kevin@hadoop112 conf]$ vim spark-defaults.conf

# 添加  开启Log
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://hadoop112:9000/directory
spark.yarn.historyServer.address=hadoop112:18080
spark.history.ui.port=18080

hdfs://hadoop112:9000/directory 需要提前存在

创建命令:

[kevin@hadoop112 spark-2.1.3]$ hdfs dfs -mkdir /directory

[kevin@hadoop112 conf]$ vim spark-env.sh

# 添加
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=hdfs://hadoop112:9000/directory"
[kevin@hadoop112 spark-2.1.3]$ xsync.sh conf/
[kevin@hadoop112 spark-2.1.3]$ sbin/start-history-server.sh
[kevin@hadoop112 spark-2.1.3]$bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --executor-memory 1G --total-executor-cores 2 ./examples/jars/spark-examples_2.11-2.1.3.jar 100

http://hadoop112:18080/

几种模式对比
模式 Spark安装机器数 需启动的进程 所属者
Local 1 Spark
Standalone 3 Master及Worker Spark
Yarn 1 Yarn及HDFS Hadoop
Spark使用MySQL

直接把MySQL驱动包拉到jars

Spark使用外部Hive

Spark是有内置的Hive的

在hive的conf文件夹中复制hive-site.xml到Spark的conf中,先删除me...-..元数据

上一篇下一篇

猜你喜欢

热点阅读