大数据学习十七:Spark2.3.1编译
1、源码下载地址:http://spark.apache.org/downloads.html
2、本次编译所用环境(请确保有VPN,否则一些包无法下载,会编译失败)
JDK1.8 Maven3.3.9 Scala2.11.8 Hadoop2.6.0-cdh-5.7.0 Hive1.1.0 Flume1.6.0 zookeeper3.4.5
3、CentOS7下载Git
# yum install git -y
4、修改/home/hadoop/source/spark-2.3.1/pom.xml
修改pom.xml文件中<properties></properties>中的version
<repository></repsository>中添加cloudera和aliyun仓库
https://maven.aliyun.com/nexus/content/groups/public
https://repository.cloudera.com/artifactory/cloudera-repos
执行:mvn -X -Dmaven.test.skip=true -Dscala-2.11 -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh-5.7.0 -Phive -Phive-thriftserver clean scala:compile clean
5、另外可以在这些地址查找dependency依赖
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_vd_cdh5_maven_repo.html
http://mvnrepository.com/
https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh5_maven_repo.html