大数据,机器学习,人工智能大数据大数据&云计算

Spark03 gradle 构建 scala 项目并在 spa

2020-05-17  本文已影响0人  山高月更阔

创建scala项目

mkdir demo
cd demo
gradle init --type scala-library

除了选择groovy作为gradle模版语言 其他都可以默认

生成默认build.gradle配置文件:

plugins {
    // Apply the scala plugin to add support for Scala
    id 'scala'
}

repositories {
    // Use jcenter for resolving your dependencies.
    // You can declare any Maven/Ivy/file repository here.
    jcenter()
}

dependencies {
    // Use Scala 2.12 in our library project
    implementation 'org.scala-lang:scala-library:2.12.7'

    // Use Scalatest for testing our library
    testImplementation 'junit:junit:4.12'
    testImplementation 'org.scalatest:scalatest_2.12:3.0.5'

    // Need scala-xml at test runtime
    testRuntimeOnly 'org.scala-lang.modules:scala-xml_2.12:1.1.1'
}

配置maven源

repositories {
    // Use jcenter for resolving dependencies.
    // You can declare any Maven/Ivy/file repository here.
    mavenLocal()
    maven { url 'http://maven.aliyun.com/nexus/content/groups/public/' }
    mavenCentral()
    jcenter()
}

引入spark依赖

// Use Scala 2.12 in our library project
    implementation 'org.scala-lang:scala-library:2.12.8'


    // https://mvnrepository.com/artifact/org.apache.spark/spark-core
    implementation 'org.apache.spark:spark-core_2.12:2.4.5'
    compileOnly 'org.apache.spark:spark-sql_2.12:2.4.5'

    // https://mvnrepository.com/artifact/org.apache.spark/spark-streaming
    implementation group: 'org.apache.spark', name: 'spark-streaming_2.12', version: '2.4.5'


    // Use Scalatest for testing our library
    testImplementation 'junit:junit:4.12'
    testImplementation 'org.scalatest:scalatest_2.12:3.0.8'

    // Need scala-xml at test runtime
    testRuntimeOnly 'org.scala-lang.modules:scala-xml_2.12:1.2.0'

这里要特别注意 spark-sql_2 只能通过compileOnly方式引入 不然 执行jar包时会提示找不到man方法

配置生成jar插件

jar {
    //详细信息参考 https://docs.gradle.org/current/dsl/org.gradle.api.tasks.bundling.Jar.html
    archivesBaseName = 'Example'//基本的文件名
    manifest { //配置jar文件的manifest
        attributes(
                "Manifest-Version": 1.0,
                'Main-Class': 'com.andy.example.Main' //指定main方法所在的文件
        )
    }
    //打包依赖包
    from {
        (configurations.runtimeClasspath).collect {
            it.isDirectory() ? it : zipTree(it)
        }
    }
}

完整build.gradle 配置如下

/*
 * This file was generated by the Gradle 'init' task.
 *
 * This generated file contains a sample Scala library project to get you started.
 * For more details take a look at the Scala plugin chapter in the Gradle
 * User Manual available at https://docs.gradle.org/5.6.2/userguide/scala_plugin.html
 */

plugins {
    // Apply the scala plugin to add support for Scala
    id 'scala'
    id 'maven-publish'
    id 'idea'
}

repositories {
    // Use jcenter for resolving dependencies.
    // You can declare any Maven/Ivy/file repository here.
    mavenLocal()
    maven { url 'http://maven.aliyun.com/nexus/content/groups/public/' }
    mavenCentral()
    jcenter()
}

dependencies {
    // Use Scala 2.12 in our library project
    implementation 'org.scala-lang:scala-library:2.12.8'


    // https://mvnrepository.com/artifact/org.apache.spark/spark-core
    implementation 'org.apache.spark:spark-core_2.12:2.4.5'
    compileOnly 'org.apache.spark:spark-sql_2.12:2.4.5'

    // https://mvnrepository.com/artifact/org.apache.spark/spark-streaming
    implementation 'org.apache.spark:spark-streaming_2.12:2.4.5'


    // Use Scalatest for testing our library
    testImplementation 'junit:junit:4.12'
    testImplementation 'org.scalatest:scalatest_2.12:3.0.8'

    // Need scala-xml at test runtime
    testRuntimeOnly 'org.scala-lang.modules:scala-xml_2.12:1.2.0'
}

jar {
    //详细信息参考 https://docs.gradle.org/current/dsl/org.gradle.api.tasks.bundling.Jar.html
    archivesBaseName = 'Example'//基本的文件名
    manifest { //配置jar文件的manifest
        attributes(
                "Manifest-Version": 1.0,
                'Main-Class': 'com.andy.example.Main' //指定main方法所在的文件
        )
    }
    //打包依赖包
    from {
        (configurations.runtimeClasspath).collect {
            it.isDirectory() ? it : zipTree(it)
        }
    }
}

idea 安装 scala插件

20200506162849.jpg

如图所示安装scala插件

测试代码

在 src/main/scala目录下新加package com.andy.example
新建Main

package com.andy.example

import org.apache.spark.sql.SparkSession

import scala.math.random


object Main {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession
      .builder
      .appName("Spark Pi")
      .getOrCreate()
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
    val count = spark.sparkContext.parallelize(1 until n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x * x + y * y <= 1) 1 else 0
    }.reduce(_ + _)
    println(s"Pi is roughly ${4.0 * count / (n - 1)}")
    spark.stop()

  }
}

打包

gradle build

运行

spark-submit build/libs/Example.jar 

这里默认通过本地方式运行 其他方式 参照 spark-submit 用法

上一篇 下一篇

猜你喜欢

热点阅读