Spark

Scala入门笔记

2018-11-21  本文已影响113人  geekAppke

mac安装scala

brew cask install java
brew install scala

本地安装scala环境:~/.zshrc

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home
export SCALA_HOME=/Library/Scala/scala-2.10.6
PATH=$PATH:${SCALA_HOME}/bin:${JAVA_HOME}/bin

Hello World

➜  ~ scala
Welcome to Scala 2.12.7 (OpenJDK 64-Bit Server VM, Java 11.0.1).
Type in expressions for evaluation. Or try :help.

scala> print("hello world!")
hello world!
scala> :quit

Scala IDEA和MAC版安装
IDEA 运行Scala程序出现无法加载主类问题的解决
+添加Library的Scala SDK(运行不报错),覆盖原来modules的dependencies里的Scala SDK(编译不报错,运行报错:找不到或无法加载主类

下载不动plugins?配置HTTP代理

IDEA 运行scala程序

object Test {
  def main(args: Array[String]): Unit = {
    println("Hello World~ ~ ~")
  }
}

eclipse 配置scala插件

下载插件(一定要对应eclipse版本下载)
http://scala-ide.org/download/prev-stable.html  


将features和plugins两个文件夹拷贝到eclipse安装目录中的”dropins/scala”目录下。
进入dropins,新建scala文件夹,将两个文件夹拷贝到“dropins/scala”下

Scala官网6个特征

Scala的WordCount

导入spark-assembly-1.6.0-hadoop2.6.0.jar包;项目中创建words.txt文件

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD
import org.apache.spark.rdd.RDD.rddToPairRDDFunctions

object WordCount {
  def main(args: Array[String]): Unit = {
    val conf = new SparkConf()
    conf.setMaster("local").setAppName("WC")
    val sc = new SparkContext(conf)
    val lines :RDD[String] = sc.textFile("./words.txt")
    val word :RDD[String]  = lines.flatMap{lines => {
      lines.split(" ")
    }}
    val pairs : RDD[(String,Int)] = word.map{ x => (x,1) }
    val result = pairs.reduceByKey{(a,b)=> {a+b}}
    // result.sortBy(_._1, false).foreach(println)
    result.sortBy(_._1,true).foreach(println)
    
    // 简化写法
    // lines.flatMap { _.split(" ")}.map { (_,1)}.reduceByKey(_+_).foreach(println)
  }
}

flatMap:1对多
map:来一个String出1个String,1对1
reduceByKey:相同key分在1组;对每1组的key进行累加
先分组,后对每一组的key对应的value去聚合

输出结果

(c++,2)
(hbase,2)
(hello,17)
(hive,1)
(java,5)
(matlab,3)
(mongodb,1)
(mysql,3)
(objective-c,2)
(oracle,1)
(pig,1)
(python,8)
(redies,2)
(sqoop,3)
(swift,3)
(word,4)
(zookeeper,1)

参考资料

Scala学习笔记(一) - 简书
hive找出掉线率最高的前10基站&WordCount

Scala学习笔记导航

上一篇 下一篇

猜你喜欢

热点阅读