Spark与函数式编程

2021-04-20 本文已影响0人诺之林

本文基于函数式编程简介

目录

引入
特点
要求
价值

引入

Functional Programming鼻祖Lisp 现代方言主要有Clojure
Scala = 多范式(multi-paradigm)编程语言支持OOP和FP
Spark的开发语言即是Scala

特点

First-class Functions

This means functions can be treated as values. They can be assigned as values, passed into functions, and returned from functions

scala

val f = (s: String) => println(s)

Array("Hello", "Scala").map(f)

Immutable State

Since data structures can’t be changed, 'adding' or 'removing' something from an immutable collection means creating a new collection just like the old one but with the needed change

/opt/services/spark/bin/spark-shell

val rdd = sc.parallelize(Array(1, 2, 3, 4, 5))
# rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize 

val mapRDD = rdd.map(i => 10 + i)
# mapRDD: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map

要求

Pure Function, no Side Effects

Pure Function: 相同输入的情况下输出总是相同

No Side Effects: 不会修改函数外部的任何变量

Expression, no Statement

Expression: 单纯的运算过程, 总是有返回值

Statement: 执行某种操作, 没有返回值

价值

Method Chaining => Pure functions are easier to reason about

function-programming-introduction-01.png

Parallel Programming => 多个任务被同时(多核)处理

function-programming-introduction-02.png

参考

上一篇下一篇

猜你喜欢

热点阅读