Spark

Spark从入门到精通66:Dataset的其他常用函数

2020-07-21  本文已影响0人  勇于自信

Dataset其他常用函数有
日期函数:current_date、current_timestamp
数学函数:round
随机函数:rand
字符串函数:concat、concat_ws
自定义udf和udaf函数
官网的函数介绍:
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.functions

实践:
输入数据:
employee.json:

{"name": "Leo", "age": 25, "depId": 1, "gender": "male", "salary": 20000}
{"name": "Marry", "age": 30, "depId": 2, "gender": "female", "salary": 25000}
{"name": "Jack", "age": 35, "depId": 1, "gender": "male", "salary": 15000}
{"name": "Tom", "age": 42, "depId": 3, "gender": "male", "salary": 18000}
{"name": "Kattie", "age": 21, "depId": 3, "gender": "female", "salary": 21000}
{"name": "Jen", "age": 30, "depId": 2, "gender": "female", "salary": 28000}
{"name": "Jen", "age": 19, "depId": 2, "gender": "female", "salary": 8000}

department:

{"id": 1, "name": "Technical Department"}
{"id": 2, "name": "Financial Department"}
{"id": 3, "name": "HR Department"}

代码:

package com.spark.ds

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._

object OtherFunction {

  def main(args: Array[String]): Unit = {
    val spark = SparkSession
      .builder()
      .appName("AggregateFunction")
      .master("local")
      .config("spark.sql.warehouse.dir", "D:/spark-warehouse")
      .getOrCreate()
    val employee = spark.read.json("inputData/employee.json")
    val department = spark.read.json("inputData/department.json")
    import org.apache.spark.sql.functions._
    employee.select(employee("name"), current_date(), current_timestamp(), rand(),
      round(employee("salary"),2), concat(employee("gender"),employee("age")),
      concat_ws("|",employee("gender"),employee("age")))
      .show()

  }
}

输出结果:

+------+--------------+--------------------+-------------------------+----------------+-------------------+-------------------------+
|  name|current_date()| current_timestamp()|rand(9122844294398711437)|round(salary, 2)|concat(gender, age)|concat_ws(|, gender, age)|
+------+--------------+--------------------+-------------------------+----------------+-------------------+-------------------------+
|   Leo|    2020-07-21|2020-07-21 00:47:...|       0.7503891080032685|           20000|             male25|                  male|25|
| Marry|    2020-07-21|2020-07-21 00:47:...|       0.4340990615089132|           25000|           female30|                female|30|
|  Jack|    2020-07-21|2020-07-21 00:47:...|       0.2020792875471602|           15000|             male35|                  male|35|
|   Tom|    2020-07-21|2020-07-21 00:47:...|       0.1784916488061976|           18000|             male42|                  male|42|
|Kattie|    2020-07-21|2020-07-21 00:47:...|       0.3918989540118957|           21000|           female21|                female|21|
|   Jen|    2020-07-21|2020-07-21 00:47:...|       0.3349504449575764|           28000|           female30|                female|30|
|   Jen|    2020-07-21|2020-07-21 00:47:...|       0.8679772995821763|            8000|           female19|                female|19|
+------+--------------+--------------------+-------------------------+----------------+-------------------+-------------------------+
上一篇下一篇

猜你喜欢

热点阅读