Spark术语

2019-03-09 本文已影响0人 436048bfc6a1

Application

User program built on Spark Consists of a driver program and executors on the cluster

构建在spark上的用户应用程序(eg.idea上的scala object)
由在集群上的一个driver program和多个executors所组成

Application jar
A jar containing the user's Spark application In some cases users will want to create an "uber jar"-- --containing their application along with its dependencies The user's jar should never include Hadoop or Spark libraries, -- --however, these will be added at runtime.

一个jar包含了用户的Spark应用程序

Driver program

The process running the main() function of the application and creating the SparkContext

运行应用main()方法的进程，并且能创建SparkContext   
所以在main方法里创建SparkContext的程序就是driver program

Cluster manager

An external service for acquiring resources on the cluster (e.g. standalone manager, Mesos, YARN)

在集群上申请资源的外部的服务
好处：代码开发过程中不用关注代码运行在哪里
     运行各种模式下，其代码都是相同的

Deploy mode

Distinguishes where the driver process runs. In "cluster" mode, the framework launches the driver inside of the cluster In "client" mode, the submitter launches the driver outside of the cluster

分辨driver process运行在哪里， 在集群模式, 框架在集群内启动框架， 
在client模式, submitter在cluster外面启动driver

Worker node

Any node that can run application code in the cluster

在集群上能够运行应用的node被称为Worker node

Executor

A process launched for an application on a worker node that runs tasks and keeps data in memory or disk storage across them Each application has its own executors

启动一个服务于worker node(eg. node manager)上的进程, 
运行在container里运行tasks(map或filter)，
并且可以将数据放于内存中或者是跨节点的磁盘上。
每个应用程序有其独立的executors

Task

A unit of work that will be sent to one executor

发送给executor的工作单元

A parallel computation consisting of multiple tasks , that gets spawned in response to a Spark action (e.g. save, collect)

由多个task组成的一个并行计算, 一个action触发一个job
简单解释, 调用一个action(如collection算子)就是一个job

Stage

Each job gets divided into smaller sets of tasks called stages that depend on each other similar to the map and reduce stages in MapReduce

1个job会被分成多个stage
遇到一个shuffle就产生新的stage

Spark术语

猜你喜欢

热点阅读