Spark资源

2019-01-29  本文已影响21人  丹之

Spark Source外部数据源

  1. https://blog.csdn.net/oopsoom/article/details/42064075
  2. Spark SQL 源码分析
    https://blog.csdn.net/oopsoom/article/details/38257749

RDD

Spark SQL执行流程

Spark Catalyst

内存

Spark Task

Spark调优

1.http://marsishandsome.github.io/SparkSQL-Internal/03-performance-turning/
从Spark的并行度、数据格式(列式存储)、合适数量的Task(默认200个)

Spark storage

1.http://jerryshao.me/2013/10/08/spark-storage-module-analysis/
从通信和存储层来介绍,介绍了driver和executor之间的通信,核心类BlockManager

Spark 调度

Spark Streaming

1.Structured Streaming 实现思路与实现概述
https://github.com/lw-lin/CoolplaySpark/blob/master/Structured%20Streaming%20%E6%BA%90%E7%A0%81%E8%A7%A3%E6%9E%90%E7%B3%BB%E5%88%97/1.1%20Structured%20Streaming%20%E5%AE%9E%E7%8E%B0%E6%80%9D%E8%B7%AF%E4%B8%8E%E5%AE%9E%E7%8E%B0%E6%A6%82%E8%BF%B0.md
2.Source 解析
https://github.com/lw-lin/CoolplaySpark/blob/master/Structured%20Streaming%20%E6%BA%90%E7%A0%81%E8%A7%A3%E6%9E%90%E7%B3%BB%E5%88%97/2.1%20Structured%20Streaming%20%E4%B9%8B%20Source%20%E8%A7%A3%E6%9E%90.md
3.Sink 解析
https://github.com/lw-lin/CoolplaySpark/blob/master/Structured%20Streaming%20%E6%BA%90%E7%A0%81%E8%A7%A3%E6%9E%90%E7%B3%BB%E5%88%97/2.2%20Structured%20Streaming%20%E4%B9%8B%20Sink%20%E8%A7%A3%E6%9E%90.md

上一篇 下一篇

猜你喜欢

热点阅读