Spark笔记

2017-03-24  本文已影响51人  开水的杯子

Constructing RDDs

Shared Variables

Implementation

RDD Implementation

Shared Variables Implementation

Interpreter Integration

Performance benchmarks

Related Work

Future work — was this achieved?

  1. Formally characterize the properties of RDDs and Spark’s other abstractions, and their suitability for various classes of applications and workloads.
  2. **Enhance the RDD abstraction to allow programmers to trade between storage cost and re-construction cost. **
  3. Define new operations to transform RDDs, including a “shuffle” operation that repartitions an RDD by a given key. Such an operation would allow us to implement group-bys and joins.
  4. Provide higher-level interactive interfaces on top of the Spark interpreter, such as SQL and R [4] shells.
上一篇 下一篇

猜你喜欢

热点阅读