Spark RPC Client Request 和 Serve

2018-05-11  本文已影响0人  博弈史密斯

需要结合下面这几篇文章看,下面是自己学习的记录。
https://blog.csdn.net/u011564172/article/details/62043236
https://blog.csdn.net/u011564172/article/details/60875013
https://blog.csdn.net/u011564172/article/details/60143168
https://blog.csdn.net/u011564172/article/details/59113617

Master Main 方法中,调用 RpcEnv 的 create 方法,返回 NettyRpcEnv 实例,NettyRpcEnv 继承自 RpcEnv,create 方法最终启动了 Netty 服务(具体请参考 Spark RPC之Netty启动),流程入下图:

RpcEnv create 方法 返回的 NettyRpcEnv 实例,随后调用了 setupEndpoint 方法:

    val rpcEnv = RpcEnv.create(SYSTEM_NAME, host, port, conf, securityMgr)
    val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME,
      new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))

其实是调用了 Dispatcher 的 registerRpcEndpoint 方法:

  //NettyRpcEnv.scala 中的代码
  override def setupEndpoint(name: String, endpoint: RpcEndpoint): RpcEndpointRef = {
    dispatcher.registerRpcEndpoint(name, endpoint)
  }

在 NettyRpcEnv.scala 中创建了 TransportContext:

  private val transportContext = new TransportContext(transportConf,
    new NettyRpcHandler(dispatcher, this, streamManager))

TransportContext 构造函数中创建了 NettyRpcHandler,NettyRpcHandler 继承自 RpcHandler,看下 NettyRpcHandler 类的部分代码:

private[netty] class NettyRpcHandler(
    dispatcher: Dispatcher,
    nettyEnv: NettyRpcEnv,
    streamManager: StreamManager) extends RpcHandler with Logging {

  override def receive(
      client: TransportClient,
      message: ByteBuffer,
      callback: RpcResponseCallback): Unit = {
    val messageToDispatch = internalReceive(client, message)
    dispatcher.postRemoteMessage(messageToDispatch, callback)
  }

  override def receive(
      client: TransportClient,
      message: ByteBuffer): Unit = {
    val messageToDispatch = internalReceive(client, message)
    dispatcher.postOneWayMessage(messageToDispatch)
  }
}

可以看到 有两个 重写的 receive 方法,我们知道 receive 方法用来接收 远端发来的 RPC消息,最终调用了 Dispatcher 的 postMessage 方法。
那 receive 最终由哪里调用呢?其实最终是从 TransportRequestHandler 的 rpcHandler 调用的。
TransportRequestHandler 类 的 rpcHandler 成员,持有了 NettyRpcHandler 的引用。我们看下 NettyRpcHandler 如何一步步把自己传给 TransportRequestHandler 的 rpcHandle 的:

TransportContext 的 rpcHandler 成员持有了 NettyRpcHandler 的引用:

  public TransportContext(TransportConf conf, RpcHandler rpcHandler) {
    this(conf, rpcHandler, false);
  }

  public TransportContext(...RpcHandler rpcHandler) {
    ...
    this.rpcHandler = rpcHandler;
  }

TransportContext 把 rpcHandler 传给了 TransportServer:

  public TransportServer createServer(int port, List<TransportServerBootstrap> bootstraps) {
    return new TransportServer(this, null, port, rpcHandler, bootstraps);
  }

TransportServer 的成员 appRpcHandler 持有了 NettyRpcHandler 的引用:

  public TransportServer(...RpcHandler appRpcHandler) {
    ...
    this.appRpcHandler = appRpcHandler;
  }

在 TransportServer 的 init 方法中,把 appRpcHandler 传给了 TransportContext 的initializePipeline 方法:

private void init(String hostToBind, int portToBind) {
  ...
  context.initializePipeline(ch, rpcHandler);
}

我们看下 TransportContext 的initializePipeline 方法:

public TransportChannelHandler initializePipeline(SocketChannel channel, RpcHandler channelRpcHandler) {
  ...
  TransportChannelHandler channelHandler = createChannelHandler(channel, channelRpcHandler);
  //下面把 TransportChannelHandler 添加到 pipeline 中。
  channel.pipeline()
        .addLast("encoder", ENCODER)
        .addLast(TransportFrameDecoder.HANDLER_NAME, NettyUtils.createFrameDecoder())
        .addLast("decoder", DECODER)
        .addLast("idleStateHandler", new IdleStateHandler(0, 0, conf.connectionTimeoutMs() / 1000))
        // NOTE: Chunks are currently guaranteed to be returned in the order of request, but this
        // would require more logic to guarantee if this were not part of the same event loop.
        .addLast("handler", channelHandler);
   return channelHandler;
}

initializePipeline 方法创建了 TransportChannelHandler,并返回。
看下 createChannelHandler 方法:

  private TransportChannelHandler createChannelHandler(...RpcHandler rpcHandler) {
    TransportRequestHandler requestHandler = new TransportRequestHandler(channel, client,
      rpcHandler);
    return new TransportChannelHandler(client, responseHandler, requestHandler,
      conf.connectionTimeoutMs(), closeIdleConnections);
  }

最终 TransportRequestHandler 的成员 rpcHandler 持有了 NettyRpcHandler 的引用。

我们 看下 TransportRequestHandler 中 使用 rpcHandler 的地方:

  private void processRpcRequest(final RpcRequest req) {
      rpcHandler.receive(reverseClient, req.body().nioByteBuffer(), new RpcResponseCallback() {
        @Override
        public void onSuccess(ByteBuffer response) {
          respond(new RpcResponse(req.requestId, new NioManagedBuffer(response)));
        }
      });
  }

  private void processOneWayMessage(OneWayMessage req) {
    rpcHandler.receive(reverseClient, req.body().nioByteBuffer());
  }

NettyRpcHandler 重写的 receive 方法,最终在这里被回调的:rpcHandler.receive
上面两个方法在这里调用:

  @Override
  public void handle(RequestMessage request) {
    if (request instanceof ChunkFetchRequest) {
      processFetchRequest((ChunkFetchRequest) request);
    } else if (request instanceof RpcRequest) {
      processRpcRequest((RpcRequest) request);
    } else if (request instanceof OneWayMessage) {
      processOneWayMessage((OneWayMessage) request);
    } else if (request instanceof StreamRequest) {
      processStreamRequest((StreamRequest) request);
    } else {
      throw new IllegalArgumentException("Unknown request type: " + request);
    }
  }

handle 方法对 RequestMessage 做了区分,验证了我们上面提到的。
在 TransportChannelHandler.java 中调用了 handle 方法:

  @Override
  public void channelRead(ChannelHandlerContext ctx, Object request) throws Exception {
    if (request instanceof RequestMessage) {
      requestHandler.handle((RequestMessage) request);
    } else if (request instanceof ResponseMessage) {
      responseHandler.handle((ResponseMessage) request);
    } else {
      ctx.fireChannelRead(request);
    }
  }

而 channelRead 中的消息是 client 通过 RPC 发过来的。

处理client 的 RpcRequest 请求

RpcEndpointRefRpcEndpoint不在一台机器


上图的过程3,简化了流程,这个简化的流程就是我们上面分析的。

不在同一台机器时,需要借助于netty,大致步骤如下

  1. Spark RPC之Netty启动 所述,创建RpcEnv时启动netty server,同时将TransportChannelHandler添加到pipeline中
  2. 如上图,TransportChannelHandler处理netty接收到的数据,依次交给TransportRequestHandler、NettyRpcHandler处理。
  3. 最后交由Dispatcher、Inbox,请参考Spark RPC之Dispatcher、Inbox、Outbox 。看下 Dispatcher 流程图:

RpcEndpointRefRpcEndpoint在一台机器

在同一台机器时,不需要netty,直接访问RpcEndpoint,如上图,依然交给Dispatcher、Inbox处理。

https://blog.csdn.net/u011564172/article/details/62043236

上一篇下一篇

猜你喜欢

热点阅读