2018-01-27 7 HDFS Performance Tu

2018-01-27  本文已影响0人  鸭鸭学语言

Performance Parameter define / change

Parameters are defined in HDFS-site.xml.

Cloudera manager has friendly GUI for end-user to change the para, without going with xml file modification manually.

Start Cloudera manager:

    On terminal, run:  $ sudo /home/cloudera/cloudera-manager --express --force

    Then, on firefox: access : quickstart.cloudera:7180/cmf/services/8/config


4 main parameters impact performances:

    DFS Block size  -- dfs.blocksize : default 64M.  Impact directly the name node mamory usage and mumber of map tasks.

    HDFS Replication -- dfs.replication : default 3. Reducing replication has a trade off with regards to robustness. It mitigates the failure and is achieved from perspectives below:

        periodicaly heartbeat from data node to name node.

        file's checksum stored in name node, to verify the re-read from other healthy nodes.

    Number of handlers on each data node -- dfs.datanode.handler.count

    Maximum number of blocks per file -- dfs.namenode.fs-limits.max-blocks-per-file


lesson 7 - slides

上一篇下一篇

猜你喜欢

热点阅读