Mongodb实践我爱编程mongodb学习

Mongodb分片集群部署

2018-04-22  本文已影响1747人  平凡的运维之路

Mongodb分片概括

1、分片目的

对于单台数据库服务器,庞大的数据量及高吞吐量的应用程序对它而言无疑是个巨大的挑战。频繁的CRUD操作能够耗尽服务器的CPU资源,快速的数据增长也会让硬盘存储无能为力,最终内存无法满足数据需要导致大量的I/O,主机负载严重。为了解决这种问题,对于数据库系统一般有两种方法:垂直扩展分片(水平扩展)。

【垂直扩展】:添加更多的CPU和存储资源来增加系统性能。这种方式缺点是:拥有大量CPU和RAM资源的高端机器比普通PC机器昂贵得太多,而且单点故障会影响整个系统的服务。

【分片】:相反地,分片将大的数据集分配到多台主机上,每个分片是一个独立的数据库,这些分片整体上构成一个完整的逻辑数据库。分片减少了每台服务器上的数据操作量,随着集群的增长,每台分片处理越来越少的数据,结果,增加了系统整体服务能力。另外,分片还减少了每台服务器需要存储的数据量。

2、MongoDB中的分片

MongoDB通过配置分片集群来支持分片,一个分片集群包括以下几个组件:分片,查询路由,配置服务器

数据划分

MongoDB的数据划分,是以集合级别为标准。分片通过shard key来划分集合数据。

为了对集合分片,你需要指定一个shard key。shard key既可以是集合的每个文档的索引字段也可以是集合中每个文档都有的组合索引字段。MongoDB将shard keys值按照块(chunks)划分,并且均匀的将这些chunks分配到各个分片上。MongoDB使用基于范围划分基于散列划分来划分chunks的。

MongoDB通过shard key值将数据集划分到不同的范围就称为基于范围划分。对于数值型的shard key:你可以虚构一条从负无穷到正无穷的直线(理解为x轴),每个shard key 值都落在这条直线的某个点上,然后MongoDB把这条线划分为许多更小的没有重复的范围成为块(chunks),一个chunk就是就某些最小值到最大值的范围。

MongoDB计算每个字段的hash值,然后用这些hash值建立chunks。

基于范围划分对于范围查询比较高效。假设在shard key上进行范围查询,查询路由很容易能够知道哪些块与这个范围重叠,然后把相关查询按照这个路线发送到仅仅包含这些chunks的分片。但是基于范围划分很容易导致数据不均匀分布,这样会削弱分片集群的功能。例如当shard key是个成直线上升的字段,如时间。那么,所有在给定时间范围内的请求都会映射到相同的chunk,也就是相同的分片上。这种情况下,小部分的分片将会承受大多数的请求,那么系统整体扩展并不理想。

相反的,基于散列划分是以牺牲高效范围查询为代价,它能够均匀的分布数据,散列值能够保证数据随机分布到各个分片上。

MongoDB允许DBA们通过标签标记分片的方式直接平衡数据分布策略,DBA可以创建标签并且将它们与shard key值的范围进行关联,然后分配这些标签到各个分片上,最终平衡器转移带有标签标记的数据到对应的分片上,确保集群总是按标签描述的那样进行数据分布。标签是控制平衡器行为及集群中块分布的主要方法

4、维持数据分布平衡

新加入的数据及服务器都会导致集群数据分布不平衡,MongoDB采用两种方式确保数据分布的平衡:

拆分是一个后台进程,防止块变得太大。当一个块增长到指定块大小的时候,拆分进程就会块一分为二,整个拆分过程是高效的。不会涉及到数据的迁移等操作。

平衡器是一个后台进程,管理块的迁移。平衡器能够运行在集群任何的mongd实例上。当集群中数据分布不均匀时,平衡器就会将某个分片中比较多的块迁移到拥有块较少的分片中,直到数据分片平衡为止。举个例子:如果集合users有100个块在分片1里,50个块在分片2中,那么平衡器就会将分片1中的块迁移到分片2中,直到维持平衡。

分片采用后台操作的方式管理着源分片和目标分片之间块的迁移。在迁移的过程中,源分片中的块会将所有文档发送到目标分片中,然后目标分片会获取并应用这些变化。最后,更新配置服务器上关于块位置元数据。

添加新分片到集群中会产生数据不平衡,因为新分片中没有块,当MongoDB开始迁移数据到新分片中时,等到数据分片平衡恐怕需要点时间。

当删除一个分片时,平衡器将会把分片中所有块迁移到另一个分片中,在完成这些迁移并更新元数据后,你就可以安全的删除分片了。

分片集群

[图片上传失败...(image-aa2350-1517882631314)]

片键
sh.shardCollection(namespace, key)

2、哈希片键使用单字段的哈希索引进行数据在分片之间的平均分发,除数取余一致性哈希
3、被选为片键的字段必须有足够大的基数,或者有足够多的不同的值,对于单调的递增的字段如果ObjectID或是时间戳,哈希索引效果更好
4、如果在一个空集合创建哈希片键,Mongodb会自动创建并迁移数据块,以保证每个分片上都有两个数据块,也可以执行shardCollection指定numInitialChunks参数以控制初始化时Mongodb创建数据块数目,或者手动调用split命令在分片上分裂数据块
5、对使用了哈希片键分片的集合进行请求时,Mongodb会自动计算哈希值,应用不需要解析哈希值

shard集群部署

配置config server 副本集
[root@My-Dev db2]# vim config1.conf 
[root@My-Dev db1]# vim configsvr.conf 
logpath=/home/mongodb/test/db1/log/db1.log
pidfilepath=/home/mongodb/test/db1/db1.pid
logappend=true
port=30000  
fork=true
dbpath=/home/mongodb/test/db1/data
configsvr=true   # 在配置文件添加此项就行
oplogSize=512
replSet=config


[root@My-Dev db2]# vim config2.conf 
logpath=/home/mongodb/test/db2/log/db2.log
pidfilepath=/home/mongodb/test/db2/db2.pid
logappend=true
port=30001
fork=true
dbpath=/home/mongodb/test/db2/data
oplogSize=512
replSet=config
configsvr=true


[root@My-Dev db2]# vim config3.conf
logpath=/home/mongodb/test/db3/log/db3.log
pidfilepath=/home/mongodb/test/db3/db3.pid
logappend=true
port=30002
fork=true
dbpath=/home/mongodb/test/db3/data
oplogSize=512
replSet=config
configsvr=true


[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db1/config1.conf 
about to fork child process, waiting until server is ready for connections.
forked process: 5260
child process started successfully, parent exiting

[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db2/config2.conf 
about to fork child process, waiting until server is ready for connections.
forked process: 5202
child process started successfully, parent exiting


[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db3/config3.conf 
about to fork child process, waiting until server is ready for connections.
forked process: 4260
child process started successfully, parent exiting

> use admin
switched to db admin

> config = { _id:"config",members:[ {_id:0,host:"conf1:30000"}, {_id:1,host:"conf2:30001"}, {_id:2,host:"conf3:30002"}] }        #定义副本集
{
    "_id" : "config",
    "members" : [
        {
            "_id" : 0,
            "host" : "conf1:30000"
        },
        {
            "_id" : 1,
            "host" : "conf2:30001"
        },
        {
            "_id" : 2,
            "host" : "conf3:30002"
        }
    ]
}
> rs.initiate(config)     #初始化副本集
{ "ok" : 1 }

配置mongos
[root@My-Dev db4]# vim  mongos.conf 

logpath=/home/mongodb/test/db4/log/db4.log
pidfilepath=/home/mongodb/test/db4/db4.pid
logappend=true
port=40004
fork=true
configdb=mongos/172.17.237.33:30000,172.17.237.34:30001,172.17.237.36:30002   #如果有多个mongo confi的话就用逗号分隔开 


[root@My-Dev bin]# ./mongos -f /home/mongodb/test/db4/mongos.conf 
about to fork child process, waiting until server is ready for connections.
forked process: 6268
child process started successfully, parent exiting

shard2副本集集群部署
[root@My-Dev db8]# more shard21.conf 
logpath=/home/mongodb/test/db8/log/db8.log
pidfilepath=/home/mongodb/test/db8/db8.pid
directoryperdb=true
logappend=true
port=60000
fork=true
dbpath=/home/mongodb/test/db8/data 
oplogSize=512
replSet=sha
shardsvr=true



[root@My-Dev db9]# more shard22.conf 
logpath=/home/mongodb/test/db9/log/db9.log
pidfilepath=/home/mongodb/test/db9/db9.pid
directoryperdb=true
logappend=true
port=60001
fork=true
dbpath=/home/mongodb/test/db9/data 
oplogSize=512
replSet=sha
shardsvr=true


[root@My-Dev db10]# more shard23.conf 
logpath=/home/mongodb/test/db10/log/db10.log
pidfilepath=/home/mongodb/test/db10/db10.pid
directoryperdb=true
logappend=true
port=60002
fork=true
dbpath=/home/mongodb/test/db10/data 
oplogSize=512
replSet=sha
shardsvr=true


[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db8/shard21.conf 
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db9/shard22.conf 
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db10/shard23.conf 

> use admin 
switched to db admin
> sha = { _id:"sha",members:[ {_id:0,host:"sha1:60000"}, {_id:1,host:"sha2:60001"}, {_id:2,host:"sha3:60002"}]}
{
    "_id" : "sha",
    "members" : [
        {
            "_id" : 0,
            "host" : "sha1:60000"
        },
        {
            "_id" : 1,
            "host" : "sha2:60001"
        },
        {
            "_id" : 2,
            "host" : "sha3:60002"
        }
    ]
}
> rs.initiate(sha)
{ "ok" : 1 }



shard1副本集集群部署
[root@My-Dev db5]# vim shard1.conf 
logpath=/home/mongodb/test/db5/log/db5.log
pidfilepath=/home/mongodb/test/db5/db5.pid
directoryperdb=true
logappend=true
port=50000
fork=true
dbpath=/home/mongodb/test/db5/data
oplogSize=512
replSet=shard
shardsvr=true


[root@My-Dev db6]# vim shard2.conf 
logpath=/home/mongodb/test/db6/log/db6.log
pidfilepath=/home/mongodb/test/db6/db6.pid
directoryperdb=true
logappend=true
port=50001
fork=true
dbpath=/home/mongodb/test/db6/data
oplogSize=512
replSet=shard
shardsvr=true


[root@My-Dev db7]# vim shard3.conf 
logpath=/home/mongodb/test/db7/log/db7.log
pidfilepath=/home/mongodb/test/db7/db7.pid
directoryperdb=true
logappend=true
port=50002
fork=true
dbpath=/home/mongodb/test/db7/data
oplogSize=512
replSet=shard
shardsvr=true

[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db7/shard1.conf 
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db7/shard2.conf 
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db7/shard3.conf 

> use admin
switched to db admin
> shard = { _id:"shard",members:[ {_id:0,host:"shard1:50000"}, {_id:1,host:"shard2:50001"}, {_id:2,host:"shard3:50002"}] }
{
    "_id" : "shard",
    "members" : [
        {
            "_id" : 0,
            "host" : "shard1:50000"
        },
        {
            "_id" : 1,
            "host" : "shard2:50001"
        },
        {
            "_id" : 2,
            "host" : "shard3:50002"
        }
    ]
}
> rs.initiate(shard)
{ "ok" : 1 }

分片配置
[root@My-Dev bin]# ./mongo mongos:40004    #登陆到mongos中

mongos> sh.status()  #查看分片状态
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
  shards:
  active mongoses:
    "3.4.1" : 1
 autosplit:
    Currently enabled: yes
  balancer:
    Currently enabled:  yes
    Currently running:  no
        Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:

#首先要登陆到shard副本集中查看那个是主节点,本次实验室使用了两个shard副本集 sh.addShard("<replSetName>/主节点IP/port") 
mongos> sh.addShard("shard/shard1:50000")  
{ "shardAdded" : "shard", "ok" : 1 }

mongos> sh.addShard("sha/sha:60000")
{ "shardAdded" : "shard", "ok" : 1 }

mongos> sh.status()  #查看分片集群已经成功把shard加入分片中
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
  shards:
    {  "_id" : "sha",  "host" : "sha/sha1:60000,sha2:60001,sha3:60002",  "state" : 1 }
    {  "_id" : "shard",  "host" : "shard/shard1:50000,shard2:50001,shard3:50002",  "state" : 1 }
  active mongoses:
    "3.4.1" : 1
 autosplit:
    Currently enabled: yes
  balancer:
    Currently enabled:  yes
    Currently running:  no
        Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
    Failed balancer rounds in last 5 attempts:  5
    Last reported error:  Cannot accept sharding commands if not started with --shardsvr
    Time of Reported error:  Thu Feb 09 2017 17:42:21 GMT+0800 (CST)
    Migration Results for the last 24 hours: 
        No recent migrations
  databa

mongos> sh.enableSharding("zhao")  #指定zhao数据库中使用分片
{ "ok" : 1 }

mongos> sh.shardCollection("zhao.call",{name:1,age:1})   #在zhao数据库和call集合中创建了name和age为升序的片键
{ "collectionsharded" : "zhao.call", "ok" : 1 }
 

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
  shards:
    {  "_id" : "sha",  "host" : "sha/sha1:60000,sha2:60001,sha3:60002",  "state" : 1 }
    {  "_id" : "shard",  "host" : "shard/shard1:50000,shard2:50001,shard3:50002",  "state" : 1 }
  active mongoses:
    "3.4.1" : 1
 autosplit:
    Currently enabled: yes
  balancer:
    Currently enabled:  yes
    Currently running:  no
        Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
    Failed balancer rounds in last 5 attempts:  5
    Last reported error:  Cannot accept sharding commands if not started with --shardsvr
    Time of Reported error:  Thu Feb 09 2017 17:56:02 GMT+0800 (CST)
    Migration Results for the last 24 hours: 
        No recent migrations
  databases:
    {  "_id" : "zhao",  "primary" : "shard",  "partitioned" : true }
        zhao.call
            shard key: { "name" : 1, "age" : 1 }
            unique: false
            balancing: true
            chunks:
                shard    1
            { "name" : { "$minKey" : 1 }, "age" : { "$minKey" : 1 } } -->> { "name" : { "$maxKey" : 1 }, "age" : { "$maxKey" : 1 } } on : shard Timestamp(1, 0) 

mongos> for ( var i=1;i<10000000;i++){db.call.insert({"name":"user"+i,age:i})};


mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
  shards:
    {  "_id" : "sha",  "host" : "sha/sha1:60000,sha2:60001,sha3:60002",  "state" : 1 }
    {  "_id" : "shard",  "host" : "shard/shard1:50000,shard2:50001,shard3:50002",  "state" : 1 }
  active mongoses:
    "3.4.1" : 1
 autosplit:
    Currently enabled: yes
  balancer:
    Currently enabled:  yes
    Currently running:  no
        Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
    Failed balancer rounds in last 5 attempts:  5
    Last reported error:  Cannot accept sharding commands if not started with --shardsvr
    Time of Reported error:  Thu Feb 09 2017 17:56:02 GMT+0800 (CST)
    Migration Results for the last 24 hours: 
        4 : Success
  databases:
    {  "_id" : "zhao",  "primary" : "shard",  "partitioned" : true }
        zhao.call
            shard key: { "name" : 1, "age" : 1 }
            unique: false
            balancing: true
            chunks:   #数据已经分片到两个chunks里面了
                sha    4
                shard    5
            { "name" : { "$minKey" : 1 }, "age" : { "$minKey" : 1 } } -->> { "name" : "user1", "age" : 1 } on : sha Timestamp(4, 1) 
            { "name" : "user1", "age" : 1 } -->> { "name" : "user1", "age" : 21 } on : shard Timestamp(5, 1) 
            { "name" : "user1", "age" : 21 } -->> { "name" : "user1", "age" : 164503 } on : shard Timestamp(2, 2) 
            { "name" : "user1", "age" : 164503 } -->> { "name" : "user1", "age" : 355309 } on : shard Timestamp(2, 3) 
            { "name" : "user1", "age" : 355309 } -->> { "name" : "user1", "age" : 523081 } on : sha Timestamp(3, 2) 
            { "name" : "user1", "age" : 523081 } -->> { "name" : "user1", "age" : 710594 } on : sha Timestamp(3, 3) 
            { "name" : "user1", "age" : 710594 } -->> { "name" : "user1", "age" : 875076 } on : shard Timestamp(4, 2) 
            { "name" : "user1", "age" : 875076 } -->> { "name" : "user1", "age" : 1056645 } on : shard Timestamp(4, 3) 
            { "name" : "user1", "age" : 1056645 } -->> { "name" : { "$maxKey" : 1 }, "age" : { "$maxKey" : 1 } } on : sha Timestamp(5, 0) 



mongos> sh.getBalancerState()
true

选择sharing kes'注意点

sharing 进级

mongos> sh.addShard("sha4/192.168.2.10:21001")

Balancer
mongos> sh.startBalancer()

```

- 设置Balancer进程运行时间窗口
默认情况ixaBalancing进程在运行时为降低Balancing进程对系统的影响,可以设置Balancer进程的运行时间窗口,让Balancer进程在指定时间窗口操作

```
#设置时间窗口
{ $set :{ activeWindow : { start : "23:00", stop : "6:00" } } }, true )  ;

```

- 查看Balancer运行时间窗口

```
# 查看Balancer时间窗口
mongos> db.settings.find();
{ "_id" : "balancer", "activeWindow" : { "start" : "23:00", "stop" : "6:00" }, "stopped" : false }

mongos> sh.getBalancerWindow()
{ "start" : "23:00", "stop" : "6:00" }


```

- 删除Balancer进程运行时间窗口

```
 mongos> db.settings.update({ "_id" : "balancer" }, { $unset : { activeWindow : 1 }});
mongos> db.settings.find();
{ "_id" : "chunksize", "value" : 10 }
{ "_id" : "balancer", "stopped" : false }

```


##### 在shell脚本中执行mongodb

```
[root@My-Dev ~]# echo  -e "use zhao \n  db.call.find()" |mongo --port 60001 

```


##### Mongodb片键的添加
- 首先进入mongos的的admin数据库中

```
mongos> use admin
switched to db admin  
mongos> db.runCommand({"enablesharding":"zl"})   #创建zl库中
{ "ok" : 1 }
mongos> db.runCommand(db.runCommand({"shardcollection":"$ent.t_srvappraise_back","key")
```

- 分片脚本

```
#!/bin/bash
url=10.241.96.155
port=30000
ent=test1

./mongo $url:$port/admin <<EOF
db.runCommand({"enablesharding":"$ent"});
db.runCommand({"shardcollection":"$ent.t_srvappraise_back","key":{"sa_seid":"hashed"}})
exit;
EOF

```
上一篇 下一篇

猜你喜欢

热点阅读