Docker Swarm部署
Swarm集群部署
1、主机环境
182.48.115.237 swarm的manager节点 manager-node
182.48.115.238 swarm的node节点 node1
182.48.115.239 swarm的node节点 node2
2377/tcp:管理端口
7946/udp:节点间通信端口
4789/udp:overlay 网络端口
2、在所有节点上安装docker,并下载swarm镜像
[root@manager-node ~]# yum install -y docker
[root@manager-node ~]# vim /etc/sysconfig/docker
......
OPTIONS='-H 0.0.0.0:2375 -H unix:///var/run/docker.sock' //在OPTIONS参数项后面的''里添加内容
[root@manager-node ~]# systemctl restart docker
[root@manager-node ~]# docker pull swarm
[root@manager-node ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/swarm latest 36b1e23becab 4 months ago 15.85 MB
3、创建swarm并查看集群的相关信息
[root@manager-node ~]# docker swarm init --advertise-addr 182.48.115.237
Swarm initialized: current node (1gi8utvhu4rxy8oxar2g7h6gr) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-4roc8fx10cyfgj1w1td8m0pkyim08mve578wvl03eqcg5ll3ig-f0apd81qfdwv27rnx4a4y9jej \
182.48.115.237:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
命令执行后,该主机自动加入到swarm集群。且会创建一个集群token,获取全球唯一的token,作为集群唯一标识。后续将其他节点加入集群都会用到这个token值(这里自己记录一下)。其中,--advertise-addr参数表示其它swarm中的worker节点使用此ip地址与manager联系。命令的输出包含了其它节点如何加入集群的命令。
[root@manager-node ~]# docker info
Swarm: active
Is Manager: true
...
[root@manager-node ~]# docker node ls //*号表示现在连接到这个节点上
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
1gi8utvhu4rxy8oxar2g7h6gr * manager-node Ready Active Leader
4、添加节点到swarm集群中
登录到node1和node2节点上,执行前面创建swarm集群时输出的命令:
[root@node1 ~]# docker swarm join --token SWMTKN-1-4roc8fx10cyfgj1w1td8m0pkyim08mve578wvl03eqcg5ll3ig-f0apd81qfdwv27rnx4a4y9jej 182.48.115.237:2377
This node joined a swarm as a worker.
[root@node2 ~]# docker swarm join --token SWMTKN-1-4roc8fx10cyfgj1w1td8m0pkyim08mve578wvl03eqcg5ll3ig-f0apd81qfdwv27rnx4a4y9jej 182.48.115.237:2377
This node joined a swarm as a worker.
再次查看集群状态
[root@manager-node ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
1gi8utvhu4rxy8oxar2g7h6gr * manager-node Ready Active Leader
ei53e7o7jf0g36329r3szu4fi node1 Ready Active
f1obgtudnykg51xzyj5fs1aev node2 Ready Active
更改节点的availablity状态
swarm集群中node的availability状态可以为 active或者drain:
active状态下,node可以接受来自manager节点的任务分派;
drain状态下,node节点会结束task,且不再接受来自manager节点的任务分派(也就是下线节点)。
[root@manager-node ~]# docker node update --availability drain node1 //将node1节点下线
[root@manager-node ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
1gi8utvhu4rxy8oxar2g7h6gr * manager-node Ready Active Leader
ei53e7o7jf0g36329r3szu4fi node1 Ready drain
f1obgtudnykg51xzyj5fs1aev node2 Ready Active
当节点的状态改为drain后,那么该节点就不会接受task任务分发,就算之前已经接受的任务也会转移到别的节点上;如果要删除node1节点,命令是"docker node rm --force node1"
[root@manager-node ~]# docker node update --availability active node1 //将下线的节点再次上线
5、在Swarm中部署以nginx为例的服务
Docker 1.12版本提供服务的Scaling、health check、滚动升级等功能,并提供了内置的dns、vip机制,实现service的服务发现和负载均衡能力。
在启动容器之前,先来创建一个覆盖网络,用来保证在不同主机上的容器网络互通的网络模式
[root@manager-node ~]# docker network create -d overlay ngx_net
[root@manager-node ~]# docker network ls
NETWORK ID NAME DRIVER SCOPE
8bbd1b7302a3 bridge bridge local
9e637a97a3b9 docker_gwbridge bridge local
b5a41c8c71e7 host host local
1x45zepuysip ingress overlay swarm
3ye6vfp996i6 ngx_net overlay swarm
0808a5c72a0a none null local
在manager-node节点上使用此覆盖网络创建具有一个副本的nginx服务:
[root@manager-node ~]# docker service create --replicas 1 --network ngx_net --name my-test -p 80:80 nginx
--replicas 参数指定服务由1个实例组
nginx:不需要提前在节点上下载nginx镜像
查看正在运行服务的列表
[root@manager-node ~]# docker service ls
ID NAME REPLICAS IMAGE COMMAND
0jb5eebo8j9q my-test 1/1 nginx
查询Swarm中服务的信息
[root@manager-node ~]# docker service inspect --pretty my-test //-pretty 使命令输出格式化为可读的格式
ID: 0jb5eebo8j9qb1zc795vx3py3
Name: my-test
Mode: Replicated
Replicas: 1
Placement:
UpdateConfig:
Parallelism: 1
On failure: pause
ContainerSpec:
Image: nginx
Resources:
Networks: 3ye6vfp996i6eq17tue0c2jv9
Ports:
Protocol = tcp
TargetPort = 80
PublishedPort = 80
查询到哪个节点正在运行该服务
[root@manager-node ~]# docker service ps my-test
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
2m8qqpoa0dpeua5jbgz1infuy my-test.1 nginx manager-node Running Running 3 minutes ago
由于下载镜像需要时间,故STATE 字段中刚开始的服务状态为 Preparing,需要等一会才能变为 Running 状态
该容器被调度到manager-node节点上启动了,然后访问http://182.48.115.237即可访问这个容器应用
登陆相应节点,查看运行的nginx容器
[root@manager-node ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1ea1d72007da nginx:latest "nginx -g 'daemon off" 4 minutes ago Up 4 minutes 80/tcp my-test.1.2m8qqpoa0dpeua5jbgz1infuy
-------------------------------在Swarm中动态扩展服务------------------------------------------
如果只是通过service启动容器,swarm也算不上什么新鲜东西了。Service还提供了复制(类似k8s里的副本)功能。可以通过 docker service scale 命令来设置服务中容器的副本数:
将上面的my-test容器动态扩展到5个:
[root@manager-node ~]# docker service scale my-test=5
[root@manager-node ~]# docker service ps my-test
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
2m8qqpoa0dpeua5jbgz1infuy my-test.1 nginx manager-node Running Running 9 minutes ago
aqko8yhmdj53gmzs8gqhoylc2 my-test.2 nginx node2 Running Running 2 minutes ago
erqk394hd4ay7nfwgaz4zp3s0 my-test.3 nginx node1 Running Running 2 minutes ago
2dslg6w16wzcgboa2hxw1c6k1 my-test.4 nginx node1 Running Running 2 minutes ago
bmyddndlx6xi18hx4yinpakf3 my-test.5 nginx manager-node Running Running 2 minutes ago
特别需要清楚的一点:如果一个节点宕机了,则Docker应该会将在该节点运行的容器,调度到其他节点,以满足指定数量的副本保持运行状态。如:将node1宕机后或将node1的docker服务关闭,那么它上面的task实例就会转移到别的节点上。当node1节点恢复后,它转移出去的task实例不会主动转移回来,只能等别的节点出现故障后转移task实例到它的上面。使用命令"docker node ls",发现node1节点已不在swarm集群中了。
流程如下:
[root@manager-node ~]# docker service ps my-test
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
2m8qqpoa0dpeua5jbgz1infuy my-test.1 docker.io/nginx manager-node Running Running 33 minutes ago
aqko8yhmdj53gmzs8gqhoylc2 my-test.2 docker.io/nginx node2 Running Running 26 minutes ago
di99oj7l9x6firw1ai25sewwc my-test.3 docker.io/nginx node2 Running Running 6 minutes ago
erqk394hd4ay7nfwgaz4zp3s0 \_ my-test.3 docker.io/nginx node1 Shutdown Complete 5 minutes ago
aibl3u3pph3fartub0mhwxvzr my-test.4 docker.io/nginx node2 Running Running 6 minutes ago
2dslg6w16wzcgboa2hxw1c6k1 \_ my-test.4 docker.io/nginx node1 Shutdown Complete 5 minutes ago
bmyddndlx6xi18hx4yinpakf3 my-test.5 docker.io/nginx manager-node Running Running 26 minutes ago
当访问182.48.115.239节点的80端口,swarm的负载均衡会把请求路由到一个任意节点的可用的容器上。
[root@node2 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
216abf6bebea docker.io/nginx:latest "nginx -g 'daemon off" 7 minutes ago Up 7 minutes 80/tcp my-test.3.di99oj7l9x6firw1ai25sewwc
1afd12cc9140 docker.io/nginx:latest "nginx -g 'daemon off" 7 minutes ago Up 7 minutes 80/tcp my-test.4.aibl3u3pph3fartub0mhwxvzr
cc90da57c25e docker.io/nginx:latest "nginx -g 'daemon off" 27 minutes ago Up 27 minutes 80/tcp my-test.2.aqko8yhmdj53gmzs8gqhoylc2
再次在node2节点上将从node1上转移过来的两个task关闭
[root@node2 ~]# docker stop my-test.3.di99oj7l9x6firw1ai25sewwc my-test.4.aibl3u3pph3fartub0mhwxvzr
my-test.3.di99oj7l9x6firw1ai25sewwc
my-test.4.aibl3u3pph3fartub0mhwxvzr
再次查询服务的状态列表,两个task又转移到node1上了
[root@manager-node ~]# docker service ps my-test
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
2m8qqpoa0dpeua5jbgz1infuy my-test.1 docker.io/nginx manager-node Running Running 38 minutes ago
aqko8yhmdj53gmzs8gqhoylc2 my-test.2 docker.io/nginx node2 Running Running 31 minutes ago
7dhmc63rk0bc8ngt59ix38l44 my-test.3 docker.io/nginx node1 Running Running about a minute ago
di99oj7l9x6firw1ai25sewwc \_ my-test.3 docker.io/nginx node2 Shutdown Complete about a minute ago
erqk394hd4ay7nfwgaz4zp3s0 \_ my-test.3 docker.io/nginx node1 Shutdown Complete 9 minutes ago
607tyjv6foc0ztjjvdo3l3lge my-test.4 docker.io/nginx node1 Running Running about a minute ago
aibl3u3pph3fartub0mhwxvzr \_ my-test.4 docker.io/nginx node2 Shutdown Complete about a minute ago
2dslg6w16wzcgboa2hxw1c6k1 \_ my-test.4 docker.io/nginx node1 Shutdown Complete 9 minutes ago
bmyddndlx6xi18hx4yinpakf3 my-test.5 docker.io/nginx manager-node Running Running 31 minutes ago
swarm还可以缩容,缩小到1个
[root@manager-node ~]# docker service scale my-test=1
[root@manager-node ~]# docker service ps my-test
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
2m8qqpoa0dpeuasdfsdfdfsdf my-test.1 nginx manager-node Running Running 3 minutes ago
[root@node2 ~]# docker ps //容器被stop
[root@manager-node ~]# docker service rm my-test //把所有节点上的所有容器删除
对服务的启动参数进行更新/修改。
[root@manager-node ~]# docker service update --replicas 3 my-test
my-test
[root@manager-node ~]# docker service ls
ID NAME REPLICAS IMAGE COMMAND
d7cygmer0yy5 my-test 3/3 nginx /bin/bash
[root@manager-node ~]# docker service ps my-test
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
ddkidkz0jgor751ffst55kvx4 my-test.1 nginx node1 Running Preparing 4 seconds ago
1aucul1b3qwlmu6ocu312nyst \_ my-test.1 nginx manager-node Shutdown Complete 5 seconds ago
4w9xof53f0falej9nqgq064jz \_ my-test.1 nginx manager-node Shutdown Complete 19 seconds ago
0e9szyfbimaow9tffxfeymci2 \_ my-test.1 nginx manager-node Shutdown Complete 30 seconds ago
27aqnlclp0capnp1us1wuiaxm my-test.2 nginx manager-node Running Preparing 1 seconds ago
7dmmmle29uuiz8ey3tq06ebb8 my-test.3 nginx manager-node Running Preparing 1 seconds ago
升级镜像
[root@manager-node ~]# docker service update --image nginx:new my-test
[root@manager-node ~]# docker service ls
ID NAME REPLICAS IMAGE COMMAND
d7cygmer0yy5 my-test 3/3 nginx:new /bin/bash
6、Swarm中使用Volume
[root@manager-node ~]# docker volume create --name myvolume
myvolume
[root@manager-node ~]# docker volume ls
DRIVER VOLUME NAME
local 11b68dce3fff0d57172e18bc4e4cfc252b984354485d747bf24abc9b11688171
local 1cd106ed7416f52d6c77ed19ee7e954df4fa810493bb7e6cf01775da8f9c475f
local myvolume
参数src也可以使用source;dst表示容器内的路径,也可以用target
[root@manager-node ~]# docker service create --replicas 2 --mount type=volume,src=myvolume,dst=/wangshibo --name test-nginx nginx
[root@manager-node ~]# docker service ls
ID NAME REPLICAS IMAGE COMMAND
8s9m0okwlhvl test-nginx 2/2 nginx
[root@manager-node ~]# docker service ps test-nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
32bqjjhqcl1k5z74ijjli35z3 test-nginx.1 nginx node1 Running Running 23 seconds ago
48xoypunb3g401jkn690lx7xt test-nginx.2 nginx node2 Running Running 23 seconds ago
[root@node1 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d471569629b2 nginx:latest "nginx -g 'daemon off" 2 minutes ago Up 2 minutes 80/tcp test-nginx.1.32bqjjhqcl1k5z74ijjli35z3
[root@node1 ~]# docker exec -ti d471569629b2 /bin/bash
root@d471569629b2:/# cd /wangshibo/
root@d471569629b2:/wangshibo# ls
root@d471569629b2:/wangshibo# echo "ahahha" > test
root@d471569629b2:/wangshibo# ls
test
[root@node1 ~]# docker volume inspect myvolume
[
{
"Name": "myvolume",
"Driver": "local",
"Mountpoint": "/var/lib/docker/volumes/myvolume/_data",
"Labels": null,
"Scope": "local"
}
]
[root@node1 ~]# cd /var/lib/docker/volumes/myvolume/_data/
[root@node1 _data]# ls
test
[root@node1 _data]# cat test
ahahha
[root@node1 _data]# echo "12313" > 123
[root@node1 _data]# ls
123 test
root@d471569629b2:/wangshibo# ls
123 test
root@d471569629b2:/wangshibo# cat test
ahahha
将node1节点机上的volume数据目录做成软链接
[root@node1 ~]# ln -s /var/lib/docker/volumes/myvolume/_data /wangshibo
[root@node1 ~]# cd /wangshibo
[root@node1 wangshibo]# ls
123 test
[root@node1 wangshibo]# rm -f test
[root@node1 wangshibo]# echo "5555" > haha
root@d471569629b2:/wangshibo# ls
123 haha
root@d471569629b2:/wangshibo# cat haha
5555
命令格式
docker service create --mount type=bind,target=/container_data/,source=/host_data/
参数target表示容器里面的路径,source表示本地硬盘路径
[root@manager-node ~]# docker service create --replicas 1 --mount type=bind,target=/usr/share/nginx/html/,source=/opt/web/ --network ngx_net --name haha-nginx -p 8880:80 nginx
[root@manager-node ~]# docker service ls
ID NAME REPLICAS IMAGE COMMAND
9t9d58b5bq4u haha-nginx 1/1 nginx
[root@manager-node ~]# docker service ps haha-nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
bji4f5tikhvm7nf5ief3jk2is haha-nginx.1 nginx node2 Running Running 18 seconds ago
登录node2节点,在挂载目录/opt/web下写测试数据
[root@node2 _data]# cd /opt/web/
[root@node2 web]# ls
[root@node2 web]# cat wang.html
sdfasdf
登录容器查看,发现已经实现数据同步
[root@node2 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3618e3d1b966 nginx:latest "nginx -g 'daemon off" 28 seconds ago Up 24 seconds 80/tcp haha-nginx.1.bji4f5tikhvm7nf5ief3jk2is
[root@node2 ~]# docker exec -ti 3618e3d1b966 /bin/bash
root@3618e3d1b966:/# cd /usr/share/nginx/html
root@3618e3d1b966:/usr/share/nginx/html# ls
wang.html
root@3618e3d1b966:/usr/share/nginx/html# cat wang.html
sdfasdf
root@3618e3d1b966:/usr/share/nginx/html# touch test
touch: cannot touch 'test': Permission denied
由此可见,以上设置后,在容器里的同步目录下没有写权限,更新内容时只要放到宿主机的挂在目录下即可!总之,Swarm上手很简单,Docker swarm可以非常方便的创建类似kubernetes那样带有副本的服务,确保一定数量的容器运行,保证服务的高可用。然而,光从官方文档来说,功能似乎又有些简单;
swarm、kubernetes、messos总体比较而言:
1)Swarm的优点和缺点都是使用标准的Docker接口,使用简单,容易集成到现有系统,但是更困难支持更复杂的调度,比如以定制接口方式定义的调度。
2)Kubernetes 是自成体系的管理工具,有自己的服务发现和复制,需要对现有应用的重新设计,但是能支持失败冗余和扩展系统。
3)Mesos是低级别 battle-hardened调度器,支持几种容器管理框架如Marathon, Kubernetes, and Swarm,现在Kubernetes和Mesos稳定性超过Swarm,在扩展性方面,Mesos已经被证明支持超大规模的系统,比如数百数千台主机,但是,如果你需要小的集群,比如少于一打数量的节点服务器数量,Mesos也许过于复杂了。