2015-07-24

部署一个 hello word fleet 集群

目录

  • fleet 介绍
  • fleet 集群依赖介绍
  • 部署 etcd 集群
  • fleet 使用示例
  • 参考文献

fleet 介绍

fleet 集群组成部分

fleet 集群主要由 etcd 集群 fleed 守护进程和 Systemd 组成, 其中: * etcd 主要用户集群共享配置和服务发现功能, fleetd 进程通过 etcd 通信 * systemd 主要用于服务的管理包括: 启动 停止 重启等功能 * fleet 主要是通过 etcd 向其他 fleet 发送指令, 决策从哪个 fleet 节点启动服务等功能

如上述可知, 我们需要安装 fleet 、etcd 和 systemd 才能使用, 因为我们是采用的 CoreOS 系统,所以不用担心, CoreOS 默认已经安装了上述软件. 但是默认 etcd 是没有构成集群的, 所以我们需要创建一个 etcd集群 fleet 其实是通过 etcd 来构建集群的, 所以要先搭建一个 etcd 的集群才能通过 fleet 控制集群设备.

fleet 集群依赖介绍

etcd

如何创建一个 etcd 集群请 [参见这里]()

systemd

systemd 通过 units 文件来管理系统服务, 更多[参见这里]()

fleet 使用示例

部署一个 hello word 服务

etcd 集群创建完成以后就可以通过 fleetctl 向集群发送指令了, 如下所示: * 列举出集群有多少设备

core@core-01 ~ $ fleetctl list-machines
MACHINE		IP		METADATA
4371d26c...	172.17.8.102	-
9662f791...	172.17.8.101	-
d8c2fedf...	172.17.8.103	-
  • 创建一个 units 文件并添加到 fleet 集群中, 如下所示, 这个 服务的名字是 myapp
core@core-01 ~ $ cat myapp.service
[Unit]
Description=MyApp
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill busybox1
ExecStartPre=-/usr/bin/docker rm busybox1
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"
ExecStop=/usr/bin/docker stop busybox1
  • 添加到 fleet 中 $fleetctl submit myapp.service
  • 通过 fleet 在集群中启动一个服务, 如下命令所示, 此集群在 172.17.8.102 上面部署了 myapp.service 这个服务 core@core-01 ~ $ fleetctl start myapp.service Unit myapp.service launched on 4371d26c.../172.17.8.102
  • 查看 fleet 集群中又哪些服务, 可以看到服务运行在 102
core@core-01 ~ $ fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
myapp.service	9662f791.../172.17.8.102	active	running

  • 比如我在集群中启动了一个 dillinger.service, 那么我可以查看应用的状态, 如下所示, 正在使用 Docker 下线镜像, 所以会儿再运行一下此命令会发现 应用已经启动成功
core@core-01 ~ $ fleetctl status myapp.service
  myapp.service - MyApp
   Loaded: loaded (/run/fleet/units/myapp.service; linked-runtime; vendor preset: disabled)
   Active: activating (start-pre) since Thu 2015-07-23 23:06:15 UTC; 1min 2s ago
  Process: 1223 ExecStartPre=/usr/bin/docker rm busybox1 (code=exited, status=1/FAILURE)
  Process: 1140 ExecStartPre=/usr/bin/docker kill busybox1 (code=exited, status=1/FAILURE)
  Control: 1233 (docker)
   Memory: 11.3M
      CPU: 55ms
   CGroup: /system.slice/myapp.service
           └─control
             └─1233 /usr/bin/docker pull busybox

Jul 23 23:06:30 core-02 docker[1233]: 6ce2e90b0bc7: Pulling fs layer
Jul 23 23:06:30 core-02 docker[1233]: 8c2e06607696: Pulling fs layer
Jul 23 23:06:30 core-02 docker[1233]: 8c2e06607696: Pulling fs layer
Jul 23 23:06:30 core-02 docker[1233]: 8c2e06607696: Layer already being pulled by another client. Waiting.
Jul 23 23:06:40 core-02 docker[1233]: cf2616975b4a: Verifying Checksum
Jul 23 23:06:40 core-02 docker[1233]: cf2616975b4a: Download complete
Jul 23 23:06:40 core-02 docker[1233]: cf2616975b4a: Pull complete
Jul 23 23:06:42 core-02 docker[1233]: 8c2e06607696: Verifying Checksum
Jul 23 23:06:42 core-02 docker[1233]: 8c2e06607696: Download complete
Jul 23 23:06:42 core-02 docker[1233]: 8c2e06607696: Download complete
core@core-01 ~ $ fleetctl status myapp.service
  myapp.service - MyApp
   Loaded: loaded (/run/fleet/units/myapp.service; linked-runtime; vendor preset: disabled)
   Active: active (running) since Thu 2015-07-23 23:08:45 UTC; 13s ago
  Process: 1233 ExecStartPre=/usr/bin/docker pull busybox (code=exited, status=0/SUCCESS)
  Process: 1223 ExecStartPre=/usr/bin/docker rm busybox1 (code=exited, status=1/FAILURE)
  Process: 1140 ExecStartPre=/usr/bin/docker kill busybox1 (code=exited, status=1/FAILURE)
 Main PID: 1315 (docker)
   Memory: 11.5M
      CPU: 76ms
   CGroup: /system.slice/myapp.service
           └─1315 /usr/bin/docker run --name busybox1 busybox /bin/sh -c while true; do echo Hello World; sleep 1; done

Jul 23 23:08:49 core-02 docker[1315]: Hello World
Jul 23 23:08:50 core-02 docker[1315]: Hello World
Jul 23 23:08:51 core-02 docker[1315]: Hello World
Jul 23 23:08:52 core-02 docker[1315]: Hello World
Jul 23 23:08:53 core-02 docker[1315]: Hello World
Jul 23 23:08:54 core-02 docker[1315]: Hello World
Jul 23 23:08:55 core-02 docker[1315]: Hello World
Jul 23 23:08:56 core-02 docker[1315]: Hello World
Jul 23 23:08:57 core-02 docker[1315]: Hello World
Jul 23 23:08:58 core-02 docker[1315]: Hello World
  • 现在我们再看看, myapp.service 运行在哪里

模拟集群故障

如上所述, 我们在 fleet 集群自动把服务分配给了 172.17.8.102 这台设备, 那么现在我们手动关掉这台设备, 模拟故障发生, 看看会出现什么状况

  • 登录到 172.17.8.102 上面, 执行关机
  • 现在我们看看集群中有哪些机器, 可见 102 已经关掉了

```core@core-01 ~ $ fleetctl list-machines MACHINE IP METADATA 9662f791… 172.17.8.101 - d8c2fedf… 172.17.8.103 -

*  使用 fleetctl status myapp.service 观察服务状态, 如下所示, 首先会出现异常, fleet 提示无法连接到 102

core@core-01 ~ $ fleetctl status myapp.service Error running remote command: ssh: handshake failed: read tcp 172.17.8.102:22: connection reset by peer core@core-01 ~ $ fleetctl status myapp.service Error running remote command: ssh: handshake failed: read tcp 172.17.8.102:22: connection reset by peer core@core-01 ~ $ fleetctl status myapp.service Error running remote command: ssh: handshake failed: read tcp 172.17.8.102:22: connection reset by peer

* 稍等片刻在执行 fleetctl status may.service , 我们发现  myapp.service 这个应用开始在新的设备上面 下载 Docker 镜像

core@core-01 ~ $ fleetctl status myapp.service myapp.service - MyApp Loaded: loaded (/run/fleet/units/myapp.service; linked-runtime; vendor preset: disabled) Active: activating (start-pre) since Thu 2015-07-23 23:12:30 UTC; 36s ago Process: 1472 ExecStartPre=/usr/bin/docker rm busybox1 (code=exited, status=1/FAILURE) Process: 1391 ExecStartPre=/usr/bin/docker kill busybox1 (code=exited, status=1/FAILURE) Control: 1479 (docker) Memory: 9.0M CPU: 51ms CGroup: /system.slice/myapp.service └─control └─1479 /usr/bin/docker pull busybox

Jul 23 23:12:45 core-01 docker[1479]: 6ce2e90b0bc7: Pulling fs layer Jul 23 23:12:45 core-01 docker[1479]: 8c2e06607696: Pulling fs layer Jul 23 23:12:45 core-01 docker[1479]: 8c2e06607696: Pulling fs layer Jul 23 23:12:45 core-01 docker[1479]: 8c2e06607696: Layer already being pulled by another client. Waiting. Jul 23 23:12:56 core-01 docker[1479]: 8c2e06607696: Verifying Checksum Jul 23 23:12:56 core-01 docker[1479]: 8c2e06607696: Download complete Jul 23 23:12:56 core-01 docker[1479]: 8c2e06607696: Download complete Jul 23 23:12:56 core-01 docker[1479]: cf2616975b4a: Verifying Checksum Jul 23 23:12:56 core-01 docker[1479]: cf2616975b4a: Download complete Jul 23 23:12:57 core-01 docker[1479]: cf2616975b4a: Pull complete


* 再过一会儿 查看状态, 如下所示, 服务已经恢复

core@core-01 ~ $ fleetctl status myapp.service ● myapp.service - MyApp Loaded: loaded (/run/fleet/units/myapp.service; linked-runtime; vendor preset: disabled) Active: active (running) since Thu 2015-07-23 23:14:28 UTC; 36s ago Process: 1479 ExecStartPre=/usr/bin/docker pull busybox (code=exited, status=0/SUCCESS) Process: 1472 ExecStartPre=/usr/bin/docker rm busybox1 (code=exited, status=1/FAILURE) Process: 1391 ExecStartPre=/usr/bin/docker kill busybox1 (code=exited, status=1/FAILURE) Main PID: 1548 (docker) Memory: 7.0M CPU: 62ms CGroup: /system.slice/myapp.service └─1548 /usr/bin/docker run –name busybox1 busybox /bin/sh -c while true; do echo Hello World; sleep 1; done

Jul 23 23:14:55 core-01 docker[1548]: Hello World Jul 23 23:14:56 core-01 docker[1548]: Hello World Jul 23 23:14:57 core-01 docker[1548]: Hello World Jul 23 23:14:58 core-01 docker[1548]: Hello World Jul 23 23:14:59 core-01 docker[1548]: Hello World Jul 23 23:15:00 core-01 docker[1548]: Hello World Jul 23 23:15:01 core-01 docker[1548]: Hello World Jul 23 23:15:02 core-01 docker[1548]: Hello World Jul 23 23:15:03 core-01 docker[1548]: Hello World Jul 23 23:15:04 core-01 docker[1548]: Hello World


* 查看 fleet 集群中又哪些服务, 可以看到服务运行在 101, fleet 已经集群中自动恢复了服务

core@core-01 ~ $ fleetctl list-units UNIT MACHINE ACTIVE SUB myapp.service 9662f791…/172.17.8.101 active running


* 在集群外部通过 fleetctl 控制集群

[2015-07-24 07:36:52 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $export FLEETCTL_TUNNEL=127.0.0.1:2222

[2015-07-24 07:36:54 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $ssh-add ~/.vagrant.d/insecure_private_key Identity added: /Users/coreos/.vagrant.d/insecure_private_key (/Users/coreos/.vagrant.d/insecure_private_key)

[2015-07-24 07:36:59 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $fleetctl list-machines -bash: fleetctl: command not found

[2015-07-24 07:37:05 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $../fleet-v0.10.2/fleetctl list-machines The authenticity of host ‘[127.0.0.1]:2222’ can’t be established. RSA key fingerprint is d1:13:64:23:9b:80:72:1f:6e:f5:0f:bb:2e:35:78:b6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘[127.0.0.1]:2222’ (RSA) to the list of known hosts. MACHINE IP METADATA 9662f791… 172.17.8.101 - d8c2fedf… 172.17.8.103 -

[2015-07-24 07:37:15 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $vim dillinger.service

[2015-07-24 07:39:31 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant]

-bash: fleetctl: command not found

[2015-07-24 07:39:47 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $../fleet-v0.10.2/fleetctl submit dillinger.service

[2015-07-24 07:39:59 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $../fleet-v0.10.2/fleetctl list-units UNIT MACHINE ACTIVE SUB myapp.service 9662f791…/172.17.8.101 active running

[2015-07-24 07:40:07 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $fleetctl start dillinger.service -bash: fleetctl: command not found

[2015-07-24 07:40:36 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant]

[2015-07-24 07:40:53 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $fleetctl list-units -bash: fleetctl: command not found

[2015-07-24 07:41:03 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $../fleet-v0.10.2/fleetctl list-units UNIT MACHINE ACTIVE SUB dillinger.service d8c2fedf…/172.17.8.103 active running myapp.service 9662f791…/172.17.8.101 active runn-07-24 07:41:09 coreos@coreosdeAir /Users/coreos/coreos/coreos-vagrant] $../fleet-v0.10.2/fleetctl journal dillinger.service The authenticity of host ‘172.17.8.103’ can’t be established. RSA key fingerprint is ca:cc:f6:ef:fb:2d:15:e7:3c:41:7b:cc:9f:74:16:3c. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘172.17.8.103’ (RSA) to the list of known hosts. – Logs begin at Tue 2015-07-21 15:47:48 UTC, end at Thu 2015-07-23 23:41:25 UTC. – Jul 23 23:40:52 core-03 systemd[1]: Started dillinger.io. Jul 23 23:40:52 core-03 systemd[1]: Starting dillinger.io… Jul 23 23:40:53 core-03 docker[3958]: Unable to find image ‘dscape/dillinger:latest’ locally Jul 23 23:41:04 core-03 docker[3958]: Pulling repository dscape/dillinger Jul 23 23:41:08 core-03 docker[3958]: 51ab2c16c4f4: Pulling image (latest) from dscape/dillinger Jul 23 23:41:08 core-03 docker[3958]: 51ab2c16c4f4: Pulling image (latest) from dscape/dillinger, endpoint: https://registry-1.docker.io/v1/ Jul 23 23:41:11 core-03 docker[3958]: 51ab2c16c4f4: Pulling dependent layers Jul 23 23:41:11 core-03 docker[3958]: 8dbd9e392a96: Pulling metadata Jul 23 23:41:13 core-03 docker[3958]: 8dbd9e392a96: Pulling fs layer ```

参考文献