内容纲要
概要描述
本案例详细介绍升级docker-monitor的操作过程以及注意事项,简单描述一下升级docker-monitor的作用和意义;
本案例环境:TDH 5.2.2 docker-monitor 1.0.2
详细说明
docker-monitor是为了解决因docker异常导致一些正常服务被误杀的问题,一般表现为Manager、Agent进程莫名消失了,但是pid文件还在。或者是个别节点经常出现端口占用的问题,导致服务无法正常启动;
$ /etc/init.d/transwarp-manager status
transwarp-manager is Not running but pid file found
Caused by:java.net.BindException:Address already in use
操作步骤
- 将docker-monitor的rpm包放入repo源
- 更新repo源索引,并清空yum缓存
- 安装docker-monitor并验证功能
将docker-monitor的rpm包放入repo源
5.2之前的版本,将docker-monitor的rpm包copy至Manager节点的如下目录下:
$ pwd
/var/lib/transwarp-manager/master/pub/transwarp/hadoop_related/common/docker
5.2之后的版本变成如下路径:
CentOS
$ pwd
/var/lib/transwarp-manager/master/pub/native/RHEL7/transwarp/hadoop_related/common/docker/
SUSE:
$ pwd
/var/lib/transwarp-manager/master/pub/native/SLES12/transwarp/hadoop_related/common/docker/
更新repo源索引,并清空yum缓存
$ cd /var/lib/transwarp-manager/master/pub/transwarp
//不要忘记createrepo后面的 ·
$ createrepo .
Spawning worker 0 with 4 pkgs
Spawning worker 1 with 4 pkgs
Spawning worker 2 with 4 pkgs
Spawning worker 3 with 4 pkgs
Spawning worker 4 with 4 pkgs
Spawning worker 5 with 4 pkgs
Spawning worker 6 with 4 pkgs
Spawning worker 7 with 4 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete
清空yum缓存
CentOS:
$ yum clean all
已加载插件:fastestmirror
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
正在清理软件源: os transwarp
Cleaning up everything
Cleaning up list of fastest mirrors
SUSE:
$ zypper clean -a
安装docker-monitor并验证功能
$ yum remove docker-monitor-tos -y
$ yum -y install docker-monitor-tos
已加载插件:fastestmirror
os | 2.9 kB 00:00:00
transwarp | 2.9 kB 00:00:00
(1/2): transwarp/primary_db | 27 kB 00:00:00
(2/2): os/primary_db | 2.8 MB 00:00:00
Determining fastest mirrors
软件包 docker-monitor-tos-1.0-1.el7.centos.x86_64 已安装并且是最新版本
无须任何处理
启用docker-monitor,然后启动docker-monitor
$ systemctl enable docker-monitor
$ systemctl start docker-monitor
验证docker-monitor版本和功能
$ md5sum /usr/sbin/docker-monitor
2c656f64f82a00c9535fa65bf73e0c1e /usr/sbin/docker-monitor
$ docker-monitor version
docker_monitor version 1.0.2 (amd64)
$ journalctl -f -u docker-monitor
-- Logs begin at 三 2019-10-09 09:39:02 CST. --
10月 10 15:40:52 tdh-01 systemd[1]: Started Docker Monitor Service.
10月 10 15:40:52 tdh-01 systemd[1]: Starting Docker Monitor Service...
10月 10 15:40:52 tdh-01 docker-monitor[4005]: I1010 15:40:52.279824 4005 docker_client.go:241] Connecting to docker on unix:///var/run/docker.sock
10月 10 15:40:52 tdh-01 docker-monitor[4005]: I1010 15:40:52.279918 4005 docker_client.go:41] Start docker client with request timeout=1m59s
10月 10 15:40:52 tdh-01 docker-monitor[4005]: I1010 15:40:52.281213 4005 metrics.go:106] Starting metrics server: :9323
10月 10 15:40:52 tdh-01 docker-monitor[4005]: I1010 15:40:52.281759 4005 monitor.go:187] [docker version] func work normally, docker version: 1.13.1
10月 10 15:40:53 tdh-01 docker-monitor[4005]: I1010 15:40:53.375403 4005 containerd_client.go:53] containerd currently has 67 running containers
10月 10 15:40:53 tdh-01 docker-monitor[4005]: I1010 15:40:53.375446 4005 monitor.go:224] containerd component is healthy.
10月 10 15:40:53 tdh-01 docker-monitor[4005]: I1010 15:40:53.396667 4005 monitor.go:238] [containers status] total: 112, running: 67, paused: 0, stopped: 45
10月 10 15:40:53 tdh-01 docker-monitor[4005]: I1010 15:40:53.400259 4005 monitor.go:252] [docker exec] func work normally!
docker-monitor的rpm包下载链接请联系星环售后工程师获取