一、配置pacemaker和corosync,实现高可用
具体步骤:
1.server4上安装master,编辑域名解析,开启服务
[root@server4 ~]# ls
moosefs-master-3.0.103-1.rhsystemd.x86_64.rpm
[root@server4 ~]# rpm -ivh moosefs-master-3.0.103-1.rhsystemd.x86_64.rpm
warning: moosefs-master-3.0.103-1.rhsystemd.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID cf82adba: NOKEY
Preparing... ################################# [100%]
Updating / installing...
1:moosefs-master-3.0.103-1.rhsystem################################# [100%]
[root@server4 ~]# vim /etc/hosts
[root@server4 ~]#
[root@server4 ~]# vim /usr/lib/systemd/system/moosefs-master.service
8 ExecStart=/usr/sbin/mfsmaster -a
[root@server4 ~]# systemctl daemon-reload
[root@server4 ~]# systemctl start moosefs-master
[root@server4 ~]#
[root@server4 ~]# netstat -atnlp
2.在server1和server4上配置高级yum源
yum源的地址在物理机中可以这样查看
[root@foundation19 ~]# cd /var/www/html/rhel7.3
[root@foundation19 rhel7.3]# ls
addons GPL LiveOS release-notes RPM-GPG-KEY-redhat-release
EFI images media.repo repodata TRANS.TBL
EULA isolinux Packages RPM-GPG-KEY-redhat-beta
[root@foundation19 rhel7.3]# cd addons/
[root@foundation19 addons]# ls
HighAvailability ResilientStorage
[root@foundation19 addons]# cd
server1:
[root@server1 ~]# vim /etc/yum.repos.d/rhel.repo
[root@server1 ~]# yum clean all
[root@server1 ~]# yum repolist
[rhel7.3]
name=rhel7.3
baseurl=http://172.25.19.250/rhel7.3
gpgcheck=0
enable=1
[HighAvailability]
name=HighAvailability
baseurl=http://172.25.19.250/rhel7.3/addons/HighAvailability
gpgcheck=0
[ResilientStorage]
name=ResilientStorage
baseurl=http://172.25.19.250/rhel7.3/addons/ResilientStorage
gpgcheck=0
将编辑好的文件传给server4
[root@server1 ~]# scp /etc/yum.repos.d/rhel.repo server4:/etc/yum.repos.d/
server4:
[root@server4 ~]# yum clean all
[root@server4 ~]# yum repolist
3.server1和server4安装pacemaker 和corosync(corosync 做心跳检测)
[root@server1 ~]# ls
3.0.103 pacemaker
[root@server1 ~]# cd pacemaker/
[root@server1 pacemaker]# yum install -y pacemaker corosync
[root@server4 ~]# ls
moosefs-master-3.0.103-1.rhsystemd.x86_64.rpm pacemaker
[root@server4 ~]# cd pacemaker/
[root@server4 pacemaker]# yum install -y pacemaker corosync
4.在server1上做免密,将生成的密钥传给server4,ssh连接
[root@server1 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
3b:70:a4:57:11:9c:c0:e1:43:d3:f6:6d:84:2d:01:bd root@server1
The key's randomart image is:
+--[ RSA 2048]----+
| .==+=.+ |
| o..=.+ o |
| +... = |
| o o E o |
| o S . |
| + . |
| o |
| . |
| |
+-----------------+
[root@server1 ~]# ssh-copy-id server4
[root@server1 ~]# ssh-copy-id server1
5.server1和server4安装资源管理工具pcs(pcs是一个集群套件,他的后台服务是pcsd,在企业6中是没有这个服务的)
[root@server1 ~]# yum install -y pcs
[root@server1 ~]# systemctl start pcsd
[root@server1 ~]# systemctl enable pcsd
安装之后可以看到自动生成了一个hacluster用户,给这个用户设置密码
[root@server1 ~]# tail -n 3 /etc/passwd
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
mfs:x:997:995:MooseFS:/var/lib/mfs:/sbin/nologin
hacluster:x:189:189:cluster user:/home/hacluster:/sbin/nologin
[root@server1 ~]# passwd hacluster
Changing password for user hacluster.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
server4同理 server4上应该确保和server1中hacluster用户的uid,gid完全一致
[root@server4 ~]# yum install -y pcs
[root@server4 ~]# systemctl start pcsd
[root@server4 ~]# systemctl enable pcsd
[root@server4 ~]# tail -n 3 /etc/passwd
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
mfs:x:997:995:MooseFS:/var/lib/mfs:/sbin/nologin
hacluster:x:189:189:cluster user:/home/hacluster:/sbin/nologin
[root@server4 ~]# passwd hacluster
Changing password for user hacluster.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
6.创建集群,并启动
[root@server1 ~]# pcs cluster auth server1 server4
Username: hacluster
Password:
server4: Authorized
server1: Authorized
[root@server1 ~]# pcs cluster setup --name mycluster server1 server4
[root@server1 ~]# pcs cluster start --all
7.查看状态
[root@server1 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
id = 172.25.19.1
status = ring 0 active with no faults
[root@server1 ~]# pcs status corosync
Membership information
----------------------
Nodeid Votes Name
1 1 server1 (local)
2 1 server4
8.crm检验,解决报错(set stonith-enabled-=false 表示修改pcs的属性,stonith是报头)
[root@server1 ~]# crm_verify -L -V
error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
[root@server1 ~]# pcs property set stonith-enabled=false
[root@server1 ~]# crm_verify -L -V
9.创建资源 ocf:heartbeat:IPaddr2 是一种资源类型,可以tab补齐查看资源代理的方式
[root@server1 ~]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.25.19.100 cidr_netmask=32 op monitor interval=30s
[root@server1 ~]# pcs resource show
vip (ocf::heartbeat:IPaddr2): Started server1
[root@server1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: server1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Sat May 18 14:45:10 2019 Last change: Sat May 18 14:44:56 2019 by root via cibadmin on server1
2 nodes and 1 resource configured
Online: [ server1 server4 ]
Full list of resources:
vip (ocf::heartbeat:IPaddr2): Started server1
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
在server4上监控
[root@server4 ~]# crm_mon
10.server1查看ip,vip在server1上
[root@server1 ~]# ip addr
(1)关闭server1
[root@server1 ~]# pcs cluster stop server1
server1: Stopping Cluster (pacemaker)...
server1: Stopping Cluster (corosync)...
(2)vip不在server1上
[root@server1 ~]# ip addr
(3)在server4上查看,发现vip漂移
[root@server4 ~]# ip addr
查看监控可以看到server1挂掉,server4在线
(4)再次开启server1,vip不会漂移
[root@server1 ~]# pcs cluster start server1
server1: Starting Cluster...
[root@server1 ~]# ip addr
二、配置iscsi实现数据共享
具体步骤:
1.物理机卸载之前的目录,关闭server1的master服务,server2和server3关闭chunkserver服务
[root@foundation19 ~]# umount /mnt/mfs
[root@foundation19 ~]# df
[root@server1 ~]# systemctl stop moosefs-master
[root@server2 ~]# systemctl stop moosefs-chunkserver
[root@server3 ~]# systemctl stop moosefs-chunkserver
2.所有的节点添加如下解析(包括物理机)
172.25.19.100 mfsmaster
3.server2添加一个虚拟磁盘
4.server2安装targetcli,配置共享设备
[root@server2 ~]# yum install -y targetcli
[root@server2 ~]# targetcli
/> cd backstores//block
/backstores/block> create my_disk2 /dev/vdb
/> cd iscsi
/iscsi> create iqn.2019-05.com.example:server2
/iscsi> cd iqn.2019-05.com.example:server2/tpg1/luns
/iscsi/iqn.20...er2/tpg1/luns>
/iscsi/iqn.20...er2/tpg1/luns> create /backstores/block/my_disk2
/iscsi/iqn.20...er2/tpg1/luns> cd ../acls
/iscsi/iqn.20...er2/tpg1/acls> create iqn.2019-05.com.example:client
5.server1安装iscsi,编辑认证文件
[root@server1 ~]# yum install -y iscsi-*
[root@server1 ~]# vim /etc/iscsi/initiatorname.iscsi
[root@server1 ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2019-05.com.example:client
6.发现iscsi设备,登陆
[root@server1 ~]# iscsiadm -m discovery -t st -p 172.25.19.2
172.25.19.2:3260,1 iqn.2019-05.com.example:server2
[root@server1 ~]# iscsiadm -m node -l
7.划分分区,格式化
[root@server1 ~]# fdisk -l
[root@server1 ~]# fdisk /dev/sdb
[root@server1 ~]# mkfs.xfs /dev/sdb1
8.挂载,修改/mnt/目录的所有人和所有组为mfs(注意:要在挂载时修改)
[root@server1 ~]# mount /dev/sdb1 /mnt
[root@server1 ~]# df
[root@server1 ~]# cd /var/lib/mfs
[root@server1 mfs]# ls
[root@server1 mfs]# cp -p * /mnt
[root@server1 mfs]# cd /mnt
[root@server1 mnt]# ls
[root@server1 mnt]# chown mfs.mfs /mnt/
[root@server1 ~]# umount /mnt
[root@server1 ~]#
[root@server1 ~]# mount /dev/sdb1 /var/lib/mfs
[root@server1 ~]# systemctl start moosefs-master
[root@server1 ~]# systemctl stop moosefs-master
注意:在开启服务之前保险起见,查看一下端口是否占用,如果被占用可以kill -9关闭进程,然后mfsmaster -a 开启服务
9.server4的配置和server1相同,但是不用划分分区
[root@server4 ~]# yum install -y iscsi-*
[root@server4 ~]# vim /etc/iscsi/initiatorname.iscsi
[root@server4 ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2019-05.com.example:client
[root@server4 ~]# iscsiadm -m discovery -t st -p 172.25.19.2
172.25.19.2:3260,1 iqn.2019-05.com.example:server2
[root@server4 ~]# iscsiadm -m node -l
[root@server4 ~]# fdisk -l
[root@server4 ~]# mount /dev/sdb1 /var/lib/mfs
[root@server4 ~]# df
[root@server4 ~]# systemctl start moosefs-master
7.再次创建资源,每一步都可以在server4上监控
[root@server1 ~]# pcs resource create mfsdata ocf:heartbeat:Filesystem device=/dev/sdb1 directory=/var/lib/mfs fstype=xfs op monitor interval=30s
[root@server1 ~]# pcs resource create mfsd systemd:moosefs-master op monitor interval=1min
[root@server1 ~]# pcs resource group add mfsgroup vip mfsdata mfsd
[root@server1 ~]# crm_mon
8.测试:
(1)关闭server4节点
[root@server1 ~]# pcs cluster stop server4
server4: Stopping Cluster (pacemaker)...
server4: Stopping Cluster (corosync)...
(2)查看监控,所有的资源都在server1上
[root@server1 ~]# crm_mon
Connection to the CIB terminated
(3)拉起server4,查看监控,所有的资源不会再回去
[root@server1 ~]# pcs cluster start server4
server4: Starting Cluster...
[root@server1 ~]# crm_mon
Connection to the CIB terminated
[root@server1 ~]# ip addr
三、配置fence
具体步骤:
1.server1和server4安装fence-virt(客户端)
[root@server1 ~]# yum install -y fence-virt
[root@server4 ~]# yum install -y fence-virt
[root@server1 ~]# pcs stonith list
fence_virt - Fence agent for virtual machines
fence_xvm - Fence agent for virtual machines
2.物理机安装fence的服务端(一共三个安装包,有两个插件)
[root@foundation19 ~]# yum install -y fence-virtd
[root@foundation19 ~]# yum install -y fence-virtd-libvirt.x86_64
3.创建一个目录,用来存放fence的key,配置key (生成配置文件的时候一直回车,只有两个地方需要手动输入)
[root@foundation19 ~]# mkdir /etc/cluster
[root@foundation19 ~]# cd /etc/cluster
[root@foundation19 cluster]# ls
[root@foundation19 cluster]# fence_virtd -c
划分一个key文件
[root@foundation19 cluster]# dd if=/dev/urandom of=fence_xvm.key bs=128 count=1
1+0 records in
1+0 records out
128 bytes (128 B) copied, 0.000166492 s, 769 kB/s
[root@foundation19 cluster]# ls
fence_xvm.key
4.将生成的文件传给server1和server4
[root@foundation19 cluster]# scp fence_xvm.key server1:
root@server1's password:
fence_xvm.key 100% 128 0.1KB/s 00:00
[root@foundation19 cluster]# scp fence_xvm.key server4:
root@server4's password:
fence_xvm.key 100% 128 0.1KB/s 00:00
server1和server4同样创建目录,将文件复制进去
[root@server1 ~]# mkdir /etc/cluster
[root@server4 ~]# mkdir /etc/cluster
5.物理机开启fence服务
[root@foundation19 cluster]# systemctl start fence_virtd
== 可以查看到1299端口已经开启==
[root@foundation19 ~]# netstat -anulp | grep :1229
6.在server1上配置策略
[root@server1 ~]# cd /etc/cluster
[root@server1 cluster]# ls
fence_xvm.key
[root@server1 cluster]# pcs stonith create vmfence fence_xvm pcmk_host_map="server1:server1;server4:server4" op monitor interval=1min
[root@server1 cluster]# pcs property set stonith-enabled=true
[root@server1 cluster]# crm_verify -L -V
[root@server1 cluster]# fence_xvm -H server4
[root@server1 cluster]# crm_mon
7.断掉server4之后资源转到server1上,再次拉起server4
8.破坏server1的内核,server1会断电自启动
此时查看监控可以看到所有资源都在server4上
9.server1启动后资源不会漂回
补充:
1.这里不用keepalived做该可用是因为他更适合做无状态的服务的高可用
2.在任何及群众都应该尽可能的保证每个节点的信息一致,包括用户的uid和gid
3.忽略合法集群检测 no-quorum-policy=ignore
4.资源添加的顺序就是启动顺序