justlinux2010的专栏: 解决mv命令导致的cman启动失败问题

2013年10月30日星期三

解决mv命令导致的cman启动失败问题

解决mv命令导致的cman启动失败问题参考资料：https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=667703

http://www.qingsword.com/qing/1086.html

今天使用cman来管理集群，在启动时报错，错误信息如下：

[root@CentOS____102 ~]# service cman start
Starting cluster: 
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... I/O warning : failed to load external entity "/etc/cluster/cluster.conf"

Unable to get the configuration
I/O warning : failed to load external entity "/etc/cluster/cluster.conf"
corosync [MAIN  ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.
corosync [MAIN  ] Corosync built-in features: nss dbus rdma snmp
corosync [MAIN  ] Unable to read config from /etc/cluster/cluster.conf
corosync [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1695.
corosync died: Could not read cluster configuration Check cluster logs for details
                                                           [FAILED]
Stopping cluster: 
   Leaving fence domain...                                 [  OK  ]
   Stopping gfs_controld...                                [  OK  ]
   Stopping dlm_controld...                                [  OK  ]
   Stopping fenced...                                      [  OK  ]
   Stopping cman...                                        [  OK  ]
   Unloading kernel modules...                             [  OK  ]
   Unmounting configfs...                                  [  OK  ]
[root@CentOS____102 ~]#

产生问题的原因是corosync无法读取配置文件/etc/cluster/cluster.conf，文件属性如下所示：

[root@CentOS____102 ~]# ls -l /etc/cluster/cluster.conf 
-rw-r--r--. 1 root root 995 Oct 30 16:30 /etc/cluster/cluster.conf

这个文件权限在机器A（192.168.56.101）是没有问题的，出问题的是机器B（192.168.56.102）。

机器B上的文件是使用scp命令从机器A上先拷贝到机器B的/root目录下，然后使用mv命令移到/etc/cluster目录的。之所以出现这样的问题，是因为在机器B上文件是/root目录下创建的，在使用mv命令移动这个文件的时候会加上admin_home_t label，如下所示：

 [root@CentOS____102 ~]# ls -Z /etc/cluster/cluster.conf 
-rw-r--r--. root root unconfined_u:object_r:admin_home_t:s0 /etc/cluster/cluster.conf

因为这个label的存在，SELinux会阻止corosync读取这个文件，admin_home_t表示这个文件是root用户目录下的文件。之所以打不开是因为cluster.conf文件的SELinux配置信息是继承原来那个目录的，与/etc/cluster目录不同。查看SELinux的日志信息(日志文件位置为/var/log/audit/audit.log)，可以看到下面的内容：

22109 type=AVC msg=audit(1383121897.046:28821): avc:  denied  { getattr } for  pid=2645 comm="corosync" path="/etc/cluster/cluster.conf" dev=sda2 ino=1311842 scontext=unconfined_u:system_r:corosync_t:s0 tcontext=unconfined_u:object_r:admin_home_t:s0 tclass=file

正确的SELinux配置信息是scontext对应的内容，它描述了corosync进程的SELinux上下文，其中unconfined_u表示的是SELinux用户，这里的就是没有限制特定的用户。system_r表示corosync是一个进程，corosync_t是它的类型（或者叫域），每个进程或文件都有一个类型。cluster.conf的SELinux上下文由tcontext描述，第一个字段表示没有限制特定的用户，第二个字段表示她是一个目录或文件，第三个字段则是它的类型，这个类型只有root用户才可以访问。

现在知道了问题原因，要解决这个问题见就很简单了。最容易想到的方法就是关闭SELinux。如果不想关闭SELinux，可以使用下面的命令恢复原来的文件标签，如下所示：

[root@CentOS____102 ~]# restorecon -v /etc/cluster/cluster.conf 
restorecon reset /etc/cluster/cluster.conf context unconfined_u:object_r:admin_home_t:s0->unconfined_u:object_r:cluster_conf_t:s0

恢复后再使用ls -Z命令查看文件的信息，和机器A上的一样。修改后重新启动cman服务正常。

justlinux2010的专栏

2013年10月30日星期三

解决mv命令导致的cman启动失败问题

没有评论:

发表评论