部分集群的KVM集群在重启物理机后发现虚拟机网络故障,具体表现为ping不通网关和宿主机,也无法连接外网

首先分别检查宿主机和虚拟机网卡配置文件和路由是否正确

然后使用bcrtl检查宿主机网络桥接状态:

[root@node084 ~]# brctl show
bridge name     bridge id               STP enabled     interfaces
br-13e4c7e43788         8000.024222969321       no
br-b7a940d7d211         8000.0242ac89ed2a       no
br0             8000.ac1f6b1aca54       no              eno1
docker0         8000.02421cc19f82       no              vethe5249c6
virbr0          8000.5254004f8ee0       yes             virbr0-nic

可以看到br0虚拟网口只绑定了一个eno1的物理网口,除此之外没有任何虚拟网口。问题定位到了,下一步获取宿主机下所有虚拟网卡:

[root@node084 ~]# ip a |grep -i vnet*
40: vnet3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
42: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
44: vnet4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
48: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000

使用brctl依次绑定虚拟网卡到br0网桥

[root@node084 ~]# brctl addif br0 vnet3
[root@node084 ~]# brctl addif br0 vnet0
[root@node084 ~]# brctl addif br0 vnet4
[root@node084 ~]# brctl addif br0 vnet1
[root@node084 ~]# brctl show
bridge name     bridge id               STP enabled     interfaces
br-13e4c7e43788         8000.024222969321       no
br-b7a940d7d211         8000.0242ac89ed2a       no
br0             8000.ac1f6b1aca54       no              eno1
                                                        vnet0
                                                        vnet1
                                                        vnet3
                                                        vnet4
docker0         8000.02421cc19f82       no              vethe5249c6
virbr0          8000.5254004f8ee0       yes             virbr0-nic

虚拟机恢复正常

[root@node084 ~]# ping 172.26.5.32
PING 172.26.5.32 (172.26.5.32) 56(84) bytes of data.
64 bytes from 172.26.5.32: icmp_seq=1 ttl=63 time=0.151 ms
64 bytes from 172.26.5.32: icmp_seq=2 ttl=63 time=0.117 ms
64 bytes from 172.26.5.32: icmp_seq=3 ttl=63 time=0.135 ms
64 bytes from 172.26.5.32: icmp_seq=4 ttl=63 time=0.143 ms
64 bytes from 172.26.5.32: icmp_seq=5 ttl=63 time=0.140 ms
^C
--- 172.26.5.32 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3999ms
rtt min/avg/max/mdev = 0.117/0.137/0.151/0.013 ms

PS.
NetworkManager服务也会影响网桥的端口绑定,若开启NetworkManager服务可能会导致虚拟机无法访问外网!