网络already

【DNS】"Can't resolve host" as non

2022-07-02  本文已影响0人  Bogon

一、 问题背景

虚拟机漂移重启后,上面某些应用重启失败

看相关应用启动日志,显示无法解析主机名,但是用到的主机名解析已经写在/etc/hosts了

xx.xx.xx.xx   oa.bogon.com

ping: oa.bogon.com: Name or service not known

image.png image.png

于是用业务进程运行用户身份 ping oa.bogon.com ,发现还真是解析不了;nslookup oa.bogon.com走DNS 解析却可以正常解析。

可是,当你 su - root 用户后 再ping,却都可以正常解析!

image.png image.png

二、 问题追踪

对Linux服务器而言,一般不都是 /etc/hosts 的解析优先级最高吗,现在怎么 /etc/hosts 不生效了
当然,此处的不生效有限定条件,那就是只针对普通用户,当使用root用户时候是完全没问题的!

于是自然开始怀疑是不是跟解析有关的文件、网络权限有关?

用strace 追踪不同用户的解析过程的系统调用:

# su   -  root 

# strace -e trace=open    ping oa.bogon.com

open("/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libonion.so", O_RDONLY|O_CLOEXEC) = 3
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libcap.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libidn.so.11", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libcrypto.so.10", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libresolv.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libz.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/etc/pki/tls/legacy-settings", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 4
open("/etc/host.conf", O_RDONLY|O_CLOEXEC) = 4
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 4
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 4
open("/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 4
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 4
PING oa.bogon.com (10.0.8.7) 56(84) bytes of data.
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 4
64 bytes from oa.bogon.com (10.0.8.7): icmp_seq=1 ttl=64 time=0.033 ms
64 bytes from oa.bogon.com (10.0.8.7): icmp_seq=2 ttl=64 time=0.044 ms
64 bytes from oa.bogon.com (10.0.8.7): icmp_seq=3 ttl=64 time=0.044 ms
64 bytes from oa.bogon.com (10.0.8.7): icmp_seq=4 ttl=64 time=0.043 ms
64 bytes from oa.bogon.com (10.0.8.7): icmp_seq=5 ttl=64 time=0.042 ms
64 bytes from oa.bogon.com (10.0.8.7): icmp_seq=6 ttl=64 time=0.045 ms
64 bytes from oa.bogon.com (10.0.8.7): icmp_seq=7 ttl=64 time=0.045 ms
strace: Process 18039 detached

# su - test
$ strace -e trace=open    ping oa.bogon.com
open("/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libonion.so", O_RDONLY|O_CLOEXEC) = 3
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libcap.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libidn.so.11", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libcrypto.so.10", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libresolv.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libz.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/etc/pki/tls/legacy-settings", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
open("/usr/share/locale/en_US.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
ping: socket: Operation not permitted
+++ exited with 2 +++

将关注点放在如下3个文件身上:
/etc/hosts
/etc/host.conf
/etc/nsswitch.conf

$ ls -l  /etc/hosts
-rw-r--r-- 1 root root 257 Jul  2 11:58 /etc/hosts

$ ls -l  /etc/host.conf
-rw-r--r-- 1 root root 9 Jun  7  2013 /etc/host.conf

$ ls -l  /etc/nsswitch.conf
-rw-rw----. 1 root root 1746 Mar  7  2019 /etc/nsswitch.conf

$ cat /etc/nsswitch.conf
cat: /etc/nsswitch.conf: Permission denied

image.png

三、解决方法

#  chmod  644  /etc/hosts
#  chmod  644  /etc/host.conf
#  chmod  644   /etc/nsswitch.conf
image.png image.png

nsswitch.conf(name service switch configuration,名字服务切换配置)文件位于/etc目录下,由它规定通过哪些途径以及按照什么顺序以及通过这些途径来查找特定类型的信息,还可以指定某个方法奏效或失效时系统将采取什么动作。

$  cat    /etc/nsswitch.conf

hosts:      files dns myhostname

先使用/etc/hosts 搜索;如果失败的话,根据/etc/resolv.conf文件中nameserver搜索;如果再次失败的话,核对myhostname找出主机信息。

三、问题处理复盘

  1. 底层物理机故障导致上面的虚拟机漂移重启(虚拟机化机制),漂移重启后的虚拟机 /etc/nsswitch.conf文件权限变成了660,默认应该是644

  2. 如果没有root权限用户ping作为对比,可能一时找不到方向

  3. 通过使用root用户 strace 追踪 ping 系统调用,找到相关打开的文件

  4. 普通用户如果没有对 /etc/nsswitch.conf read权限,那么就无法使用 /etc/hosts

四、参考

/etc/hosts entries not being used for non-root users
https://www.unixsherpa.com/solution/etchosts-entries-not-being-used-for-non-root-users/

Cannot resolve host as non-root user
https://serverfault.com/questions/637274/cannot-resolve-host-as-non-root-user

"Can't resolve host" as user, but works fine as root
https://www.linuxquestions.org/questions/linux-networking-3/can%27t-resolve-host-as-user-but-works-fine-as-root-494270/·`

nslookup-OK-but-ping-fail
https://plantegg.github.io/2019/01/09/nslookup-OK-but-ping-fail

Linux 能PING IP 但不能PING 主机域名的解决方法
https://www.cnblogs.com/gaoyuechen/p/8378138.html

Linux系统下的/etc/nsswitch.conf文件
https://www.bbsmax.com/A/Ae5RaXXLJQ
https://blog.csdn.net/waqwn/article/details/51687719

系统管理指南:命名和目录服务(DNS、NIS 和 LDAP)
https://docs.oracle.com/cd/E24847_01/html/E22302/a12swit-22067.html

Linux神器 strace解析
https://www.cnblogs.com/johnny666888/p/12629216.html

上一篇下一篇

猜你喜欢

热点阅读