The named process is reporting itself as "dead" to the system’s service manager, but its PID file, which is supposed to disappear when the process stops, is still present. This indicates that the service manager thinks named should be running, but it can’t find the actual process, and the lingering PID file is confusing the startup/shutdown logic.
Common Causes and Fixes
-
Stale PID File Due to Unexpected Termination:
- Diagnosis: Manually check for the
namedprocess:ps aux | grep named. If it’s not running, check for the PID file:ls -l /var/run/named/named.pid(the exact path might vary, e.g.,/run/named/named.pid). If the file exists and its timestamp is old, it’s a stale file. - Fix: Remove the stale PID file:
sudo rm /var/run/named/named.pid. Then try restartingnamed:sudo systemctl restart named. - Why it works: The PID file is a signal to the system that
namedis running. Ifnamedcrashed or was killed abruptly, the PID file might not have been cleaned up. Removing it allows the service manager to correctly detect thatnamedis not running and proceed with a fresh start.
- Diagnosis: Manually check for the
-
Permissions Issue with PID File Directory:
- Diagnosis: Check the ownership and permissions of the directory containing the PID file:
ls -ld /var/run/named/(or equivalent). Thenameduser/group (oftennamedorbind) needs write permissions to this directory. - Fix: Ensure the
nameduser/group owns the directory and has write permissions. For example, if the user isnamedand group isnamed:sudo chown named:named /var/run/named/andsudo chmod 755 /var/run/named/. Then restartnamed. - Why it works: The
namedprocess needs to create and delete its PID file. If the directory it’s supposed to write to isn’t owned by thenameduser or lacks write permissions, it can fail to start, leaving old files behind or failing to clean them up during shutdown.
- Diagnosis: Check the ownership and permissions of the directory containing the PID file:
-
SELinux Blocking PID File Creation/Deletion:
- Diagnosis: Check SELinux audit logs for
namedrelated denials:sudo ausearch -c 'named' --raw | audit2allow -a. Look for AVC denials related to file creation or deletion in/var/run/named/. - Fix: If SELinux is the culprit, you might need to create a custom SELinux policy. A common immediate workaround (not recommended for production without careful consideration) is to set SELinux to permissive mode temporarily:
sudo setenforce 0. Then restartnamed. If it works, you’ll need to create the proper policy. - Why it works: SELinux enforces security contexts. If the
namedprocess doesn’t have the correct SELinux context to write to or delete files in the PID file directory, it will be blocked, leading to stale files or startup failures.
- Diagnosis: Check SELinux audit logs for
-
Incorrect
namedConfiguration Leading to Early Exit:- Diagnosis: Examine the
namedconfiguration files, typically in/etc/named.confor/etc/bind/named.conf. Look for syntax errors, incorrect zone definitions, or misconfigured options that might causenamedto exit immediately after starting. Usenamed-checkconfto validate:sudo named-checkconf /etc/named.conf. - Fix: Correct any syntax errors or logical misconfigurations in
named.confand its included zone files. For example, if a zone file is missing or has incorrect permissions,namedmight fail. After fixing the configuration, restartnamed. - Why it works: A fundamentally broken configuration will cause
namedto fail its startup checks or encounter an unrecoverable error very early in its execution. It might start long enough to create a PID file but then exit, leaving the PID file behind.
- Diagnosis: Examine the
-
Systemd Unit File Issues:
- Diagnosis: Inspect the
systemdservice unit file fornamed, usually located at/usr/lib/systemd/system/named.serviceor/etc/systemd/system/named.service. Look for incorrectPIDFile=directives orExecStart=commands that might be faulty. Checksystemctl status namedfor detailed logs. - Fix: Ensure the
PIDFile=directive correctly points to the actual PID file location. Correct any errors in theExecStart=command. After modifying the unit file, reload systemd:sudo systemctl daemon-reload, then restartnamed. - Why it works: The
systemdunit file tellssystemdhow to manage thenamedservice, including where to find its PID file. If this directive is wrong,systemdwill misinterpret the service’s state, leading to the "dead but PID file exists" error.
- Diagnosis: Inspect the
-
Resource Limits or Dependencies Not Met:
- Diagnosis: Check system logs (
journalctl -xeor/var/log/messages) for any messages related tonamedfailing due to insufficient memory, file descriptors, or other system resources. Also, verify that any necessary dependencies (nss-util,dnssec-tools, etc.) are installed and functional. - Fix: Adjust system resource limits (e.g., in
/etc/security/limits.conf) or free up system resources. Ensure all required packages are installed. After making changes, restartnamed. - Why it works:
namedmight fail to start or stay running if the system cannot provide the necessary resources. This can lead to an abrupt exit, leaving the PID file behind.
- Diagnosis: Check system logs (
-
Corrupted
namedBinary or Libraries:- Diagnosis: Attempt to run
namedmanually from the command line with debug options:sudo /usr/sbin/named -d 5. If it fails with segmentation faults or library loading errors, the binary or its dependencies might be corrupt. - Fix: Reinstall the
bindornamedpackage. For example, on Debian/Ubuntu:sudo apt-get update && sudo apt-get --reinstall install bind9. On RHEL/CentOS:sudo yum reinstall bind. Then restartnamed. - Why it works: A corrupted executable or missing/corrupted shared libraries will prevent
namedfrom running correctly, leading to premature exits and stale PID files.
- Diagnosis: Attempt to run
After resolving these, the next error you might encounter is related to zone file loading if there are underlying issues with the zone data itself, or perhaps a network binding issue if named can’t acquire the necessary ports.