A drone CI pipeline step is failing because the drone-runner process, responsible for executing build jobs, is unable to communicate with the Drone server API. This usually manifests as a connection refused or timeout error from the runner’s perspective.
Common Causes and Fixes
-
Network Connectivity Between Runner and Server
- Diagnosis: From the host where the
drone-runneris running, try tocurlthe Drone server’s API endpoint.
(Replacecurl -v http://drone.example.com:8000/api/v1/pingdrone.example.com:8000with your Drone server’s actual address and port.) - Fix: Ensure that the
drone-runnerhost can reach the Drone server’s IP address and port. This might involve:- Firewall Rules: Open the Drone server’s port (default 8000) on the server’s firewall.
- DNS Resolution: If using a hostname, verify DNS is correctly resolving.
- Network ACLs/Security Groups: If running in a cloud environment, ensure your network security groups or ACLs allow ingress traffic on port 8000 from the
drone-runner’s IP.
- Why it works: The runner needs to establish a TCP connection to the server’s API to receive job instructions and report status. If this connection is blocked or fails, the runner cannot proceed.
- Diagnosis: From the host where the
-
Incorrect Drone Server Address/Port in Runner Configuration
- Diagnosis: Check the
DRONE_SERVERenvironment variable or configuration file for thedrone-runner.# Example for docker/kubernetes runner echo $DRONE_SERVER # Or check the runner's configuration file if applicable - Fix: Update the
DRONE_SERVERenvironment variable to the correct address and port of your Drone server. For example, if your Drone server is athttp://192.168.1.100:8000, set:
If using HTTPS, ensure the scheme isexport DRONE_SERVER=http://192.168.1.100:8000httpsand the port is correct (usually 443). - Why it works: The runner uses this URL to find and communicate with the Drone server. An incorrect URL means it’s trying to connect to the wrong place.
- Diagnosis: Check the
-
Drone Server Not Running or Unhealthy
- Diagnosis: Check the status of the Drone server process or container.
Also, check the Drone server logs for any errors.# If running as a systemd service sudo systemctl status drone-server # If running in Docker docker ps | grep drone-server # If running in Kubernetes kubectl get pods -l app=drone-server -n drone - Fix: Start or restart the Drone server. If it’s crashing, investigate the server logs for the root cause. This might involve increasing resource limits (CPU/memory) if the server is overloaded.
- Why it works: The runner is a client; it needs a healthy server to connect to. If the server is down, the connection will inevitably fail.
- Diagnosis: Check the status of the Drone server process or container.
-
TLS/SSL Certificate Issues (for HTTPS)
- Diagnosis: If your Drone server uses HTTPS, check the
DRONE_TLS_SECRETandDRONE_RPC_SECRETenvironment variables on both the server and the runner configuration. Also, verify theDRONE_SERVERURL in the runner configuration useshttps. - Fix:
- Matching Secrets: Ensure
DRONE_RPC_SECRETis identical on both the Drone server and thedrone-runner. This secret is used for authentication between them. - Valid Certificate: If the Drone server is using a self-signed certificate and the runner is not configured to trust it, the connection will fail. You can either:
- Configure the runner to trust the CA that signed the server’s certificate.
- For testing/simplicity, disable TLS verification on the runner (not recommended for production): Set
DRONE_SKIP_VERIFY=truein the runner’s environment variables. - Use a valid, trusted certificate for the Drone server.
- Matching Secrets: Ensure
- Why it works: The RPC secret authenticates the runner to the server. TLS verification ensures the runner is talking to the actual Drone server and not an imposter. If these fail, the connection is dropped.
- Diagnosis: If your Drone server uses HTTPS, check the
-
Runner Resource Constraints (CPU/Memory/Disk)
- Diagnosis: Check the resource utilization on the host where the
drone-runneris running. Look for high CPU, low memory, or disk space exhaustion.
Also, check the# On Linux top df -hdrone-runnerlogs for any resource-related errors. - Fix: Allocate more resources (CPU, RAM) to the
drone-runnerprocess or the host it’s running on. Ensure sufficient disk space is available for build artifacts and Docker images. - Why it works: A starved runner might not be able to establish or maintain network connections, or it might crash before it can properly communicate with the server.
- Diagnosis: Check the resource utilization on the host where the
-
Incorrect
DRONE_RPC_SECRETConfiguration- Diagnosis: Verify the
DRONE_RPC_SECRETvalue in thedrone-runnerconfiguration matches theDRONE_RPC_SECRETvalue configured for the Drone server. - Fix: Ensure the
DRONE_RPC_SECRETenvironment variable or configuration setting is identical for both the Drone server and all registereddrone-runners. For example, if set in Docker Compose:# drone-server service environment: - DRONE_RPC_SECRET=your-super-secret-key # drone-runner service environment: - DRONE_RPC_SECRET=your-super-secret-key - Why it works: This secret acts as a shared key for secure communication and authentication between the runner and the server. Mismatched secrets prevent them from trusting each other.
- Diagnosis: Verify the
After addressing these, the next error you might encounter is a resource limit exceeded in a pipeline step if the runner is still struggling with insufficient resources for the actual build jobs.