CircleCI’s Kubernetes orb makes deploying to your Kubernetes cluster a breeze, but sometimes things go sideways.
Fix: CircleCI Orb Failed to Deploy to Kubernetes
The kubernetes/deploy job in your CircleCI pipeline failed because the kubectl command, executed by the orb, couldn’t authenticate with your Kubernetes cluster. This is usually because the service account credentials provided to CircleCI are either invalid, missing necessary permissions, or haven’t been propagated correctly within the cluster.
Here are the most common reasons this happens and how to fix them:
-
Incorrect Service Account Name or Namespace: The orb uses a service account to interact with your Kubernetes API. If the
kubernetes-namespaceorkubernetes-service-accountparameters in your.circleci/config.ymldon’t match the actual service account and its namespace in your cluster,kubectlwill fail to find it.-
Diagnosis: Check your
.circleci/config.ymlfor thekubernetes/deployjob and note thekubernetes-namespaceandkubernetes-service-accountvalues. Then, in your Kubernetes cluster, run:kubectl get serviceaccount <your-service-account-name> -n <your-namespace>If this command returns an error like "serviceaccount.v1.core not found," your names are wrong.
-
Fix: Update the
kubernetes-namespaceandkubernetes-service-accountparameters in yourconfig.ymlto match the exact names and namespaces of a valid service account in your cluster. For example:jobs: deploy_to_k8s: executor: kubernetes/docker steps: - kubernetes/deploy: actions: | kubectl apply -f k8s/deployment.yaml kubernetes-namespace: "production" kubernetes-service-account: "circleci-deployer"Ensure the
circleci-deployerservice account exists in theproductionnamespace. -
Why it works: This directly aligns the orb’s configuration with the actual resources available in your Kubernetes cluster, allowing
kubectlto find and use the specified service account for authentication.
-
-
Service Account Lacks Required Permissions (RBAC): Even if the service account exists, it needs specific Role-Based Access Control (RBAC) permissions to perform actions like
create,update, anddeleteon Kubernetes resources (Deployments, Services, etc.).-
Diagnosis: Check the Role(s) and RoleBinding(s) associated with your service account.
# Get the RoleBindings for the service account kubectl get rolebinding -n <your-namespace> --field-selector subject.name=<your-service-account-name> # Describe the associated Role to see permissions kubectl describe role <role-name-from-rolebinding> -n <your-namespace>Look for permissions like
deployments,pods,services,ingresses, etc., with verbs likecreate,get,list,watch,update,patch,delete. If these are missing, the deployment will fail. -
Fix: Create or update a
RoleandRoleBindingto grant the necessary permissions. For example, aRolemight look like this:apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: production name: circleci-deployer-role rules: - apiGroups: ["apps", ""] # "" indicates the core API group resources: ["deployments", "services", "pods", "configmaps", "secrets"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]And a
RoleBindingto link it to your service account:apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: circleci-deployer-binding namespace: production subjects: - kind: ServiceAccount name: circleci-deployer namespace: production roleRef: kind: Role name: circleci-deployer-role apiGroup: rbac.authorization.k8s.ioApply these with
kubectl apply -f rbac.yaml. -
Why it works: This explicitly grants the service account the granular permissions required by
kubectlto manage the specific Kubernetes resources your deployment modifies.
-
-
Kubernetes Token Expiration or Revocation: The service account token used by CircleCI has a lifespan. If it expires or is manually revoked, authentication will fail.
-
Diagnosis: This is harder to diagnose directly from CircleCI logs. A common symptom is intermittent failures or failures occurring after a long period of successful deployments. You can check the
secretsin your namespace for the service account’s token:kubectl get secret <service-account-name>-token-<token-suffix> -n <your-namespace> -o jsonpath='{.data.token}' | base64 --decodeIf this command fails or the decoded token is empty, the secret might be misconfigured or removed.
-
Fix: The recommended way to handle this is to let Kubernetes automatically generate new tokens. If a service account token secret is missing or invalid, Kubernetes will often create a new one for you. Alternatively, you can delete and recreate the service account:
kubectl delete serviceaccount <your-service-account-name> -n <your-namespace> kubectl create serviceaccount <your-service-account-name> -n <your-namespace> # Re-apply your Role and RoleBinding if you deleted themThen, ensure your CircleCI orb is configured to use the new service account. It’s best practice to use the
kubernetes/create- GKE-clusteror similar orb commands if you’re provisioning the cluster and SA via CI, as they handle token management. -
Why it works: Kubernetes automatically generates new, valid tokens for service accounts when the associated secret is missing or invalid, ensuring that CircleCI can always obtain a fresh, usable token for authentication.
-
-
Network Policy Blocking CircleCI: If your Kubernetes cluster has Network Policies configured, they might be preventing CircleCI’s egress traffic from reaching the Kubernetes API server.
-
Diagnosis: Look for errors in CircleCI logs that suggest a connection timeout or refusal when trying to reach the Kubernetes API endpoint. Check your cluster’s network policies:
kubectl get networkpolicy -n <your-namespace>If policies exist, examine them to see if they restrict egress from the CircleCI runner’s IP range or to the Kubernetes API server’s IP/port.
-
Fix: Modify your Network Policies to allow egress traffic from CircleCI’s IP addresses (or the CIDR range of your CI runners) to the Kubernetes API server’s IP and port (usually 443). This often involves adding a new
egressrule to an existing policy or creating a new one. Example:apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-circleci-egress namespace: production spec: podSelector: {} # Applies to all pods in the namespace policyTypes: - Egress egress: - to: - ipBlock: # Replace with the actual IP range of your CircleCI runners or your cluster's egress IPs cidr: 192.168.0.0/16 ports: - protocol: TCP port: 443 -
Why it works: This ensures that network traffic originating from CircleCI can successfully reach the Kubernetes API server, as allowed by the cluster’s security policies.
-
-
Incorrect Kubernetes Context or Credentials (if not using Service Account directly): If you’re not explicitly providing a service account and instead rely on CircleCI’s ability to pick up a context (e.g., from a
kubeconfigfile), that context might be misconfigured or outdated.-
Diagnosis: Ensure your CircleCI project has a
kubeconfigfile stored as a secret and that the context within that file points to the correct cluster and uses valid credentials. Test thekubeconfiglocally:export KUBECONFIG=/path/to/your/kubeconfig.yaml kubectl config current-context kubectl get pods -n <your-namespace> -
Fix: Update your
kubeconfigsecret in CircleCI with a valid, up-to-date configuration. Ensure thecluster.serverURL is correct and that theusersection contains valid client certificates or tokens.# In your CircleCI .circleci/config.yml jobs: deploy_to_k8s: docker: - image: cimg/base:stable steps: - checkout - restore_cache: keys: - kubeconfig-{{ checksum "kubeconfig.yaml" }} - run: name: Set up KUBECONFIG command: | mkdir -p ~/.kube echo "$KUBECONFIG_CONTENTS" > ~/.kube/config chmod 600 ~/.kube/config # ... use kubernetes/deploy orb or kubectl commandsMake sure
KUBECONFIG_CONTENTSis a project environment variable containing the fullkubeconfigcontent. -
Why it works: Providing a valid
kubeconfigfile allowskubectlto establish a direct, authenticated connection to the correct Kubernetes cluster endpoint.
-
After fixing these issues, your next hurdle will likely be dealing with image pull secrets for private container registries if your deployment manifests specify them.