Comparing topology snapshots before and after deployments is crucial for verifying that your infrastructure changes haven’t inadvertently altered critical network paths or resource relationships.

Let’s say we have a simple application deployed across two Kubernetes clusters, cluster-a and cluster-b, with a service my-app in cluster-a that needs to reach a database my-db in cluster-b.

Initial State (Before Deployment)

Imagine we’re using a tool like kubectl to inspect our resources.

In cluster-a:

kubectl get service my-app -o yaml -n default

Output might show:

apiVersion: v1
kind: Service
metadata:
  name: my-app
  namespace: default
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: my-app
  type: ClusterIP

And in cluster-b, for the database:

kubectl get service my-db -o yaml -n database

Output might show:

apiVersion: v1
kind: Service
metadata:
  name: my-db
  namespace: database
spec:
  ports:
  - port: 5432
    protocol: TCP
    targetPort: 5432
  selector:
    app: postgresql
  type: ClusterIP

For my-app in cluster-a to reach my-db in cluster-b, we’d typically have some form of cross-cluster networking set up. This could be via a service mesh (like Istio or Linkerd) with multi-cluster capabilities, or perhaps a dedicated CNI plugin that handles cross-cluster communication, or even something simpler like direct VPN tunnels.

Let’s assume a multi-cluster service mesh is in place. The my-app service in cluster-a would resolve to a Kubernetes Service object, but the actual network traffic would be intercepted by the service mesh’s sidecar proxy. This proxy would then route the traffic to the correct endpoint in cluster-b, potentially via a gateway or a direct mesh-to-mesh connection. The topology we’re concerned with here isn’t just the Kubernetes API objects, but the effective network path.

Deployment Scenario

Now, let’s imagine a deployment happens in cluster-a. We might be updating the my-app deployment itself, or perhaps changing its associated Service object, or even updating the networking configuration that enables cross-cluster communication.

For instance, a common mistake might be to accidentally change the selector on the my-app service in cluster-a to something that no longer matches the pods running the application. Or, perhaps the cross-cluster networking configuration was modified, breaking the path to cluster-b.

Post-Deployment State (After Deployment)

After the deployment, we’d re-run our checks.

In cluster-a:

kubectl get service my-app -o yaml -n default

Suppose the selector was accidentally changed to app: my-app-new:

apiVersion: v1
kind: Service
metadata:
  name: my-app
  namespace: default
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: my-app-new # <-- THIS CHANGED
  type: ClusterIP

If the pods still have the label app: my-app, this service will no longer route traffic to them. This is a local problem.

But let’s focus on the cross-cluster aspect. If the cross-cluster networking configuration was changed, the my-app service might still look correct in kubectl, but its ability to reach my-db in cluster-b is broken.

To compare topologies, we need to go beyond just kubectl get service. We need to understand how the system behaves from an end-to-end perspective.

Tools and Techniques for Comparison

  1. Service Mesh Observability: If you’re using a service mesh, this is your primary tool. Tools like Kiali (for Istio) or the Linkerd dashboard provide visualizations of your service graph.

    • Before: You’d capture a screenshot or export the service graph from Kiali/Linkerd showing my-app in cluster-a successfully connecting to my-db in cluster-b.
    • After: You’d capture the same graph. If the connection is broken, you’d see a red line, a missing edge, or a significant increase in error rates between the two services. The visualization directly shows the effective topology.
  2. Network Policy and Firewall Rules: For cross-cluster communication, especially if it’s not managed by a service mesh, you might have explicit network policies or firewall rules.

    • Before: You’d audit your NetworkPolicy objects in Kubernetes (if applicable) and any cloud provider firewall rules (e.g., AWS Security Groups, Azure Network Security Groups) that permit traffic between the clusters on the required ports (e.g., PostgreSQL’s 5432).
    • After: You’d re-audit. A common deployment error is to create new policies that inadvertently deny traffic that was previously allowed, or to forget to update rules when IP ranges change. For example, a new NetworkPolicy in cluster-b might restrict ingress to port 5432 to only pods with app: database, but if the traffic is coming from a different source IP range (e.g., a new gateway IP for the cross-cluster connection), it will be blocked.
  3. DNS Resolution and Endpoints: Even if the service mesh or CNI handles routing, the underlying Kubernetes Service and Endpoint objects are still relevant.

    • Before: You’d check the Endpoints for my-db in cluster-b.
      kubectl get endpoints my-db -n database -o yaml
      
      This would show the actual IP addresses of the pods serving my-db. You’d also verify DNS resolution from cluster-a to my-db.database.svc.cluster.local (or its cross-cluster equivalent).
    • After: You’d check again. If the my-db deployment in cluster-b had issues (e.g., pods crashing, incorrect labels), the Endpoints object might be empty or point to unhealthy IPs. If the cross-cluster DNS mechanism was broken, my-app pods might not be able to resolve the service name at all.
  4. Ingress/Egress Gateway Configuration: If your cross-cluster communication relies on ingress or egress gateways (common in service meshes or custom network setups), their configurations are critical.

    • Before: Examine the configuration of the gateway in cluster-a (for egress from my-app) and cluster-b (for ingress to my-db). This might involve custom Kubernetes resources like Gateway and VirtualService (Istio) or specific CNI configurations.
    • After: Check these configurations. A deployment might have accidentally altered a VirtualService rule that dictates how traffic leaving cluster-a destined for cluster-b is handled, or a Gateway resource that defines the entry point into cluster-b. For example, a change to an egress gateway’s IP address or port without updating the corresponding ingress gateway would break the connection.
  5. CNI Plugin State: If your cross-cluster connectivity is handled by your CNI, you might need to inspect the CNI’s specific control plane or configuration. This is highly dependent on the CNI. For example, some CNIs might expose their own CRDs or APIs to check network connectivity status between nodes or pods across clusters.

The Counterintuitive Truth About Topology

The actual network topology of a distributed system is often a complex, layered abstraction. What kubectl shows you are Kubernetes API objects – the desired state. The actual state of network reachability is determined by a combination of your CNI, service mesh, network policies, cloud provider firewalls, and the underlying network fabric. A "successful" deployment in Kubernetes might create all the right API objects, but if a firewall rule was missed or a service mesh gateway configuration was subtly altered, the effective topology will have broken connections that are invisible from a simple kubectl get perspective.

Next Steps

After ensuring your application can reach its dependencies across clusters, the next common problem is ensuring that external clients can reach your application’s ingress points, which involves a similar process of verifying ingress controller configurations and associated load balancers or firewall rules.

Want structured learning?

Take the full Dynatrace course →