Argo Workflows isn’t just a Kubernetes-native workflow engine; it’s a declarative state machine for your distributed applications, where the "state" is the successful completion of a step.

Let’s see it in action. Imagine you need to process a batch of images: resize them, apply a filter, and then upload the results.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: image-processing-
spec:
  entrypoint: main
  templates:
  - name: main
    dag:
      tasks:
      - name: download-images
        template: download
      - name: resize-images
        template: resize
        dependencies: [download-images]
      - name: apply-filter
        template: filter
        dependencies: [resize-images]
      - name: upload-results
        template: upload
        dependencies: [apply-filter]

  - name: download
    container:
      image: alpine:latest
      command: ["sh", "-c"]
      args: ["echo 'Downloading images...' && sleep 5 && echo 'Done downloading.'"]

  - name: resize
    container:
      image: alpine:latest
      command: ["sh", "-c"]
      args: ["echo 'Resizing images...' && sleep 7 && echo 'Done resizing.'"]

  - name: filter
    container:
      image: alpine:latest
      command: ["sh", "-c"]
      args: ["echo 'Applying filter...' && sleep 6 && echo 'Done filtering.'"]

  - name: upload
    container:
      image: alpine:latest
      command: ["sh", "-c"]
      args: ["echo 'Uploading results...' && sleep 4 && echo 'Done uploading.'"]

When you apply this YAML, Argo Workflows creates a Workflow resource. The Kubernetes controller for Argo watches for these resources. It then translates the dag definition into a series of Pods that run sequentially based on the dependencies. Each Pod is a single step in your workflow, executing a defined container. The Workflow resource itself tracks the overall state, from Pending to Running to Succeeded or Failed.

The core problem Argo Workflows solves is managing complex, multi-step processes that need to be automated, reliable, and observable. Think CI/CD pipelines, data processing jobs, machine learning model training, or infrastructure provisioning. It moves the logic of "what to do next" from imperative scripts scattered across your infrastructure to a declarative, auditable definition within Kubernetes.

You control the workflow’s execution through its lifecycle. You can:

  • Trigger workflows: Manually via argo submit or automatically via eventing (e.g., Git commits, S3 object creation).
  • Monitor progress: Using the argo list command, the Argo UI, or by inspecting the Workflow resource status.
  • Debug failures: By examining logs of individual steps (Pods) and the Workflow status.
  • Parameterize and reuse: Define templates that accept inputs and produce outputs, making them reusable across different workflows.
  • Handle retries and error conditions: Configure steps to automatically retry on failure or execute specific error-handling paths.

The DAG (Directed Acyclic Graph) template is powerful for defining parallel execution paths. In the example above, resize-images runs only after download-images completes. If you had multiple independent processing steps after download-images, you could define them to run concurrently by having them all depend only on download-images and not on each other. Argo Workflows manages the Kubernetes Pod scheduling to ensure these dependencies are met.

A common pattern for handling sensitive information or configuration is to use Kubernetes Secrets or ConfigMaps and mount them into your workflow steps. For instance, an upload step might need S3 credentials. You’d define a Secret in Kubernetes, and then in your upload template, specify:

  - name: upload
    container:
      image: amazon/aws-cli:latest
      command: ["aws", "s3", "cp", "/data/results.txt", "s3://my-bucket/results/"]
      volumeMounts:
      - name: aws-credentials
        mountPath: /root/.aws/credentials
        subPath: credentials
    volumes:
    - name: aws-credentials
      secret:
        secretName: my-aws-secret
        defaultMode: 0400

This mounts a specific key from your my-aws-secret Kubernetes secret directly into the container at the expected location for the AWS CLI to pick up. The defaultMode: 0400 ensures the credential file is only readable by the owner, which is good practice.

When a workflow fails, the Workflow resource will have a status of Failed, and the spec.deadlineTo field, if set, indicates the deadline by which the workflow was expected to complete. You can also define resilience strategies on steps, such as retryStrategy, which allows you to specify limit, backoff, and retryPolicy (e.g., OnFailure, Always). This is crucial for transient network issues or temporary service unavailability in your workflow steps.

The concept of "artifact passing" is fundamental to how data flows between steps. When a step produces output that another step needs, it’s typically uploaded to an artifact repository (like S3, GCS, or MinIO) and then referenced by the downstream step. Argo Workflows abstracts this by allowing you to define outputs in a template, specifying an artifactPath or archive that will be uploaded. Downstream steps can then access these artifacts via inputs that reference the outputs of previous steps.

The next logical step is understanding how to orchestrate complex, stateful applications beyond simple linear or DAG flows, which leads to exploring steps templates and recursive workflow definitions.

Want structured learning?

Take the full Argo-workflows course →