Argo Workflows can store their output artifacts directly in Amazon S3, avoiding the need for a separate artifact repository.
Let’s see how this looks in practice. Imagine you have a simple Argo Workflow that creates a file and wants to save it as an artifact.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: s3-artifact-example-
spec:
entrypoint: main
templates:
- name: main
container:
image: alpine:latest
command: ["sh", "-c"]
args:
- echo "Hello from Argo!" > /tmp/hello.txt
outputs:
artifacts:
- name: hello-artifact
path: /tmp/hello.txt
This workflow, when executed, will produce a file named hello.txt in its container’s /tmp directory. The outputs.artifacts section tells Argo to treat this file as an artifact. By default, Argo Workflows will try to upload this artifact to a configured artifact repository.
To make Argo Workflows use S3 for storing artifacts, you need to configure the artifactRepository in your Argo Workflows controller’s configuration. This typically involves setting up an S3 bucket and providing the necessary credentials.
Here’s a snippet of how the argo-workflows-controller deployment might be configured for S3 artifact storage. The key is the args section, specifically --artifact-repository.
# ... other deployment args
- name: ARGO_WORKFLOW_ARGS
value: |
--artifact-repository
s3://my-argo-artifacts-bucket/argo-workflows/{{workflow.name}}/{{pod.name}}
--executor-image=argoproj/argocraft:v2.12.0
# ... other args
In this example:
-
s3://my-argo-artifacts-bucket/argo-workflows/{{workflow.name}}/{{pod.name}}is the S3 URI. -
my-argo-artifacts-bucketis the name of your S3 bucket. -
argo-workflows/{{workflow.name}}/{{pod.name}}is a prefix within the bucket. Argo uses Go templating to dynamically create paths.{{workflow.name}}and{{pod.name}}will be replaced with the actual names of the workflow and its pods at runtime. This ensures that artifacts from different workflows and pods are organized and don’t overwrite each other.
For Argo Workflows to be able to write to S3, the Kubernetes cluster running Argo Workflows needs appropriate IAM permissions. This is typically achieved by attaching an IAM role to the service account used by the Argo Workflows controller pods. This IAM role must have permissions like s3:PutObject, s3:GetObject, and s3:DeleteObject for the specified bucket.
When the workflow above runs, after the main template completes, Argo Workflows will:
-
Inspect the
outputs.artifactsfor themaintemplate. -
Find the
hello.txtfile at/tmp/hello.txt. -
Construct the S3 destination path using the configured
--artifact-repositorytemplate and the runtime values for{{workflow.name}}and{{pod.name}}. -
Upload
hello.txtto that S3 path. -
The artifact can then be accessed via its S3 URL, which will be visible in the Argo Workflows UI or obtainable via the Argo CLI.
The ability to use S3 directly simplifies your infrastructure by eliminating the need to manage a separate artifact repository service. It leverages existing AWS infrastructure and IAM for security and access control, making it a common choice for teams already invested in the AWS ecosystem.
The real power comes from dynamic path templating. You can include more than just workflow and pod names; you can use labels, annotations, and even custom variables defined within your workflow to create highly organized and searchable artifact storage structures in S3.
Once artifacts are stored in S3, you can easily configure subsequent steps in your workflow or other systems to consume these artifacts by referencing their S3 URIs.