Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

cargo-athena

Write a normal Rust program. Get an Argo Workflow.

cargo-athena compiles ordinary, annotated Rust into Argo Workflows YAML — and ships your compiled binary so every step runs your real code.

use cargo_athena::{workflow, container};

#[workflow]
fn pipeline() {
    let raw = fetch("https://example.com/data".to_string());   // a #[container]
    let clean = transform(raw, 3);
    publish(clean);                                            // a #[container]
}

#[container(image = "ghcr.io/acme/app:latest")]
fn transform(data: String, factor: i64) -> String {
    format!("{data} x{factor}")          // this actually runs in the pod
}

That #[workflow] becomes an Argo WorkflowTemplate whose DAG wires fetch → transform → publish by their data dependencies. transform becomes a container template of its own. In-pod, your binary deserializes data and factor, runs the function body, and serializes the result for the next step.

# `cargo athena emit` → one WorkflowTemplate per fn, wired by name:
kind: WorkflowTemplate              # the #[workflow] is a DAG
metadata: { name: app-pipeline }
spec:
  templates:
    - dag:
        tasks:
          - { name: fetch }
          - { name: transform, dependencies: [fetch] }   # data dep → edge
          - { name: publish,   dependencies: [transform] }
# …plus one WorkflowTemplate per #[container]: your image, your binary,
# and a tiny bootstrap that runs the right function.

Why

  • No YAML. The workflow is the program. Refactor with the compiler, not a templating language.
  • Type-checked data flow. Passing the wrong type between steps, a missing struct field, or consuming a step that returns nothing is a compile error — caught long before a cluster ever sees it.
  • Composable. A workflow is a Rust type. Reference it from another crate to force-link it — workflows compose across modules and crates with no registry or hand-run codegen.
  • Real Rust in any image. Your binary is fetched at runtime, so each step runs your Rust on top of any image you pick — a vendor’s postgres:16, a CUDA base, your team’s tooling image — with no custom Dockerfile per step. The image just needs sh and uname.

How it fits together

You writecargo-athena produces
#[workflow] fnan Argo WorkflowTemplate (a DAG, or sequential steps)
#[container] fnan Argo WorkflowTemplate (a container step) that runs your real Rust
#[fragment] fna plain helper that carries pod resources into its callers
fn main()the entrypoint into your workflow binary

Ship and run it in two commands: cargo athena publish (cross-compile

  • upload the binary), then cargo athena submit <workflow> (registers the templates and starts the run) — see the CLI.

Read Getting Started to go hands-on, then Core Concepts for the mental model. The #[workflow] and #[container] pages are the complete feature reference.

Getting Started

From nothing to a running workflow — assuming you have an S3-compatible bucket and a reachable Argo cluster.

Install

cargo install cargo-athena                      # the `cargo athena` subcommand
cargo add cargo-athena --no-default-features    # the library, in your workflow crate

# …or via Nix:
nix profile install github:mostlymaxi/cargo-athena
nix run github:mostlymaxi/cargo-athena -- athena …

⚠️ Library users: --no-default-features. A workflow crate needs only the macros + runtime; the default cli feature pulls a heavy CLI tree (kube, reqwest, tokio, …) it doesn’t use.

emit needs nothing but an athena.toml. publish also needs the Zig cross toolchain: cargo install cargo-zigbuild + zig (cargo athena build checks and tells you what’s missing).

A tiny pipeline

Three containers in a chain — data flow becomes the DAG. Source · Emitted YAML

use cargo_athena::{container, workflow};

#[workflow]
fn pipeline() {
    let raw = fetch("https://example.com/data".to_string());
    let summary = summarize(raw, 3);
    publish(summary);
}

#[container(image = "ghcr.io/acme/app:latest")]
fn fetch(url: String) -> String {
    format!("data-from:{url}")
}

#[container]
fn summarize(data: String, top_n: i64) -> String {
    format!("top-{top_n}:{data}")
}

#[container]
fn publish(report: String) {
    println!("publishing {report}");
}

fn main() {
    cargo_athena::entrypoint::<pipeline>();
}

Add an athena.toml at or above your crate (found by walking up, like Cargo.toml):

[artifact_repository.s3]
endpoint = "s3.amazonaws.com"
bucket   = "my-bucket"
region   = "us-east-1"
access_key_secret = { name = "my-s3", key = "accessKey" }
secret_key_secret = { name = "my-s3", key = "secretKey" }

[artifact]
key = "athena/{crate}/{version}/{bin}.tar.gz"

[bootstrap]
targets = ["x86_64-unknown-linux-musl", "aarch64-unknown-linux-musl"]

Ship it

cargo athena emit                                    # inspect the YAML, no infra needed
cargo athena publish                                 # cross-compile + upload the binary
cargo athena submit cargo-athena-example-getting-started-pipeline

submit type-checks args, confirms the binary is uploaded, registers the WorkflowTemplates (y/N on drift), creates the run, and prints its name. Credentials come from AWS_* env vars or instance-role identity. -y skips prompts, --update re-applies, --argo-server/$ARGO_SERVER selects the REST path — see the CLI page.

GitOps alternative: cargo athena emit | kubectl apply -f - registers the templates; argo submit --from workflowtemplate/<root> runs them. Names are stable and deterministic.

Try one step locally before deploying? cargo athena container emulate runs a single #[container] under docker/podman exactly as Argo would.

Next: Core Concepts.

Core Concepts

A few ideas explain everything cargo-athena does.

1. Templates are types

Each #[workflow] / #[container] lowers to a unit-struct type implementing an internal Template trait. Names, inputs, and the emitted YAML are derived by the compiler.

Referencing a template type force-links its defining crate; emission walks the reachable closure from your entrypoint via monomorphic calls. No registry, no DCE concern — nothing uncalled is emitted, nothing called is missed. Workflows compose across modules and crates through normal Rust name resolution.

2. #[workflow] is a statically analyzed DAG

A workflow body is read, not executed. Each let x = t(args); or t(args); becomes an Argo task; data flow becomes templateRef wiring and DAG edges.

Because the body is read, it is also type-checked as ordinary Rust by a hidden, never-run “ghost” — wrong types, arity, fields, or calling a non-template are compile errors. Only the lowered shapes (let/call statements, if/else) are accepted; anything else is a spanned compile_error!. Full details on the #[workflow] page.

3. #[container] runs real Rust in a pod

A container body genuinely executes inside its pod. Arguments arrive as Argo input parameters; the return is captured as outputs.parameters.return for the next step. I/O is serde-bound at compile time — take and return owned types.

Arguments can also be spliced into the pod spec: `image = “repo:”

  • taginjects an argument into the image (and likewise intoservice_account, node_selector). See [#[container]` → Parameter injection](container.md).

4. Data flows as JSON parameters

Every value between steps is JSON-encoded into an Argo parameter and decoded by the receiver. The uniformity is what makes the DAG analysis tractable and what keeps a String "7" a string on the wire (never silently parsed as a number).

It also makes more specialised plumbing fall out cleanly: b.field on a binding wires only that field via a JSON path on the producer’s output, and list.fan_out(|x| step(x)) lowers to Argo withParam — one task invocation per element of the list.

5. #[fragment] carries pod resources

Pod resources (host! mounts, S3 artifact ports) are declared inside a #[container] or #[fragment]. A fragment is a normal helper that runs as real code in the calling pod; every resource it declares is collected onto each container that transitively calls it — composable resource decls, no global registry.

6. The binary runs in two worlds

The same compiled binary plays two roles:

  • Emit-time, on your machine — cargo athena emit / publish / submit walks the template closure from your entrypoint and prints one WorkflowTemplate per template.
  • Pod-time, in Argo — the binary deserializes the step’s inputs, calls the matching #[container] body, and serializes the return.

cargo athena publish cross-compiles it static-musl per athena.toml target and uploads it to your S3 ArtifactRepository. emit adds it as an input artifact to every container template; a tiny sh bootstrap picks the matching app-<triple> and execs it. The image needs only sh and uname — distroless works.


With these in mind, the reference pages are the details: #[workflow], #[container], the CLI, and athena.toml.

#[workflow]

A #[workflow] is an Argo DAG. Its body is statically analyzed, not executed: each statement is lowered to an Argo task, data flow becomes templateRef wiring and DAG edges, and the function name becomes a WorkflowTemplate. The entrypoint is a type:

fn main() { cargo_athena::entrypoint::<run_foo>(); }

The body is also type-checked as ordinary Rust by a hidden, never-run “ghost” copy: wrong argument types or arity, a missing struct field, consuming a workflow that has no return, or calling a #[fragment] / regular function from a #[workflow] are all compile errors, not runtime surprises.

Attribute arguments

#[workflow(name = "...", steps,
           node_selector = { "k" = "v", ... },
           on_exit_if_root = path::to::template,
           retry(limit = 2, policy = "OnError", backoff = "30s"),
           ttl_if_root(after_completion = 86400, after_success = 3600, after_failure = 7200),
           pod_gc_if_root(strategy = "OnWorkflowSuccess"),
           active_deadline_if_root = "2h")]
ArgEffect
name = "my-name"Override the Argo template name. Default: <crate>-<fn> (kebab).
stepsEmit an Argo steps: (sequential) template instead of the default data-dependency dag:.
node_selector = { "k" = "v" }nodeSelector on this dag/steps template; the controller cascades it onto every task pod this workflow runs. Keys and values are literal strings — see Node selector.
on_exit_if_root = tWhole-workflow exit handler on this template’s own spec.hooks.exit. Fires only when this template is the workflow you submit. Distinct from the per-task .on_exit(t) builder.
retry(limit = N | unlimited, policy = "…", backoff = <dur>)Template-level Argo retryStrategy. limit is required (unlimited ⇒ no cap); policyAlways|OnFailure|OnError|OnTransientError; backoff is an int (seconds) or a humantime string.
ttl_if_root(after_completion = <s>, after_success = <s>, after_failure = <s>)WorkflowSpec ttlStrategy: GC the finished Workflow. ≥1 of the three is required (int seconds or humantime). Root-only.
pod_gc_if_root(strategy = "<S>")WorkflowSpec podGC. strategyOnPodCompletion|OnPodSuccess|OnWorkflowCompletion|OnWorkflowSuccess. Root-only.
active_deadline_if_root = <secs | "2h">WorkflowSpec activeDeadlineSeconds — the whole-workflow runtime cap. The only timeout that works on a #[workflow]. Root-only. See Timeouts.

All are optional. A parameter name (i.e. a function argument) or a name = "…" value that a YAML 1.1 parser reads as a boolean/null (y/yes/n/no/on/off/true/false, null, ~, any case) is a compile error — Argo’s YAML→JSON parser would silently mis-type it.

Timeouts

To time-bound a whole workflow, use active_deadline_if_root — the only mechanism Argo enforces at workflow scope. The other two knobs (timeout, pod_running_timeout) are per-pod and live on #[container].

_if_root is load-bearing: like ttl_if_root/pod_gc_if_root, the cap applies only when this WorkflowTemplate is the workflow you actually submit. It is inert when this template is templateRef’d as a nested sub-workflow.

Every duration is an integer (seconds) or a humantime string ("90s", "1h30m", "2d").

Node selector

#[workflow(node_selector = {
    "kubernetes.io/arch" = "amd64",
    "topology.kubernetes.io/region" = "{{workflow.parameters.region}}",
})]
fn pipeline() { /* ... */ }

athena puts the selector on the dag/steps template; the Argo controller cascades it onto every task pod the workflow runs. Unlike #[container(node_selector = …)], keys and values are literal strings only — no "lit" + arg parameter injection.

For a dynamic value, drop in {{workflow.parameters.<NAME>}} as a literal (as in region above). That’s the only interpolation Argo keeps through the cascade, and it always resolves against the submitted root — supply <NAME> to the workflow you actually submit, not to this sub-workflow.

The body

Only three statement shapes are lowered:

let x = template(args);   // a task; `x` binds its output
template(args);           // a task (no output consumed)
if cond { ... } else { ... }  // see "if / else" below

Everything else — match, for/while/loop, macros, arbitrary method calls, let with non-ident/tuple patterns, let … else — is a hard compile_error! with a spanned message. Nothing is silently dropped.

Arguments to a template call

FormLowers to
literal "s", 7, truea static Argo parameter value
a #[workflow] input param{{inputs.parameters.<name>}}
a prior let binding{{tasks.<dep>.outputs.parameters.return}} + a DAG edge
binding.clone() / binding.to_owned()same as the binding (type-preserving)
"lit".to_string() / "lit".into()same as the literal (literal-only)
binding.field.sub{{=toJSON(fromJSON(<src>)['field']['sub'])}} (named struct fields; tuple/index access is not lowered)
a nested call foo(bar())bar becomes its own task; foo takes a ref to it (recursive: foo(bar(baz())))

Notes:

  • .clone() is the fan-out marker. The body is faithful Rust (real move semantics). Sending one binding to two consumers requires an explicit .clone() — which is exactly correct, since Argo copies the output parameter into each consumer.
  • .to_string() / .into() are literal-only. On a binding/input they would change the Rust type while emit still passes the raw serialized parameter — a silent mismatch — so they are rejected there. Any literal value is fine (every parameter value is emitted as JSON, so a string like "no" is unambiguous).

Return values

A #[workflow] with a return type bubbles its terminal task’s output up as the template’s own outputs.parameters.return, so a parent consumes a sub-workflow exactly like a container:

#[workflow]
fn sub(seed: String) -> String {
    let fetched = fetch(seed);
    transform(fetched, 7)        // tail call == this workflow's return
}

#[workflow]
fn parent() {
    let r = sub("seed".to_string());
    publish(r);                  // {{tasks.sub.outputs.parameters.return}}
}

The terminal is the tail template call, a returned/tail binding, or a value-if (below). A return type with no resolvable terminal is a compile error.

Custom method calls

Per-task builder chain

A task call may be suffixed, in any order, with:

fetch(url).continue_on(failed, error);          // dependents proceed on failure/error
transform(x).on_exit(cleanup);                  // unconditional per-task exit hook
transform(x).on_exit(record("done"));           // hook target may take args
transform(x).on_success(notify).on_failure(alarm);   // repeatable phase hooks
transform(x).on_error(alarm);
transform(x).hook_if("workflow.status == 'Failed'" = alarm);  // raw Argo expr escape hatch
  • .continue_on(failed | error | failed, error) — ≤1; sets Argo continueOn.
  • .on_exit(t) / .on_exit(t(args)) — ≤1; the special unconditional exit hook.
  • .on_success(t) / .on_failure(t) / .on_error(t) — repeatable; athena generates the Argo phase expression.
  • .hook_if("raw-argo-expression" = t, …) — repeatable; verbatim Argo expression escape hatch.

Any hook target is t or t(args) (args resolved like task args). Hook templates are force-linked and emitted like any callee.

.fan_out(|x| C(x, …)) — list fan-out

let b = a.fan_out(|x| caps(x, "!".to_string())); runs caps once per element of a (Argo withParam; the closure parameter is {{item}}, {{item.field}} for a field of it). b is the aggregated Vec<U>, consumed downstream like any output.

  • a (the source) must be a prior let binding or a #[workflow] input that is a list.
  • the closure body must be a single template call.
  • the element/closure/result types are checked by the ghost (AthenaList<T> is blanket-implemented for Vec<T>/[T; N]).

if / else / else if

Real Rust conditionals lower to synthesized, when-gated wrapper workflows; exactly one branch runs.

// statement-if / else-if / else
if n == 0 {
    note("zero".to_string());
} else if m.id == "abc" && n > 1 {
    note(chosen);
} else {
    note("other".to_string());
}

// value-if: the wrapper selects + returns the taken branch
let chosen = if n > 3 { left(n) } else { right(n) };
  • Conditions are a closed grammar: comparisons == != < <= > >=, combined with && / || / !. Operands are a binding, a #[workflow] input, an a.field of one, a literal, or a nested template call (if foo() > 3foo runs as a parent task, since Rust evaluates the condition unconditionally). Anything outside this grammar (method calls, arithmetic, casts) is a targeted compile error.
  • Value-if requires an else and both arms producing the same type — Rust enforces this, and the ghost inherits it.
  • Bindings created inside an arm are not visible after the if (Argo has no phi node); use the value-if form to pass a result out.

Type checking & strictness, in one line

The data flow is compiler-enforced and the body contract is fail-loud: if it compiles, the argument/field/return types line up and every statement was lowered — there is no silent mis-emit.

#[container]

A #[container] is a workflow step whose body is ordinary Rust, executed in a pod. Unlike a #[workflow] (statically analyzed), a container’s body really runs: arguments are deserialized from Argo parameters, the function executes, and its return value is serialized back out.

#[container(image = "ghcr.io/acme/app:latest")]
fn run_a_container(a: String) -> String {
    println!("regular code, got: {a}");
    format!("done:{a}")          // -> outputs.parameters.return
}

It compiles to its own Argo WorkflowTemplate. The image is arbitrary — it needs only POSIX sh and uname, so distroless and read-only-rootfs images work fine.

I/O contract

  • Each function argument is an Argo input parameter, deserialized (serde) from that parameter at pod start.
  • The return value is serialized to outputs.parameters.return, so a #[workflow] consumes it as {{tasks.<t>.outputs.parameters.return}}.
  • Container I/O is compile-time bound to serde (DeserializeOwned / Serialize). Borrows can’t cross this boundary — take/return owned types (String, not &str).

How it runs in-pod

cargo athena publish cross-compiles a static-musl, multi-arch binary and uploads it to the S3 ArtifactRepository in athena.toml. emit adds that binary as an input artifact to every container template; Argo stages it at /athena/bin, and a small sh bootstrap picks the matching app-<triple> and execs it.

All athena paths live under a pod-scoped emptyDir at /athena.

Attribute arguments

#[container(
    image = "ghcr.io/acme/app:latest",
    name = "...",
    service_account = "athena-runner",
    node_selector = { "kubernetes.io/arch" = "amd64", "disktype" = "ssd" },
    on_exit_if_root = path::to::template,
    retry(limit = 3, policy = "OnError", backoff = "30s"),
    timeout = "5m",
    pod_running_timeout = 600,
    ttl_if_root(after_completion = 86400, after_success = 3600, after_failure = 7200),
    pod_gc_if_root(strategy = "OnWorkflowSuccess"),
    active_deadline_if_root = "2h",
)]
ArgEffect
image = "…"Container image. Default: [bootstrap].default_image from athena.toml.
name = "…"Override the Argo template name. Default: <crate>-<fn> (kebab).
service_account = "…"Pod ServiceAccount. Default: [defaults].service_account.
node_selector = { "k" = "v", … }Template-level nodeSelector; the controller cascades it onto this template’s pods. Keys are literal; values may be injected (below).
on_exit_if_root = tWhole-workflow exit handler. Fires only when this template is the workflow you submit. Distinct from the per-task .on_exit(t) builder.
retry(limit = N | unlimited, policy = "…", backoff = <dur>)Template-level Argo retryStrategy. limit is required (unlimited ⇒ no cap); policyAlways|OnFailure|OnError|OnTransientError; backoff is an int (seconds) or a humantime string.
timeout = <secs | "5m">Argo Template.timeout. Controller-enforced node timeout, counts Pending time. See Timeouts.
pod_running_timeout = <secs | "1h30m">Argo Template.activeDeadlineSeconds on the pod. Kubelet-enforced; only counts time Running. See Timeouts.
ttl_if_root(after_completion = <s>, after_success = <s>, after_failure = <s>)WorkflowSpec ttlStrategy: GC the finished Workflow. ≥1 of the three is required (int seconds or humantime). Root-only.
pod_gc_if_root(strategy = "<S>")WorkflowSpec podGC. strategyOnPodCompletion|OnPodSuccess|OnWorkflowCompletion|OnWorkflowSuccess. Root-only.
active_deadline_if_root = <secs | "2h">WorkflowSpec activeDeadlineSeconds — the whole-workflow runtime cap. Root-only. See Timeouts.

All optional. As with #[workflow], an argument name or a name = "…" value that a YAML 1.1 parser reads as a boolean/null is a compile error.

Timeouts

Argo has three “stop after a while” knobs.

AttributeArgo fieldEnforced byClock starts
timeoutTemplate.timeoutArgo controllernode creation (includes Pending)
pod_running_timeoutTemplate.activeDeadlineSecondsKubernetes kubeletpod Running
active_deadline_if_rootWorkflowSpec.activeDeadlineSecondsArgo controllerwhole-workflow start (root-only)

A pod stuck Pending trips timeout but not pod_running_timeout. Both are #[container]-only; Argo applies neither to dag/steps templates, so they’re rejected on a #[workflow]. The only working whole-workflow timeout is active_deadline_if_root.

Every duration accepts an integer (seconds) or a humantime string ("90s", "1h30m", "2d").

Parameter injection

image, service_account, and node_selector values can splice in the container’s own arguments — Argo substitutes the real value into those fields when the pod is created:

#[container(
    image           = "ghcr.io/acme/app:" + tag,            // arg
    service_account = "athena-" + tenant + "-runner",        // literal + arg + literal
    node_selector   = { "kubernetes.io/arch" = "amd64",      // literal value
                        "disktype" = profile.disk },         // a named struct field
)]
fn run(tag: String, tenant: String, profile: Profile) { /* ... */ }

Rules:

  • The value is a string literal, or a +-concatenation of string literals and operands. An operand is an argument (tag) or a named struct field of one (profile.disk, a.b.c — named fields only, no a.0/a[i]).
  • String-literal segments are emitted verbatim. A hand-written {{workflow.parameters.x}} inside a literal passes through untouched — the escape hatch if you know Argo’s templating and want it raw.
  • Operands must be String/&str or a number (i64, f64, …). That’s enforced at compile time: anything else (a struct, Vec, bool, …) is an error, because only those round-trip to the obvious raw scalar. A non-argument identifier, a tuple/index field, or any other expression is also a targeted error.
  • node_selector keys are always literal. (Argo can substitute them, but a dynamic label key is a foot-gun, so athena forbids it by design.)
  • name is the static Argo template identity and on_exit_if_root is a template path — neither is an injection target.

Under the hood an operand lowers to {{=fromJSON(inputs.parameters['arg']['field'…])}} — Argo evaluates it to the raw scalar at pod creation. You don’t need to know that; the point is image = "repo:" + tag just works.

#[fragment]

A #[fragment] is a plain helper functionnot a template. It is genuinely called as Rust, so it executes inside the calling container’s pod:

#[container(image = "ghcr.io/acme/tools:latest")]
fn build() {
    frag_a();                                  // ordinary call, runs in this pod
}

#[fragment]
fn frag_a() {
    let _ = cargo_athena::host!("/var/lib/a"); // resource carried to `build`
    frag_b();                                  // transitive
}

#[fragment]
fn frag_b() { let _ = cargo_athena::host!("/var/lib/b"); }

Its purpose is to carry pod-resource declarations across function boundaries. Every host! / artifact-port macro a fragment uses is collected onto each #[container] that transitively calls it (resolved as a closure at emit time). A #[fragment] cannot be called from a #[workflow] (it is not a Template, so it fails as a type error).

Macro calls

These declare pod resources and are only valid inside a #[container] or #[fragment] (the public form is a compile_error! anywhere else, and a #[workflow] using one is a hard error):

MacroEffectRuntime value
host!("/abs/path")a hostPath volume mounted at that path&str path
load_artifact!("key")an Argo S3 input artifact port at the exact athena.toml object keyVec<u8>
load_artifact_str!("key")same, as textString
save_artifact!("key", bytes)an Argo S3 output artifact portwrites impl AsRef<[u8]>
save_artifact_str!("key", text)same, as textwrites impl AsRef<str>
#[container]
fn publish(report: String) {
    let notes = cargo_athena::load_artifact_str!("notes");   // S3 input port
    println!("publishing {report} (notes: {notes})");
    cargo_athena::save_artifact!("receipt", format!("ok:{report}"));
}

Key properties:

  • Literal key only. The argument is the exact S3 object key (for artifacts) or absolute path (for host!) — a string literal/const, resolved at compile time.
  • Static AST union, not a trace. Declarations are collected from every if/match/loop branch, not the one path that runs. This is correct, not approximate: Argo fixes the pod spec before the pod runs, so the union is the only expressible semantics.
  • Decoupled through the bucket. Artifact producer and consumer share only the S3 key — there is no DAG dependency, no {{tasks.…}} wiring, and no ordering imposed. A missing object is an Argo error at run time.
  • Carried through #[fragment]s transitively, as above.

Used path-qualified (cargo_athena::host!) by convention so it doesn’t require a use and the gating compile-errors stay obvious.

The cargo athena CLI

After cargo install cargo-athena you have the cargo athena subcommand. It drives your workflow crate’s binary (the one whose main calls cargo_athena::entrypoint::<Root>()) in the right mode.

cargo athena [-c F] emit  [--package PKG] [--bin B] [--out F] [--with-workflow]
cargo athena [-c F] container ls       [-p PKG] [--bin B] [--all]
cargo athena [-c F] container emulate  <name> [-a k=v].. [--input-file F] [-p PKG] [--bin B]
                                       [--build|--tarball F] [--runtime R] [--skip-artifacts]
cargo athena [-c F] container describe <name> [-p PKG] [--bin B]
cargo athena [-c F] workflow  ls       [-p PKG] [--bin B] [--include-synthetic]
cargo athena [-c F] workflow  describe <name> [-p PKG] [--bin B]
cargo athena [-c F] submit <name> [-a k=v].. [-n NS] [--service-account SA]
                          [--node-selector k=v].. [--argo-server URL] [-y] [--update]
cargo athena [-c F] build [--package PKG] [--bin B] [--target T].. [--print]
cargo athena [-c F] publish [--package PKG] [--bin B] [--target T].. [--tarball F] [--print]

-c, --config <FILE> (global) points at an athena.toml. By default the nearest one walking up from the cwd is used (like Cargo.toml), or $ATHENA_CONFIG.

emit

Relays the multi-document YAML: one WorkflowTemplate per reachable template, cross-referenced by templateRef. The names are stable and deterministic (<crate>-<fn>) — register them and trigger runs with argo submit --from workflowtemplate/<root>.

The ergonomic path is publish + cargo athena submitsubmit does this emit + register and starts the run for you. Reach for emit directly to inspect the YAML, or for a GitOps kubectl apply pipeline.

cargo athena emit --package my-crate                    # to stdout
cargo athena emit --package my-crate --out wf.yaml      # to a file
cargo athena emit --package my-crate | kubectl apply -f -   # register

--with-workflow also appends a convenience runnable Workflow (generateName, workflowTemplateRef → root), so cargo athena emit --with-workflow … | kubectl create -f - registers and fires one run — handy for demos. Off by default: a generateName object isn’t idempotent and isn’t something you’d GitOps; the deterministic templates are.

Needs only an athena.toml (it bakes the artifact source into the YAML) — no cluster, S3, or cross-build. The fast iteration loop.

container emulate

Runs one #[container] locally under docker/podman, exactly as Argo would: the same image, the same injected bootstrap, the same ATHENA_PARAM_* env, the /athena scratch dir, host! binds, and S3 artifact ports. Test a single node locally — no Kubernetes, no source on the node.

# default: pull the *deployed* binary from S3 and run it in its image
cargo athena container emulate my-crate-transform -a data=hello -a factor=4

cargo athena container emulate my-crate-fetch --input-file args.json
cargo athena container emulate my-crate-fetch --build         # local musl build instead

Fidelity is by construction: the binary reports its run metadata from the same Template::build() emit uses, so there’s nothing to keep in sync.

  • <name> (positional) — the full template name (<crate>-<fn> kebab, or the #[container(name = "…")] override). cargo athena container ls lists them. A #[workflow] is rejected (it’s a DAG, not a pod — emulate its containers individually).
  • -a name=value (repeatable, --arg) / --input-file F — the function arguments. A value is parsed as JSON if it parses (-a n=4 → number), else a string; all are JSON-encoded into the env exactly as Argo passes them. Arguments are type-checked against the fn’s real signature before anything launches — missing, unknown (with did-you-mean), and wrong scalar/array kinds fail fast.
  • -p/--package, --bin select the cargo target (see package selection).
  • Binary source: default = pull the deployed tarball from the athena.toml S3 repo (smoke-test what’s live). --build packages a local host-arch musl binary; --tarball F uses one verbatim. S3 credentials come from the standard AWS_* env vars.
  • --runtime docker|podman (default: autodetect, prefer docker); --skip-artifacts to bypass S3 load/save_artifact! sync.

Limitations — this runs the container body faithfully, not the pod’s Kubernetes context. docker run has no notion of a ServiceAccount, so #[container(service_account=…)] and any podSpec-level concerns (RBAC, nodeSelector, podSpecPatch) are not emulated. For those, exercise the real Argo path (emit + submit).

container describe

Prints, as JSON, the exact runner metadata one template reports — its image, parameters and their Rust types, the binary/host!/artifact S3 ports, and the scratch + result paths. It’s the same metadata emulate consumes (derived from the same Template::build() as emit), so it’s the way to see what would run, or to script around it:

cargo athena container describe my-crate-transform

container ls

Lists the templates your workflow binary reports — full name, kind, and typed args — so they’re discoverable for emulate/describe (no guessing the <crate>-<fn> name):

cargo athena container ls            # #[container]s only
cargo athena container ls --all      # + #[workflow]s and synthetic templates
NAME                                  KIND       ARGS
my-crate-fetch                        container  url: String
my-crate-transform                    container  data: String, factor: i64

workflow ls

The #[workflow]s in the package (name + typed inputs). athena’s synthesized if/else wrapper + arm sub-workflows are an implementation detail, so they’re hidden unless --include-synthetic:

cargo athena workflow ls                      # your #[workflow]s
cargo athena workflow ls --include-synthetic  # + the if/else machinery

workflow describe

Same metadata dump as container describe, for any template — handy on a #[workflow] to see its resolved inputs:

cargo athena workflow describe my-crate-pipeline

submit

Run a #[workflow] (or a single #[container]) on a real cluster — argo submit --from workflowtemplate/<name> with the safety rails you’d otherwise do by hand. Paired with publish this is the recommended deploy+run flow: publish ships the binary, submit registers the templates and starts the run — no hand-run emit/kubectl apply:

cargo athena submit my-crate-pipeline -a seed=hello
W=$(cargo athena submit my-crate-pipeline -a seed=hello -y)   # scriptable

Before anything is created it:

  1. type-checks -a/--input-file against the template’s real signature (same report as emulate);
  2. confirms the binary tarball is uploaded (so pods can bootstrap; --skip-binary-check to bypass);
  3. registers + drift-checks every reachable WorkflowTemplate: missing ones are created, ones that differ from emit are updated — after a y/N prompt (the change list is shown; --update re-applies all, -y/--yes skips every prompt);
  4. creates the Workflow (a second y/N), then prints its name to stdout (everything else is on stderr, so W=$(… -y) works).

Transport mirrors the argo CLI: with --argo-server/$ARGO_SERVER set it uses the Argo Server REST API ($ARGO_TOKEN for auth, --insecure-skip-tls-verify if needed); otherwise it creates the CR through the Kubernetes API via your kubeconfig / in-cluster config (client-certs, tokens, and EKS/GKE/AKS exec-credential plugins all work — it’s kube-rs).

Overrides: -n/--namespace ($ARGO_NAMESPACE[defaults].namespacedefault), --service-account (→ [defaults].service_account), and --node-selector k=v (repeatable; set root-scoped on the submitted Workflow — Argo applies it to every pod).

build

Cross-compiles a static-musl binary for each target in athena.toml’s matrix, packages them as app-<triple> inside one .tar.gz, and prints the exact upload destination:

cargo athena build --package my-crate           # build + package
cargo athena build --package my-crate --print   # dry run: just resolve + print the key
  • --target T (repeatable) overrides the athena.toml target matrix.
  • Requires the Zig cross toolchain: cargo install cargo-zigbuild and zig. build checks for both up front and tells you exactly what to install if either is missing.

cargo athena publish

The one-shot build + upload: cross-compiles + packages (exactly like build) and then uploads the tarball to athena.toml’s artifact repository — the same key emit resolves, so it lands where the injected bootstrap fetches it:

cargo athena publish --package my-crate   # cross-compile + package + upload
  • Takes the same flags as build (-p/--bin, --target, --print). --print is a dry run (resolve + print the key, no build/upload). Use plain build when you want the tarball locally without uploading (CI artifact, inspection).
  • --tarball F uploads F verbatim and skips the build — build-once / upload-many (reuse one CI-built artifact; the kind e2e uses this).
  • S3 credentials: the standard AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY (/ AWS_SESSION_TOKEN) env vars, else the ambient cloud identity (EC2 IMDS / ECS task role / IRSA web-identity) — the same object_store path submit/emulate use. The shared ~/.aws/credentials file and AWS_PROFILE are not read (object_store is not the AWS SDK).
  • AWS_ENDPOINT_URL (AWS-SDK standard; AWS_ENDPOINT_URL_S3 too) overrides the athena.toml endpoint for this upload only — for when S3 is reached differently here than from the pods (a port-forward, or a public vs in-cluster host). It does not change what emit bakes into the templates.
  • The destination s3://bucket/key is printed on stdout (scriptable); progress on stderr.

emit injects that tarball plus a tiny sh bootstrap into every container template, so one artifact serves every step on any node architecture.

Package selection

cargo athena runs your crate’s binary. Which one is resolved, in order:

  1. -p/--package and --bin flags (same meaning as for cargo itself);
  2. else [defaults] in athena.tomlpackage = "…" / bin = "…" (set them once instead of repeating the flags, like a project default);
  3. else cargo’s single-package / default-bin autodetect.

So in a configured workspace cargo athena container ls and cargo athena container emulate my-crate-fetch -a url=… just work with no target flags. (-p is package here — function arguments to emulate are -a/--arg.)

This precedence (and the -p short flag) is for container/workflow/submit. emit/build/publish take --package/--bin explicitly — no -p, no [defaults] fallback (pass them, or rely on cargo’s single-package autodetect).

Working in this repo instead of an installed binary? Any cargo athena <cmd> above is cargo run -p cargo-athena --bin cargo-athena -- athena <cmd>.

athena.toml

athena.toml describes where the binary and artifacts live and a few pod defaults. It is required by cargo athena and is never read in-pod (everything it controls is baked into the emitted YAML). The nearest one walking up from the cwd is used (like Cargo.toml); point at a specific file with cargo athena -c FILE … or the ATHENA_CONFIG env var.

A complete example (the one the kind e2e uses):

[artifact_repository.s3]
endpoint = "minio.argo.svc.cluster.local:9000"
bucket = "athena-artifacts"
region = "us-east-1"
insecure = true                                    # plain HTTP (e.g. MinIO)
access_key_secret = { name = "athena-s3", key = "accessKey" }
secret_key_secret = { name = "athena-s3", key = "secretKey" }

[artifact]
key = "athena/bin/e2e/0.1.0/e2e.tar.gz"

[bootstrap]
targets = ["x86_64-unknown-linux-musl", "aarch64-unknown-linux-musl"]

[defaults]
service_account = "default"
# package   = "my-workflows" # so `cargo athena` needs no --package/-p
# bin       = "app"          # …or --bin, in a multi-bin crate
# namespace = "argo"         # default namespace for `cargo athena submit`

[artifact_repository.s3]

The S3-compatible bucket holding both the binary tarball and every load_artifact! / save_artifact! object. Emitted into each container template as an Argo s3{} artifact source.

KeyMeaning
endpointS3 endpoint (host:port).
bucketBucket name.
regionS3 region.
insecuretrue for plain HTTP (e.g. local MinIO).
access_key_secret / secret_key_secretKubernetes { name, key } secret selectors for credentials.

[artifact]

KeyMeaning
keyThe exact S3 object key for the binary .tar.gz. cargo athena publish uploads here; emit references it.

[bootstrap]

KeyMeaning
targetsThe static-musl target triples to cross-compile. Each becomes app-<triple> in the tarball; in-pod the bootstrap picks the one matching uname.
default_image(optional) Image for #[container]s with no explicit image. Needs only POSIX sh and uname (distroless works).

[defaults]

KeyMeaning
service_accountPod ServiceAccount for every container, unless overridden by #[container(service_account = "…")].
package(optional) Default cargo package the cargo athena subcommands drive, so you don’t repeat -p/--package. The flag wins.
bin(optional) Default cargo bin within it (multi-bin crates need this). The --bin flag wins.
namespace(optional) Default Kubernetes namespace for cargo athena submit. Precedence: -n/--namespace$ARGO_NAMESPACE → this → default.

The artifact bucket is the only coupling between an artifact’s producer and consumer (see #[container] → macro calls): they share a key, not a DAG edge.

Cookbook

Common patterns, each a few lines. The full rules behind them are on the #[workflow] and #[container] pages.

Sequential vs. parallel

Edges come from data, not statement order. Independent calls run in parallel; a shared input creates the dependency:

#[workflow]
fn pipeline() {
    let a = ingest("src".to_string());   // a and b have no relation:
    let b = probe();                     //   they run in parallel
    combine(a, b);                       // depends on BOTH -> joins them
}

Need a strict order without a data dependency? Make the dependency explicit by threading a return value through.

Sub-workflow that returns a value

A #[workflow] with a return type bubbles its terminal task’s output up like a container’s:

#[workflow]
fn sub(seed: String) -> String {
    let f = fetch(seed);
    transform(f, 7)                      // tail call == this workflow's return
}

#[workflow]
fn parent() {
    let r = sub("seed".to_string());
    publish(r);
}

Nested calls

foo(bar()) lowers bar to its own task and wires foo’s arg to its output — a shorthand for let x = bar(); foo(x);:

#[workflow]
fn pipeline() {
    publish(transform(fetch("u".to_string()), 7));
}

Recursive: foo(bar(baz())) works the same way.

Fan-out over a list

#[workflow]
fn batch() {
    let items = make_list();                       // -> Vec<String>
    let out = items.fan_out(|x| caps(x, "!".to_string()));  // Argo withParam
    summarize(out);                                // out: Vec<String>
}

caps runs once per element ({{item}}); out is the aggregated Vec, consumed like any output.

Conditionals

Real if / else / else if become when-gated wrappers; a value-if selects the taken branch:

#[workflow]
fn gated() {
    let n = decide("hello".to_string());
    let chosen = if n > 3 { left(n) } else { right(n) };  // value-if
    if n == 0 {
        note("zero".to_string());
    } else {
        note(chosen);
    }
}

Conditions are a closed grammar (== != < <= > >=, && || ! over bindings/inputs/a.field/literals/nested calls).

Sequential steps mode

The default workflow body is a DAG (edges from data deps). Add steps to emit an Argo steps: template — one statement per sequential group:

#[workflow(steps)]
fn pipeline() {
    let p = prepare("seed".to_string());
    finalize(p);
}

Same body, different emit shape.

Passing one field of a struct

a.field (or a.field.sub) wires only that field to the next task — lowered to a JSON path on the producer’s output:

#[derive(serde::Serialize, serde::Deserialize)]
struct Meta { id: String, n: i64 }

#[container] fn make_meta() -> Meta { Meta { id: "abc".into(), n: 7 } }
#[container] fn use_id(id: String) { println!("id={id}"); }

#[workflow]
fn pipeline() {
    let m = make_meta();
    use_id(m.id);                          // only `id` is wired through
}

Named fields only (no a.0 / a[i]). The ghost type-checks that the field exists and matches the consumer.

Decoupled artifacts (no DAG edge)

A producer and consumer that share only an S3 key — no ordering, no wiring:

#[container]
fn produce() { cargo_athena::save_artifact_str!("report", "hello"); }

#[container]
fn consume() {
    let r = cargo_athena::load_artifact_str!("report");
    println!("{r}");
}

Artifact chain (with a DAG dep)

The recipe above has no ordering. To chain artifact-producing containers — explicit dependency, guaranteed ordering — bridge them with a return value:

#[container]
fn produce() -> String {
    cargo_athena::save_artifact_str!("report", "hello");
    "ok".to_string()                       // return value creates the edge
}

#[container]
fn consume(seq: String) {
    let r = cargo_athena::load_artifact_str!("report");
    println!("seq={seq}: {r}");
}

#[workflow]
fn pipeline() {
    let token = produce();
    consume(token);                        // edge: produce must finish first
}

The artifact key is still a literal; the return-value just orders the two tasks.

Per-task hooks & resilience

.continue_on / .on_success / .on_failure / .on_error / .on_exit are per-task builders — they fire for that one task, always:

#[workflow]
fn resilient() {
    let raw = fetch("u".to_string()).continue_on(failed, error);
    transform(raw, 9)
        .on_failure(alarm)
        .on_exit(cleanup);     // runs when *this task* finishes
}

Retry with backoff

A flaky step retries itself:

#[container(retry(limit = 3, policy = "OnError", backoff = "30s"))]
fn fetch(url: String) -> String { /* … */ "ok".into() }

limit is required (unlimited for no cap); policyAlways|OnFailure|OnError|OnTransientError; backoff is an int (seconds) or a humantime string. Works on #[workflow] too.

Timeouts

Three knobs for three scopes — stack as many as you need:

#[container(
    timeout = "5m",                       // controller; counts Pending time
    pod_running_timeout = "2m",           // kubelet; only counts time Running
)]
fn long_step() { /* … */ }

#[workflow(active_deadline_if_root = "1h")]   // whole-workflow cap (root-only)
fn pipeline() { /* … */ }

Full distinctions: Timeouts.

Whole-workflow cleanup

on_exit_if_root is the workflow-level exit handler — distinct from the per-task .on_exit(t) above. It runs once, when the workflow finishes, but only for the workflow you actually submit:

#[workflow(on_exit_if_root = teardown)]
fn pipeline() { /* … */ }

Every #[workflow]/#[container] that sets it carries the hook on its own template, so argo submit --from workflowtemplate/pipeline runs teardown. When pipeline is instead a templateRef’d sub-step of a bigger run, its on_exit_if_root stays inert (Argo fires only the submitted workflow’s) — submit it directly to get it.

Pinning a pod (image / SA / node)

Static, or with a container argument spliced in (image / service_account / node_selector values accept "lit" + arg):

#[container(
    image           = "ghcr.io/acme/heavy:" + tag,          // arg injected
    service_account = "athena-" + tenant + "-runner",
    node_selector   = { "kubernetes.io/arch" = "amd64",
                        "disktype" = profile.disk },          // a struct field
)]
fn heavy(tag: String, tenant: String, profile: Profile) -> String { tag }

Operands are an argument or a named struct field of one, and must be String/&str/number. See #[container] → Parameter injection.

Shared pod resources via #[fragment]

A #[fragment] is a normal helper that runs inside the calling container. Every host! / artifact decl it makes is carried onto each container that transitively calls it — share pod resources without a registry:

#[fragment]
fn with_secrets() {
    let _ = cargo_athena::host!("/secrets/db");
    let _ = cargo_athena::host!("/secrets/api");
}

#[container]
fn step_a() { with_secrets(); /* both /secrets paths land on step_a */ }

#[container]
fn step_b() { with_secrets(); /* …and on step_b */ }

Resources travel with the function; the calling containers don’t have to know what’s inside.

Pitfalls

  • Fan-out a value to two consumers → .clone(). The body is faithful Rust; Argo copies the output into each consumer, so the explicit clone is correct, not boilerplate.
  • Workflow bodies are strict. Loops, match, and arbitrary expressions are compile errors — by design, so a step is never silently dropped. if/else/else if, nested calls, and the builder/fan_out chain are supported.
  • Parameter values are JSON. Every value athena emits is JSON, so any string is safe (t("no") works) and a String "7" stays a string, not a number.

Testing

Three levels, from fastest to most thorough.

1. A single step’s real logic

A #[container] body is ordinary Rust, so the fastest test is just a plain #[test] calling the function directly — no harness, no infra.

For the step as it actually runs in-pod (its image, the injected bootstrap, ATHENA_PARAM_* env, host!/artifact ports), use cargo athena container emulate — docker/podman locally, exactly as Argo would, with the deployed binary pulled from S3:

cargo athena container emulate my-crate-transform -a data=hi -a factor=2

See the CLI page for --build (local binary) and the Kubernetes-context limitations.

2. Guard the generated workflow

cargo athena emit is deterministic, so snapshot it and fail CI on an unintended change — catching DAG/wiring regressions before a cluster ever sees them:

cargo athena emit --package my-workflows > expected.yaml   # commit this
# in CI:
diff <(cargo athena emit --package my-workflows) expected.yaml

(cargo-athena’s own test suite does exactly this with checked-in expected YAML across a broad “all features” fixture, plus trybuild compile-fail tests pinning the strict #[workflow] contract and the macro guards — cargo test in the repo.)

3. End-to-end on real Argo

Register the templates and submit on any real Argo + S3 (see Getting Started step 4) and assert the run Succeeded.

Conformance is not a claim: every push to main runs the project’s example workflow against a live Argo + MinIO per supported version and asserts success — see Supported Argo Versions. The repo’s scripts/{deploy,e2e-test,teardown}.sh reproduce that locally (a Docker/Podman daemon required; nix develop provides kind/argo/mc if you use Nix).

Supported Argo Versions

Every push to main submits the real examples/e2e workflow to a live Argo + MinIO per version and asserts it Succeeded. The support matrix is therefore not a claim — it is a continuously verified result.

ArgoSupport
v4.0.5maintained (latest minor)
v3.7.14maintained (n‑1 minor)
v3.6.19minimum supported (EOL, hard-gated)

Argo Workflows maintains the two most recent minors; cargo-athena tracks that plus the minimum that still works. All three are blocking CI jobs (no continue-on-error).

Why ≤ 3.5 is unsupported

cargo-athena emits one WorkflowTemplate per template, wired via templateRef. Argo’s submit-time validator before 3.6 cannot resolve {{tasks.X.outputs.*}} across a templateRef boundary, so any multi-step workflow fails instantly with failed to resolve {{tasks.a.outputs.…}}.

This is intrinsic to the one-template-per-function model and was fixed in Argo 3.6 — the emitted YAML is correct and passes 3.6/3.7/4.0 unchanged. Older versions may still work for trivial cases; use at your own risk.

Live badges

GitHub has no per-matrix-job badge, so each matrix job publishes its pass/fail to a gist and the README renders shields.io endpoint badges from it — the badges at the top of the README are that live e2e result.