What is a uw.workflow?

A workflow in Underworld3 is a structured way to organise a simulation — and everything that surrounds it — as a graph of named, cached computations. This page explains the mental model. The API reference documents the symbols; the convection example user guide walks through one workflow end to end.

The problem a workflow solves

Geodynamic simulations rarely live in isolation. A typical project ends up with:

  • A simulation that runs for hours or days and produces large HDF5 checkpoint chains.

  • Diagnostics computed during the run (Nu, Vrms, viscosity profiles, …).

  • Aggregations across multiple runs (Nu vs Ra plots, parameter sweeps).

  • Visualisations rendered after-the-fact (frame stacks, movies).

  • Restarts, warm-starts, parameter ramps that re-use prior work.

Without structure, all of this becomes ad-hoc: scripts that orchestrate each other, files in semi-conventional locations, no record of what config produced what output. Re-running the analysis often means re-running the simulation because the script doesn’t know what’s already been done.

A workflow encodes the structure once. Then “give me the Nu-vs-Ra plot” is a single function call: the framework figures out which simulations are still needed, runs only those, and assembles the plot.

The mental model: a DAG of products

A workflow is a directed acyclic graph (DAG) where:

  • Nodes are products — named, persistable outputs. Examples: mesh, run_directory, nu_vs_ra_csv, temperature_movie.

  • Edges encode dependencies. An edge from mesh to run_directory says “to build the run directory, you first need the mesh”.

Each product is built by exactly one step — a Python function decorated with @workflow_step, declaring what it produces and what it requires. Together the steps form the DAG.

from underworld3.workflows import workflow_step

@workflow_step(produces=["mesh"])
def create_mesh(config):
    return uw.meshing.UnstructuredSimplexBox(...)

@workflow_step(produces=["run_directory"], requires=["mesh"])
def evolve(mesh, config):
    ...
    return run

@workflow_step(produces=["run_summary"], requires=["run_directory"])
def summarise_run(run_directory, config):
    ...
    return summary

When you ask the runner to build("run_summary"), it walks the DAG, runs whatever steps are needed (and only those), and caches the results. Re-asking for the same product on the next session hits the cache.

The vocabulary

A small set of words you’ll see across every workflow. The same glossary appears at the bottom of each per-workflow user guide so non-technical readers don’t need to come here first.

Workflow

A Python module of @workflow_step-decorated functions plus a Pydantic config class. Every workflow exports the same shape, so once you’ve learned the mechanics of one (see the convection example), you can read any other.

Step

A single function decorated with @workflow_step, declaring what it produces (one or more named outputs) and requires (named outputs of upstream steps).

Product

A named, cached output of a step. The runner persists products through WorkflowProducts to a <output_dir>/products/ directory with type-aware serialisation:

Object type

On-disk form

Mesh

HDF5 + XDMF

MeshVariable

HDF5 + XDMF

Surface / SurfaceCollection

VTK

ndarray

NPZ

Run (run-directory)

recorded path; the directory itself is the artefact

Path (file artefact)

recorded path; the producer wrote the file

anything else

YAML fallback

Cache key

A short hex digest derived from a product’s inputs (the relevant config fields plus upstream products’ cache keys). Two products with the same cache key are equivalent under deterministic producers; mismatched cache keys mean “the inputs changed, the cached version is stale, rebuild”. Stored next to each product in the manifest.

Identity fields

The subset of a config whose change should invalidate cached products. Mesh and physics fields are identity; step caps and tolerances aren’t. Declared on each WorkflowConfig subclass:

class MyConfig(WorkflowConfig):
    _identity_fields = ("mesh_resolution", "rayleigh", "viscosity")
    ...

If _identity_fields is None (the default), all products fall back to existence-based caching — the legacy pre-cache_key behaviour, preserved for backward compatibility.

Freshness

The property of a cached product whose cache key still matches what the current config expects. When the runner is asked for a product, it walks the DAG; products that are fresh hit the cache, products that are stale rebuild.

Recipe

An example script that composes the workflow’s primitives — e.g. warm_start, ramp_rayleigh. Recipes live alongside the workflow code; they are not part of the public API. The discipline (Run/Manifest/etc. is API, recipes are example code) keeps the public surface small while letting per-workflow patterns live where they belong.

Run directory

The on-disk folder for one model run: manifest, h5 chain, timeseries.csv, summary. Wrapped by the Run class. For time-loop workflows, the run directory is itself a product (a run_directory typed entry in the products manifest).

How the pieces fit together

┌──────────────────────┐
│ WorkflowConfig       │  Parameters with types, bounds, identity
│ (Pydantic)           │  hashable via .cache_key()
└──────────┬───────────┘
           │
           ▼
┌──────────────────────┐
│ Workflow module      │  @workflow_step functions declaring
│ (your code)          │  produces/requires
└──────────┬───────────┘
           │ inspected by
           ▼
┌──────────────────────┐    ┌──────────────────────┐
│ WorkflowRunner       │◄──►│ WorkflowProducts     │
│ runner.build(...)    │    │ <output_dir>/products│
│ runner.dag()         │    │ /manifest.yaml       │
└──────────────────────┘    └──────────────────────┘
  • WorkflowConfig is your parameter object. Subclass it, declare _identity_fields (which fields invalidate caches), populate with Pydantic Fields.

  • Your workflow module has @workflow_step-decorated functions with produces= and requires= declared.

  • WorkflowRunner(module, config, products=...) ties them together. runner.build("name") resolves the DAG, runs whatever is stale, caches results in memory and on disk.

  • WorkflowProducts is the on-disk persistence layer. Type registry handles save/load; cache keys handle freshness.

Two flavours of workflow

Workflows come in two structural shapes — same machinery, different patterns.

Time-loop workflow

A simulation that integrates forward in time. The dominant product is a run directory that grows incrementally (h5 chain plus timeseries.csv). Cached at two granularities:

  • Inner: the run directory’s own manifest tracks “is this run steady?” — short-circuits when re-invoked on the same directory with matching identity hash.

  • Outer: the runner records the run directory as a product with a cache_key derived from inputs. Config changes propagate freshness to downstream products (summary, tables, plots, movies).

The convection example is a time-loop workflow.

Product-graph workflow

A pipeline of one-shot computations: mesh → adapted mesh → stress → permeability → … No time loop; each step’s output is built entirely or not at all. All caching is at the product granularity.

The H2Ex example is a product-graph workflow.

Recipes vs API

Some operations naturally compose the workflow’s primitives:

  • warm_start(source_dir, target_dir, **overrides) — start a new run from an existing converged run, optionally with different settings.

  • ramp_rayleigh(values, base_dir=...) — sequential warm-starts through a sequence of Ra values.

These are recipes — example scripts that compose Run.open / Run.load_field / Run.create / Run.append_step / WorkflowRunner.build. They live next to the workflow’s code, not in the underworld3.workflows package.

The discipline: round-trip primitives (the methods and classes that consumed-by every workflow) are API. Composition patterns (warm_start, ramp, branch_run, ensemble_from_ic) are recipes until 3+ workflows write nearly-identical versions. Then they’re worth promoting to API.

This keeps the public surface small and the recipes flexible. You’re encouraged to copy a recipe into your workflow’s repository and adapt it; that’s the point.

Versioning

The workflow runtime is at underworld3.workflows.__api_version__ = "0.2" — pre-1.0. The shape of Run, Manifest, manifest schemas, and the cache_key computation may shift in 0.x. When a second consumer of the time-loop primitives (a fault-mechanics workflow, a subduction workflow) lands and exercises the API without major changes, we bump to 1.0 and commit to backward compatibility.

Each manifest carries a workflow_api field stamping the version that wrote it. Reading a pre-stamped manifest is non-fatal — Manifest.workflow_api returns None, and freshness checks fall back to existence-based.

What’s currently in the package

Public API surface (see API reference for detail):

  • ConfigurationWorkflowConfig (Pydantic base), config_cache_key, config_snapshot.

  • Runtime@workflow_step, WorkflowRunner, WorkflowProducts.

  • Run-directory primitivesRun, Manifest, RUN_NAME.

  • CLI helpercli_from_config, config_from_args.

  • Diagramdiagram(module), render(module, output_path).

  • Discoveryview, list_workflows, init_workflow.

Out of scope on purpose:

  • Solver wrappers — workflows use UW3 solvers directly.

  • A “workflow GUI” — the package gives observers + DAG metadata so a UI layer can be built on top, but doesn’t ship one.

  • Generic time-loop scaffold (run_loop) — stays in the example until a second time-loop consumer makes the right callback shape obvious.

Where to go next