Stacks & Stages


Stages add support for persistent data via a named stack, plus basic push/pop/peek operators.

Any task associated with compose.mk is able to use a stack just by mentioning it, regardless of whether it is inside or outside of a container.

Stacks live on the filesystem usually at the project root, and are JSON-backed. Ultimately the push/pop/etc operators are implemented with jq and jb, using local tools if available but falling back to docker for a portable implementation. (For more background, see the docs on structured IO and proxy wrappers.)

Other than the datastructure itself, there are few restrictions or isolation guarantees about how that data is used, or where it's used from. By default, there are no isolation guarantees at the level of PIDs, parent PIDs, or container. However, isolation per compose.mk managed project is usually implicit, because stages are uniquely determined by the stage name and the working directory when the stage is used.

Since access to the stage-stack should "just work" from the docker host or a container, and works inside or outside the particular subshell for any existing make-target, this adds a lot of missing flexibility for "crosswise" communication that task-DAGs and pipelining would usually struggle with.

The point of stacks and stages is to have something small, fast, and light that you use out of the box or build on top of. Stage-stacks in compose.mk are a project-local datastructure that feels native, is basically dependency free, and is actually kind of scalable. After all.. jq will be happy to chug through a very large pile of JSON indeed, and file IO is a bottleneck, switching to shared memory is straightforward.

Stage Basics


At a high level, these are the rules for stages:

  • Once entered, a stage lives forever unless explicitly exited,
  • The lifetime of a stage-stack is just the lifetime of the stage
  • Stages may be re-entered from anywhere without an exit, and
  • Entering and exiting stages leaves time-stamped events inside the stack
  • Any reference to a non-existent stage enters the stage implicitly, creating the stage-stack

Stack-operations are logged by default to stderr, so it's easy for users or devs to keep track of the stack state. Stages also automatically get a banner / visual divider on entry, helpful for making automation output more readable.

You can use stages via stand-alone mode or programmatically from script, and either way it looks pretty similar.

Stages in Stand Alone Mode


Creating a stack is just mentioning it.

$ ./compose.mk flux.stage.enter/BUILD

╔═══════════════════════════════════════════════════════════╗
║                           BUILD                           ║
╚═══════════════════════════════════════════════════════════╝
⇄ flux.stage // BUILD //  stack file @ .flux.stage.BUILD
Φ flux.stage.push // BUILD  io.stack.push // stack@.flux.stage.BUILD   {"stage.entered":"Mon 24 Feb 2025 08:53:39 PM PST"}

Pushing data works with pipes:

$ echo '{"data":789}' | ./compose.mk flux.stage.push/BUILD

Φ flux.stage.push // BUILD  io.stack.push // stack@.flux.stage.BUILD   {"data":789}

You can also use the builtin wrapper for jb like this:

$ ./compose.mk jb foo=bar | ./compose.mk flux.stage.push/BUILD

Popping data is pipe-safe JSON for downstream, but stderr is annotated for humans.

$ ./compose.mk flux.stage.pop/BUILD
Φ flux.stage.pop // BUILD
⇄ io.stack.pop // stack@.flux.stage.BUILD {
  "data": 789
}

$ ./compose.mk flux.stage.pop/BUILD
Φ flux.stage.pop // BUILD
⇄ io.stack.pop // stack@.flux.stage.BUILD {
  "stage.entered": "Mon 24 Feb 2025 08:53:39 PM PST"
}

$ ./compose.mk flux.stage.exit/BUILD
Φ flux.stage.clean // BUILD //  removing stack file @  .flux.stage.BUILD

Stage Idioms


If you're scripting with stages, the main difference is that flux.stage.enter and flux.stage.exit can be used like function decorators:

Summary
#!/usr/bin/env -S make -f
# demos/stages-idiom.mk: 
#
#   Demonstrating stages, stacks, and artifact-related features of compose.mk
#   Part of the `compose.mk` repo. This file runs as part of the test-suite.  
#   See the docs for more discussion: https://robot-wranglers.github.io/compose.mk/stages
#
#   USAGE: ./demos/stages-idiom.mk

include compose.mk
.DEFAULT_GOAL := validate

# Wrap targets with entry/exit as an explicit context-manager
validate: \
    flux.stage.enter/VALIDATION \
        project.scan \
        project.analyze \
    flux.stage.exit/VALIDATION

project.scan:
    echo '["results"]' | ./compose.mk flux.stage.push/VALIDATION

project.analyze:
    echo '["other results"]' | ./compose.mk flux.stage.push/VALIDATION

In this case when validate succeeds the stage file is cleaned like we'd expect, and if there's a crash then partial results can be inspected. There are various ways to write this, like replacing calls to ./compose.mk with ${make} expansions, or using macros instead of targets, but the idea is the same.


Using an implicit context manager with flux.stage.wrap/<target> is an alternative style. The example below shows that, plus overriding the banner-printing mechanism to use the (dockerized) figlet tool instead of using gum.

Summary
#!/usr/bin/env -S make -f
# demos/stage-wrapper.mk: 
#   Demonstrating stages, stacks, and artifact-related features of compose.mk
#   Part of the `compose.mk` repo. This file runs as part of the test-suite.  
#
#   See the docs for more discussion: https://robot-wranglers.github.io/compose.mk/stages
#
#   USAGE: ./demos/stages.mk

include compose.mk

project.scan:
    echo '["results"]' | ./compose.mk flux.stage.push/VALIDATION

project.analyze:
    echo '["other results"]' | ./compose.mk flux.stage.push/VALIDATION

# Override the default target used to print the entry-banner
export banner_target?=io.figlet

__main__: flux.stage.wrap/INIT/project.scan,project.analyze


Another source of usage hints about stages is the test-suite, so it's included below for reference.

If you are scripting, note that using ${@} as shorthand for "current target name" is a good thing to organize around, since it reduces typing and typos. Another idea is to parse out a prefix from the current target name, thus ensuring that related targets all use the same stack.

Summary
#!/usr/bin/env -S make -f
# demos/stages.mk: 
#   Demonstrating stages, stacks, and artifact-related features of compose.mk
#
# Part of the `compose.mk` repo. This file runs as part of the test-suite.  
# See the docs for more discussion: https://robot-wranglers.github.io/compose.mk/stages
# USAGE: ./demos/stages.mk

include compose.mk

# disable gum usage by overriding the default target for printing banners
export banner_target?=io.print.banner

__main__: flux.star/test.stage

test.stage.basic:
    @# Note that ${@} is shorthand for "current target name"-- 
    @# we use that for the stage name everywhere
    $(call log.test, declare a stage & get stage name back)
    ./compose.mk flux.stage/${@} flux.stage 

    $(call log.test, stage stack should exist still with legal JSON if not explicitly exited)
    ls .flux.stage.${@} && cat .flux.stage.${@} | ${jq} -e .

    $(call log.test, exiting the stage removes the stack file)
    ${make} flux.stage.exit/${@}
    ! ls .flux.stage.${@} 2>/dev/null

    $(call log.test, using a stage by pushing data causes stack to exist)
    ${jb} one=1 | ./compose.mk flux.stage.push/${@} 
    ls .flux.stage.${@} 2>/dev/null
    ${jb} two=2 | ./compose.mk flux.stage.push/${@} 

    $(call log.test, getting the whole stack is possible and returns JSON)
    ./compose.mk flux.stage.stack/${@} | ${jq} -e .
    ${make} flux.stage.exit/${@}

    $(call log.test, testing popping JSON data off the stack)
    ${jb} foo=bar | ./compose.mk flux.stage.push/${@} 
    ./compose.mk flux.stage.stack/${@} | ${jq} .
    ./compose.mk flux.stage.pop/${@} | ${stream.peek} | ${jq} -e -r .foo

    ${make} flux.stage.exit/${@}

test.stage.empty:
    $(call log.test, popping an empty stack is also allowed)
    ./compose.mk flux.stage.pop/${@}
    ./compose.mk flux.stage.pop/${@}
    ./compose.mk flux.stage.pop/${@}
    ./compose.mk flux.stage.pop/${@}
    ${make} flux.stage.exit/${@}
    ./compose.mk flux.stage.pop/${@}

See the the implementation details below for quick links to the full API.

Support for Non-JSON Artifacts?


For non-JSON artifacts, output typically doesn't need to be input somewhere else in the same pipeline, and you can just dump the content on the filesystem anywhere you like as usual because tool containers share the project directory.

So.. unless you want build pipelines then there's little advantage if compose.mk gets in the middle. Plus, while jamming base64 binary data directly into the JSON isn't really recommended.. ipynb's do it all the time! :)

If you think you do have a use-case for something like this you might want to look at the packaging docs, which can at least help with creating archives.

Stage API


Stages are implemented using the flux.stage.* family of targets, and the lower level io.stack helpers.

Why Stacks, though?


Nothing prevents further downstream slicing with jq, so stacks as presented thus far can be easily changed or extended to do LIFO/FIFO, express KV stores, etc. Any changes along these lines can usually be completely expressed in jqlang, and should be fairly fast/atomic. 1

Practically speaking, tools ike terraform, helm, kubectl and ansible all spit JSON, then frequently needs the output for one as the input of another. Without stages, compose.mk still makes that relatively easy, but pure pipelining means it's hard to reuse values, or you need to track a temp file, etc.

More philosophically, native support for stacks gives compose.mk a new capability, in terms of stuff like programming paradigms. The stack is also a convenient way to communicate between polyglots and across matrioshka layers. And, for what it's worth, now you can pretty directly architect your automation in terms of pushdown automata or queue machines :)

References



  1. Nope, this isn't perfectly thread safe.. but (famous last words) YAGNI