Skip to content

HTTP middleware using WASIp3#3384

Draft
itowlson wants to merge 2 commits intospinframework:mainfrom
itowlson:http-middleware-final-final-v2
Draft

HTTP middleware using WASIp3#3384
itowlson wants to merge 2 commits intospinframework:mainfrom
itowlson:http-middleware-final-final-v2

Conversation

@itowlson
Copy link
Collaborator

@itowlson itowlson commented Feb 3, 2026

Fixes #3291.

This currently includes #3383; once that lands I will rebase. For now just ignore the wit directory. Done!

Example usage: https://github.com/itowlson/spin-middleware-terrifying-nonsense

Setting it up in spin.toml:

[[trigger.http]]
route = "/..."
component = "middleware-terrifying-nonsense"
components.middleware = ["middlybiddly", "middlybiddly2"]
executor = { type = "wasip3-unstable" }

Implementing a middleware component in Rust (today: we'd presumably provide a wrapper for the onbound handle import):

#[spin_sdk::http_wasip3::http_service]
async fn handle(mut request: Request) -> impl IntoResponse {
    munge(&mut request);
    wasi::http0_3_0_rc_2026_01_06::handler::handle(request.into_request().unwrap()).await
    // could process the response here if needed
}

Needs lots of renaming, tests, etc. but anyway getting it into the system. (and yes, yes, I'll squash the commits)

@itowlson itowlson force-pushed the http-middleware-final-final-v2 branch 2 times, most recently from 76ccf41 to 0371b3c Compare February 4, 2026 22:03
@fibonacci1729 fibonacci1729 moved this to Backlog in Spin 4.0 Feb 10, 2026
@itowlson itowlson force-pushed the http-middleware-final-final-v2 branch 2 times, most recently from 6d30b34 to 654d07b Compare February 15, 2026 21:17
@itowlson
Copy link
Collaborator Author

Updated to rework OCI precomposition. The way I did it before resulted in two code paths which were essentially the same but couldn't be exactly the same. Now there is only one code path and it is exactly the same as itself.

I am getting increasingly fearful that I am getting lost in a labyrinth of my own changes and although it all seems to work for me I would really value somebody else looking at it and kicking the tyres because what if I am not testing it right. I was sure that this last rework had broken something, and yet seemingly not, which makes me worry whether I am testing what I think I'm testing.

@fibonacci1729 fibonacci1729 moved this from Backlog to In progress in Spin 4.0 Feb 24, 2026
@fibonacci1729 fibonacci1729 self-requested a review February 25, 2026 20:06
Signed-off-by: itowlson <ivan.towlson@fermyon.com>
@itowlson itowlson force-pushed the http-middleware-final-final-v2 branch from 867c0be to c08e5d1 Compare March 4, 2026 01:33
@itowlson
Copy link
Collaborator Author

itowlson commented Mar 4, 2026

One thing that is troubling me with all this is isolation behaviour. Like if you want a middleware to call an auth provider, then it needs network permissions. So:

  1. the main component has to declare network permissions
  2. the middleware then inherits all the component's network permissions

We can probably work around 1 in the increasingly terrifying normalisation step but 2 is a gazillion times more complicated. The only way around that today is to use a service chaining approach where it passes through the host rather than being directly composed. (Which is do-able, but a non-trivial rewrite, and we'd need to check that this would work with downstream hosts who don't support service chaining.)

@fibonacci1729 fibonacci1729 requested a review from a team March 5, 2026 17:42
Copy link
Collaborator

@fibonacci1729 fibonacci1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early comments.

if self.precompose_only {
let Some(precompose_component_id) = self.precompose_component_id.as_ref() else {
anyhow::bail!("got --precompose-only but no --precompose-component-id");
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this added to aid in the research? I'm not sure what's happening here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used by the OCI subsystem during precomposition. spin registry push needs to precompose Wasm binaries for cases like SpinKube where composition at load time isn't available. But the OCI subsystem doesn't know how to deal with HTTP middleware because that's an HTTP concern, not an OCI concern. So when the OCI subsystem sees complications in a trigger, such as HTTP middleware, it issues issues a command to the relevant trigger saying "hey, you're so smart, you precompose this for me."


/// An object which composes extras onto the primary component.
/// TODO: the combination of functions and objects and traits is a bit funny and we may/should be able to streamline it.
fn complicator() -> impl spin_factors_executor::Complicator {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know complicator is a placeholder so I'll suggest we use something generic here like fn compose_extras() -> impl spin_factors_executor::ComposeExtras.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think I originally had "extras" but if anything I hated that even more than "complication." I am not sure what a good name would be!

middleware_blobs: impl Iterator<Item = &'a ComplicationData>,
) -> anyhow::Result<Vec<u8>> {
const MW_NEXT_INBOUND: &str = "wasi:http/handler@0.3.0-rc-2026-01-06";
const MW_NEXT_OUTBOUND: &str = "wasi:http/handler@0.3.0-rc-2026-01-06";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we unify these into a single const MW_HANDLER_INTERFACE: &str = "wasi:http/handler@0.3.0-rc-2026-01-06";?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can. Originally they were different, and I guess I wanted to retain clarity about which context we were in at any given point in case we ever needed to prise them apart again. But yeah that does seem unlikely.

/// but with different middleware. In this case, we will synthesise a component
/// for each such trigger, with the same main configuration but with its own
/// "extra" components.
fn reassign_extra_components_from_triggers(mut locked: LockedApp) -> LockedApp {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid cloning the locked app if we break this up into 3 parts:

  1. Collect a lightweight TriggerInfo struct containing the trigger_id, component_id and whether "extras" are present.
  2. Use those to determine splitting, then clone only the individual components that need it.
  3. Move extras from triggers to components.

Something like (untested):

/// We want all component/composition graph information to be in the component,
/// because the component ID is how Spin looks this stuff up. So if a trigger
/// contains a `components` table, e.g. specifying middleware, we want to move
/// that to the component.
///
/// But it's possible to have two triggers pointing to the same primary component,
/// but with different middleware. In this case, we will synthesise a component
/// for each such trigger, with the same main configuration but with its own
/// "extra" components.
fn reassign_extra_components_from_triggers(mut locked: LockedApp) -> LockedApp {
    use std::collections::{HashMap, HashSet};

    fn trigger_component_id(trigger: &LockedTrigger) -> Option<String> {
        trigger
            .trigger_config
            .get("component")
            .and_then(|v| v.as_str())
            .map(|s| s.to_string())
    }

    fn extra_components(trigger: &LockedTrigger) -> Option<&ValuesMap> {
        trigger
            .trigger_config
            .get("components")
            .and_then(|v| v.as_object())
    }

    fn has_extra_components(trigger: &LockedTrigger) -> bool {
        extra_components(trigger).is_some_and(|xcs| !xcs.is_empty())
    }

    // Collect lightweight trigger metadata to determine which components
    // need splitting. This avoids cloning the entire LockedApp.
    struct TriggerInfo {
        trigger_id: String,
        component_id: String,
        has_extras: bool,
    }

    let trigger_infos: Vec<TriggerInfo> = locked
        .triggers
        .iter()
        .filter_map(|t| {
            trigger_component_id(t).map(|cid| TriggerInfo {
                trigger_id: t.id.clone(),
                component_id: cid,
                has_extras: has_extra_components(t),
            })
        })
        .collect();

    // Group triggers by component ID, find components that need splitting
    // (multiple triggers reference the same component AND at least one has extras).
    let mut cid_triggers: HashMap<&str, Vec<&TriggerInfo>> = HashMap::new();
    for info in &trigger_infos {
        cid_triggers
            .entry(&info.component_id)
            .or_default()
            .push(info);
    }

    let mut seen = HashSet::new();
    let mut disambiguator = 0;

    // For components referenced by multiple triggers where at least
    // one has extras, create synthetic component clones so each trigger gets
    // its own composition graph.
    let needs_splitting: Vec<_> = cid_triggers
        .into_iter()
        .filter(|(_, triggers)| {
            triggers.len() > 1 && triggers.iter().any(|t| t.has_extras)
        })
        .collect();

    for (cid, triggers) in &needs_splitting {
        for info in triggers {
            if !info.has_extras {
                // Unenriched triggers can continue pointing to the original component.
                continue;
            }
            let mut synthetic_id = format!("{cid}-for-{}", info.trigger_id);
            if seen.contains(&synthetic_id) {
                disambiguator += 1;
                synthetic_id = format!("{synthetic_id}-d{disambiguator}");
            }
            seen.insert(synthetic_id.clone());

            let component = locked
                .components
                .iter()
                .find(|c| c.id == **cid)
                .expect("trigger references non-existent component")
                .clone();

            let mut synthetic = component;
            synthetic.id = synthetic_id.clone();
            locked.components.push(synthetic);

            // Update the trigger to point to the new synthetic component
            let trigger = locked
                .triggers
                .iter_mut()
                .find(|t| t.id == info.trigger_id)
                .expect("trigger disappeared during splitting");
            trigger
                .trigger_config
                .as_object_mut()
                .expect("trigger config should be an object")
                .insert("component".into(), synthetic_id.into());
        }
    }

    // Move extras from triggers onto their respective components.
    // Now each enriched trigger has its own component, so composition graphs
    // are uniquely identified by component ID.
    for trigger in &mut locked.triggers {
        if let Some(extras) = extra_components(trigger) {
            if let Some(component_id) = trigger_component_id(trigger) {
                if let Some(component) = locked.components.iter_mut().find(|c| c.id == component_id)
                {
                    component
                        .metadata
                        .insert("trigger-extras".into(), extras.clone().into());
                    component.metadata.insert(
                        "resolve-extras-using".into(),
                        trigger.trigger_type.clone().into(),
                    );
                    trigger
                        .trigger_config
                        .as_object_mut()
                        .expect("trigger config should be an object")
                        .remove("components");
                }
            }
        }
    }

    locked
}

.await
.unwrap();
assert_eq!(4, locked_app.triggers.len());
// Splitting should resultm in triggers pointing to different IDs, but the same primary source
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Splitting should resultm in triggers pointing to different IDs, but the same primary source
// Splitting should result in triggers pointing to different IDs, but the same primary source

for component_ref in components {
let component_ref = component_ref
.as_str()
.context("middleware should be strings curently")?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.context("middleware should be strings curently")?;
.context("middleware should be strings currently")?;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also i might say: "middleware references should be strings currently"

async fn load_source(&self, source: &Self::Source) -> anyhow::Result<Vec<u8>> {
let path = self
.wasm_loader
.load_component_source("bippety-boppety", source)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.load_component_source("bippety-boppety", source)
.load_component_source("<extra-component>", source)

assert!(t.trigger_config.get("components").is_none());
}

println!("{}", serde_json::to_string_pretty(&locked_app).unwrap());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: delete this println!

let value = toml::Value::deserialize(deserializer)?;
if let Ok(val) = T::deserialize(value.clone()) {
Ok(vec![val])
if let Some(arr) = value.as_array() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe delete the old path commented out below and add a note here:

// NOTE: We explicitly check for array first rather than trying T::deserialize
// first, because toml's serde impl will treat an array as a sequence of fields
// to be assigned to struct members (e.g. Component), producing nonsensical results.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonsensical

would you compromise on "the valuable guano of our much-admired chiropterans"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer it, actually.😂


let temp_dir = tempfile::tempdir().unwrap();
let working_dir = temp_dir.path();
let locked_url = write_locked_app(&locked, working_dir).await.unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these .unwrap()s safe? Or should we .context(...)? here?

@fibonacci1729
Copy link
Collaborator

fibonacci1729 commented Mar 5, 2026

One thing that is troubling me with all this is isolation behaviour. Like if you want a middleware to call an auth provider, then it needs network permissions. So:

1. the _main_ component has to declare network permissions

2. the middleware then inherits _all_ the component's network permissions

We can probably work around 1 in the increasingly terrifying normalisation step but 2 is a gazillion times more complicated. The only way around that today is to use a service chaining approach where it passes through the host rather than being directly composed. (Which is do-able, but a non-trivial rewrite, and we'd need to check that this would work with downstream hosts who don't support service chaining.)

Capability isolation remains a thorn for composition. We either need to accept that middlewares inherit capability from the primary component (or alternatively deny-all by default via some adapter like we do for dependencies with an opt-in) or what you suggest with service-chaining (which I prefer less for the reason you postulate).

}
}

async fn compose_middlewares<'a>(
Copy link
Collaborator

@fibonacci1729 fibonacci1729 Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to just push a commit but i would simplify this to use wac-graph (addressing the TODO). You can delete the wasm-compose, wasmparser and tempfile dependencies using this.

Note: I added chain as a general purpose thing thinking it would be useful to upstream but for now it can live here.

/// Chain a list of component packages into a middleware pipeline.
///
/// `packages` is ordered from **outermost** (first to receive a request) to
/// **innermost** (the final handler).  Every component except the last must
/// import a name equal to `import_name` and every component must export a name
/// equal to `export_name`.  In the common middleware pattern these are the same
/// (e.g. both `"handle"`), but they can differ if the WIT uses separate names.
///
/// Returns the [`NodeId`] of the alias for the outermost component's export,
/// ready to be passed to [`CompositionGraph::export`].
///
/// # Errors
///
/// Returns an error if fewer than two packages are provided, or if any
/// alias / argument wiring step fails.
fn chain(
    graph: &mut CompositionGraph,
    packages: &[PackageId],
    import_name: &str,
    export_name: &str,
) -> anyhow::Result<NodeId> {
    if packages.len() < 2 {
        bail!("chain requires at least 2 packages, got {}", packages.len());
    }

    // Start from the innermost component (last in the list) and work outward.
    // The innermost component is instantiated first with no wiring — its
    // unsatisfied imports (if any) will become implicit imports of the
    // composed component.
    let mut iter = packages.iter().rev();
    let innermost = *iter.next().unwrap();
    let mut instance = graph.instantiate(innermost);
    let mut upstream_handle = graph.alias_instance_export(instance, export_name)?;

    // For each remaining component (moving outward), instantiate it and
    // wire the previous component's export into its import.
    for &pkg in iter {
        instance = graph.instantiate(pkg);
        graph.set_instantiation_argument(instance, import_name, upstream_handle)?;
        upstream_handle = graph.alias_instance_export(instance, export_name)?;
    }

    Ok(upstream_handle)
}

async fn compose_middlewares<'a>(
    primary: Vec<u8>,
    middleware_blobs: impl Iterator<Item = &'a ComplicationData>,
) -> anyhow::Result<Vec<u8>> {
    const MW_HANDLER_INTERFACE: &str = "wasi:http/handler@0.3.0-rc-2026-01-06";

    let mut graph = CompositionGraph::new();
    let mut package_ids: Vec<PackageId> = Vec::new();

    // Register middleware packages (outermost → innermost order).
    for (index, blob) in middleware_blobs.enumerate() {
        let bytes: Vec<u8> = match blob {
            ComplicationData::InMemory(data) => data.clone(),
            ComplicationData::OnDisk(path) => tokio::fs::read(path)
                .await
                .with_context(|| format!("reading middleware from {}", path.display()))?,
        };
        let name = format!("middleware{index}");
        let package = Package::from_bytes(&name, None, bytes, graph.types_mut())
            .context("parsing middleware component")?;
        package_ids.push(graph.register_package(package)?);
    }

    // Register the primary component (innermost in the chain).
    let package = Package::from_bytes("primary", None, primary, graph.types_mut())
        .context("parsing primary component")?;
    package_ids.push(graph.register_package(package)?);

    // Wire the pipeline: outermost middleware → … → primary.
    let outermost_export = chain(
        &mut graph,
        &package_ids,
        MW_HANDLER_INTERFACE,
        MW_HANDLER_INTERFACE,
    )?;

    // Export the outermost handler as the composed component's export.
    graph.export(outermost_export, MW_HANDLER_INTERFACE)?;

    Ok(graph.encode(Default::default())?)
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fibonacci1729 - this is so much clearer and well-structured than what I had - I really appreciate you taking the time to figure it out and share it!

Signed-off-by: itowlson <ivan.towlson@fermyon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

WASI P3 HTTP middleware

2 participants