Skip to content

Conversation

@spacebear21
Copy link
Collaborator

@spacebear21 spacebear21 commented Dec 15, 2025

This PR introduces a unified payjoin-service binary that combines the OHTTP relay and payjoin directory services in one binary, as discussed in https://github.com/orgs/payjoin/discussions/775 and tracked in #941.

This approach refactors the relay & directory as tower Services, and simply routes requests to those existing services based on URL path discrimination, to introduce payjoin-service with minimal code changes. The idea is to merge this PR ~as-is as a foundation, then fold individual components into payjoin-service one by one in follow ups.

Much of this PR was written by Claude, with close supervision and scrutiny by me. I smoke-tested the binary with some basic curl to ensure it was routing properly, and updated payjoin-test-utils directory and relay test services to instances of payjoin-service. This way, we'll have some regression testing already in place as we fold more components into payjoin-service.

Pull Request Checklist

Please confirm the following before requesting review:

This is a preparatory refactor ahead of introducing the unified
`payjoin-service`.
This makes ohttp relay modular as a tower Service, in preparation for
the unified payjoin-service.
Accept any Body implementation instead of only hyper::body::Incoming.
This enables integration with axum and other frameworks that use
different body types.
Accept any Body implementation instead of only hyper::body::Incoming.
This enables integration with axum and other frameworks that use
different body types.

Replace hyper_tungstenite with manual WebSocket upgrade handling
since hyper_tungstenite::upgrade() requires Request<Incoming>.
The generic hyper::upgrade::on() combined with tokio_tungstenite
provides equivalent functionality with generic body support.
This introduces the payjoin-service binary crate, which lives outside of
the workspace for now to enable independent testing and Cargo.lock
changes without causing conflicts.
@coveralls
Copy link
Collaborator

coveralls commented Dec 15, 2025

Pull Request Test Coverage Report for Build 20284095851

Details

  • 224 of 297 (75.42%) changed or added relevant lines in 9 files are covered.
  • 32 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.4%) to 82.967%

Changes Missing Coverage Covered Lines Changed/Added Lines %
payjoin-test-utils/src/lib.rs 35 36 97.22%
ohttp-relay/src/bootstrap/ws.rs 35 41 85.37%
ohttp-relay/src/lib.rs 46 54 85.19%
payjoin-directory/src/lib.rs 27 37 72.97%
payjoin-service/src/main.rs 0 12 0.0%
payjoin-service/src/config.rs 0 17 0.0%
payjoin-service/src/lib.rs 68 87 78.16%
Files with Coverage Reduction New Missed Lines %
ohttp-relay/src/lib.rs 1 80.35%
payjoin-directory/src/lib.rs 31 48.22%
Totals Coverage Status
Change from base Build 20280996222: -0.4%
Covered Lines: 9854
Relevant Lines: 11877

💛 - Coveralls

@nothingmuch
Copy link
Collaborator

concept ACK.

IMO we should also let go of all of the config/cli boilerplate coming from the directory (which in turn borrows from payjoin-cli), since that mainly grew with not breaking compatibility in mind. A clean slate approach would be simpler and cleaner, and we can take our time with it.

I think main.rs can be removed entirely from this PR, focusing on making this a library crate that can be cleanly used in our test utils crate.

Then for deployment, a minimal main.rs can be added in a followup PR, tailored for container usage only.

The majority of the boilerplate in the existing code arises from interface between clap and config, and changing this code in a way that actually makes sense is tricky because of the many interactions between the different layers, but the CLI api was mainly designed around the existing environment variables.

So i guess the next thing to figure out how to configure such a minimal main.rs. config crate with only a mandatory config file is easiest on the rust side but may be a hassle to set up with e.g. docker compose. config crate using only environment variables maybe a better fit.

And we should probably look at how other projects handle this stuff.

@spacebear21 spacebear21 force-pushed the unified-payjoin-service branch from 634793b to b007070 Compare December 16, 2025 03:57
To introduce payjoin-service, it can simply route requests to the
ohttp-relay or payjoin-directory sub-services based on URL path
discrimination. Individual components (e.g health checks, metrics...)
can then be migrated to axum in follow-ups, and use `tower` middleware
where appropriate to reduce boilerplate.
@spacebear21 spacebear21 force-pushed the unified-payjoin-service branch from b007070 to d6fe5ef Compare December 16, 2025 21:43
This replaces the direct dependencies on ohttp-relay and
payjoin-directory with a dependency on payjoin-service. The test
services still spin up two instances of the payjoin-service to simulate
a relay and directory running on isolated infrastructure.
@spacebear21 spacebear21 force-pushed the unified-payjoin-service branch from d6fe5ef to dc01a20 Compare December 16, 2025 22:01
@spacebear21 spacebear21 marked this pull request as ready for review December 16, 2025 22:17
@spacebear21
Copy link
Collaborator Author

Good points about the cli boilerplate. Instead of removing main.rs entirely I stripped it to a minimal binary which accepts a single optional argument for the config file path (inspired by cdk-mintd).

Since your last review, the latest push contains the above change, moves all the core logic to lib.rs, adds manual TLS support for tests, and replaces the payjoin-directory + ohttp-relay direct dependencies in test utils with two instances of payjoin-service.

@spacebear21 spacebear21 requested a review from zealsham December 16, 2025 22:22
@zealsham
Copy link
Collaborator

are we opting to keep hyper @nothingmuch ? asking because the inital plan was to move away from hyper and do all of our http stuffs with axum

@nothingmuch
Copy link
Collaborator

are we opting to keep hyper @nothingmuch ? asking because the inital plan was to move away from hyper and do all of our http stuffs with axum

@spacebear21 said that the challenges with hyper/tower service traits were easy enough to fix, so the original motivation for short cutting to axum might not be as strong as we thought (recall that originally i thought it would be easier to first use only the tower service trait and then integrate axum, but that was causing some headaches)

IMO there's no need to keep hyper, it's more low level than we need and doesn't have a native concept of routers that forces us to have more boilerplate, but if still allowing the hyper.Service to be in use is easy and doesn't get in the way (which kinda makes sense since axum uses it under the hood anyway, and both in turn rely on tower's service trait IIRC) then we don't need to remove it

@spacebear21
Copy link
Collaborator Author

@zealsham Indeed, instead of rewriting all the routing from scratch in axum I figured we could keep the existing logic in payjoin-directory/ohttp-relay mostly untouched while we introduce payjoin-service for the unified binary. In follow-ups we can rewrite individual components as tower services in chunks as we see fit, and softly deprecate hyper. For example we might want to start with a metrics service via axum middleware, replacing the current payjoin-directory /metrics endpoint. It should also be possible to do this work in parallel of other components.

Copy link
Collaborator Author

@spacebear21 spacebear21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from mob programming session. Small fixes will go in immediately (including blocking self requests with HMAC of the body) and we will follow up in future PRs.


#[derive(Clone)]
pub struct Service {
config: Arc<RelayConfig>,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idiomatic approach: make RelayConfig public and make it a Builder pattern. Rename RelayConfig to InternalConfig, use RelayConfig for the Builder that builds that type.

fn new(config: Arc<RelayConfig>) -> Self { Self { config } }

pub async fn new_with_gateway(gateway_origin: GatewayUri) -> Self {
let config = RelayConfig::new_with_default_client(gateway_origin);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider hardcoding payjo.in as it's the only gateway that would've been targeted until today (and the parameter should get deprecated) - Crash if env variable is unexpected (GATEWAY_URI)


pub async fn new_with_gateway(gateway_origin: GatewayUri) -> Self {
let config = RelayConfig::new_with_default_client(gateway_origin);
config.prober.assert_opt_in(&config.default_gateway).await;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this because we control the default gateway and the gateway is known to opt in

}

impl<D: Db> tower::Service<Request<Incoming>> for Service<D> {
impl<D: Db, B> tower::Service<Request<B>> for Service<D>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyper::Incoming is an unread buffer, so we might be able to use Request<Buffer> as long as we enforce the body size limit (can be done with off-the-shelf middleware).

#[instrument]
pub(crate) async fn try_upgrade(
req: Request<Incoming>,
#[instrument(skip(req))]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if we can add Debug to the B trait bound.

}
}

/// Routes incoming requests to ohttp-relay or payjoin-directory based on the path.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have the gateway enforce that the relay isn't itself with a HMAC header.

Comment on lines +61 to +63
/// OHTTP Gateway authorities are either hostnames (fully qualified, so containing a `.`) or IP
/// addresses (containing `.` or `:`). Directory mailboxes are 13 bech32 characters, so there is no
/// ambiguity.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This distinction can be removed. If you're receiving and OHTTP request, it's definitely for the directory gateway. Everything else has static paths.

}
}

match services.directory.call(req).await {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edge case on GET / is the directory static page. Default gateway relay request is POST /. Check how the directory does it in its manual dispatcher to discriminate on method + path. Axum might have a cleaner way to do this.

Ok(Duration::from_secs(secs))
}

impl Config {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to add another config source for environment variables without redundant overrides. (Easier to set env vars in a docker file than it is to write a config file)

pub storage_dir: PathBuf,
#[serde(deserialize_with = "deserialize_duration_secs")]
pub timeout: Duration,
pub gateway_origin: String,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete (hardcoded)

@nothingmuch
Copy link
Collaborator

nothingmuch commented Jan 7, 2026

my mob review conclusion: concept ACK, made some minor requests for changes in the call (posted by spacebear):

  • handling of / path (GET -> directory, POST -> relay)
  • remove authority / short ID distinction as it is irrelevant
  • Debug trait bound for req type, we should only have 2 concrete ones and both should be Debug (Incoming and boxed Bytes)
  • best effort self loop detection with sentinel header

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants