Skip to content

Implement Core JetHinter#334

Open
ivanlele wants to merge 1 commit into
BlockstreamResearch:masterfrom
ivanlele:feature/core-jet-hinter
Open

Implement Core JetHinter#334
ivanlele wants to merge 1 commit into
BlockstreamResearch:masterfrom
ivanlele:feature/core-jet-hinter

Conversation

@ivanlele
Copy link
Copy Markdown
Contributor

This PR introduces the CoreJetHinter implementation, demonstrating the changeability of the main executable jet set.

@ivanlele ivanlele requested a review from delta1 as a code owner May 27, 2026 10:39
@ivanlele
Copy link
Copy Markdown
Contributor Author

@apoelstra After this PR, we can finally do the thing. We can load external implementations for Jet, JetHL, and JetHinter using dlopen2. I'll try to set up a minimal working example with it and a few jets.

Comment thread src/jet/core.rs
}

pub fn source_type(jet: Core) -> Vec<AliasedType> {
match jet {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In bf0a1d8:

Can you replace this massive match with something that parses the TypeName returned from Jet::source_ty and Jet::target_ty?

Copy link
Copy Markdown
Contributor Author

@ivanlele ivanlele May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dreamt of doing this for months. Right now, this function returns several aliases instead of native types, example: Pubkey, Message64, Scalar, and Gej. If we could drop these aliased types here in favor of parsing them into native types, that would be great.

The thing is, that this might break something during syntax analysis, but i'm not sure.

Ideally, it would be cool to resolve these aliases in some preprocessing step so they become native types everywhere in the codebase before being fed to the lexer. But, I'm not sure whether that would negatively affect the quality of errors outputed by the compiler, which Chumsky currently handles well.

Pros and cons I guess.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, this function returns several aliases instead of native types

Just to point out -- this is the main reason why we can't just parse it to target_ty, because parsing can't be deterministic, I forgot to highlight it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we match and only manually handle the cases that have Pubkey, Message64, etc.? And then use a _ wildcard match for all other jets?

This would address all the arithmetic jets, which are the vast majority of them.

Copy link
Copy Markdown
Contributor

@apoelstra apoelstra May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, even with the arithmetic jets we have things like vec![U8.into(), U8.into()] for jets whose input signature actually looks like U16. But conceptually these are "binary jets" that take two inputs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can. on it

Copy link
Copy Markdown
Contributor

@apoelstra apoelstra May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we do something like

enum JetClassification {
    Simple,
    BinaryInput,
    Custom(Vec<AliasedType>),
}

(this should be a private enum). Then we can implement source_type like

fn source_type(jet: Core) -> Vec<AliasedType> {
    match jet_classification(jet) {
        JetClassification::Simple => parse_ty(jet.source_ty()),
        JetClassification::BinaryInput => split_in_half(parse_ty(jet.source_ty())),
        JetClassification::Custom(vec) => vec,
    }
}

where jet_classification has a giant match in it, but ideally this is the only function with a giant match. And it's not trying to write out full type signatures, it's just doing a loose classification.

Does this make sense?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, I really want there to be a "single source of truth" for these types. I understand that for the crypto jets where we have "rich types" for inputs and outputs, this is not really possible. But there aren't a ton of these jets.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it deserves a separate PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While experimenting with this today, I found that it's not that easy to determine which number types should be used in different cases. For example, with a notation like **lll, it's not deterministic whether it should represent [u64, u64, u64], [u128, u64], or some other composition. Similarly, a notation such as ****22*22**22*22***22*22**22*22 gives us the same: should it be interpreted as [u8, u8] or as [u16] or something else?

A potential solution would be to introduce a stricter notation on the generated source_ty and target_ty side. We already have the following spec:

/// | char | type         |
/// |------|--------------|
/// | `1`  | unit         |
/// | `2`  | single bit   |
/// | `c`  | 8-bit word   |
/// | `s`  | 16-bit word  |
/// | `i`  | 32-bit word  |
/// | `l`  | 64-bit word  |
/// | `h`  | 256-bit word |

We could change *_ty methods to avoid uncertainty. Instead of notations like ****22*22**22*22***22*22**22*22 which is [u8, u8] or [u16], we could encode them more directly, for example, *cc would represent [u8, u8], while s would represent [u16].

For u128, we could introduce a new symbol, for example m, to represent a 128-bit word.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants