-
Notifications
You must be signed in to change notification settings - Fork 23
add Wasmtime plugin RFC #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
|
i don't see a way to access a .so global variable from wasm... |
indeed, I did not add this functionality in the current Does rvolosatovs/wasi-dl#1 address your concern? Perhaps Edit: added rvolosatovs/wasi-dl@30ea77f |
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
yes! |
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
|
Did a small update to ensure the symbol lookups are typed b9ae4c9 |
There are two kinds of unsafe relevant here. One is whether the plugin code is unsafe, and I agree that this is basically the same with any host plugin system we'd design here. The other is whether Wasm code using the plugin code is unsafe. The libffi-style approach in this proposal looks like it means that we'd additionally have to treat the Wasm that calls the code as unsafe by default, and while there are potential ways to make it safe, they aren't described here. Also, the libffi-style approach in this proposal looks like it would mean that the Wasm would not be portable, in general, because libffi doesn't encapsulate all C ABI details. What is
This proposal does not currently describe how this would work. And, signatures alone would not be sufficient, because libffi-style bindings also include raw pointers. Perhaps it would be possible to design an interface description language sophisticated enough to describe these interfaces, including signatures, lifetime information, synchronization information, and perhaps also resource lifetimes (eg. open files that need to be explicitly closed and not used thereafter), and perhaps eventually even a way to describe C
It looks like this proposal would also usually want "adapter" libraries too, or at least adapter layers, because I don't expect we'll want normal Wasm code talking directly to these low-level libffi-style APIs, for ergonomics, language-independence, portability, and potential security reasons. And these adapters are going to be tedious to write and maintain, because they need to be written for each source language that needs them, and they'll have a lot of repetitive low-level code. I imagine we'd pretty quickly find ourselves wanting bindings generators for this task. And if we're going to design a language-independent sandboxable interface description language with tooling around it for generating bindings, we should think carefully about whether or not we already have one, and what relationships we want 😄. |
fitzgen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing up this RFC!
I agree that a plugin system geared towards allowing hosts to define and expose new capabilities to Wasm guests that Wasmtime has no builtin knowledge of is very valuable.
Unfortunately, I think a missing constraint is that we fundamentally cannot trust Wasm guests, so we can't just expose dlopen/dlsym and raw FFI types to them. Therefore, I don't think the solution proposed here is something we can pursue. More details inline below.
That said, I also sketch (very roughly) an alternative approach that should address the same motivations but which avoids giving untrusted Wasm guests raw dlopen powers.
| /// Constructs a function from an opaque `alloc` and a type signature | ||
| /// Fails if type of `alloc` is not `ffi-type::primitive(primitive-type::pointer)` | ||
| from-alloc: static func(alloc: alloc, args: list<ffi-type>, ret: option<ffi-type>) -> result<function>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this result in memory unsafety if the Wasm (which is untrusted, and potentially malicious) passes the wrong number or type of arguments and returns?
Or is it expected that Wasmtime will somehow dynamically check these calls?
Similar question for declaring FFI struct types and their fields.
| } | ||
| ``` | ||
|
|
||
| Such interface is *unsafe* and it must be used with extreme care, however that is no different from any other host plugin, which would be loaded via `dlopen`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the answer to the previous question is "yes" then.
I see @sunfishcode's comments now, and I agree with the gist of his points.
There is a difference between whether
- the plugin internally is using
unsafebut exposing a safe interface, and - the plugin's interface is itself
unsafe.
With (1) the (untrusted and potentially malicious) Wasm guest cannot trigger any memory safety, modulo implementation bugs in the plugin itself.
With (2) the (untrusted and potentially malicious) Wasm guest can trivially trigger memory unsafety. That is, (2) is handing security vulnerabilities to Wasm guests by design.
So (2) is a complete non-starter; it is contradictory to Wasmtime's (and the BA's) mission and values.
And -- correct me if I'm wrong! -- this RFC seems to be proposing (2) so, unless I am misunderstanding the proposal, this is not an approach we should consider or pursue any further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more constructive, I would suggest an alternative approach that maintains a safe interface to Wasm, something like:
- There is some well-known symbol that plugin
.sos should export, describing their WIT interface (maybe literally just astatic WIT_INTERFACE: &'static str = "..."or alternatively the binary encoding of the same thing). - Wasmtime loads a
plugin.soand reads its WIT interface - Wasmtime
dlsyms the functions described by the WIT interface - Wasmtime adds functions for that WIT interface to a
Linker, these functions- translates Wasm / canonical ABI arguments into the equivalent in some sort of native ABI
- call their corresponding
dlsymed functions fromplugin.so - translate the native ABI's result back into Wasm / canonical ABI
In the above sketch, the plugin.so is trusted, but the Wasm is not. Any unsafety can only come from bugs in the plugin.so (either from its internal implementation or if its functions' types don't match the WIT interface it claims). Notably, unsafety cannot originate from within (untrusted and potentially malicious) Wasm guests, no matter what garbage values they indirectly pass to plugin.so.
The tricky parts here will be:
- What is the native ABI? Can we reuse the canonical ABI or a variant of it? I could imagine a
bindgen-y proc macro that does some variant of the canonical ABI for plugins with statically-known interfaces, but what about dynamic interfaces (i.e. the common case for thewasmtimecli, rather than awasmtimecrate embedding that happens to use plugins of a certain shape)? What can we do to avoid arg/result translation overheads? - A
plugin.somay want some per-Storestate, for example ifwasi-socketswas implemented as a plugin, it would want any open sockets to be attached to theStore. How do we letplugin.socreate that per-Storestate? Where do we keep it? How do we pass it back toplugin.soon each call? How do we letplugin.sodestroy it when we drop the store? - Finally, it isn't clear to me whether this RFC proposes that
plugin.sos are forwards compatible with newwasmtimeversions (i.e. new Wasmtime releases are backwards compatible with oldplugin.sos) or not. If so, then the ABI concerns described above are doubly important and we need to make sure they remain extensible for future additions and changes, which will involve a lot of subtleties.
|
Thanks for the feedback @sunfishcode @fitzgen! In general I feel that perhaps I misjudged the expected level of detail for RFCs in this repository, this RFC currently is very much a high-level idea/direction, as opposed to directly-implementable design document, which seems to what people are searching for here. First, let's agree on some terms: In this RFC by component composition I mostly refer to function-style composition, and not component composition as defined at https://component-model.bytecodealliance.org/creating-and-consuming/composing.html#what-is-composition More formally, let's assume that components are morphisms (functors) that map a set of interfaces (imports) to another set of interfaces (exports). Their composition is depicted here: Here's an example in context of this RFC: // Trusted Wasm targets this world
world plugin {
// These two interfaces are provided by the host:
import wasi:sockets/tcp;
import wasi:dl/dl;
// These two interfaces are provided to the guest:
export wasi:sockets/tcp;
export wasi:keyvalue/store;
}
// Untrusted Wasm targets this world
world guest {
// These two interfaces are either directly provided by the plugin component or passed through to the host *staticaly* by the composition tool:
import wasi:sockets/tcp;
import wasi:keyvalue/store;
export wasi:http/incoming-handler;
// NOTE: This import would *not* be satisfied:
// import wasi:dl/dl;
}
world composed {
import wasi:sockets/tcp;
import wasi:dl/dl;
export wasi:http/incoming-handler;
}@fitzgen you seem to imply that all Wasm is implicitly untrusted. I'm not sure I agree with that statement and the assumption I'm operating upon is that whether a trusted piece of code is compiled into a native application/library or a Wasm component should not change the "trustworthiness" of the produced artifact. That's a key assumption on which this RFC is built. Is there something specific about Wasm components I'm not aware of, that would make them inherently untrusted? In #39 (comment) you've outlined a way how a plugin could be loaded by Wasmtime:
Note, that adding functions to the In the context of this RFC, the plugin could operate exactly like you've outlined in #39 (comment), except wasmtime CLI would load Let's consider an example with a shared library plugin (this is not an API suggestion, just a quick example sketch):
An example usage with a Wasm component plugin could look like this:
In both cases, one way or another, Arguably, the Wasm plugin option is safer, since the runtime can control what libraries, symbols and their signatures can the plugin access. In this RFC I've decided to start with a simple approach and give the If (trusted)
From perspective of memory safety purely, if Whether we trust the plugin code or not, guest code directly or indirectly invoking a symbol loaded from a shared object will always be potentially memory unsafe. Like I mentioned above, the runtime could limit Effectively, One potential strategy could be using value definitions or just functions (since recursive types are not currently allowed) to either process a C header file ahead-of-time or somehow else (e.g. manually) produce something roughly similar to: (component
(import "wasi:dl/ffi" (instance
(export "primitive-type" (type $primitive_type (enum
"c-char"
"uint64-t"
;; etc..
)))
;; etc...
))
(import "wasi:dl/dll" (instance
(export "function" (type $function (sub resource)))
;; etc...
))
;; using value definition
(export "SOMECONST" (value $primitive_type (enum "uint64-t"))
;; using a function, returns the C type of the constant
(export "SOMECONST" (func (result $primitive_type)))
;; returns a typed `wasi:dl/dll.function`
(export "myfunc" (func (result $function)))
)
If the (trusted) plugin was a Wasm component, there'd be no need for any custom symbols or ABI - answers to most of these questions would be provided directly by the component model.
I don't think that being able to load every dynamic library in existence should be a goal here, the intention is to have an interface with an overlay with C type system just big enough to be useful, but not more that that. I'd expect complex or very platform-specific to be structured the following way:
Right, so In terms of re-exporting WASI, good news is that I've already done this for Rust: https://github.com/wasmCloud/wasi-passthrough/tree/1ade95ee6d2046ffefa5a72731bec22a6d470157/src (roughly based it on There's certainly a lot of tooling that would be required here to make this nice. In spirit of this RFC, however, such tooling would be general purpose Wasm component tooling, as opposed to something built specifically for Wasmtime plugins. If we went the route of Wasmtime doing the "runtime composition" by giving the If people insist on the "single-Wasm" and plugin-as-a-shared-object approach, then I'd still suggest relying on WIT and component model ABI as much as possible and perhaps use cabish for value encoding/decoding |
|
I think having adapter.so directly provide the wasm component interface rather than having to use an intermediate plugin.wasm is safer, faster and easier to use for the end user. Plugin.wasm is effectively unsandboxed as any mistake in it's use of wasi-dl would cause UB. It is a lot easier to directly define a safe wasm component interface in adapter.so than to export an unsafe C api and then separately consume this C api in plugin.wasm and hope that you didn't accidentally cause an ABI mismatch (as soon as you use any non-fixed size integer type (or an integer type larger than the register size) or you use a struct type or enum in your C api, it becomes non-trivial to match the ABI unless you are the C compiler that compiled adapter.so. And if adapter.so is written in Rust, avoiding a separate plugin.wasm may enable the plugin writer to entirely avoid unsafe code. Having the intermediate plugin.wasm also requires you to copy all data twice. Once from adapter.so to plugin.wasm and once from plugin.wasm to the wasm module that uses the plugin. If adapter.so directly provides a wasm component interface, it only needs to be copied once. And finally it is easier for the end user if only adapter.so exists. This way there can't be a version mismatch between adapter.so and plugin.wasm (which will likely cause UB) and you only need to copy a single file around to use the plugin. |
I've outlined an example approach in #39 (comment), which would let prevent UB in using Structs are also fully supported by
A With a
Otherwise, we'd still need two copies |
|
Writing an RFC for Given that it does not appear that Wasm-based Wasmtime plugins is something people are interested in at this time, I'll take a step back and just go ahead and close this PR, instead replacing it by my original proposal: #40 |
How does Wasmtime know what type signature that adapter.so needs?
There are edge cases where even two C compilers for the same platform disagree on the right ABI. Libffi can not know which ABI to use in those cases.
I personally would still love to see dylib based plugins that directly interface with wasm interface types, but RPC based plugins are also nice. While they would almost certainly be a bit slower, they would be easier to support for other wasm engines that can't support dlopen and would be much easier to sandbox at an OS level. |
Refs bytecodealliance/wasmtime#7348
Rendered