Heh, I’m slowly turning even more into a Wasm Components true believer (if that’s even possible). Ran a size comparison between an async-std based TCP echo server, and a native WASI 0.2 one. Both optimized in release mode, debug info stripped, LTO on.
The results: async-std comes in at half a Megabyte for the echo server. WASI 0.2 comes in at just 100 Kilobytes. And in even better news: it currently still uses a WASI 0.1 adapter that weighs 80 Kilobytes. WASI binaries are *small*.
This is literally for the smallest network server I could write. The differences are going to be far more drastic once we pull in e.g. HTTP.
Not that binary sizes are what most people should care about per se. But it’s a decent proxy for things like compile times.
@yosh As soon as I pulled in some libraries like opensearch + reqwest I quickly ended up with 4-5MB. The further you go away from what the host runtime provides the closer it gets to normal native size.
For HTTP server side I’m rally happy with the small kB sized request handlers though
@smndtrl Oh I didn’t realize reqwest already worked on WASI 0.2. Is that new? Do you know if they’re using the native WASI HTTP client, or rolling their own?
Your broader point about code size definitely stands. Though I’d add one fun bit: because abstractions like async sockets and HTTP often sit in the critical path for most of a program, actually placing them inside the platform can remove bottlenecks from the compilation pipeline!
@yosh I rolled my own fork with reqwest to make an API shim for things like proxies, cookies, tls as they are not in WASI HTTP and I wanted to use downstream libs like opensearch. I gave input on potential API design as to my “end user” expectation of how it should work on a wasip2 target though.
I did a quick and dirty hack over Fermyons spin_executor and stuck it into reqwest just to get me to a working state.
@yosh Smaller binaries also matter for cache hit rates.
@yosh it also opens a lot of doors to optimize, for example, green energy usage by moving workloads to follow green energy supply. But that is only viable if the cost of moving is incredibly low. Or if the binary size is small enough to just store all workloads everywhere.
@yosh what about performance?
@yerke good question, hah — haven’t yet measured it. But I don’t see any reason why WASI would do worse.
And in an apples:apples scenario (WASI vs Docker), I think the numbers should be really close. But we should actually measure tho!
@yosh How much of that half MB comes from the panic backtrace implementation? It is a non-trivial amount of code which is omitted on wasm due to wasm not providing any way to look at the call stack.
@bjorn3 oh good question, I actually didn’t look for that. I definitely buy that backtrackes are a lot of code; I think “panic=abort” should get rid of most of that right?
@yosh panic=abort gets rid of the unwinding code, but doesn't affect backtrace generation for panic at all. For that you need to use build-std and enable the panic_immediate_abort feature. You may also want to enable it for your wasi-p2 version for comparison purposes as even there it removed some code, in particular the panic formatting code that is still left over when backtrace printing is disabled.
@yosh I just measured it with async-std's tcp-echo example:
* LTO + panic=abort + strip = 558K
* LTO + panic_immediate_abort + strip = 240K
So a big difference, but still not anywhere close to your wasi component.
For reference just hello world with panic_immediate_abort is 20K and with just panic=abort, LTO and strip it is 342K, in other words the overhead of not enabling panic_immediate_abort is a fixed 320K roughly.
@bjorn3 ohh, ty for measuring this!
@yosh @bjorn3 there's a recent (nightly) rust feature flag to use smaller core/std impl's for size constrained env's. probably not huge effect yet but maybe worth a try?
https://github.com/rust-lang/rust/pull/125011
@yosh small components are one of my special interests. There's a reason Claw and template-compiler generate such absolutely tiny components.
There's also a bunch of really exciting optimization stuff like @cfallin 's weval and at some point I want to start writing component-aware optimizations passes for component-opt.
We've been able to show that incredible things are possible and it'll be exciting to see where that goes.