Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from Wasmer to Wasmtime #3349

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

bjorn3
Copy link

@bjorn3 bjorn3 commented May 14, 2024

Zellij is currently unable to update Wasmer as the currently used version is the last one without mandatory wasix support which Zellij doesn't want to adopt for various reasons. In the mean time a CVE has been found in the Cranelift version used by this Wasmer version. While it has been worked around in #2830, future CVE's may not be possible to workaround without updating Cranelift and thus Wasmer. This PR switches from Wasmer to Wasmtime to avoid all these problems.

Each individual commit builds. All commits except for the last two commits keep using Wasmer and thus could land without the actual switch to Wasmtime if preferred.

.appender("logFile")
.build("wasmer_compiler_cranelift", LevelFilter::Warn),
.appender("logPlugin")
.build("wasmtime_wasi", LevelFilter::Warn),
Copy link
Author

@bjorn3 bjorn3 May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go to logFile or logPlugin? On the one hand these are logs in response to plugin actions, on the other hand they are not logs directly produced by the plugin itself.

@bjorn3
Copy link
Author

bjorn3 commented May 15, 2024

In Wasmer trying to create a file with a relative path creates it in the /host directory. In Wasmtime it will result in an error that there is no "pre-opened file descriptor" through which the file can be opened. As WASI doesn't have any such concept as a current working directory, the Wasmtime behavior is more natural. It does break the current released version of zjstatus however. dj95/zjstatus@25e3a56 works around this. I tried to work around the difference, but while it fixed this specific issue, it broke a lot more things. As such I would prefer documenting the difference rather than making Wasmtime behave the same way as Wasmer.

@bjorn3
Copy link
Author

bjorn3 commented May 15, 2024

CI should be fixed now.

@bjorn3
Copy link
Author

bjorn3 commented May 19, 2024

(Rebased. Will likely keep doing this as soon as conflicts arise to avoid accumulating conflicts.)

@syrusakbary
Copy link

Given that this PR can break many behaviors for plugins on Zellij, we can commit to update Cranelift in Wasmer very soon, so it doesn't become a problem for the project

@imsnif
Copy link
Member

imsnif commented May 20, 2024

Given that this PR can break many behaviors for plugins on Zellij, we can commit to update Cranelift in Wasmer very soon, so it doesn't become a problem for the project

Much appreciated @syrusakbary !! But since I don't want you to perform extra work, I will emphasize something I mentioned elsewhere: we will not consider any upgrade of wasmer that includes wasix or any sort of extra dependency that comes with wasix.

Given that I understood this is a big ask for your project (totally legit ofc, every project and application has different requirements), I gave @bjorn3 the go-ahead for this work. If all checks out with it (I haven't gone over it yet) and we manage to iron out the differences with plugins you rightfully mentioned (some of them have already been brought up as points of order by @bjorn3 ) - we will go through with this change.

@syrusakbary
Copy link

We will not consider any upgrade of wasmer that includes wasix or any sort of extra dependency that comes with wasix. Given that I understood this is a big ask for your project

Is not a big ask at all! Because WASIX is a superset of WASI, is actually feasible to only ship the base calls of WASI –for example– via a feature flag in Cargo. We can do that for you guys so you can upgrade Wasmer easily and without the burden of changing runtimes and plugins!

I've created this issue so we can follow up on that quickly: wasmerio/wasmer#4722

@imsnif
Copy link
Member

imsnif commented May 20, 2024

Thanks @syrusakbary ! As mentioned here and elsewhere, a feature flag will probably not suffice. We also need things like dependencies that are only used in wasix not to be included at all (for our packagers). This would likely have to be an entirely different crate (though I of course do not know about your internals).

But as I said - the work has already started here, so I intend to see it through. If we hit a road block that prevents us from migrating, or feel the migration is too risky - I'll be sure to check out the work in the linked issue to see if it would be a better path forward for us.

Thanks for your efforts and I am sorry things have to be this way, but we've sounded this alarm before a few times and we have to think of our application first.

@syrusakbary
Copy link

syrusakbary commented May 20, 2024

@imsnif of course, no worries! Do what you think is best for the project.

Just a minor clarification: on the Wasmer side, we can turn off the WASIX syscalls, dependencies and code paths incredibly easily via a feature flag, so a new crate shall not be needed if you would like to stick with Wasmer :)

Copy link
Member

@imsnif imsnif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bjorn3 - first, thanks so much for your thorough work on this. This is a major infrastructure change for us and I do not want to take it lightly - so I hope you don't mind being patient with my review cycle.

I'm going on a short vacation tomorrow and didn't want to leave this hanging without a response while I'm away. So I added some comments about things that immediately jumped out at me to get us started. I will give this a more thorough shake-up and review when I get back around the beginning of June.

In Wasmer trying to create a file with a relative path creates it in the /host directory. In Wasmtime it will result in an error that there is no "pre-opened file descriptor" through which the file can be opened. As WASI doesn't have any such concept as a current working directory, the Wasmtime behavior is more natural. It does break the current released version of zjstatus however. dj95/zjstatus@25e3a56 works around this. I tried to work around the difference, but while it fixed this specific issue, it broke a lot more things. As such I would prefer documenting the difference rather than making Wasmtime behave the same way as Wasmer.

I've been thinking about this specific issue and while I hear you about the difficulty with the behavior change, I'm not enthusiastic about it. zjstatus is just one example that we happened to catch, I'm sure there are more. Backwards compatibility of already-compiled plugins is a strong value for our ecosystem.

I'm not saying this is a deal-breaker, but I'd like to understand the trade-offs before going ahead with this. What other stuff breaks if we try to work around this problem? How big of a hack is it? Will it affect upgrades? What else?

@@ -84,7 +84,7 @@ fn start_zellij(channel: &mut ssh2::Channel) {
)
.unwrap();
channel.flush().unwrap();
std::thread::sleep(std::time::Duration::from_secs(1)); // wait until Zellij stops parsing startup ANSI codes from the terminal STDIN
std::thread::sleep(std::time::Duration::from_secs(3)); // wait until Zellij stops parsing startup ANSI codes from the terminal STDIN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why we need the increase here? I don't mind increasing it in in this case of the e2e tests if we need it, but I'd prefer to understand what happened.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it, the first snapshot for several tests seem to be taken while still compiling wasm, which suggests that compiling is slower. This is surprising to me as locally it actually seemed to be faster. Maybe the singlepass compiler of Wasmtime is slower than the one of Wasmer and only the optimizing compiler is faster?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Winch is about 2-6 times slower compiling than Wasmer's Singlepass for non-trivial applications (>2Mb), you can check more examples such as time to compile Spidermonkey or ffmpeg on each runtime on this article:

https://wasmi-labs.github.io/blog/posts/wasmi-v0.32/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All builtin plugins together add up to 2MB, but point taken. I agree Winch being a fair bit slower at compiling is likely the cause here. Caching compiled wasm modules between tests rather than recompiling every single time would likely help a lot. I haven't looked into how hard that would be, but Zellij already caches compiled wasm modules under normal operation. To be honest I'm surprised it isn't already cached between tests. Or maybe it is already getting cached, but would just need a warmup with a higher timeout before any tests run?

zellij-server/Cargo.toml Outdated Show resolved Hide resolved
@bjorn3
Copy link
Author

bjorn3 commented May 27, 2024

I've been thinking about this specific issue and while I hear you about the difficulty with the behavior change, I'm not enthusiastic about it. zjstatus is just one example that we happened to catch, I'm sure there are more. Backwards compatibility of already-compiled plugins is a strong value for our ecosystem.

I just reproduced this behavior in standalone Wasmer. Turns out it is completely disregarding the intention of the WASI specification:

fn main() {
    println!("{:?}", std::fs::write("foo", ""));
    println!("{:?}", std::fs::read("foo"));
    println!("{:?}", std::fs::read_dir("/").map(|dir| dir.collect::<Vec<_>>()));
    println!("{:?}", std::fs::read_dir(".").map(|dir| dir.collect::<Vec<_>>()));
    println!("{:?}", std::fs::read_dir("/tmp/foo/x").map(|dir| dir.collect::<Vec<_>>()));
}

when run in wasmtime with a /tmp/foo/x directory (containing my_file) passed through to the guest will print:

Err(Custom { kind: Uncategorized, error: "failed to find a pre-opened file descriptor through which \"foo\" could be opened" })
Err(Custom { kind: Uncategorized, error: "failed to find a pre-opened file descriptor through which \"foo\" could be opened" })
Err(Custom { kind: Uncategorized, error: "failed to find a pre-opened file descriptor through which \"/\" could be opened" })
Err(Custom { kind: Uncategorized, error: "failed to find a pre-opened file descriptor through which \".\" could be opened" })
Ok([Ok(DirEntry("/tmp/foo/x/my_file"))])

as I would have expected while wasmer prints

Ok(())
Ok([])
Ok([Ok(DirEntry("/.app")), Ok(DirEntry("/.private")), Ok(DirEntry("/bin")), Ok(DirEntry("/dev")), Ok(DirEntry("/etc")), Ok(DirEntry("/foo")), Ok(DirEntry("/tmp"))])
Ok([Ok(DirEntry("./.app")), Ok(DirEntry("./.private")), Ok(DirEntry("./bin")), Ok(DirEntry("./dev")), Ok(DirEntry("./etc")), Ok(DirEntry("./foo")), Ok(DirEntry("./tmp"))])
Ok([Ok(DirEntry("/tmp/foo/x/my_file"))])

In other words rather than passing through the directories the user specified as pre-opened directories as seems to be intended by the WASI specification, it creates an in-memory virtual filesystem whose contents are discarded upon exit and inside this VFS it mounts the user specified directories. The pre-opened directories it passes are / (fd 4), / (fd 5) and . (fd 6). )Yes, it passes / twice for whatever reason. I can imagine that some guest languages will get upset about that.) Wasmtime directly passes /tmp/foo/x as a pre-opened directory instead.

$ RUST_LOG=wasmer_wasix::syscalls::wasi::fd_prestat_dir_name=trace wasmer run target/wasm32-wasi/debug/foo.wasm
2024-05-27T17:00:01.380160Z TRACE ThreadId(01) fd_prestat_dir_name: wasmer_wasix::syscalls::wasi::fd_prestat_dir_name: return=Errno::success fd=3 path="/"
2024-05-27T17:00:01.380179Z TRACE ThreadId(01) fd_prestat_dir_name: wasmer_wasix::syscalls::wasi::fd_prestat_dir_name: close time.busy=31.2µs time.idle=8.81µs fd=3 path="/"
2024-05-27T17:00:01.380201Z TRACE ThreadId(01) fd_prestat_dir_name: wasmer_wasix::syscalls::wasi::fd_prestat_dir_name: return=Errno::success fd=4 path="/"
2024-05-27T17:00:01.380205Z TRACE ThreadId(01) fd_prestat_dir_name: wasmer_wasix::syscalls::wasi::fd_prestat_dir_name: close time.busy=4.83µs time.idle=448ns fd=4 path="/"
2024-05-27T17:00:01.380213Z TRACE ThreadId(01) fd_prestat_dir_name: wasmer_wasix::syscalls::wasi::fd_prestat_dir_name: return=Errno::success fd=5 path="."
2024-05-27T17:00:01.380217Z TRACE ThreadId(01) fd_prestat_dir_name: wasmer_wasix::syscalls::wasi::fd_prestat_dir_name: close time.busy=4.09µs time.idle=326ns fd=5 path="."
[...]

It would probably be possible to emulate this with wasmtime-wasi, but I think it would only hide bugs inside plugins. From what I can tell before dj95/zjstatus@25e3a56 the .zjstatus.log file would have been discarded rather than saved on the host as was intended, however based on https://discord.com/channels/771367133715628073/1240239515071942727/1240246058970517556 it seems like it somehow gets stored in the path mounted at /host anyway. If you do still want me to implement support for emulating the Wasmer behavior, I will do so. Maybe it could be a future compatibility warning when it seems like the plugin depends on the Wasmer behavior.

This will enable PluginEnv to be the Store context when migrating to
Wasmtime.
This will allow removing the Clone impl from PluginEnv when migrating to
Wasmtime as required by the missing Clone impl on Wasmtime's WasiCtx.
Wasmtime requires storing the read/write end of the pipe outside of the
WasiCtx. Passing PluginEnv to these functions allows storing them in the
PluginEnv.
To wait for all plugins to be compiled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants