167 points by syrusakbary 21 hours ago | 18 comments
marmaduke 20 hours ago
Couldn’t a tcc or similarly simple C compiler be used instead of a 100MB Clang? Where’s the C to wasm compiler hiding?
mananaysiempre 18 hours ago
One issue with Wasm is you essentially can't target it with a single-pass compiler, unlike just about any real machine. Wasm can only represent reducible control flow, so you have to pass your control-flow graph through some variation of the Relooper[1,2]. I don't know if upstream tcc can do that (there are apparently some forks?..).

[1] http://troubles.md/why-do-we-need-the-relooper-algorithm-aga...

[2] https://medium.com/leaningtech/solving-the-structured-contro...

titzer 18 hours ago
> you essentially can't target it with a single-pass compiler,

That might be true if your source language has goto, but for other languages that start with structured control flow, it's possible to just carry the structure through and emit Wasm directly from the AST.

mananaysiempre 17 hours ago
Sure, I was speaking in the context of C specifically. (In non-simplistic compilers, you may not want to preserve the source structure anyway—e.g. in Scheme or Lua with tail calls all over the place.)
RossBencina 13 hours ago
Presumably C's `switch` is also a problem.
metadat 10 hours ago
Yes, I don't recall all the confusing elements and technicalities of what's allowed in Switch statements in C offhand but here are a few brainfscks:

https://old.reddit.com/r/C_Programming/comments/16kg48y/mind...

https://old.reddit.com/r/programminghorror/comments/ylc7f3/w...

r1chardnl 6 hours ago
I went down a rabbithole and wow.

Found a comment from the author of https://github.com/stclib/STC apparently and then came across this example:

https://stackoverflow.com/a/76887723

  int coro_a(struct a* g)
  {
   cco_routine (g) {
    printf("entering a\n");
    for (g->i = 0; g->i < 3; g->i++) {
     printf("A step %d\n", g->i);
     cco_yield();
    }
    cco_final:
    printf("returning from a\n");
   }
   return 0; // done
  }
gcc -E -ISTC/include co.c

After running it through a preprocessor, it gives me this.

  int coro_a(struct a* g)
   {
    for (int* _state = &(g)->cco_state; *_state != CCO_STATE_DONE; *_state = CCO_STATE_DONE) _resume: switch (*_state) case 0: {
     printf("entering a\n");
     for (g->i = 0; g->i < 3; g->i++) {
      printf("A step %d\n", g->i);
      do { *_state = 14; return CCO_YIELD; goto _resume; case 14:; } while (0);
     }
     *_state = CCO_STATE_FINAL; case CCO_STATE_FINAL:
     printf("returning from a\n");
    }
    return 0;
   }
alexdovzhanyn 17 hours ago
This is true. In Theta (https://github.com/ThetaLang/Theta) this is exactly what we do -- no need for more than one pass for the WASM codegen.
halb 17 hours ago
If all you want to do is compile and run c code in the browser you could run tcc in the blink x86_64 emulator, running in wasm. It would take ~300Kb, less than the js & css used in the average webpage
syrusakbary 20 hours ago
The whole LLVM toolchain is a bit big. I think we can reduce much more the size. We actually researched on using tcc but unfortunately tcc doesn’t have a wasm backend (for generating wasm output). It would be awesome if they added it!
fuhsnn 20 hours ago
Check out https://github.com/tyfkda/xcc, I've only used the native backend, but it's small and fast.
syrusakbary 20 hours ago
Nice! I didn’t know the project. Thanks for sharing!
mdhb 3 hours ago
This project is also very much worth checking out.

https://cranelift.dev/

From the page:

Cranelift is a fast, secure, relatively simple and innovative compiler backend. It takes an intermediate representation of a program generated by some frontend and compiles it to executable machine code. Cranelift is meant to be used as a library within an "embedder".

It is in successful use by the Wasmtime WebAssembly virtual machine, for just-in-time (JIT) and ahead-of-time (AOT) compilation, and also as an experimental backend for the Rust compiler.

Cranelift is an optimizing compiler, but it aims to take a fresh look at which optimizations are necessary. We have explicitly avoided features -- such as advanced alias analysis or use of undefined behavior -- that have historically led to subtle miscompilations in other compilers. Cranelift consists of about 200 thousand lines of code; in contrast, e.g. LLVM consists of over 20 million lines of code, a hundred times larger. This difference also allows Cranelift to be relatively approachable to developers, researchers, auditors and others who wish to understand how it works.

gnulinux 16 hours ago
I recently wanted to use tcc for a homebaked programming sideproject and was surprised to find it's no longer supported anymore, at least not by Fabrice Bellard. Upstream git still has some light activity but no releases. I wasn't sure how good of an idea it is to rely on it as a code generator.
stefanos82 15 hours ago
It's alive and kicking my friend https://repo.or.cz/tinycc.git/shortlog

We wait for grischka to decide when to announce a new release https://lists.nongnu.org/archive/html/tinycc-devel/2024-10/m...

gnulinux 0 minutes ago
I see thanks, that's great.
kachapopopow 20 hours ago
clang can target wasm already.
mati365 5 hours ago
You can compile C using JavaScript and target DOS if you are hard core enough. https://github.com/Mati365/ts-c-compiler
kylewlacy 18 hours ago
Very cool! I've been watching the "toolchains in Wasm" landscape for a while, and seeing a Clang/LLVM toolchain running in Wasm is awesome!

YoWASP has also had an LLVM toolchain working in Wasm for a while too[1], although it seems like this version solves the subprocess problem by providing an implementation of `posix_spawn` whereas the YoWASP one uses some patches to avoid subprocesses altogether

My biggest question marks around this version are about runtime/platform support. As I understand it, this toolchain uses WASIX, which (AFAICT) works with Wasmer's own runtime and with a browser shim, but with none of the other runtimes. Are there plans to get WASIX more widely adopted across more runtimes, or to get WASIX caught up to the latest WASI standard (preview2)? Or maybe even better, bring the missing features from WASIX to mainline WASI like `posix_spawn`[2]? I'd love to be able to adopt this toolchain, but it doesn't seem like WASIX support has really caught on across the other runtimes

[1]: https://discourse.llvm.org/t/rfc-building-llvm-for-webassemb... [2]: https://github.com/WebAssembly/WASI/issues/414

ancientstraits 13 hours ago
A few weeks ago, I tried to compile Clang to WebAssembly, but got several different errors, and tried fixing a lot of them, but some of them seemed kind of impossible to fix, so I thought I would try again at a later date. However it seems I will not need to try again. I feel angry that someone made a convenient solution before I did, but also happy, because this probably implies that they made a consistent process to compile Clang for WASM.
egnehots 18 hours ago
It's pretty misleading not to mention the performance overhead. That's an obvious downside and quite easy to benchmark. Skipping any discussion of performance feels like sweeping it under the marketing rug :/
bilekas 18 hours ago
> Skipping any discussion of performance feels like sweeping it under the marketing rug

Expecting performance while compiling C in the browser feels redundant right now though.

zengid 17 hours ago
Didn't Gary Bernhardt do this in 2014? /sarcasm
pjmlp 6 hours ago
Not really, on Firefox

    panicked at /Users/syrusakbary/Development/wasmer/lib/api/src/js/instance.rs:62:84:
    called `Result::unwrap()` on an `Err` value: JsValue(Function(bound 846))

    Stack:

    fe/_.wbg.__wbg_new_abda76e883ba8a5f@https://wasmer.sh/assets/index-CgFg6VHw.js:17:6582
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[1125]:0x2b4276
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[2888]:0x3ab373
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[8254]:0x435ed3
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[4825]:0x3fa7de
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[517]:0x1af753
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[294]:0xbed03
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[2039]:0x34b10e
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[2896]:0x3abaa1
    @https://wasmer.sh/assets/wasmer_js_bg-BruS15W0.wasm:wasm-function[9393]:0x43fde1
    S@https://wasmer.sh/assets/index-CgFg6VHw.js:11:424
    c@https://wasmer.sh/assets/index-CgFg6VHw.js:11:264
apignotti 16 hours ago
GCC? That's easy! :-) What about a complete system? https://webvm.io

Shameless plug: we are hosting a WebVM Hackathon next week (11-14 October) over Discord. For more information: https://cheerpx.io/hackathon

baudaux 7 hours ago
Can I compete with exaequOS?
apignotti 7 hours ago
No, the competition is explicitly based around CheerpX: an X86 virtualization technology built on top of WebAssembly
jedisct1 7 hours ago
What about something that doesn't even require WebAssembly and is faster? https://bellard.org/jslinux/
yuri91 4 hours ago
very unscientific benchmark of `clang hello.c`, after a few runs to make sure the code is downloaded/cached:

jslinux: 4.7s

wasmer: 1.3s

webvm: 1.2s

corysama 20 hours ago
Is it possible/already existing to have interactive C++ lessons where the user's C++ code is compiled an run client-side in a web page?
halb 18 hours ago
Absolutely! You can even run clang in wasm targeting x86_64, and then emulate the resulting program using the blink x86_64 emulator.

I'm working on something similar, where students can compile intel assembly and run it client-side: https://github.com/robalb/x86-64-playground

legobmw99 19 hours ago
nunobrito 8 hours ago
Thanks, I'm seeing but the documentation is so scarce and I'm not a proefficient C expert.

What syntax can be used to run emception? Thank you.

legobmw99 1 hour ago
It’s sadly a bit more of a proof of concept than a hackable project. The docker build in the readme did work last time I tried, and there is a demo site at https://jprendes.github.io/emception/, but I’ve failed to modify it in the past to do other things

There is a fork at https://github.com/emception/emception that is trying to make it more production ready, but it looks like that may have stalled

fwip 20 hours ago
Definitely been possible for at least 5 years now. Would probably be a weekend project now.
legobmw99 19 hours ago
If what I want is not an executable but a shared library, does this get me anything?

I currently have a use case that uses a server running an emscripten build (using SMODULARIZE and some exports, I suppose it’s not a true dylib)

Muromec 18 hours ago
Importing a wasm module from a wasm module is (non)surprisingly impossible to do -- you have to have a linker, abi and all that.
okwhateverdude 17 hours ago
It is possible provided some care. I was looking into this with WAForth which compiles the wasm and loads it via a host function (ie. it is the hosts responsibility to make it available). I wanted to enable dynamic loading of words from disk which requires some book keeping and shuffling a bunch of bytes around during compilation to write out the bits necessary to have the host do that linking. It isn't impossible to do, just tedious and in my case, having to write it in WAT is a pain.
Muromec 16 minutes ago
Yep, you need to do the nasty bits by hand, that's what I mean.
CyberDildonics 19 hours ago
100MB on every page refresh just to compile C is a pretty bold direction to go in.
jampekka 19 hours ago
Except if/when it's cached.
adastra22 18 hours ago
I don’t want my cache requirements ballooning by 100mb.
Retr0id 20 hours ago
> note: it requires a 100MB download

Is this how big a clang toolchain usually is?

syrusakbary 20 hours ago
The Clang WASI SDK weights about 100Mb compressed. We optimized things a bit but still have a way to go (we are not yet compressing in the network). I believe we can serve everything in about 30Mb
01HNNWZ0MV43FF 20 hours ago
MB, right? Mb is megabit

I only have to bring this up because network providers still insist on measuring bits

kevin_thibedeau 19 hours ago
They insist on it because it is the proper way to measure data rates on serial bit streams where out-of-band encoding doesn't divide up on octet boundaries.
msla 12 hours ago
They insist on it because big number sell better.
westurner 18 hours ago
Cling (the interactive C++ interpreter) should also compile to WASM.

There's a xeus-cling Jupyter kernel, which supports interactive C++ in notebooks: https://github.com/jupyter-xeus/xeus-cling

There's not yet a JupyterLite (WASM) kernel for C or C++.

zh2408 18 hours ago
What's the use case?
baanist 15 hours ago
Like most things in software the use cases are the limits of one's imagination. The browser has always been a Turing complete development environment so this is just another demonstration.
dxroshan 8 hours ago
I was also asking exactly the same question.
ttflee 11 hours ago
Every few years, new progresses might remind me of this talk by Gary Bernhardt:

https://www.destroyallsoftware.com/talks/the-birth-and-death...

pjmlp 6 hours ago
Yeah, mostly because WebAssembly is the new kid in bytecode town.
whytevuhuni 19 hours ago
Now all this needs is a simple OS running in a browser, that can edit and compile itself, post the resulting binary onto a WebDAV somewhere, and reload itself from there.

Then it becomes a fully self-sustaining OS that can live forever in a browser.

baudaux 14 hours ago
Something like exaequOS? https://exaequos.com
d_philla 17 hours ago
Check out Jeff Lindsay's Wanix project: https://wanix.sh/
dangerwill 18 hours ago
Very interesting idea but I have to say that those goals are not possible with a simple OS, at least by OS definitions of simple :P
whytevuhuni 18 hours ago
The old https://webassembly.sh/ and the new https://wasmer.sh/ came a long way already.

All you need is a virtual filesystem of some sort, a way to download, a way to upload, an editor, a compiler, and a VT100 JS library. We already have WASI for the rest.

If the JS is too undesired, then perhaps go the old framebuffer graphics mode (e.g. a region of the WASM memory that is interpreted as an ASCII screen, or maybe even as a full bitmap buffer). Then JavaScript side just needs to forward keyboard/mouse into memory and that screen region out of memory.

Arshia001 17 hours ago
The framebuffer idea is used in this wasm doom port: https://github.com/diekmann/wasm-fizzbuzz/tree/main/doom

WASIX already does all the other stuff you mentioned, including in the browser. The one thing it's missing is GUI, mainly because there's no standard GUI interface in POSIX.

baudaux 2 hours ago
It is possible. I already embedded chibicc in exaequOS. I will continue with xcc and clang
jauntywundrkind 19 hours ago
And then use webrtc (or ideally someone can revive a webtransport-p2p please!) to serve itself from a page to other people.

Ideally http3 over webtransport-p2p!

Then add some network discovery so we can advertise & find what's available on our networks!

spirodonfl 8 hours ago
Do you have a proper link to the webtransport-p2p idea? I've done a few searches but I think there's some mix of current implementation and deprecated implementation somehow.

What is it that needs reviving?

jauntywundrkind 4 hours ago
The spec is inactive, afaik, no implementations. Got it backwards, pardon, p2p-webtransport. https://github.com/w3c/p2p-webtransport

I don't know why it's fallen off, to be honest, or what was raised against it. Highly desireable to a lot of p2p folk, a very promising webrtc datatransport replacement.

2 hours ago
2 hours ago
bilekas 18 hours ago
"Yeah, yeah, but your scientists were so preoccupied with whether or not they could that they didn't stop to think if they should." ....
_ache_ 3 hours ago
"We do what we must

Because we can"