Interesting, didn't hear from this system so far. Seems to be funded by the EU. Apparently it is written in pure Rust since 2020, and Andrew "bunnie" Huang seems to be involved.

Is there a PDF version of the book (https://betrusted.io/xous-book/)?

It's not directly funded by the EU, it's funded by NLNet which is only in part funded by the EU. The goal is to collect money from large sources (e.g. EU) from relatively complex subsidiaries that are too big for small projects then dispatch and evaluate.

Source : I have an NLNet funded project, so like Xous https://github.com/betrusted-io/xous-core?tab=readme-ov-file... I have such banners at the bottom of my repository.

Thanks to a PR from a community member (millette) there's now a PDF: https://betrusted.io/xous-book/pdf/xous-book.pdf
  • romac
  • ·
  • 14 hours ago
  • ·
  • [ - ]
There is a single-page version of the book that you can save as a PDF: https://betrusted.io/xous-book/print.html
Great, thanks.

I assume the "kernel" makes heavy use of "unsafe", because all the infrastructure assumed by Rust is not available. Or how was this solved?

  • ajb
  • ·
  • 12 hours ago
  • ·
  • [ - ]
From the talk linked above, they went to considerable effort to design a system with a cheap processor which nevertheless contains an mmu, and so most other embedded kernels, which assume the lack of one, are not applicable. So the point of writing in rust is that they can ensure that some of the guarantees of rust are enforced by the hardware. (It's been a while since I watched that talk, so I don't recall exactly which ones). And this is a microkernel, not a monolithic kernel, so they will be using hardware guarantees even between kernel components.
To be fair, 1) Zephyr can take advantage of an MMU if you have one, and 2) Linux itself scales down surprisingly far. Keep in mind that its lineage extends far back in time and that it retains much of its ability to run on low-spec hardware.
It's not really about infrastructure but yes kernels and firmwares have to do a lot of stuff the compiler can't verify as safe, eg writing to a magic memory address you obtained from the datasheet that enables some feature of the chip. And that will need to happen in unsafe code blocks. I wouldn't call that a problem but it is a reality.
Are you one of the authors? Concerning the "infrastructure": Rust assumes a runtime, the standard library assumes a stack exists, a heap exists, and that main() is called by an OS; in a kernel, none of this is true. And the borrow checker cannot reason about things like e.g. DMA controllers mutating memory the CPU believes it owns, Memory-mapped I/O where a "read" has side effects (violating functional purity), context switches that require saving register state to arbitrary memory locations, or interrupt handlers that violate the call stack model. That's what I mean by "infrastructure". It's essentially the same issue with every programming language to some degree, but for Rust it is relevant to understand that the "safety guarantees" don't apply to all parts of an operating system, even if written in Rust.
I am a maintainer. I think what you're referring to is the problem where `std` is actually a veneer on C - so for example, when Rust allocates memory on an x86-class desktop, it actually invokes a C library (jemalloc, or whatever the OS is using) and that networking is all built on top of libc. Thus a bunch of nice things like threads, time, filesystem, allocators are all actually the same old C libraries that everything else uses underneath a patina of Rust.

In Xous, considerable effort went in to build the entire `std` in Rust as well, so no C compilers are required to build the OS, including `std`. You can see some of the bindings here in the Rust fork that we maintain: https://github.com/betrusted-io/rust/tree/1.92.0-xous/librar...

Thus to boot the OS, a few lines of assembly are required to set up the stack pointer and some default exception handler state, and from there we jump into Rust and stay in Rust. Even the bootloaders are written in Rust using the small assembly shim then jump to Rust trick.

Xous is Tier-3 Rust OS, so we are listed as a stable Rust target. We build and host the binaries for our `std` library, which native rustc knows how to link against.

Thanks, interesting. My concern was less about which language implements std, but rather about the semantic mismatch between Rust's ownership model and hardware behavior (e.g. DMA aliasing, MMIO side effects). So I was curious what work-around you found; do you e.g. use wrapper types with VolatileCell, or just raw pointers?
  • wmf
  • ·
  • 8 hours ago
  • ·
  • [ - ]
standard library assumes a stack exists, a heap exists, and that main() is called

A small assembly stub can set up the stack and heap and call main(); from then on you can run Rust code. The other topics you mention are definitely legitimate concerns that require discipline from the programmer because Rust won't automatically handle them but the result will still be safer than C.

I have no affiliation, I'm just a commenter.

The standard library requires a heap and such, but you can enable the no_std attribute to work in environments where they don't exist. https://docs.rust-embedded.org/book/intro/no-std.html

Rust's safety model only applies to code you write in your program, and there's a lot that's unsafe (cannot be verified by the compiler) about writing a kernel or a firmware, agreed. You could have similar problems when doing FFI as well.

  Rust assumes a runtime, the standard library assumes a stack exists, a heap
  exists, and that main() is called by an OS;
Wrong.

Source: I'm writing Rust without a runtime without a heap and without a main function. You can too.

  • xobs
  • ·
  • 5 hours ago
  • ·
  • [ - ]
The Rust runtime will, at a minimum, set up the stack pointer, zero out the .bss, and fill in the .data section. You're right in that a heap is optional, but Rust will get very cranky if you don't set up the .data or .bss sections.
  • comex
  • ·
  • 5 hours ago
  • ·
  • [ - ]
As will C.
No idea what either of you are talking about. It's the operating system that sets those things up not the language runtime.
Use of "unsafe" is unavoidable. Various pieces of hardware are directly writing into the address space. Concepts of "ownership" and "mutability" go beyond code semantics.
  • junon
  • ·
  • 13 hours ago
  • ·
  • [ - ]
You can't write a kernel without `unsafe` appearing somewhere.
Yeah. That's why my preferred approach isn't to use Rust for the core TCB. It'd be mostly unsafe anyway, so what's the point? You can write in an all-unsafe language if you want. you can still prove it correct out of band, and seL4 has done that work for you.

Sure, you could just use unsafe Rust and prove it correct with Prusti or something, but why duplicate work?

It is true that hardware, by definition, is a big ball of globally mutable state with no guarantees about concurrency, data types, or anything else. However, one could take the view that it's the role of the OS to restrict & refine that raw power into a set of APIs that are safe, through a set of disciplines, such as reasoning through why an unsafe block might actually be sound.

unsafe means that the compiler can't provide any guarantees about what happens inside the unsafe block. However, it is possible to manually ensure those guarantees.

Thus as a matter of discipline every time an unsafe block is used there's a comment next to it recanting the mantra of safety: This `unsafe` is sound because ... all data types are representable, the region is aligned & initialized with valid data, the lifetime is static, we structurally guarantee only one owner at a time (no concurrency issues)...often times in writing that comment, I'll be like, "oh, right. I didn't actually think about concurrency, we're going to need an Atomic somewhere around this to guarantee that" - and that saves me a really hard-to-find concurrency bug down the road.

So while this is a very manual process, I have found the process of documenting safety to be pretty helpful in improving code quality.

Once you've established safety, then you do get some nice things in your life, like Mutexes, Refcells, Arcs, and the whole collections library to build an OS on top of, which saves us a lot of bugs. It is kind of nice to have a situation where if the code compiles, it often just works.

  • junon
  • ·
  • 1 hour ago
  • ·
  • [ - ]
Because not ALL of it is unsafe. The point of using Rust in the kernel is to write abstractions over the unsafe bits and then utilize safe Rust for all the logic you build on top.
I guess then you aren't writing a kernel anymore, you're writing a driver suite for seL4.
Yep. And that's a good place to be. Keep in mind that the "driver suite" in an seL4 system includes a bunch of things that others would put in the kernel: memory management and swap, networking, filesystems, linking and loading, and so on are all userspace. So, if you want, you still get to differentiate based on interesting low-level things.

Calling seL4 system guts a "driver suite" is like calling rustc "just a preprocessor for LLVM IR". True, but only in the most uselessly pedantic sense.

>It'd be mostly unsafe anyway, so what's the point?

The vast majority of the code that will be tagged "unsafe", will be done so because you're doing the equivalent of FFI, but implemented in hardware. If there was a way to automatically generate the binding from a register map, the only purpose of the unsafe keyword would be to warn you that the effect of the ffi call you are doing is unknown. In other words, the unsafe marker isn't some kind of admission of defeat. It marks the potentially unsafe sections of the code where additional care might be required.

This means you're throwing out the baby with the bathwater.

This was a really great talk. Full of interesting things. E.g. his BIO system for replacing Raspberry Pi's proprietary PIO. It uses RV32E (16 registers) and then uses x16-31 as custom registers to directly control the pins so you can do GPIO without the usual delays from MMIO.
The talk is great. Good into to Xous, to using the MMU sort of to do a rust borrow-checked owned memory system.

Also Bunnie talks about making the Baochip: a 1+4 small-tiny risc-v design. That hitchhikes on another company's core! "Can I also put fuses on so we can use this as a risc-v design?" "Sure"! So awesome!!

Unexpectedly wasn't asked: How it compares to Redox, another message passing microkernel system written in Rust? Also, what for embedded devices means? What specific features has that other microkernel systems don't, or just means is limited in scope*?

*Contrast to Redox that is meant to be general purpose but also offers an embedded-oriented minimal version.

Can't comment on Redox as I'm not familiar with it (maybe xobs is), but "for embedded devices" means design choices are made to accommodate smaller memory footprints - hundred k's of RAM, ROM; not the gigabytes expected in desktop-class OSes. So, this is in the same class as e.g. zephyr, threadx, chibi-os, Tock, etc. and has no explicit aspirations to be able to run e.g. server workloads. An example of such a trade-off is sticking with a 32-bit pointer size from the get-go. No desktop or server-class OS could make that trade-off, but the memory savings from smaller pointer and object sizes is meaningful on a memory-constrained device.
That's a neat little system.

Two surprising design decisions:

- One-way messages. You send, then, in a separate operation, you wait for a reply. This happens at each end. That means two extra trips through the scheduler and more time jitter. QNX has a blocking "MsgSend" which sends and waits for a reply. The scheduler transfers control to the receiving thread in the common case where the receiver is waiting, which behaves like a coroutine with bounded latency. It's a subtle point, but one of the reasons QNX is so well behaved about jitter.

- Interprocess communication by memory remapping instead of copying. This is high overhead for small messages, and at some fairly large size, becomes a win. Remapping pages means a lot of MMU and cache churn. Cost varies with the CPU and memory architecture. Mach worked that way, and the overhead was high. Not sure how expensive it is with modern MMUs. Do you have to stop other threads that might have access to the page about to be unmapped?

  • xobs
  • ·
  • 5 hours ago
  • ·
  • [ - ]
> One-way messages

Messages are either one-way (Send or Scalar), or are two-way (BlockingScalar, Lend, or MutableLend). For two-way messages, the calling process inherits the quantum of the sending process, so the only penalty is the cost of two context switches.

> Interprocess communication by memory remapping instead of copying

This is true for Send, Lend, and MutableLend, but for Scalar or BlockingScalar you get 5xusize values instead, which is used for things like `msleep` or `uptime`.

You would have to stop access to other threads that might have access to the page about to be unmapped, but Rust guarantees that if you have a mutable reference, you're the only one with access to the page.

> Xous Operating System

So it is a kernel and can run on "hardware". On which "hardware", is left as an exercise for the user.

I have mixed feelings about rust but the more I look into the xous, docs the more interesting this becomes
Once in a while projects like pgrx, or xous come along that really beat the pack as far as Rust projects are concerned. Actually delivering novel capabilities
What problem is this solving? Are there no OSes for medium embedded systems? Are they too expensive?
Key aspects from the talk iirc (I was in the audience :)):

* Real time embedded CPUs are usually without an MMU -> kernels such as FreeRTOS lack secure memory due to the lack of MMUs in those CPUs

* A kernel targeting embedded CPUs with MMUs that supports secure memory management

* Secure memory communication a there called server/client method to communicate leveraging Rust borrow checker build time for later having "user-land processes" to communicate via pages.

These things combined allow a very small kernel, with user-space implementation of usually kernel-level functionality, such as the system clock timer (presented in the talk).

All of this is meant to provide a complete trustworthy processing chain, from CPU dies that can be inspected through infrared microscopy through the CPU epoxy package/cover to the entire build/software tool chain.

The Xous OS project both takes care of the Kernel, but also the CPU/RISC-V runtime with an MMU, something that is usually quite difficult to obtain - but due to synergy effects with another chip consumer/organization they managed getting their custom processor manufactured.

Trust and transparency: https://betrusted.io
The problem is : do you trust your hardware? If not can you build, or buy, hardware that you can verify? So they built https://www.crowdsupply.com/sutajio-kosagi/precursor with an FPGA instead of a CPU from Intel or SpacemiT and are going up and down the chain to insure that EVERYTHING can be inspected.
It's about providing the security benefits we get from MMUs (e.g. process isolation) to microcontrollers. There are no OSes for that space because basically no microcontrollers have MMUs. They had to make one for this OS.

I highly recommend watching the talk, it's very good!

There is QNX. seL4 is another.

The former is proprietary. The latter kernel is GPL2, similar to Linux.

QNX is not open source.

And seL4 is a kernel, not an OS. And it pretty hard to work with specially if you want any kind of dynamic system.

What did you mean by a dynamic system ?
  • pjmlp
  • ·
  • 1 hour ago
  • ·
  • [ - ]
One where processes, drivers and libraries come and go during the whole OS uptime.
(2022)
  • gjvc
  • ·
  • 11 hours ago
  • ·
  • [ - ]
not helpful
Oh, cool. An operating system.

> Every Xous Server contains a central loop that receives a Message, matches the Message Opcode, and runs the corresponding rust code

Rust? Only Rust?

An OS has no business dictating implementation language. Inside my isolated microservice, I should be able to run anything I damn well please.

Rust's own safety guarantees are a red herring for security at this level, BTW, because you can't trust them over an IPC or system call boundary. The other process can just lie to you about being safe.

I'm a fan of microkernels and microservice models in general, but not if they sacrifice one of their core advantages: arms-length decoupling of implementation strategies through having isolated services communicate only through stable, versioned interfaces.

> A thread connects to a Server using a 128-bit ID. This ID may be textual if the server uses a well-known name that is exactly 16-bytes wide such as b"ticktimer-server" or b"xous-name-server", or it may be a random number generated by a TRNG.

What? This mechanism seems ripe for squatting attacks. How do I know I'm talking to the service I want to contact instead of somebody squatting the name? Using the name namespace for randomly generated IDs (binary!) or an ASCII name stuffed into the same bytes.

Better to give every object on the system its own unique unforgeable, unguessable ID and treat mapping from human-legible names to these strong IDs as its own service, one that can have namespace and authentication policies tailored to a given environment.

> since sending pages of memory is extremely cheap.

Depending on architecture, doing virtual address tricks ranges from expensive to exorbitant. Real-world systems doing bulk transfers over shared memory either rotate among pre-mapped buffers (Wayland, Surface/BufferQueue) or just have the kernel do one efficient scatter/gather memcpy into address space controlled by the recipient (Binder).

I'm not excited by this "lend" IPC primitive Xous has. Seems like more trouble than it's worth. You can add a queue of pre-mapped buffers on top as a separate service if you need it.

> Processes can allocate interrupts by calling the ClaimInterrupt call.

Good! It's about time more people write drivers as regular programs that treat IRQs like any other input event and less as magical things that for some ghastly reason must run with ultimate privileges just to do a DMA once in a while.

That said, just as a matter of elegance, I'd treat an interrupt literally like an regular input source and make it a device node on the FS, not some special kind of resource managed with its own system call.

In Linux terms, I should be able to open /dev/irq/5 and expect it to work like an eventfd. Isn't that elegant?

> ...memory will not be backed by a real page, and will only be allocated by the kernel once you access the page

Ugh. Contractual overcommit. Linux does overcommit too. It's an unfixable mistake. I'm disappointed to see a greenfield OS adopt the same strategy. Doubly so for an embedded system that might want precise control over commit charge.

See, in more mature virtual memory setups, we distinguish between reserving address space (which you do with mmap and such) and reserving allocated capacity (which we call "commit"). If you turn overcommit off on Linux (or use Windows at all) you get an elegant model where you can mmap(..., PROT_NONE) and not have your process "billed" for the memory in your allocated region -- but once you protect(..., PROT_WRITE), you can "charged" for that memory because after the mprotect returns, you're contractually permitted to write to that memory with the expectation you don't segfault or get some kind of "Opps. Just kidding. Don't have the memory after all!" signal.

> IncreaseHeap(usize, MemoryFlags) will increase a program's heap by the given amount

What?? No! sbrk() is a terrible interface. Why get locked into having one region of address space called "the heap"? Modern systems (OpenBSD does especially well here among POSIX systems) don't have a "heap" like a damn PDP-11. Instead, malloc allocates out of memory pools that it internally manages using general purpose mmap. The set of anonymous memory regions so managed is what constitutes the heap. No magic. Kernel doesn't even need to know what a heap is. It speaks only the sweet, soothing language of mmap.

> There are different memory regions in virtual

Wait. There are two dozen hard coded virtual addresses that form an ABI? There goes ASLR. What is this, MS-DOS? Should I load an XMS driver? Maybe shadow video RAM?

> The kernel supports enabling the gdb-stub feature which will provide a gdb-compatible server on the 3rd serial port

Good. Maybe they have build IDs and a symbol server too?

> The loader uses a miniature version of the ELF file format.

Good, but...

> A problem with the ELF format is that it contains a lot of overhead

Bad call. I'm not a big fan of ELF (e.g. relative to PE) but it's not that bad and any conceivable savings in things like dynamic section segment descriptions isn't going to be worth a lifetime of compatibility headaches.

Just use standard ELF. It was designed for computers shittier than the ones in disposable vapes today.

> ELF supports multiple flags. For example, it is possible to mark a section as Executable, Read-Only, or Read-Write.

Yes, but...

> Unfortunately these flags don't work well in practice, and issues can arise from various permissions problems.

Eh, they work fine. (Plus, the section flags aren't relevant. Dead metadata. You don't need a section table at all, technically. An ELF loader cares about the segment table, and segments often span more than one section.)

> The Xous build system uses the xtask concept

You want Yocto. You'll lost a huge chunk of your audience once they learn your OS doesn't build with Yocto. Is it fair? No. Yocto sucks. But it's what the embedded world uses, and if you're already asking them to make a leap of faith using your new OS, you don't ask them to wear a blindfold and learn a new build system at the same time.

("Isn't Yocto for Linux?" You can use Yocto to build whatever OS you want if you don't use Poky, the default Linux distribution the Yocto build system produces. Mechanism vs. policy separation.)

> Push notifications are used when we want to be alerted of a truly unpredictable, asynchronous event that can happen at any time.

IMHO, the more useful distinction is between two-way and one-way messages. For the latter, you don't expect a response, but otherwise every single part of the protocol stack is the same. I wouldn't have made a separate "push notification" facility.

> The Plausibly Deniable DataBase (PDDB) is Xous' filesystem ...features "plausible deniability", which aims to make it difficult to prove "beyond a reasonable doubt" that additional secrets exist on the disk

I'm super happy to see a feature like this integrated into an OS.

> The core of a Mutex is a single AtomicUsize. This value is 0 when the Mutex is unlocked, and nonzero when it is locked.

No priority inheritance? Shame. I'm a huge fan of robust PI in small systems as a way to bound critical operation latencies dynamically.

  • xobs
  • ·
  • 5 hours ago
  • ·
  • [ - ]
> Rust? Only Rust?

Yes, this OS is written in Rust. However, since it has a well-defined ABI, and all services are defined to use `#[repr(C)]`, and the interface is simple primitive enums, it's designed with C-like language support in mind. The hardest part in C is getting an equivalent to `#[repr(C, align(4096))]` which, last time I checked, only let you do alignments up to 64 or so without resorting to linker tricks.

> This mechanism seems ripe for squatting attacks

There are only a few services with well-known names, and they start up before things like the scheduler are running. Most things go through the nameserver service which supports things like attestation, finite-client limits, and signature checking.

> Just use standard ELF.

Sure, there's a loader available that lets you run standard ELF files: https://github.com/betrusted-io/xous-core/tree/main/apps/app...

The bootloader uses the MiniELF format because we can make assertations about things like the order of sections and about merging multiple segments, while also stripping non-loadable sections. It would be possible to just bundle all the ELF images for all programs together, but if you're generating the loader image you might as well shrink the image a bit.

> What?? No! sbrk() is a terrible interface.

Then you can call `MapMemory(NULL, NULL, [size], RWX)`: https://docs.rs/xous/latest/xous/syscall/fn.map_memory.html

Yocto? Nobody is expecting Yocto for deeply embedded systems likely to be built on this OS. It’s closer to FreeRTOS, Zephyr, Embassy, but with this additional hardware-level safety guarantees.
My Yocto point might be colored by my experience of having seen multiple teams stick with Yocto beyond (IMHO) the point of pain and reason due to familiarity and industry inertia.
> An OS has no business dictating implementation language.

As opposed to all those OSes that only publish headers in one language right that require everyone to go through heroic effort to interoperate with it?

> This mechanism seems ripe for squatting attacks. How do I know I'm talking to the service I want to contact instead of somebody squatting the name?

Nobody is going to stop you from typosquatting yourself, no

> As opposed to all those OSes that only publish headers in one language right that require everyone to go through heroic effort to interoperate with it?

Everyone can speak C ABI. It's just a matter of making memory and registers spell out the right things.

Only Rust can speak to a Rust API that needs monomorphization before it can be used. I'd have the same objection to an OS project based on a C++ API, BTW. (Android gets this wrong for some of their HALs, sadly.)

> Nobody is going to stop you from typosquatting yourself, no

True, but if it's about the same amount of work either way, you might as well design the system such that it could conceivably run services at different trust levels.

And if you don't have different trust levels? Why not just stick everything in one address space, like Midori?

I can see a case for something simpler and more streamlined than your typical Unix system, but I imagine that thing being something more this Zephyr or FreeRTOS and not require an MMU. This project does require an MMU and virtual memory.

I appreciate the effort that's gone into Xous, but I'm not sure who it's for except Rust fans who really like greenfield projects. It's just as alien as seL4 but without the mathematical correctness assurances.

I think it'd be cool to make something like this a middleware layer atop seL4 that helps make building systems with it practical.