`dmesg` shows the connections it blocks (I think this is maybe the audit feature). I used an example sandboxer.c as a base (https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux...) except I just set mine up to not touch file restricting, just networking so that it has that one whitelisted incoming port.
./network-sandboxer-tool 8000 some-program arg1 arg2 etc.
I like it because it just works as an unprivileged usermode program without setting anything up. A tiny C program. It works inside containers without having to set up any firewalls. Aside from having to compile a small C program, there is little fuss. I found the whole Landlock thing trying to find out alternatives to bubblewrap because I couldn't figure out how to do the same thing in bwrap conveniently.The "unprivileged" in "Landlock: unprivileged access control" for me was the selling point for this use case.
I don't consider this effective against actively adversarial programs though.
- https://github.com/containers/bubblewrap
- https://codeberg.org/dannyfritz/dotfiles/src/commit/38343008...
bwrap --unshare-pid --dev-bind / / --tmpfs /home --bind "$(pwd)" "$(pwd)" bash
it seems to work fairly well? But I just started playing with bwrap this weekend. I do wish bwrap could be told "put the program in this pre-prepared network namespace" because accessing unsecured local dev servers could also be an issue.Bwrap seems like a better option.
bwrap seems a lot easier but if you want more control (or, for instance, want to run a Ubuntu basis because that's what a lot of games are compiled against), systemd-nspawn can be quite powerful.
Even the example code builds a somewhat questionable 'sandbox' that hits a problem discussed in those threads. Say we're ok with an app having r-w access to home except for a couple of places such as ~/.ssh. Now you could try to add a rule to exclude access to ~/.ssh, but the security object must exist when the policy is being established (the rules refer to directories by fds). As such, no .ssh directory, means not rules denying access. You start a sandboxed app thinking you've set up a tight sandbox, at some point ~/.ssh gets created, and now the untrusted app can read your ssh keys.
That will inevitably lag behind what the kernel supports, but more importantly I don't foresee many container image packagers, Helm recipe maintainers and other YAML wranglers getting into the business of maintaining a Landlock sandbox policy.
It makes sense for an application to use Landlock directly to sandbox some parser or other sensitive component. But if the CRI just blocks the syscalls by default, no infra person is going to take on the maintainance of their own sandbox policy for every app. The app will just see ENOSYS and not be sandboxed.
I might be missing the whole idea here, but I really don't see why we need some custom layer in the middle instead of having container runtimes let the security syscalls through?
Are you referring to [0, 1]?
> But if the CRI just blocks the syscalls by default
Does it? Where are you getting this from?
> I might be missing the whole idea here, but I really don't see why we need some custom layer in the middle instead of having container runtimes let the security syscalls through?
Because in the latter case you have to trust the application it will actually do the appropriate locking?
[0]: https://github.com/opencontainers/runc/issues/2859
[1]: https://github.com/opencontainers/runtime-spec/issues/1110
> A official c library doesn’t exist yet unfortunately, but there’s several out there you can try.
> Landlock is a Linux Security Module (LSM) available since Linux 5.13
Since when is not a C API the first and foremost interface for developers when it comes to Linux kernel stuff?
> A official c library doesn’t exist yet unfortunately, but there’s several out there you can try.
Also, it looks like there is more than zero support for C programs calling Landlock APIs. Even without a 3rd-party library you're not just calling syscall() with a magic number:
https://docs.kernel.org/userspace-api/landlock.html
https://github.com/torvalds/linux/blob/6bda50f4/include/uapi...
https://github.com/torvalds/linux/blob/6bda50f4/include/linu...
For C, unofficial support apparently sufficed for now.
"This is the io_uring library, liburing. liburing provides helpers to setup and teardown io_uring instances, and also a simplified interface for applications that don't need (or want) to deal with the full kernel side implementation."
Meaning glibc or musl or your favorite C library probably doesn't have this yet, but since the system calls are well defined you can use A C library (create your own header file using the _syscallN macro for example).
The Linux kernel has a relatively simple example on how to use the syscall even if you can't find a library to deal with it for you: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
There is no "liblandlock" or whatever, though there totally could be. The only reason Rust, Go, and Haskell have an easy-to-use API for this syscall is because someone bothered to implement a wrapper and publish it to the usual package managers. Whatever procedure distros use to add new libraries could just as easily be used to push a landlock library/header to use it in C.
Since the kernel developers don't make userland software?
https://github.com/torvalds/linux/blob/master/include/uapi/l...
you can add simple one-line wrappers if you don't like using syscall() function.
Landlock requires you to commit upfront to what is "deny-default"ed but they only added a control for TCP socket bind and nothing else. So you can "default-deny" tcp bind but all the other socket paths in the kernel are not guarded by landlock. It tries really hard to have the commit of features be an integral part of the landlock API so that you can have an application able to run on multiple kernel versions that support different parts of the landlock spec. But that means that as they develop the API the older versions of landlock need to be less restrictive than newer versions otherwise programs dont work across kernel versions.
That way, a program that is very restrictive on say kernel 6.30 can also run on kernel 6.1 with less restrictions. The program keeps functioning the same way (never break userspace). The only way to do that is to have the developer tell what parts need to be restricted explicitly and you can't restrict what isn't implemented yet.
They're planning to extend it to all socket types. This is also mentioned in the linked article https://github.com/landlock-lsm/linux/issues/6
I guess if you want to run without networking at all today you can just unshare into a fresh network namespace, or maybe use seccomp strict mode
There is an ongoing patch series for udp and another one for general socket control.
You can read about it on the linux-security-module mailing list.
Basically UDP is harder to hook into because it's a connectionless protocol. So bind and connect don't really work the same way.
https://lore.kernel.org/all/20241214184540.3835222-1-matthie...
https://lore.kernel.org/linux-security-module/20251118134639...
And raw sockets require elevated privileges anyway iirc.
That's like relying on criminals to cuff themselves when they have committed a crime.
Can sysadmins disable access to Landlock syscalls via seccomp? Not that I can see why they'd want to, just wondering how this is layered.
I suppose the problem might be if the system has been set up at some earlier point in time to whitelist a set of syscalls for a process and, as Landlock is newer, its syscall numbers won't be included. A program that has been updated to use Landlock while a seccomp policy that predates Landlock is applied would presumably be terminated with SIGSYS due to this?
How can a program determine if Landlock is present without just trying the syscalls and seeing if they work?
The questions you have about seccomp depend on the rules. Well-written filters would return -ENOSYS in that case, so it would look to the program as though the syscall is unsupported.
I also don't think doing so is extraordinarily useful.
If you allow something in landlock, it's still subject to traditional DAC and other restrictions because its a stackable LSM. It can only restrict existing access, not allow new accesses.
My questions are:
- How does this help with malware? I want to craft an environment where any program trying to read f.ex. anything inside ~/.ssh is automatically denied. I don't want a malicious build script to exfiltrate all my sensitive data!
- It seems that this software is well-positioned for us to write application launchers with, is that true? If so, well, I like the idea but it seems too manual.
Maybe I'm looking at the wrong thing. I strongly prefer deny-by-default in an invisible manner i.e. my system to refuse most requests to access this or that. Not opting in to it. Bad actors will not graciously limit their own program with Landlock. They'll try to get anything before I can even blink my eyes.
I feel I'm missing crucially important context. Can somebody help?
It is not a suitable technology for sandboxing a program that wasn't designed to be sandboxed in this way. For that, you need one of the other technologies listed in the article.
That requires a MAC security model like apparmor [0] or selinux [1]. Those can deny filesystem access based on process environment data, such as the executable path or its security context. But these require the access rules to be enumerated externally, whereas landlock is about an application voluntarily limiting itself -- or limiting its children: e.g. it would be a very good idea for npm to restrict the scope of package post-install scripts to only the npm cache/build tree.
It seems that this software is well-positioned for us to write application launchers
Like OpenBSD's pledge[2], this API is primarily meant for application writers, not launchers. But where the Openbsd base system is maintained as a whole by the same group of people, Linux is a hodgepodge of different distributions using various software to construct a complete system. This means it's going to take a long while before landlock will reach anywhere close to the same coverage that pledge already has in OpenBSD. In the meantime, wrappers/launchers is the best that you can do on Linux.
[0] https://en.opensuse.org/SDB:AppArmor_geeks#Anatomy_of_a_prof...
[1] https://manpages.opensuse.org/Tumbleweed/selinux-policy-doc/...
Your package manager would specify a policy that only allows specific access by build scripts. Or you'd use a wrapper.
> - It seems that this software is well-positioned for us to write application launchers with, is that true? If so, well, I like the idea but it seems too manual.
It could be. It's for anyone who knows what their program does, basically.
It is only useful for guarding your own process against someone using malicious inputs to get your process to do something you don’t intend. It is not a guard against programs written by malicious actors (malware), there exist other mechanisms to guard against malware.
https://man7.org/linux/man-pages/man7/landlock.7.html
But i suppose i am missing somehting then people would like...
What would you want an library to do here? abstract over it to make it easier? (relatively simple api already)
legit question, not trying to poke anyone here.. trying to find out what ppl expect from libraries which wrap around these syscalls or stdlib things.
I'm kind of surprised glibc doesn't provide a normal interface yet, but I suppose it has to do with non-Linux compatibility?
Thankfully we have had unified syscall numbers on Linux (for almost all architectures) for the past few years so tracking them is less painful than it used it be.
But also: even within a container (which isn't itself a sandbox) or a VM, you still have concentric circles of trust and/or privilege. If you're installing arbitrary dependencies from the Internet, for example, you probably want a basic initial defense of preventing those dependencies from exfiltrating your secrets at build time.
Another benefit is that it makes it easier for fine-grain control of resources in the application lifecycle. Maybe on initialization the app needs credentials to fetch some data and later on the all doesnt need them. Landlock allows the app to remove its own access to those credentials.
Linux holds on to a negligible share of the overall desktop market OS, but it is marginally more popular among tech savvy people, which have plenty of disposable income, meaning the platform has steadily growing interest for malware authors and distributors despite its relatively low usage.
VM's can be security wrappers, but if you expose all of $HOME to a VM, then there really isn't much security happening, in terms of your data.
This lets developers of applications harden themselves, it doesn't require the end-user to do anything(like put it in a VM).
In a container, provided the container software doesn't do it for you(which is likely true), you just have to break 1 kernel.
The technologies that enabled containerization (namespaces, chroot, and cgroups, and their predecessors on BSD/Solaris) were created specifically for security and resource isolation.
The people who came up with "containers" as we know them today found a clever hack: combining those security-oriented tools with a filesystem-in-a-box and packaging system allowed people to package entire OS userlands and run them pretty deterministically in multiple places. The security isolation properties of namespaces/cgroups/chroot also happened to provide increased determinism.
And I'm not criticizing that; containers are a very clever hack that solved a problem a lot of people have. I use them every day.
That said, the fact that containers became so ubiquitous in the first place speaks a completely self-induced problem that we didn't need to have in the software engineering community. That problem is, unfortunately, human/incentive-related in nature, so containers are probably the best we're going to get--problem is, they're not that good.
I complained about the root problems here awhile ago, easier to link than rehash that here: https://news.ycombinator.com/item?id=44069483
Drew deVault also explained it much more thoroughly and better than I could: https://drewdevault.com/2021/09/27/Let-distros-do-their-job....
It could even be paired with a chroot to make a container runtime. It's more like a building block for process restrictions
Further, while "whole process" sandboxing like containerizing is very effective under some conditions, having more fine grained access and the ability to reduce permissions over time is incredible.
Consider that I may need to open a file in my program. The file path will be provided by an env var `CONFIG_PATH`. My program now has to have total file system read permissions if it is going to support reading arbitrary configuration file paths, even though it only has to read one file.
I can instead set my program up to read that file one time and then never again, or I can set things up to only ever need to read that single file and no others, etc. I can incrementally reduce permissions, and that's really cool. You can't do that with a container - containers get what they get.
Defense in depth. Lock your valuables inside a safe, inside of your locked house. Why lock them in a safe when your house is already locked? Because if someone breaks into your house, you want additional defense "just in case". So just in case I wrote some shitty code and my server got hacked, lock the valuables in a safe anyway so that thief can't steal the expensive silverware (prod credentials).
Which means in the real world, the likelihood of that stuff being on and secure is fairly low, but not zero.
With landlock, pledge/unveil and similar tech, the developers of the software write and configure it, it's on by default and probably can't be turned off(or at least not easily).
This is massive since most ways of dropping privileges on Linux require already having significant permissions (ie: root).
I'd love to see a comparison of landlock to restricted containers.
Fun fact: because landlock is unprivleged, you can even use it inside containers; or to build an unprivileged container runtime :)
One thing to consider is that containers virtualize. You enter new "namespaces" where you aren't necessarily restricted within that namespace, but the namespace as a whole is sort of your own playground. So a PID namespace only allows you to see other processes within that namespace.
This is very distinct from a resource oriented approach like landlock. Landlock may allow you to say "you can do certain actions to certain processes" but you wouldn't get the same semantics as "I can only see specific processes to begin with". They would layer nicely.
Similarly, containers provide virtualized file systems. A write happens in a container and it's allowed, but the write is isolated from the host. Landlock would instead allow or deny that write.
They go very well together.
So the restrictions are permanent for the life of the program. Even root can't undo them.
Deno is a JS runtime that often runs, at my behest, code that I did not myself write and haven't vetted. At run time, I can invoke Deno with --allow-read=$PWD and know that Deno will prevent all that untrusted JS from reading any files outside the current directory.
If Deno itself is compromised then yeah, that won't work. But that's a smaller attack surface than all my NPM packages.
Just one example of how something like this helps in practise.
The API doesn't allow un-restriction, only restriction. Since one typically applies restrictions at program start, they will be applied before an attacker gains remote-execution, and the attacker is then limited in what they can do...
I did not know of this and I am looking for simple ways to isolate processes for multiple reasons. I am building a coding agent, https://github.com/brainless/nocodo, that runs (headless) on a Linux instance. Generated code is immediately available for demo.
I am new to isolation and not looking for a container based approach. Isolation from a security standpoint but I do not know enough. This approach looks like a great start for me.
https://github.com/Zouuup/landrun
All you gotta do is apply a policy and do a fork() exec(). There is also support in firejail.
Also, it's very easy to write your own LandLock policy in the programming language of your choice and wrap whatever program you like rather than downloading stuff from Github. Here's another example in Go:
package main
import (
"fmt"
"github.com/landlock-lsm/go-landlock/landlock"
"log"
"os"
"os/exec"
)
func main() {
// Define the LandLock policy
err := landlock.V1.RestrictPaths(...)
// Execute FireFox
cmd := exec.Command("/usr/bin/firefox")
}It allows running non/semi-trusted workloads with isolation. Pretty useful to onboard applications into a proper scheduler with all bells and whistles without having to containerise, but still with decent levels of isolation between them.
package main
import (
"flag"
"fmt"
"github.com/landlock-lsm/go-landlock/landlock"
"io/ioutil"
"log"
"os"
)
// simple program that demonstrates how landlock works in Go on Linux systems.
// Requires 5.13 or newer kernel and .config should look something like this:
// CONFIG_SECURITY_LANDLOCK=y
// CONFIG_LSM="landlock,lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo"
func main() {
var help = flag.Bool("help", false, "landlock-example -f /path/to/file.txt")
var file = flag.String("f", "", "the file path to read")
flag.Parse()
if *help || len(os.Args) == 1 {
flag.PrintDefaults()
return
}
// allow the program to read files in /home/user/tmp
err := landlock.V1.RestrictPaths(landlock.RODirs("/home/user/tmp"))
if err != nil {
log.Fatal(err)
}
// attempt to read a file
bytes, err := ioutil.ReadFile(*file)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(bytes))
}It's becoming increasingly usable as a wrapper for untrusted applications as well.