Context Brainstorming
Posted on
This is a blog post brainstorming about contexts.
I'll us the term contexts here, as tmandry is leaning to, since it seems to make sense to keep capabilities distinct concepts. Idiomatic capability-based code and the Principle of Least Authority prefer fine-grained access to resources, which contexts don't seem like a good fit for. So let's keep these concepts distinct for now.
yoshuawuyts showed me there is way we might use something like contexts to retrofit an awareness of ambient authority into Rust. Here's an attempt to sketch up more of what that might look like.
Automatic contexts
Let's extend the contexts proposal with a concept of automatic contexts, that
functions would implement by default. Just like how Rust has automatic trait impls.
Like automatic trait impls, you can opt out, with negative with-declarations,
using !
syntax.
And let's introduce the concept of supercontexts, which are contexts that imply other contexts. Much like supertraits in Rust. This isn't strictly necessary, but it helps with granularity.
With those, and the observation that contexts are a way of coloring functions, let's introduce some hypothetical automatic contexts:
-
global_allocator
, the ability to use the Rust global allocator. -
ambient_authority
. Similar to thisAmbientAuthority
, but as a context, so it can be more. This would be a supercontext which includes:-
fs
- the current process' filesystem namespace -
net
- the current process' network namespace -
time
- the current process' time namespace. Preventing code from observing time entirely is hard, especially if there can be multiple threads, so maybe thistime
would just be about the explicit time APIs rather than blocking all potential time sources. -
stdio
- access to the ambient stdin, stdout, and stderr -
process
- the ability to spawn arbitrary child processes -
mutable_static
- write to or read from statically-allocated mutable and interior-mutable state in the process. There are use cases where statically-allocated state is useful, but since we have contexts here, for maximal modularity, these cases should ideally use contexts instead of implicitly associating state with the whole process. -
and others. OS's attach a lot of miscellaneous authorities to processes. Ideally we'd make sure we have everything covered.
-
These being automatic is kind of a way to retroactively reinterpret existing Rust code. All code now defaults to having these contexts, and we can then opt out of them, like this:
fn useless()
with
!ambient_authority
{
}
Here, we can know immediately that this useless
is a useless function just by looking
at its signature. It has no return value, no arguments, no ambient authority. All
it could do is return, panic, or infloop.
Panic could unwind, and it'd be nice to add a context for that too:
unwind
- the ability to unwind the stack
then !unwind
could be used for functions that can't unwind. Maybe this
could even be connected to LLVM's nounwind
. Anyway, with !unwind
, we could
write code like this:
fn totally_pure(a: &A) -> B
with
!ambient_authority +
!unwind
{
// lots of interesting stuff
}
I think someone told me once that the Rust compiler can know whether
types have interior mutability. Let's assume it can, and that this includes
types that hold I/O handles. In theory, if A
here has no interior mutability,
this should allow Rust to annotate functions like this with optimizer attributes
like LLVM's readonly
, meaning calls to it could be redundant-code-eliminated.
Beyond just LLVM though, this could enable MIR-level redundant-code elimination of calls, even pre-monomorphization. No need to do complex alias analysis or escape analysis, because the type system just tells you what you need to know up front!
But it wouldn't get dead-code elimination, because of the possibility of inflooping. More on that later.
Pure, except where indicated otherwise
By the way, if one of the arguments has a type that does have an I/O handle,
including a filesystem handle, then the function can always do I/O. The fs
context is about the process' filesystem namespace. So with
!fs
, you can't do File::open
, but you can use a Dir
you've been given
as an argument to do Dir::open
, because it's resolved relative to a directory
you have an explicit handle to, rather than the process' filesystem namespace.
Similarly, passing a &mut
reference into a function marked this way requires
no special ceremony. Unlike "pure" keywords in languages where purity is all
or nothing, the rule here is, if the signature has a &mut
, the callee can
access it as a &mut
, including mutating it:
fn pure_except_as_obvious(a: &A, m: &mut M, f: &File) -> B
with
!ambient_authority +
!unwind
{
// lots of interesting stuff, including mutating `*m` and writing to `*f`.
}
See First-class I/O for more discussion of this.
Security
It may be surprising that that I haven't talked about security in this post yet. It turns out that capability-based security really is just a special case of a deeper capability-based design philosophy. It's similar to how Rust's borrow checker is, on its face, a memory-management strategy, but also much deeper, with things to say about such seemingly unrelated areas as thread safety, pointer aliasing, iterator invalidation, and refactoring. There's a lot going on here.
It also turns out that security for untrusted or compromised-supply-chain
code is complex. For example, if we want to completely sandbox a piece of Rust
code with language mechanisms, we need to make sure it can't use unsafe
blocks,
since unsafe Rust could trivially escape any sandbox. Security exploits are ok
relying on UB if it works with enough probability in practice.
Getting closer: unsafe
This post is all about contexts though, so let's see if we can use them to fix that problem too:
new_unsafe
- an automatic context representing the ability to introduce new unsafe contexts. This corresponds to the ability to writeunsafe { ... }
.unsafe
- a retroactive reinterpretation of whatunsafe fn
desugars to. Includesnew_unsafe
as a subcontext, or doesn't, depending on how unsafe blocks in unsafe functions goes.
As an aside, contexts would also be a path for libraries to define unsafe-like concepts for their own invariants, which is something I occasionally see people asking for in Rust.
With new_unsafe
, we could write:
fn untrusted_code(x: &X, y: &mut Y) -> Z
with
!ambient_authority +
!new_unsafe
{
// untrusted code here?
}
Would this be a secure sandbox? Not yet; one problem is that even if we
know X
has no interior mutability or I/O handles, this code still
exposes the address of x
or y
to untrusted code, because converting a
reference to a raw pointer doesn't require unsafe
in Rust. The address
might tell an attacker something about the ASLR in the process, which
might make other attacks more powerful.
Still getting closer: raw pointers
When all you're doing is writing a blog post about contexts, everything looks like a problem to be solved by adding a new automatic context.
raw_pointers
- the ability to convert references into raw pointers.
In addition to solving this ALSR problem, this attribute has some
interesting possibilities. It's awkward how Rust allows APIs with reference
arguments to observe whether two references have the same address, when
this usually isn't part of the conceptual API. !raw_pointers
would be
a way to declare that a function doesn't do that.
Further, with !raw_pointers
, it'd be possible to have Rust code that
doesn't depend on a byte-addressed address space. There'd be no alignment or
endianness visible. Objects could be moved at any time, just like in a
moving GC. Threads could be migrated to different stacks. This might even
open up a path to Rust being able to use Wasm reference types, which Rust
can't otherwise hold directly since they're opaque and can't have their
representation exposed.
Are we secure yet?
No. But, to keep this blog post scoped, let's ignore side-channel attacks like Spectre, hardware attacks like Rowhammer, crypto miners, and denial-of-service attacks. And let's ignore attacks which change the behavior of the code without breaking the sandbox, such as changing an encryption implementation to emit syntactically valid but insecure data. That's a lot to ignore in reality, but the solutions to those would require radically different mechanisms, so let's put those aside for now.
Ok, now are we done yet?
What about global variables? We included mutate_static
in ambient_authority
above, so they won't be mutated, but is it a problem if the untrusted code
reads any of the program's global immutable state? Could it find authentication
secrets? To answer this, we'd need to start getting more specific about the
threat model. But to keep things simple, let's say the program doesn't have
anything sensitive in immutable global state. It's best to keep sensitive
things like authorization credentials as scoped as possible in general anyway.
Along those lines, what about std::env::vars
, std::env::args
, std::env::home_dir
and others? They might contain sensitive information, or even just your username.
Let's say these are disallowed by !mutable_static
by virtue of being mutable
through libc APIs. Or, if needed, we could also add a new context to cover these.
Will it ever stop
This is just a brainstorming post, and it's possible things are missing, but it's likely any such things can be covered by adding more contexts. For the sake of making a finite blog post, let's assume we can cover everything.
So can we say then, that we now, assuming all of our assumptions, finally have a secure hypothetical sandbox here?
fn untrusted_code(x: &X) -> Y
with
!ambient_authority +
!new_unsafe +
!raw_pointers
{
// untrusted code here!
}
Yes.
beat
In theory.
In practice, the Rust compiler isn't currently designed or intended to be used as a security surface in this way. And it's not necessarily worth it for it to try to be one. There'd be work involved, and for this to actually make sense, we'd need to look at real-world use cases and attack vectors, and we wouldn't be able to ignore any of the things we ignored above.
Capability-based programming
However, even if we don't look to !ambient_authority
to be the basis of an
actual sandbox, and even if the performance impacts of the aliasing, escaping,
and side effect knowledge isn't compelling, this overall technique might still
be useful.
For people reviewing code, !ambient_authority
could reduce the
reasoning footprint, because they'd be able to make more local assumptions
about the side effects of calling functions.
And for people building large complex applications, it could give them more tools to help ensure that two parts of the application don't have unintended interactions, as explored here.
And for people building wasm components, it could give them more tools to ensure that they're only using APIs which compose cleanly with other components.
Potential downsides
With all these colors, and with users having the ability to define their own colors, we could end up with a lot of colors.
Will having an ecosystem where everyone can use all these colors to enforce their requirements with extraordinary precision increase or decrease overall usability of Rust? Will it lead programmers to waste time pursuing every possible dimension of theoretical purity, regardless of what really matters in practice?
Will these new colors and automatic contexts prompt new rounds of users
going through all their dependencies and insisting that they support new
colors? If so, will it cause ecosystem churn and/or awkward workarounds,
or even ecosystem fragmentation, like #![no_std]
sometimes does, and is
that worth it?
I don't know.
What I do know is, in a vacuum, it sure is fun to think up new colors.
Tangent: Pretty colors
Let's think about one more possible context, for fun:
turing_complete
- the ability to have loops (or tail recursion, if Rust adds that), that can't be proved to terminate.
The halting problem gets talked about a lot. However, how often does one
actually write loop
, as opposed to just using for
? If we also had a way
to assert that iterator implementations don't repeat themselves, a lot of
real-world code might be able to be compatible with !turing_complete
.
One of the tricky issues for iterators would be linked lists, which would need
to be guaranteed to be acyclic. But it's interesting to note that in Rust,
creating a circularly linked list actually requires unsafe
anyway. So maybe
there's something we could do here.
This would also address the "or infloop" case mentioned above, so we could also get dead-code-elimination of calls.