Contexts and interactions#
Todo
Talk about processes themselves.
Inter-process communication (IPC) serves, from a process’s point of view, for communicating with the hardware and other processes, allowing them to cooperate to accomplish a task. It is accomplished in thox using two mechanisms:
Remote procedure call (RPC) calls.
Message queues (not implemented yet).
These mechanisms take place in contexts. Contexts are security objects in which these two interactions take place; they allow processes to manage these elements, amongst others:
Sandboxing of given processes (by whitelisting or blacklisting RPC endpoints they can call).
Logging (by running the process from a process “watcher”).
Compatibility with older thox processes (by converting older RPC endpoints into new ones); see seccomp for a real-world example of such a concept.
They can also correspond to objects, such as file handles or network connections, where the handle is guaranteed to the daemon to be closed since references to given contexts are closed automatically when a process exits.
Regarding contexts, a process:
Has a default context, given the number 0.
Can get access to other contexts when receiving an answer to an RPC call, using
os.pull()
, or creating one, usingos.context()
.Can share access to a context by answering a call using
os.answer()
, or creating a new process usingos.run()
.
A context is owned by the process which has created it using
os.context()
. This process manages the context, that means it
manages the routing of the RPC calls and messages, thus the security of it
in case it shares this context with multiple processes.
The initial process gets a context managed by the process manager; the initial process can then create contexts for its children, depending on the security it wants to setup. See initd: process manager with thox IPC for more information.
Todo
Message queues in a different namespace, also in contexts, with the following functions, probably:
os.push(ctx, name, info)
os.listen(ctx, name)
This requires similar utilities to RPC calls, as we also need routing (although while calls must arrive to one desintation, messages can arrive to multiple).
Todo
What happens on OS shutdown for example, in which order are processes closed if they need to close processes? Can processes have a “kill” event in order to be able to make their last actions (e.g. sending disconnect messages on network connections)?
Events#
thox processes are event-driven; the process manager prepares the events for the process to read, with an optional filter depending on what it expects.
Events can be the following:
RPC events, such as calls received by the process and answers received from previous calls.
Messages from messages queues the process has subscribed to. These messages can represent “real world” events, etc.
Events are represented using os.Event
objects, and are pulled
using os.pull()
. It is possible to pull specific events, such as
the answer to a specific call, by specifying additional parameters to
os.pull()
; by default, it returns the oldest event not gathered
by the process.
Todo
What if there are too many events? Where is the limit? What happens?
Remote Procedure Call (RPC)#
thos processes communicate mainly in a one-to-one fashion using a remote procedure call protocol.
Todo
Since RPC in thox is asynchronous, what about cancelling calls? Some calls might never end, and this might clog up the process’ call space, there should be a timeout mechanism, or we could let the process manage its timeouts using alarms but let it cancel the call. But what about what happens for the process at the other end of the RPC call? This would be a hang up, but wouldn’t it be simpler to just let the RPC call open for it and simply not transmit the answer? But then should we alert it of the hang up?
Also, what if cancelling is used to DOS another process making complex operations or lots of I/O behind to make it work? e.g. instead of being limited to the max. number of calls I can emit, I make a lot of calls, cancel all of them, then do it again in a loop. The process scheduler might see this as time-sharing friendly, so it might give it more time to make more system calls compared to how it is managed on the other end.
Calling procedure#
Picture three processes P1, P2 and P3, where P3 manages the default context of both processes P1 and P2. P1 wants to execute an action using this protocool, and P2 has this function available and wants any other processes using its default context to be able to run it.
In order to represent the action, thox uses RPC names such as
my.super.function
. When started up, P3 first decides, either for each
action or globally, what it wants to do. Some common possibilities are:
It provides a fix set of functions, and does not provide any mechanisms to “bind” functions.
It transmits all calls from a given context to another, e.g. calls from a context it created to its default context.
It allows RPC name binding.
The basic context most processes on thox will encounter is the context
provided by initd (see initd: process manager with thox IPC for more information).
This context allows binding through its os.rpc.bind()
and
os.rpc.unbind()
endpoints.
With binding, daemons such as P2 can then route specific RPC calls done on
its default context to itself, which means that subsequent calls by any
process to my.super.function
on the context provided by P3 will result
in P2 receiving a call from the said process. Therefore, when making a
call to my.super.function
to its default context, P1 will receive an
answer from P2.
What happens in order during the call is the following:
P1 calls the procedure using the
rpc
call. This actually emits a call to the system usingos.call()
, which returns a token in the form of a numerical Call IDentifier (CID).P3 gets a call event, bundled with the CID with which to answer, the arguments given by P1, and some additional request information. It finds out that P2 is bound to the given name, and transmits the call to it using
os.transmit()
.P2 gets a call event, bundled with the CID with which to answer, the arguments given by P1, and some additional request information.
P2 treats the request accordingly.
P2 emits an answer using
os.answer()
, passing the CID to it, optionally followed by some return values.P1 gets an answer event, with the CID (to distinguish the call to which the answer is for, in case P1 has sent multiple calls).
For binding a name beforehand, P2 uses os.rpc.bind()
; it can also
unbind a name using os.rpc.unbind()
.
Remote Procedure Names (RPN)#
Remote Procedure Names (RPN) are names to which remote procedure calls are emitted. They are repsented as a dot-joined collection of one or more name components, which are non-empty strings of letters and digits, not beginning with a digit and not being one of the following reserved words:
and
,break
,do
,else
,elseif
,end
,false
,for
,function
,goto
,if
,in
,local
,nil
,not
,or
,repeat
,return
,then
,true
,until
,while
A Remote Procedure Name:
Can end with an underscore. Note that only the last name component can end with an underscore; for example,
fs_.open
is not allowed.Must neither end with an empty component (i.e. with a dot) nor must the last component be composed solely of an underscore; for example,
fs_
andfs.open_
are allowed, where_
orfs._
are not.Is of non-zero arbitrary length [1].
Is case-insensitive, which implies that the callee will receive a lower-cased version of it; e.g.
FS.GetSpaceLeft
,FS.GETSPACELEFT
andfs.getspaceleft
will all be received by the callee asfs.getspaceleft
.
Note that the underscore has special significance; see Sharing contexts.
The regex for validating names as used for RPC (which requires negative lookaheads) is the following:
/(?!.*\.\_?$)((?!and|break|do|else|elseif|end|false|for|function|goto|if|in|local|nil|not|or|repeat|return|then|true|until|while)([a-z][a-z0-9]*)\.?)*\_?/gi
A Lua function to validate and return the canonicalized RPC name can be
found in torpcname.lua
.
Some valid and invalid identifiers are the following:
Valid identifiers |
Invalid identifiers |
---|---|
sleep os.module how.deep.does.this.go my.function2 my.function2_ my_ |
for 123hello hello.2theworld my.gawd$ my_.function2 my._ _ |
The rationale behind this definition is to be able to integrate these
identifiers into native code using the os.rpc
prefix, for example
os.rpc.sleep(5)
to emit a synchronous call to the sleep()
function.
Case insentivity is explained by the confusion that the system-wide
difference between fs.GetSpaceLeft
and fs.getspaceleft
could generate,
leading to potential security problems; see typosquatting for a real world
problem alike what this mitigation is addressing.
Notice that while this API is asynchronous, most of the time, processes will
want to call RPC functions synchronously; to make this more accessible,
one can use the os.rpc
object.
Status codes#
RPC calls always return a status code as a number as the first argument.
This status code should always be defined between 0 and 255; when a
status code provided to os.answer()
is not provided
within those bounds, the status code will be set to INVALID
(255).
Special status codes are the following:
SUCCESS
(0): returned when the call has succeeded.UNBOUND
(253): returned when the name should be considered as unbound.UNANSWERED
(254): returned when a call has been unanswered. This can be due to it not being picked up from a full event queue, or to bad routing leading a process owning a context to forward it to self.UNKNOWN
(255): returned when an invalid status code has been provided toos.answer()
.