Future extensions and enhancements

---------------------------------------------------------

In this section, we only discuss important orientations of future
enhancements. Pending minor improvements and fixes are listed in the TODO
file.

In the section "How secure is sysmask", we have listed three steps that a
successful exploit must take simultaneously against the system:

	A. Break a daemon (or browser) to make it execute arbitrary codes.

	B. Use the above codes to exploit a vulnerability within the
	security-sensitive parts, to break into the kernel.

	C. Use the kernel breaking to get rid of the sysmask protection for
	the compromised process, or to creat a new process with no or less
	sysmask protection.

The current sysmask package only addresses part B, modulo potential
vulnerabilities in the sensitive parts of the kernel.

---------------------------------------------------------

Let's first take a look at part C. Once an exploit gets a ring 0 access, it
has the right to modify any data including those for sysmask. Under the
condition that it knows where they are.

And this is the point. Currently the kernel is compiled with static data
locations, so that an exploit or a trojan horse can easily guess what memory
location corresponds to what data. If you recompile the kernel with data
locations arbitrarily permuted, this guess becomes much more difficult. With
the amount of kernel data that are mutually independent therefore
permutable, a full random permutation makes it impossible for a bad code to
find out what are the data it wants, except by indirect methods, for example
looking at recognizable codes. However, many code portions are also mutually
independent therefore permutable.

A good example to start with is the tast_struct structure, defined in
include/linux/sched.h. It contains the most vital informations about
security, and contains a large number of fields. And this structure is
internal to the kernel, so if you recompile everything (kerel + modules)
with a shuffled structure, your kernel is already much more secured than
before.

Once this is done, utilities can be created to allow anybody to randomly
shuffle data positions before recompiling the kernel. Then every system will
be running under a different data order, and it will be the complicatedness
of the kernel that makes breaking part C practically impossible!

Note that here we get a real technical advantage of open source against
proprietary softwares. Randomly shuffling kernel source does not break GPL
as long as you don't distribute your shuffled binary (but you have every
interest to keep your binary private haven't you). On the other hand, you
obviously cannot do the same with closed-source softwares, so that users of
proprietary softwares can only continue to use non-shuffled binaries and be
vulnerable on this point.

----------------------------------------------------------

For part A, the main point is to prevent arbitrary codes to get executed
through a vulnerability. There are two different types of execution: text
scripts and binary codes.

Paradoxally, it is the execution of text scripts that is more difficult to
prevent. Many programs call shell for their normal work, so for these
programs execution of sh should be allowed, and an exploit can send
arbitrary scripts to it. In other occasions, the program itself contains an
interpreter that can do many things. In these cases, not much can be done
other than using sysmask to limit access. But then we are on step B, not
step A.

On the other hand, efficient solutions should exist to prevent execution of
binary codes. Programs can be relinked in a way that code and data are
separated to different memory areas (today they are intermixed). Once the
program has finished loading, the local descriptor table (LDT) of the
process can be modified to disallow writing to the code area, and disallow
execution on the data area. Then you disallow further modification of the
LDT by setting the corresponding mask, and you are sure that an exploit
cannot load some malicious binary code into the memory then execute it.

I haven't evaluated the development cost of doing so. In any case, for
processes for which the execution of text scripts cannot be prohibited,
preventing execution of binaries does not add much to security and might not
be worth the development cost.

-----------------------------------------------

User friendliness of the system has a big room of improvement. Utilities can
be designed to automatically find out the optimal policy for a software. It
is easy to pick up the access informations from the log file, sort them out
and feed back to the configuration files.

On the other hand, although sysmask works well without cooperation with
protected software, it is always better if the latter cooperates. And there
is very little to do to achieve this: the software has only to send some
signals to sysmask at important stages of its operation, by issuing a
special system call, or by trying to reach a file with a particular name,
real or fictive. This can be done in a way such that nothing goes wrong if
sysmask is not there.

For example, it can issue a stat() to, say, a file named
"/tell-sysmask/startup-ended" or "/sysmask_info/socket_opened". Of course
the call will fail because there is no such file. But this doesn't matter,
and if sysmask is alive it will capture the request and know that more masks
can be added to the process. That is, the call is a sysmask trigger.

This is better than directly issuing sysmask commands by the process. Direct
commands are less flexible, and may run into compatibility troubles when
protocols change.

-------------------------------------------------

Security auditing is another important new domain to explore. On the one
hand, now one can really consider applying a proof checker on the small
amount of sensitive codes of a system, to see what can be proven to be
secure and what cannot.

On the other hand, non-proving auditing programs can be designed to help
system administrators and users to see whether their configurations are
secure.