Future extensions and enhancements --------------------------------------------------------- In this section, we only discuss important orientations of future enhancements. Pending minor improvements and fixes are listed in the TODO file. In the section "How secure is sysmask", we have listed three steps that a successful exploit must take simultaneously against the system: A. Break a daemon (or browser) to make it execute arbitrary codes. B. Use the above codes to exploit a vulnerability within the security-sensitive parts, to break into the kernel. C. Use the kernel breaking to get rid of the sysmask protection for the compromised process, or to creat a new process with no or less sysmask protection. The current sysmask package only addresses part B, modulo potential vulnerabilities in the sensitive parts of the kernel. --------------------------------------------------------- Let's first take a look at part C. Once an exploit gets a ring 0 access, it has the right to modify any data including those for sysmask. Under the condition that it knows where they are. And this is the point. Currently the kernel is compiled with static data locations, so that an exploit or a trojan horse can easily guess what memory location corresponds to what data. If you recompile the kernel with data locations arbitrarily permuted, this guess becomes much more difficult. With the amount of kernel data that are mutually independent therefore permutable, a full random permutation makes it impossible for a bad code to find out what are the data it wants, except by indirect methods, for example looking at recognizable codes. However, many code portions are also mutually independent therefore permutable. A good example to start with is the tast_struct structure, defined in include/linux/sched.h. It contains the most vital informations about security, and contains a large number of fields. And this structure is internal to the kernel, so if you recompile everything (kerel + modules) with a shuffled structure, your kernel is already much more secured than before. Once this is done, utilities can be created to allow anybody to randomly shuffle data positions before recompiling the kernel. Then every system will be running under a different data order, and it will be the complicatedness of the kernel that makes breaking part C practically impossible! Note that here we get a real technical advantage of open source against proprietary softwares. Randomly shuffling kernel source does not break GPL as long as you don't distribute your shuffled binary (but you have every interest to keep your binary private haven't you). On the other hand, you obviously cannot do the same with closed-source softwares, so that users of proprietary softwares can only continue to use non-shuffled binaries and be vulnerable on this point. ---------------------------------------------------------- For part A, the main point is to prevent arbitrary codes to get executed through a vulnerability. There are two different types of execution: text scripts and binary codes. Paradoxally, it is the execution of text scripts that is more difficult to prevent. Many programs call shell for their normal work, so for these programs execution of sh should be allowed, and an exploit can send arbitrary scripts to it. In other occasions, the program itself contains an interpreter that can do many things. In these cases, not much can be done other than using sysmask to limit access. But then we are on step B, not step A. On the other hand, efficient solutions should exist to prevent execution of binary codes. Programs can be relinked in a way that code and data are separated to different memory areas (today they are intermixed). Once the program has finished loading, the local descriptor table (LDT) of the process can be modified to disallow writing to the code area, and disallow execution on the data area. Then you disallow further modification of the LDT by setting the corresponding mask, and you are sure that an exploit cannot load some malicious binary code into the memory then execute it. I haven't evaluated the development cost of doing so. In any case, for processes for which the execution of text scripts cannot be prohibited, preventing execution of binaries does not add much to security and might not be worth the development cost. ----------------------------------------------- User friendliness of the system has a big room of improvement. Utilities can be designed to automatically find out the optimal policy for a software. It is easy to pick up the access informations from the log file, sort them out and feed back to the configuration files. On the other hand, although sysmask works well without cooperation with protected software, it is always better if the latter cooperates. And there is very little to do to achieve this: the software has only to send some signals to sysmask at important stages of its operation, by issuing a special system call, or by trying to reach a file with a particular name, real or fictive. This can be done in a way such that nothing goes wrong if sysmask is not there. For example, it can issue a stat() to, say, a file named "/tell-sysmask/startup-ended" or "/sysmask_info/socket_opened". Of course the call will fail because there is no such file. But this doesn't matter, and if sysmask is alive it will capture the request and know that more masks can be added to the process. That is, the call is a sysmask trigger. This is better than directly issuing sysmask commands by the process. Direct commands are less flexible, and may run into compatibility troubles when protocols change. ------------------------------------------------- Security auditing is another important new domain to explore. On the one hand, now one can really consider applying a proof checker on the small amount of sensitive codes of a system, to see what can be proven to be secure and what cannot. On the other hand, non-proving auditing programs can be designed to help system administrators and users to see whether their configurations are secure.