A Guide to Kernel Exploitation Attacking the Core [Paperback]
465 pág.

A Guide to Kernel Exploitation Attacking the Core [Paperback]


DisciplinaLinux716 materiais1.856 seguidores
Pré-visualização50 páginas
Linux, the kernel resides in the 0xc00000000\u2013
0xffffffff range (the \u201ctop\u201d gigabyte of virtual memory), whereas each process
is free to use all the addresses beneath this range (the \u201clower\u201d 3GB of virtual
memory).
\u2022 Separated kernel and process address space In this scenario, the kernel
and the user-land applications get a full, independent address space. In other
words, both the kernel and the user-land applications can use the whole range
of virtual addresses available.
From an exploitation perspective, the first approach provides a lot of
advantages over the second one, but to better understand this we need to introduce
the concept of execution context. Anytime the CPU is in supervisor mode (i.e., it
is executing a given kernel path), the execution is said to be in interrupt context if
no backing process is associated with it. An example of such a situation is the
consequence of a hardware-generated interrupt, such as a packet on the network
card or a disk signaling the end of an operation. Execution is transferred to an
interrupt service routine and whatever was running on the CPU is scheduled off.
Code in interrupt context cannot block (e.g., waiting for demand paging to bring
in a referenced page) or sleep: the scheduler has no clue when to put the code to
sleep (and when to wake it up).
Instead, we say that a kernel path is executing in process context if there is an
associated process, usually the one that triggered the kernel code path (e.g., as a
consequence of issuing a system call). Such \u201ccode\u201d is not subject to all the limita-
tions that affect code running in interrupt context, and it\u2019s the most common
mode of execution inside the kernel. The idea is to minimize as much as possible
the tasks that an interrupt service routine needs to perform.
16 CHAPTER 1 From User-Land to Kernel-Land Attacks
We just briefly explained what \u201chaving a backing process\u201d implies: that a lot
of process-specific information is available and ready to be used by the kernel
path without having to explicitly load or look for it. This means a variable that
holds this information relative to the current process is kept inside the kernel and
is changed anytime a process is scheduled on the CPU. A large number of kernel
functions consume this variable, thereby acting based on the information
associated to the backing process.
Since you can control the backing process (e.g., you can execute a specific
system call), you clearly control the lower portion of the address space. Now
assume that you found a kernel vulnerability that allows you to redirect the execu-
tion flow wherever you want. Wouldn\u2019t it be nice to just redirect it to some
address you know and control in user land? That is exactly what systems imple-
menting a kernel space on behalf of user space allow you to do. Because the
kernel page table entries are replicated over the process page tables, a single vir-
tual address space composed of the kernel portion plus your process user-land
mappings is active and you are free to dereference a pointer inside it. Obviously,
you need to be in process context, as in interrupt context, you may have no clue
what process was interrupted. There are many advantages to combining user and
kernel address spaces:
\u2022 You do not have to guess where your shellcode will be and you can write it
in C; the compiler will take care of assembling it. This is a godsend when the
code to trigger the vulnerability messes up many kernel structures, thereby
necessitating a careful recovery phase.
\u2022 You do not have to face the problem of finding a large, safe place to store the
shellcode. You have 3GB of controlled address space.
\u2022 You do not have to worry about no-exec page protection. Since you control
the address space, you can map it in memory however you like.
\u2022 You can map in memory a large portion of the address space and fill it with
NOPs or NOP-like code/data, sensibly increasing your chances of success.
Sometimes, as you will see, you might be able to overwrite only a portion of
the return address, so having a large landing point is the only way to write a
reliable exploit.
\u2022 You can easily take advantage of user space dereference (and NULL pointer
dereference) bugs, which we will cover in more detail in Chapter 2.
All of these approaches are inapplicable in a separated user and kernel space
environment. On such systems, the same virtual address has a different meaning
in kernel land and in user land. You cannot use any mapping inside your process
address space to help you during the exploitation process. You could say that the
combined user and kernel address space approach is best: to be efficient, the
separated approach needs some help from the underlying architecture, as happens
with the context registers on UltraSPARC machines. That does not mean it
is impossible to implement such a design on the x86 architecture. The problem
concerns how much of a performance penalty is introduced.
An Exploit Writer\u2019s View of the Kernel 17
OPEN SOURCE VERSUS CLOSED SOURCE OPERATING
SYSTEMS
We spent the last couple of sections introducing generic kernel implementation
concepts that are valid among the various operating systems we will cover in
this book. We will be focusing primarily on three kernel families: Linux (as a
classic example of a UNIX operating system), Mac OS X (with its hybrid
microkernel/UNIX design), and Windows. We will discuss them in more detail
in Chapters 4, 5, and 6. To conclude this chapter, we will provide a quick
refresher on the open source versus closed source saga.
One reason Linux is so popular is its open source strategy: all the source code of
the operating system is released under a particular license, the GNU Public License
(GPL), which allows free distribution and download of kernel sources. In truth, it is
more complicated than it sounds and precisely dictates what can and cannot be done
with the source code. As an example, it imposes that if some GPL code is used as
part of a bigger project, the whole project has to be released under GPL too. Other
UNIX derivates are (fully or mostly) open source as well, with different (and, usually,
more relaxed) licenses: FreeBSD, OpenBSD, NetBSD, OpenSolaris, and, even though
it\u2019s a hybrid kernel, Mac OS X let you dig into all or the vast majority of their kernel
source code base. On the other side of the fence there is the Microsoft Windows
family and some commercial UNIX derivates, such as IBM AIX and HP-UX.
Having the source code available helps the exploit developer, who can more
quickly understand the internals of the subsystem/kernel he or she is targeting and
more easily search for exploitation vectors. Auditing an open source system is also
generally considered a simpler task than searching for vulnerability on a closed
source system: reverse-engineering a closed system is more time-consuming and
requires the ability to grasp the overall picture from reading large portions of
assembly code. On the other hand, open source systems are considered more
\u201crobust,\u201d under the assumption that more eyes check the code and may report issues
and vulnerabilities, whereas closed source issues might go unseen (or, indeed, just
unreported) for potentially a long time. However, entering such a discussion means
walking a winding road. Systems are only as good and secure as the quality of their
engineering and testing process, and it is just a matter of time before vulnerabilities
are found and reliably exploited by some skilled researcher/hacker.
SUMMARY
In this chapter, we introduced our target, the kernel, and why many exploit
developers are interested in it. In the past, kernel exploits have proven to be not
only possible, but also extremely powerful and efficient, especially on systems
equipped with state-of-the-art security patches. This power comes with the expense
of requiring a wide and deep understanding of the kernel