Anti-debugging and Anti-emulation Techniques

Anti-debugging and Anti-emulation Techniques

Malware is code written with the intent to cause annoyance, to retrieve sensitive information about individuals, to cause data loss, or for other similar purposes. Antivirus companies are constantly trying to detect malware and to lessen or stop these damaging effects. This causes an ongoing struggle between antivirus companies and virus writers. Anti-debugging and anti-emulation techniques have been devised by virus writers to counteract the efforts of antivirus companies to detect and remove malicious code. To further detect and stop malware, these techniques must be understood. This paper attempts to identify current anti-debugging and anti-emulation techniques and to provide code samples to allow for the identification of malicious code.

1. Introduction

To deter or slow down analysts from reverse-engineering malware, virus writers employ anti-debugging and anti-emulation techniques. These techniques slow down the reverse-engineering process by attempting to avoid detection when a virus is being run in an emulator or debugger. Several different methods are employed by malware to fool dynamic detection and other analysis mechanisms such as emulators and debuggers. Each virus usually employs one to many of these techniques.

This paper discusses those methods virus writers employ to slow down reverse-engineering of their viruses. It also provides examples of these methods. This paper is intends to provide an overview of the various anti-debugging and anti-emulation techniques for those that want to protect against these defenses.

The paper is split into several sections. Section 2 addresses anti-emulation techniques, followed by anti-debugging techniques in section 3, and other anti-detection techniques that do not fall into the first two categories are discussed in section 4. Examples of viruses that combine all of the techniques are discussed in section 5, while section 6 is the conclusion.

2. Anti-emulation

An emulator provides a restricted environment (an image of an operating system that is installed on top of a host operating system) in which one can dynamically analyze a program. For example, a machine running Linux can have a virtual machine installed on it running Windows XP. An emulator will also include CPU and memory emulation, hardware emulation, and an emulation controller [1]. This provides a safe way to dynamically analyze programs without the possibility of causing damage to the underlying operating system.

There are also drawbacks to this system. First, the more time spent emulating, the longer the user has to wait to get any analysis back on the program. Second, emulation is slow due to the overhead required to monitor the program and to emulate the operating system and hardware. QEMU, a hardware and operating system emulator, has between a factor of 4 to 10 slowdown in hardware emulation and a factor of 2 slowdown on emulation of the software memory management unit [2]. QEMU does not have any dynamic monitoring, which would further increase the slowdown. A final drawback is that malware employs different anti-emulation techniques to fool the emulator. These techniques fall into three categories: Outlasting, outsmarting, or overextending the emulator.

2.1 Outlasting the emulator

Since emulation is expensive, emulators only run a couple hundred instructions of code to detect malware before terminating. On an Intel x86 system, it is believed that running around 1000 instructions is enough to detect malicious code while maintaining a short runtime [1]. If the code does nothing considered malicious during that time, the emulator will not detect the code as malware. Outlasting the emulator can be done by replicating based on a specific probability, by running benign code for a specific time frame before running the malicious code, or by entry point obfuscation.

2.1.1 Probabilistic Replication

Some malware will only replicate a certain percentage of the time. This means that an emulator would have to run the code several times in order to accurately detect the malicious code. An example of this is a virus that runs benign code 90% of executions and malicious code only 10%.

2.1.2 Running Benign Code

Other malware will run benign code every time on startup for a specific time frame and then run the malicious code. The purpose of this is to allow the emulator to run for the given number of instructions and pretend that the code does nothing malicious in this time frame. It then goes on to replicate and cause damage to a system.

2.1.3 Entry Point Obfuscation

Entry point obfuscation (EPO) is a method that includes attaching malicious code to a specific section of a file. Instead of the malicious code executing first, the code can look for calls to the ExitProcess() API function and overwrite those calls with a jump to the malicious code. This way the code runs on exit, not entry of an executable. It can also look for a specific sequence of code to overwrite with the malicious code itself, or with a jump to the malicious code. Entry point obfuscation, therefore, has the added advantage of giving the virus the ability to run the code later by copying the code to a new location before actually overwriting it [1].

2.2 Outsmarting the emulator

To outsmart the emulator some malware uses time based triggers, or other conditionals, or it uses different decryption techniques if it is encrypted malware. Time based triggers can be anything as simple as running the malicious code only on 3 pm or only on a specific day. The malware can also check conditions such as checking to see if it is being debugged or emulated. If it is, it executes benign code.

Decryption techniques that can be used include spreading the decryption loop throughout the code or having multiple decryption passes [1]. Multiple decryption layers are used by W32/Harrier, W32/Coke and W32/Zelly where the first decryptor will decrypt the second, the second will decrypt the third, etc [9]. Malware can also only decrypt blocks of code as needed in order to outsmart the emulator [1].

The RDA.Fighter virus uses brute force decryption. This means that the virus does not store the decryption key so it must try all possible decryption keys in order to decrypt itself. This is useful because it is then difficult for anti-virus companies to decrypt the virus without using the same brute force method, which takes a large amount of emulation instructions. It is likely, then, that the virus will not decrypt itself while the emulator is running. Below is the code for the decryption code of the RDA Fighter virus [6].

setup:

xor ebx,ebx

iterate:

mov esi,[ebp + hostOffset]

mov edi,esi

mov ecx,[ebp + host_size]

inc ebx

decrypt:

lodsb

xor al,bl

stosb

loop decrypt

check:

mov esi,[ebp + hostOffset]

push esi

mov ecx,[ebp + host_size]

push ecx

mov eax,[ebp + __ADDR_CheckSum] ; whatever this happens to be

call eax

test eax,eax

jnz iterate

mov esi,[ebp + hostOffset]

jmp esi

In the above example we see that the setup block sets ebx to zero and is only run once. Iterate then sets esi and edi to the beginning of the encrypted code, sets ecx to the size of the encrypted code, and also increments ebx by one. The decrypt basic block goes through the ecrypted code byte by byte and xors the byte with the key in ebx. This loop terminates when ecx becomes zero. The check basic block then checks to see if the current code matches a predetermined checksum. If it does, then the code has been decrypted. If it hasn’t been, it goes back to the iterative loop and tries again with the next largest key. If it has been decrypted then it jumps to the newly decrypted code.

W95/Fono uses a nonlinear decryption algorithm, meaning that the encrypted portion of the virus is not decrypted linearly as expected. Since the virus body is not decrypted one byte after the other, it can confuse emulators. W95/Fono uses a key table and the decryptor performs substitutions based on this table. Each symbol in the plaintext alphabet corresponds to another symbol in the ciphertext. For example: A corresponds with L, Z corresponds with F, etc. Therefore, by using this encryption method, each piece of the virus is decrypted in semi-random order and every position is only hit once. W95/Drill and {W32, Linux}/Simile.D use nonlinear decryption as well.

W95/Silcer and W95/Resure force the Windows Loader to relocate the infected program images when they are loaded to memory. Relocating the images is responsible for decrypting the virus body because the virus puts in special relocations for the decryption [9]. W95/Resurrel is a virus written after W95/Resure. This virus sets the base address of the newly infected file to a value of 0xBFxxxxxx (the xxxxxx is a random value returned by the GetTickCount() API call). Then it adds a relocation entry for each DWORD value of its own code section and encrypts each DWORD by adding the base address value to the DWORD of the code section and then subtracting 0x400000. When the application is executed the image of the program cannot be loaded into memory as is because the base address is wrong or an address in KERNEL32.DLL. Therefore, the system loader will relocate the code to a valid address, decrypting the virus as it does so. The difficulty in decrypting the virus makes it hard to detect [10].

S-TECHNOLOGIES

Wednesday, June 16, 2010

Anti-debugging and Anti-emulation Techniques

No comments:

Post a Comment

About Me

sibi post's