Anti-debugging and Anti-emulation Techniques (part 2)

2.3 Overextending the emulator

Overextending the emulator is defined as executing a set of instructions that will cause the emulator to crash or that may indicate that an emulator is running. Calling undocumented instructions that an emulator may not support is one way to cause the emulator to throw an exception and stop running. One such example is the undocumented CPU Instruction SALC that is used in W95/Vulcano [9]. Another way to detect emulation or to crash the emulator currently running, if there is one, is to try to access large portions of memory at once [1]. This is usually not effective though because most operating systems, as well as emulators, will impede the program.

A way to detect whether an emulator is running is to call a function twice that should return two different values. An example of this would be calling any of the time functions twice, and verifying that there is a difference between two return values. In Windows this can be achieved through kernel32!QueryPerformanceCounter which wraps ZwQueryPerformaceCounter, kernel32!GetTickCounter, or by querying the current number of CPU cycles executed since the machine started using the RDTSC (read time stamp counter) instruction. An example of the latter is below [15].

push offset handler
push dword ptr fs:[0]
mov fs:[0],esp
rdtsc
push eax
xor eax, eax
div eax ;trigger exception
rdtsc
sub eax, [esp] ;ticks delta
add esp, 4
pop fs:[0]
add esp, 4
cmp eax, 10000h ;threshold
jb @not_debugged
@debugged:
...
@not_debugged:
...
handler:
mov ecx, [esp+0Ch]
add dword ptr [ecx+0B8h], 2 ;skip div
xor eax, eax
ret

The example above shows how the RDTSC instruction can be used to detect whether a debugger is present. In this case, RDTSC is called twice and the difference of the two values returned is found. This difference is then compared to a threshold (in this example it is 10000h). If the difference is greater, than a debugger is present and a jump away from the malicious code is executed.

Importing obscure libraries may not actually cause the emulator to stop running, but if the library is not imported, the malware code itself may not run.

Malware can also look for webpages to see if it has access to the internet. This is something that can also cause the virus code to not execute on a machine that is not currently connected to the internet, but also stops execution when an emulator is running since most emulators do not allow internet access [1].

Using coprocessor FPU instructions is another way to overextend the emulator because most emulators do not emulate the FPU instructions. Prizzy polymorphic engine (PPE) can generate 43 different coprocessor instructions for the use of its polymorphic decryptor. If these FPU instructions are not provided, the decryption on Prizzy does not execute.

Along the same lines, malware can use MMX instructions. This instruction set adds 8 new registers to the architecture. The malware will check to see if MMX support exists by using the CPUID instruction. Examples of malware that use this technique are W32/Legacy and W32/Thorin.

Malware will also setup an exception handler, execute a garbage block of code, and then indirectly execute its own handler to transfer control to another part of the polymorphic decryptor. This is done in hopes that the emulator cannot handle the exception [9]. An example that is similar to this technique is shown in section 3.7.

3. Anti-debugging

Anti-debugging techniques are usually any attempt of the malware to monitor its own code to detect debugging. To do this, the malware can examine its own code for breakpoints or check for a debugger directly through system calls.

3.1 Breakpoints

To examine its code for breakpoints, the malware can look for the 0xcc opcode instruction, which raises a SIGTRAP. This is the instruction the debugger will use to gain control from the malware at a breakpoint. The malware can also set false breakpoints if the malware code itself has a signal handler. This way it can continue to execute instructions after the breakpoint that it set.

Malware can also try to overwrite the breakpoints. W95/Marburg virus uses a backwards decryption loop for the virus to overwrite the breakpoint. The viruses in the Yankee_doodle family, on the other hand, use hamming code to self correct their code. Hamming code allows programs to detect and correct errors, but in this case allows the virus to detect and remove breakpoints in its code [9].

3.2 Calculating the checksum

Malware can also checksum its own code. If the checksum has changed, then the virus can assume that it is being debugged and there have been breakpoints placed within the code [3]. VAMPiRE is an anti-anti debugging tool that gets around the detection of breakpoints [12]. VaMPiRE accomplishes this by keeping a table of breakpoints in memory to maintain a list of the breakpoints that have been set. The program consists of a page-fault handler (PFH), a general protection fault handler (GPFH), a single-step handler and a framework API. When a breakpoint is triggered either the PFH (handles breakpoints set on code, data, or memory mapped I/O) or the GPFH (handles legacy I/O breakpoints) receives control. The single-step handler is used for breakpoint persistence, allowing breakpoints to be used more than once.

3.3 Detecting the debugger

A very simple way of detecting a debugger on a Linux system is to simply call Ptrace, since Ptrace can’t be called in succession more than once for a specific process [3]. In Windows, the system call isDebuggerPresent will return 1 if the program is being debugged and 0 otherwise. This system call simply checks a flag that has been set by the debugger if it is running. This check can be done directly by checking the second byte in the Process Environment Block. The following code is an example of this technique.

mov eax, fs:[30h]

move eax, byte [eax+2]

test eax, eax

jne @DdebuggerDetected

As the above example shows, eax is set to the PEB (Process Environment Block) and the second byte of that block is then accessed and the contents moved into eax. A check is done to see whether eax is zero. If it is, then there is no debugger present, if not, then there is a debugger.

When a process is created with a debugger already running, the system sets certain flags for the heap manipulation routines in the Windows dll ntdll.dll. These flags are FLG_HEAP_ENABLE_TAIL_CHECK, FLG_HEAP_ENABLE_FREE_CHECK, and FLG_HEAP_VALIDATE_PARAMETERS. These flags can be checked using the following code:

mov eax, fs:[30h]
mov eax, [eax+68h]
and eax, 0x70
test eax, eax
jne @DebuggerDetected

In the above example, we again access the PEB and then get the start of the flags for the heap manipulation routine by adding 68h as an offset to the address of the PEB. The flags are then checked to see if a debugger is present.

Checking flags within heap headers such as the ForceFlags is another way to detect whether a debugger is running or not. Here is an example [15]:

mov eax, fs:[30h]
mov eax, [eax+18h] ;process heap
mov eax, [eax+10h] ;heap flags
test eax, eax
jne @DebuggerDetected

The above example shows how the process heap and the heap flags can be accessed from the offset of the PEB. These are then tested to see if the Force Flags were previously set by a debugger currently running.

Another possible way to detect the debugger is through the use of the NtQueryInformationProcess syscall. This function can be called with ProcessInformationClass set to 7, which refers to the ProcessDebugPort, and the function will return -1 if the process is being debugged. Below is an example [15].

push 0

push 4

push offset isdebugged

push 7 ;ProcessDebugPort

push -1

call NtQueryInformationProcess

test eax, eax

jne @ExitError

cmp isdebugged, 0

jne @DebuggerDetected

In this example, the parameters for the NtQueryInformationProcess syscall are first pushed onto the stack. These parameters are as follows: the first is the handle (in this case 0), the second is the process information length (4 bytes in this example), the following is the process information class (in this case 7, specifying the ProcessDebugPort), the next is the variable used to return whether or not there is a debugger present. If this value is non-zero then the process is being run under a debugger. If not, then all is well. The last parameter is the return length. NtQueryInformationProcess is then called with these parameters and a return value is placed in isdebugged. This is later tested to see if it equals zero, or not.

Other simple ways of detecting the debugger is by checking to see if the device list contains the name of a debugger, by checking the registry keys for a debugger, or by scanning memory to detect the debugger’s code in memory [9].

Another method, similar to the EPO method, is to instruct the PE loader that the entry point of the program is referenced in the Thread Local Storage (TLS) entry in the PE header. This has the effect of causing the code in the TLS to execute first instead of the read entry point of the program. Therefore, the TLS can perform anti-debugging checks before the program even starts [15]. Starting on the TLS also allows the virus to begin execution before the debugger starts, since some debuggers break on the main entry point of the program [9].

3.4 Checking for single stepping

Other ways that the malware can detect a debugger is to check for single stepping. Checking for single stepping can be done by adding a value above the stack pointer and then checking to see if the value is still there. If the value is there, this means that the code is being single stepped. When a debugger is single stepping a process, it will push instructions onto the stack when it takes control and pop them back off the stack before it executes the next instruction. So if the value is still there that means that something other than the running process has been using the stack [1]. Below is a code example of how malware can detect single stepping by using the stack state [9]:

mov bp, sp ; pick stack pointer

push ax ; store any ax mark on the stack

pop ax ; pick the value from the stack

cmp word ptr [bp-2], ax ; compare against stack

jne debug ; if different, debugger detected.

As the comments in this example show, a value is pushed onto the stack then popped off the stack. If the debugger is present, then the value at the top of the stack pointer – 2 will be different than the value that was just popped off the stack and the appropriate action can be taken.

3.5 Checking for slowdown in runtime

By looking for slowdown in the runtime of the program, malicious code can also detect a debugger. A significant slowdown in runtime likely means that the code is being single stepped. So if the difference in the timestamp for 2 different calls is too great, the malware can act accordingly [1]. LTTng/LTTV Linux Trace Toolkit gets around the slowdown problem to trace a virus. This is because LTTng/LTTV is a modular tool that traces the program without adding breakpoints or performing any analysis at the time of execution. It also uses a lockless re-entry mechanism, meaning that it does not lock any portions of the Linux kernel code that the program being traced might need to use, and therefore does not cause the traced program to slow down and wait [4].

3.6 Instruction prefetching

If the malicious code modifies the next instruction in a sequence and the new instruction is executed, then a debugger is running. This is due to instruction prefetching; if the new instruction is prefetched, this means there was a break in the execution of the process. Otherwise the original instruction would have been prefetched and executed [1].

3.7 Self modifying code

The malware can also self-modify other code. One example of this is HDSpoof. This malware starts out with exception handlers and then removes them during execution. That way, if anything goes wrong and an exception is thrown by the process during execution, the virus will terminate. It also modifies the exception handlers at other times during execution by either removing or adding exception handlers. Below is an example of HDSpoof removing all exception handlers except for the default one [7].

exception handlers before:

0x77f79bb8 ntdll.dll:executehandler2@20 + 0x003a

0x0041adc9 hdspoof.exe+0x0001adc9

0x77e94809 __except_handler3

exception handlers after:

0x77e94809 __except_handler3

0x41b770: 8b44240c mov eax,dword ptr [esp+0xc]

0x41b774: 33c9 xor ecx,ecx

0x41b776: 334804 xor ecx,dword ptr [eax+0x4]

0x41b779: 334808 xor ecx,dword ptr [eax+0x8]

0x41b77c: 33480c xor ecx,dword ptr [eax+0xc]

0x41b77f: 334810 xor ecx,dword ptr [eax+0x10]

0x41b782: 8b642408 mov esp,dword ptr [esp+0x8]

0x41b786: 648f0500000000 pop dword ptr fs:[0x0]

Below is code in which HDSpoof creates a new exception handler [7].

0x41f52b: add dword ptr [esp],0x9ca

0x41f532: push dword ptr [dword ptr fs:[0x0]

0x41f539: mov dword ptr fs:[0x0],esp

3.8 Overwriting debugger information

Some malware uses techniques that can override debugger information and therefore cause either the debugger or the virus itself to function improperly.

By hooking the INT 1 and INT 3 ( INT3 is the 0xCC opcode byte that debuggers use) interrupts, malware can cause the debugger to lose its context. This is harmless during normal execution of the virus. Another option is to hook the interrupts and call another interrupt to run the virus code indirectly. Below is the code to the Tequila virus that hooks INT 1.

new_interrupt_one:

push bp

mov bp,sp

cs cmp b[0a],1 ;masm mod. needed

je 0506 ;masm mod. needed

cmp w[bp+4],09b4

ja 050b ;masm mod. needed

push ax

push es

les ax,[bp+2]

cs mov w[09a0],ax ;masm mod. needed

cs mov w[09a2],es ;masm mod. needed

cs mov b[0a],1

pop es

pop ax

and w[bp+6],0feff

pop bp

iret

Normally the hook routine is set to IRET, as it is without a debugger installed. V2Px uses hooks to decrypt its body with INT 1 and INT 3. The INT 1 and INT 3 vectors are used continuously during execution of the code and calculations are done in the interrupt vector table.

Some viruses also clear the contents of the debug registers (DRn) [9]. This can be done in one of two ways. One way is to use the NtGetContextThread and NtSetContextThread syscalls. Another way is to generate an exception, modify the thread context and then resume normal execution with the new context. An example of this is below [15].

push offset handler
push dword ptr fs:[0]
mov fs:[0],esp
xor eax, eax
div eax ;generate exception
pop fs:[0]
add esp, 4
;continue execution
;...
handler:
mov ecx, [esp+0Ch] ;skip div
add dword ptr [ecx+0B8h], 2 ;skip div
mov dword ptr [ecx+04h], 0 ;clean dr0
mov dword ptr [ecx+08h], 0 ;clean dr1
mov dword ptr [ecx+0Ch], 0 ;clean dr2
mov dword ptr [ecx+10h], 0 ;clean dr3
mov dword ptr [ecx+14h], 0 ;clean dr6
mov dword ptr [ecx+18h], 0 ;clean dr7
xor eax, eax
ret

The first line of the above example pushes the offset of the handler onto the stack to make sure that its own handler will get control when the exception is thrown. Then setup is done in for control to transfer to the handler, including setting eax to zero by XORing it with itself. The div eax instruction generates and exception because eax is zero, so AX is being divided by zero. The handler then skips the divide instruction, cleans dr0-dr7, sets eax to zero again, indicating that the exception was handled, and execution is resumed.

3.9 Detaching the debugger thread

Detaching the thread from the debugger can be done with the NtSetInformationThread syscall. Calling this function with ThreadInformationClass set to 0x11 (ThreadHideFromDebugger), will detach the program’s thread from the debugger if there is a debugger present. The following code is an example [15]:

push 0
push 0
push 11h ;ThreadHideFromDebugger
push -2
call NtSetInformationThread

In this example, the parameters for the NtSetInformationThread are first pushed onto the stack and then the function is called removing the program’s thread from the debugger. This is done because 0 is passed in for the thread information length and thread information, -2 is passed on for the thread handle, and 11h is passed in for the thread information class which is the ThreadHideFromDebugger value.

3.10 Decryption

Decryption can be done in several different ways that also protect against debugging. Some decryption depends upon a specific execution path. If this execution path is not followed, due to a debugger being started at a specific point in the program, the value that the decryption algorithm uses may be incorrect. Therefore, the program will not decrypt itself correctly. HDSpoof uses this technique [7].

Some viruses use the stack to decrypt their code. Using a debugger on such a virus causes the decryption to fail, because the stack is used by INT 1 during debugging. One example is the W95/SK virus that decrypts and builds its code on the stack. Another example of this is the Cascade virus that uses the stack pointer register for one of the decryption keys. Below is the code:

lea si, Start ; position to decrypt

mov sp, 0682 ; length of encrypted body

Decrypt:

xor [si], si ; decryption key/counter 1

xor [si], sp ; decryption key/counter 2

inc si ; increment one counter

dec sp ; decrement the other

jnz Decrypt ; loop until all bytes are decrypted

Start: ; Virus body

The comments on the above example explain fairly well how the Cascade virus uses the stack pointer to decrypt the virus body. The Cryptor virus, on the other hand, stores its encryption keys in the keyboard buffer, which is also destroyed by a debugger. Tequila uses the decryptor’s code as the decryption key, so if the decryptor is modified with a debugger the virus will not be decrypted [9]. Below is the Tequila decryption code [13]:

perform_encryption_decryption:

mov bx,0

mov si,0960

mov cx,0960

mov dl,b[si]

xor b[bx],dl

inc si

inc bx

cmp si,09a0

jb 0a61 ;masm mod. needed

mov si,0960

loop 0a52 ;masm mod. needed

ret

the_file_decrypting_routine:

push cs

pop ds

mov bx,4

mov si,0964

mov cx,0960

mov dl,b[si]

add b[bx],dl

inc si

inc bx

cmp si,09a4

jb 0a7e ;masm mod. needed

mov si,0964

loop 0a6f ;masm mod. needed

jmp 0390 ;masm mod. needed

Research is being done on new anti-debugging methods that may be used in the future. One such project works on a multiprocessor computer in which one processor is unused while debugging. This new technique uses parallel processing of the decryption code [11].

4. Other Anti-detection Techniques

4.1 Retroviruses

Retroviruses try to disable the anti-virus software. They do this by carrying a list of process names and killing the processes the program finds running. Many retroviruses also take the process off of the startup list so the process no longer starts when the computer boots. This type of malware may also try to starve the anti-virus software of CPU time or prevent the anti-virus software from connecting to the company’s servers to update its database [1].

5. Combining Techniques

The W32.Gobi virus is a polymorphic retrovirus that uses EPO and several anti-debugging techniques. This virus opens a backdoor on TCP port 666 [8]

Simile (also known as Metaphor) is a very well known and complex virus that is approximately 14,000 lines of assembly [9]. This virus uses EPO by looking for the ExitProcess() API call. It is also a metamorphic virus that uses polymorphic decryption [1]. About 90% of its code is spent on the polymorphic decryption. The virus body and polymorphic decryptor are placed in a semi-random place in a newly infected file each time. The first payload of Simile only activates during March, June, September or December. Variants A and B display their message on the 17th of these months. Variant C displays its message on the 18th. The second payload activates on the 14th of May in variants A and B and on the 14th of July in variant C [9].

Ganda is a retrovirus that uses EPO. It examines the list of startup processes and replaces the first instruction of each startup process with a return. This renders any antivirus programs useless [1].

S-TECHNOLOGIES

Thursday, June 17, 2010

Anti-debugging and Anti-emulation Techniques (part 2)

No comments:

Post a Comment

About Me

sibi post's