Advanced Static Analysis

PreviousAdvanced Dynamic Analysis NextIdentifying Anti analysis techniques

Last updated 1 month ago

Advanced Static Analysis

Introduction

Advanced static malware analysis stands for tearing down of a malware into it's most basic (Assembly) form using disassemblers and decompilers. We will reverse engineer the binaries and recreate it's code as close as possible to it's source code.

Assembly Basics

Assembly language is a low-level programming language for a computer or other programmable device specific to a particular computer architecture in contrast to most high-level programming languages, which are generally portable across multiple systems. Assembly language is converted into executable machine code by a utility program referred to as an assembler like NASM, MASM, etc.

For more details, refer to assembly books in Book-collection section.

Decompiling And Disassembling Malware

Tools used - cutter

Open cutter and select the executable-

Now click on open. A new window pops up .

No need to do anything now.Just click on ok. After some time we are greeted with a screen showing some familiar stuff about the malware such as format,hashes,libraries,etc.

We have some options on the bottom for previous static analysis methods such as -

Clicking on Disassembly, we find a list of functions called upon by the malware on the left and the assembly code on the right.

After scrolling down on the left sidebar, we stumble upon the main function of the program.

We also see the graph tab on the bottom. It gives out a graph form of the malware program flow.

As we look closely on the graph we can see the program flow of the malware -

It creates a file and reaches out for http:__ssl_6582datamanager.helpdeskbros.local_favicon.ico. Now as per the reception of the file, it tells the eax register what to do next.

If file is returned, then if executes the below which is the assimilation of the favicon.ico and the created exe file which in turn starts a bind shell on port 5555.

If web is unreachable, it deletes itself and the created exe file as shown below.

The Decompiler option takes all the assembly information and tries to recreate the source as close to the source code as possible.

More on x86 cpu architecture

For a binary to execute in x86 architecture, there are three things that should be taken into account -

CPU instructions
Memory registers
Stack

x86 instruction is written in little endian format i.e the instruction comes before the destination and the source. For example, in -

	MOV R1,R0

MOV will be executed first . Then data from R0 is transferred to R1.

For jumping and logical branching , JMP instruction is used.

Stack is a special place in memory which stores data in a sequential order i.e the order in which they are to be executed. Stack grows downward i.e new data is added to the stack at lower addresses.

PUSH is used to add data to stack.POP is used to remove lowest data from stack.

Call instruction is used to call subroutines from the main method.

Ret instruction is used to return data from the subroutine to the main function.

Registers-

EAX is the accumulator register.EDX is the data register.EBX is the base register.ESP is the extended stack pointer.EBP is the extended base pointer.EIP is the extended instruction pointer.

Below is a brief overview of above concepts -

Memory Layout

Memory is simply an array of bytes, each byte having its own address. When a program is executed, the operating system allocates a chunk of memory to the program. That memory (called address space) is divided into different segments as shown below:

Memory layout of a process

The text section stores the program executable. When you compile a C program, the compiler converts your code to 0s and 1s, which represent instructions that the CPU will execute. Those 0s and 1s are going to be loaded into this text section when you run the program.
The data section stores initialized data (i.e. global variables that have been initialized).
The heap section are memory that you can dynamically reserve from calling malloc. The heap typically grows upwards, which means it grows toward larger memory addresses.

Keep this memory layout in mind. We will come back to it later when we start programming in assembly.

Registers

It turns out that modern computers require more than a physical memory to operate. Inside the CPU, there exists a small piece of memory called registers. Registers are extremely fast, because it can be directly accessed by the CPU. Modern x86–64 processors have 16 general-purpose 64-bit registers, whose names can be overwhelming to understand at first. So I’ll provide some historical context to help you understand them better. But first, the overall layout of the x86–64 registers:

Layout of x86–64 registers

Okay. Ignore all the descriptions on the right. In fact, ignore the entire image above. Don’t try to understand it right now. I put it purely for reference. We can come back to this later.

8-bit registers

The C register was the counter register, used to store counts just like the modern counter variables.
The D register stands for the data register. It was used to store the data of most I/O operations.

the sign bit (S) will be set to 1 if the result of the previous operation is negative, and 0 if the result is non-negative.
The zero bit (Z) will be set to 1 if the result of the previous operation is zero, and 0 if the result is non-zero, so on, so forth.

And then we have some special registers, like the stack pointer (SP), and the instruction pointer (IP) or the program counter, which we’ll get back to in a moment.

16-bit registers

For example, if I have 0100 1101 stored in AX,

AH would store 0100,
AL would store 1101,
AX would represent the entire 0100 1101.

Same goes for the BX, CX, and DX registers. At this point, I should introduce two new terminologies. You are going to hear these terms a lot more often now: a byte, which just means 8 bits; and a word, which means 16 bits.

The 8086 also contains a couple of new word registers and flag bits, among them:

the SI (Source Index) register, used as a pointer to a source in stream operations
the DI (Destination Index) register, used as a pointer to a destination in stream operations
the BP (Base Pointer) register, used as a pointer to the base of a stack frame (we’ll see examples of this)
the SP (Stack Pointer) register — okay, you have seen this one before, and it’s used as a pointer to the current position in the stack (we’ll also see examples of this soon)

Here is a picture summary of what we learned so far.

Don’t worry about the segment registers yet. We’ll probably never touch them in your class.

If you’re feeling pretty overwhelmed at this point, stop reading. Go take a break. Don’t look at the diagram above, but look at the diagram below instead, because things are about to get interesting.

32-bit registers

Modern computers nowadays work on at least 32-bit (long or dword, short for double word) registers. This time, same concept as before, EAX (stands for extended AX) refer to the entire 32-bit value. If you want to access the lower word value of EAX, you can still use AX. Sadly, you won’t be able to access the higher word value of EAX this time.

To give an example, suppose EAX stores 1100 0100 1110 0010,

AL would be 0010,
AH would be 1110,
AX would be 1110 0010,
EAX would be 1100 0100 1110 0010.

It is important that you are familiar with all the registers you have learned so far, so study the diagram above!

64-bit registers

Here’s the fun part: if I show you the image I told you to skip in the beginning, it kinda starts to make sense.

x86–64 registers

Yes, that’s how 64-bit (qword, short for quad word) registers look like. You’ll notice that you can now access the entire 64-bit value with RAX (that R just stands for register, I guess they ran out of names…). EAX, AX, AH and AL are still there to maintain backward compatibility. Same goes for all the other registers.

You also have additional general-purpose registers, R8 to R15, which can be used to store anything you like. In fact, this whole time, you can store anything you like in any register — like you don’t need to store counter values in RCX, because RCX is just a regular ol’ piece of memory!

Assembly And Windows API

Going back to cutter and dropper malware sample -

Two arguments - argc and argv are passed implicitly into the main function.argc takes in the source and destination .argv takes in the strings themselves which are passed along to the malware.

We see that some variables are declared at the start. As we dont know anything about them, they are ignored for now.

The base pointer is pushed onto the stack . This is important because if we don't have the pointer to the main program before executing subroutines, we can't return back to the main function. Thus leading to halt the program.

Here a call to the API InternetOpenW is being made. It requires a header which is -Mozilla_5.0. Also the zeros are arguments which are passed onto the API. It is consistent with the documentation-->

HINTERNET InternetOpenA(
  [in] LPCSTR lpszAgent,
  [in] DWORD  dwAccessType,
  [in] LPCSTR lpszProxy,
  [in] LPCSTR lpszProxyBypass,
  [in] DWORD  dwFlags
);

After that there is a call to a function -

Clicking on it, we get a whole graph of that function-->

Returning back to the main program, there is another call to an API - URLDownloadToFile.

HRESULT URLDownloadToFile(
             LPUNKNOWN            pCaller,
             LPCTSTR              szURL,
             LPCTSTR              szFileName,
  _Reserved_ DWORD                dwReserved,
             LPBINDSTATUSCALLBACK lpfnCB
);