Hello World Of Assembly Language
Chapter 1
Building And Execution
A typical MASM program looks like this -
Asm files require a driver program along with an assembler (MASM in this case) to execute. Below is an example C driver program.
In order to compile and run both the asm file and the Cpp driver in one go, create and run the following batch file. Give the asm file name as a parameter to the batch file.
Then run and execute the bat file by -
Working of x86_64 CPU
The Intel cpu family is classified as a von Neumann architecture machine. It contains three main building blocks - CPU, memory and the I/O devices which are interconnected through a system bus.
The cpu communicates with memory and the I/O devices by placing a numeric value on the address bus to select one of the memory locations or I/O port locations, each of which has a unique numeric address. Then the data is placed on the data bus and the control bus controls signals that determine the direction of the data transfer.
The cpu contain both general purpose registers and special purpose registers. General purpose registers are responsible for data manipulation and execution flow whereas the special purpose registers are intended for debuggers and other system level tools.
These registers are overlayed on each other. A 64-bit register overlays over the 32-bit registers which in turn overlays the 16-bit registers and so on.
In addition to these registers, some special purpose registers such as floating point registers are also present. These floating point unit registers are named from ST(0) to ST(7). These can't be directly accessed by a program.
In 1990s Intel introduced the MMX register set and instruction to support SIMD operations. These registered overlayed the ST(0) to ST(7) registers on the FPU. Due to this, applications couldn't simultaneously use the FPU and MMX instructions. Thus later Intel corrected it by adding the XMM register set.
Later AMD/Intel added 128-bit XMM registers (XMM0 to XMM15) and the SSE/SSE2 instruction set. This meant each register can be configured as one 128-bit / two 64-bit / four 32-bit or eight 16-bit registers. Later they were doubled to 256-bit registers (renamed as YMM0 to YMM15).
The RFLAGS register is a 64-bit register that encapsulates several Boolean values. Some of the interesting flags are - Overflow, Direction, Interrupt, Sign, Zero, Auxiliary Carry, Parity, Carry.
Overflow, Sign, Zero and Carry flags are extremely valuable and are collectively called as "condition codes". These condition codes lets us test the result of previous computations.
Memory Subsystem
The memory subsystem holds data such as program variables, constants, machine instructions and other information. Memory is organized into cells, each of which holds a small piece of information and these smaller cells are combined to form larger pieces of information.
x86_64 supports byte-addressable memory, which means the basic unit of the memory is a byte, sufficient to hold a single character or a very small integer value.
To create data variables, they are defined in the ".data" directive. Data variable objects are defined using a set of data declaration directives which go as -
where label is the data variable identifier and the directive is one of the directives appearing below
The question mark (?) tells MASM that the object will not have an explicit value when the program loads into memory. If the variable is to be initialized with a value instead, replace the "?" with a value. For example -
MASM doesn't care about signed and unsigned numbers. It only cares if the value can fit in the directive or not.
If there are multiple operands in a data declaration statement, MASM will emit the values to sequential memory locations in the order they appear in the operand field.
Data constants can be declared by using the "=" directive. They can be declared anywhere.
Basic Instructions
Mov Instruction
mov instruction is used to move data from one location or another.
Source may be a register, memory variable or a constant. The destination may be a register or a memory variable. But it should be noted that both source and destination can't be memory variables.
Both operands must be of the same size. Below table lists all the legal mov instruction combinations.
dup Instruction
The dup instruction is used to declare a character buffer.
The maxLen dup (?) operand tells MASM to duplicate the (?) (that is, an uninitialized byte) maxLen times. maxLen is a constant set to 256 by an equate directive (=) at the beginning of the source file.
Add And Sub Instruction
Constant operands are limited to a maximum of 32 bits. If your destination operand is 64 bits, the CPU allows only a 32-bit immediate source operand.
Lea Instruction
The lea (Load Effective Address) instruction is used to load the address of a memory variable. It is similar to the "&" operator in C.
Call and Ret Instructions
The ret instruction serves the same purpose in an assembly language program as the return statement in C/C++: it returns control from an assembly language procedure.
A procedure is called in MASM by using the call instruction. It can take up a couple of forms.
At the end of the procedure, an endp is used to end it.
Below is an example of a procedure call in MASM.
Calling C/C++ Procedures
Rewriting each and every procedure is a painful. Thus you can import procedures from C/C++ using
The externdef directive doesn’t let you specify parameters to pass to the "printf()" procedure, nor does the call instruction provide a mechanism for specifying parameters. Instead, you can pass up to four parameters to the "printf()" function in the x86-64 registers RCX, RDX, R8, and R9.
The "printf()" function requires that the first parameter be the address of a format string. Therefore, you should load RCX with the address of a zero-terminated string prior to calling "printf()". If the format string contains any format specifiers (for example, %d), you must pass appropriate parameter values in RDX, R8, and R9.
Jmp Instruction
The jmp instruction unconditionally transfers control to a specified symbol.
Like all MASM symbols, statement labels have two major attributes associated with them: an address (which is the memory address of the machine instruction following the label) and a type. The type is label, which is the same type as a proc directive’s identifier.
Conditional Jump Instructions
Conditional jmp instructions depend on one of the following four flags in the FLAGs register - Carry, Sign, Overflow and zero flag.
To execute a conditional jump, first execute an instruction that affects one or more of the conditional flags.
Not all instruction affect the flags. Only sub, add, and, or, xor and not instructions affect the flags.
The cmp Instruction
The cmp instruction has the same syntax as the sub instruction and in fact it also subtracts the second operand from the first operand and sets the condition code flags based on the result of the subtraction.
The x86-64 CPUs provide an additional set of conditional jump instructions that allow you to test for comparison conditions.
Some of the instructions are synonymous to each other. For example "jb" and "jc", both of which are executed when CF or carry flag is set to 1. This is done for convenience and reliability. After a cmp instruction, jb is much more meaningful than jc.
The cmp instruction sets the flag only for integer comparisions and doesn't compate floating point instructions.
Hello World!
Below is the asm program for printing "Hello World!".
Returning Function Results
In order to coordinate function result return location, MASM uses Microsoft Windows ABI (Application Binary Interface).
The Microsoft ABI specifies that the first four parameters to printf() (or any C++ function, for that matter) must be passed in RCX, RDX, R8, and R9.
The Windows ABI also states that functions (procedures) return integer and pointer values (that fit into 64 bits) in the RAX register. So if some C++ code expects your assembly procedure to return an integer result, you would load the integer result into RAX immediately before returning from your procedure.
Below is an example program that follows the Microsoft Windows ABI.
Microsoft ABI
Along with the asm program, the calling C++ program should also follow the ABI rules. Following are the ABI rules that need to be followed.
Variable Size
It's crucial to maintain the size of the data between the C++ and assembly language programs. Following table lists the common data types and their sizes.
Register Usage
Register usage in an assembly language procedure is also subject to certain Microsoft ABI rules. They are -
Code that calls a function can pass the first four integer arguments to the function in the RCX, RDX, R8, and R9 registers, respectively. Programs pass the first four floating-point arguments in XMM0, XMM1, XMM2, and XMM3.
Registers RAX, RCX, RDX, R8, R9, R10, and R11 are volatile, which means that the values in them can be altered by the function.
XMM0/YMM0 through XMM5/YMM5 are also volatile. The function or procedure can alter the values in these registers.
RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are nonvolatile registers. A procedure/function must preserve these registers’ values across a call. If a procedure modifies one of these registers, it must save the register’s value before the first such modification and restore the register’s value from the saved location prior to returning from the function/procedure.
RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are nonvolatile registers. A procedure/function must preserve these registers’ values across a call. If a procedure modifies one of these registers, it must save the register’s value before the first such modification and restore the register’s value from the saved location prior to returning from the function/procedure.
Programs that use the x86-64’s floating-point coprocessor instructions must preserve the value of the floating-point control word across procedure calls. Such procedures should also leave the floating-point stack cleared.
Any procedure/function that uses the x86-64’s direction flag must leave that flag cleared upon return from the procedure/function.
C++ also expects function return values to appear in one of two places -
Integer results come back in RAX register. If the return type is smaller than 64 bits, the upper bits of the register remain undefined i.e if the value is 16 bits, then the value is stored within bits 0-15 and bits 16-63 contain garbage value.
Floating point results come back in XMM0 register.
Data Objects
Microsoft ABI requires all data to be aligned on a natural boundary for that data object. A natural boundary is an address that is a multiple of the object’s size (up to 16 bytes). Therefore, if you intend to pass a word/sword, dword/sdword, or qword/sqword value to a C++ procedure, you should attempt to align that object on a 2-, 4-, or 8-byte boundary, respectively.
When calling code written in a Microsoft ABI–aware language, you must ensure that the stack is aligned on a 16-byte boundary before issuing a call instruction. This can severely limit the usefulness of the push and pop instructions. If you use the push instructions to save a register’s value prior to a call, you must make sure you push two (64-bit) values, or otherwise make sure the RSP address is a multiple of 16 bytes, prior to making the call.
Arrays
The Microsoft ABI expects fields of an array to be aligned on their natural size: the offset from the beginning of the structure to a given field must be a multiple of the field’s size. On top of this, the whole structure must be aligned at a memory address that is a multiple of the size of the largest object in the structure (up to 16 bytes).
The entire structure’s size must be a multiple of the largest element in the structure (you must add padding bytes to the end of the structure to appropriately fill out the structure’s size).
The Microsoft ABI expects arrays to begin at an address in memory that is a multiple of the element size. For example, if you have an array of 32-bit objects, the array must begin on a 4-byte boundary.
Activation Records
The caller passes the first four parameters in registers rather than on the stack (though it must still reserve storage on the stack for those parameters).
The caller must reserve (at least) 32 bytes of parameter data on the stack, even if there are fewer than five parameters (plus 8 bytes for each additional parameter if there are five or more parameters).
Parameters are always 8-byte values.
RSP must be 16-byte-aligned immediately before the call instruction pushes the return address onto the stack.
Array Parameters
For parameters, all procedure/function parameters must consume exactly 64 bits. If a data object is smaller than 64 bits, the HO bits of the parameter value (the bits beyond the actual parameter’s native size) are undefined (and not guaranteed to be zero). Procedures should access only the actual data bits for the parameter’s native type and ignore the HO bits.
If a parameter’s native type is larger than 64 bits, the Microsoft ABI requires the caller to pass the parameter by reference rather than by value.
Although the Microsoft calling convention passes the first four parameters in registers, it still requires the caller to allocate storage on the stack for these parameters (shadow storage).13 In fact, the Microsoft calling convention requires the caller to allocate storage for four parameters on the stack even if the procedure doesn’t have four parameters.
The magic instructions i.e "add rsp,48" or "sub rsp,48" are used for allocating storage for local variables and all the parameter space for the procedures being called as well as keeping the stack 16-byte-aligned. However, if you use this trick to allocate storage for your procedures’ parameters, you will not be able to use the push instructions to move the data onto the stack. The storage has already been allocated on the stack for the parameters; you must use mov instructions to copy the data onto the stack (using the [RSP+constant] addressing mode) when copying the fifth and greater parameters.
Last updated