Hello World Of Assembly Language

Chapter 1

Building And Execution

A typical MASM program looks like this -

; The ".code" directive tells MASM that the statements following
; this directive go in the section of memory reserved for machine
; instructions (code).

    .code
 
; Here is the "main" function.

main PROC

Machine instructions go here

     ret ; Returns to caller

main ENDP

; The END directive marks the end of the source file.

     END

Asm files require a driver program along with an assembler (MASM in this case) to execute. Below is an example C driver program.

// asm_driver.cpp

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// extern "C" namespace prevents "name mangling" by the C++ compiler.

extern "C"
{
    // asmMain is the assembly language code's "main program":
        
    void asmMain( void );
    
    // getTitle returns a pointer to a string of characters 
    // from the assembly code that specifies the title of that 
    // program (that makes this program generic and usable
    // with a large number of sample programs in "The Art of 
    // 64-bit Assembly Language."
    
    char *getTitle( void );
    
    // C++ function that the assembly
    // language program can call:
    
    int readLine( char *dest, int maxLen );
    
};

int readLine( char *dest, int maxLen )
{
    // Note: fgets returns NULL if there was an error, else 
    // it returns a pointer to the string data read (which 
    // will be the value of the dest pointer).
    
    char *result = fgets( dest, maxLen, stdin );
    if( result != NULL )
    {
        // Wipe out the new line character at the end of the string
        
        int len = strlen( result );
        if( len > 0 )
        {
            dest[ len - 1 ] = 0;
        }
        return len;
    } 
    return -1; // If there was an error.
}

int main(void)
{
    // Get the assembly language program's title:
    
    try
    {
        char *title = getTitle();
            
        printf( "Calling %s:\n", title );
        asmMain();
        printf( "%s terminated\n", title );
    }
    catch(...)
    {
        printf
        ( 
            "Exception occurred during program execution\n"
            "Abnormal program termination.\n"
        );
    }
    
}

In order to compile and run both the asm file and the Cpp driver in one go, create and run the following batch file. Give the asm file name as a parameter to the batch file.

// asm.bat
echo off
ml64 /nologo /c /Zi /Cp %1.asm
cl /nologo /O2 /Zi /utf-8 /EHa /Fe%1.exe asm_driver.cpp %1.obj

Then run and execute the bat file by -

build listing   // Build the object file
listing         // Run the program

Working of x86_64 CPU

The Intel cpu family is classified as a von Neumann architecture machine. It contains three main building blocks - CPU, memory and the I/O devices which are interconnected through a system bus.
The cpu communicates with memory and the I/O devices by placing a numeric value on the address bus to select one of the memory locations or I/O port locations, each of which has a unique numeric address. Then the data is placed on the data bus and the control bus controls signals that determine the direction of the data transfer.
The cpu contain both general purpose registers and special purpose registers. General purpose registers are responsible for data manipulation and execution flow whereas the special purpose registers are intended for debuggers and other system level tools.

These registers are overlayed on each other. A 64-bit register overlays over the 32-bit registers which in turn overlays the 16-bit registers and so on.
In addition to these registers, some special purpose registers such as floating point registers are also present. These floating point unit registers are named from ST(0) to ST(7). These can't be directly accessed by a program.
In 1990s Intel introduced the MMX register set and instruction to support SIMD operations. These registered overlayed the ST(0) to ST(7) registers on the FPU. Due to this, applications couldn't simultaneously use the FPU and MMX instructions. Thus later Intel corrected it by adding the XMM register set.
Later AMD/Intel added 128-bit XMM registers (XMM0 to XMM15) and the SSE/SSE2 instruction set. This meant each register can be configured as one 128-bit / two 64-bit / four 32-bit or eight 16-bit registers. Later they were doubled to 256-bit registers (renamed as YMM0 to YMM15).
The RFLAGS register is a 64-bit register that encapsulates several Boolean values. Some of the interesting flags are - Overflow, Direction, Interrupt, Sign, Zero, Auxiliary Carry, Parity, Carry.
Overflow, Sign, Zero and Carry flags are extremely valuable and are collectively called as "condition codes". These condition codes lets us test the result of previous computations.

Memory Subsystem

The memory subsystem holds data such as program variables, constants, machine instructions and other information. Memory is organized into cells, each of which holds a small piece of information and these smaller cells are combined to form larger pieces of information.
x86_64 supports byte-addressable memory, which means the basic unit of the memory is a byte, sufficient to hold a single character or a very small integer value.
To create data variables, they are defined in the ".data" directive. Data variable objects are defined using a set of data declaration directives which go as -

label directive ?

where label is the data variable identifier and the directive is one of the directives appearing below

The question mark (?) tells MASM that the object will not have an explicit value when the program loads into memory. If the variable is to be initialized with a value instead, replace the "?" with a value. For example -

;; For an integer value
testing sdword -1

;; For a string
strVarName byte 'String of characters', 0

MASM doesn't care about signed and unsigned numbers. It only cares if the value can fit in the directive or not.
If there are multiple operands in a data declaration statement, MASM will emit the values to sequential memory locations in the order they appear in the operand field.

Data constants can be declared by using the "=" directive. They can be declared anywhere.

;; Format

label = expression
;; or
label equ expression

;; example

data = 256
;;or
data equ 256

Basic Instructions

Mov Instruction

mov instruction is used to move data from one location or another.

mov destination_operand, source_operand

Source may be a register, memory variable or a constant. The destination may be a register or a memory variable. But it should be noted that both source and destination can't be memory variables.
Both operands must be of the same size. Below table lists all the legal mov instruction combinations.

dup Instruction

The dup instruction is used to declare a character buffer.

input byte maxLen dup (?)

The maxLen dup (?) operand tells MASM to duplicate the (?) (that is, an uninitialized byte) maxLen times. maxLen is a constant set to 256 by an equate directive (=) at the beginning of the source file.

Add And Sub Instruction

add destination_operand, source_operand
sub destination_operand, source_operand

Constant operands are limited to a maximum of 32 bits. If your destination operand is 64 bits, the CPU allows only a 32-bit immediate source operand.

Lea Instruction

The lea (Load Effective Address) instruction is used to load the address of a memory variable. It is similar to the "&" operator in C.

lea reg64, memory_var

Call and Ret Instructions

The ret instruction serves the same purpose in an assembly language program as the return statement in C/C++: it returns control from an assembly language procedure.

ret

A procedure is called in MASM by using the call instruction. It can take up a couple of forms.

call proc_name

;; or
proc_name proc

At the end of the procedure, an endp is used to end it.

proc_name endp

Below is an example of a procedure call in MASM.

; A simple demonstration of a user-defined procedure.

        .code

; A sample user-defined procedure that this program can call.

myProc  proc
        ret    ; Immediately return to the caller
myProc  endp

; Here is the "main" procedure.

main    PROC

; Call the user-define procedure

        call   myProc

        ret     ;Returns to caller

main    ENDP
        END

Calling C/C++ Procedures

Rewriting each and every procedure is a painful. Thus you can import procedures from C/C++ using

;; format

externdef symbol:type

;;example
;;Importing printf from C++

externdef printf:proc

The externdef directive doesn’t let you specify parameters to pass to the "printf()" procedure, nor does the call instruction provide a mechanism for specifying parameters. Instead, you can pass up to four parameters to the "printf()" function in the x86-64 registers RCX, RDX, R8, and R9.
The "printf()" function requires that the first parameter be the address of a format string. Therefore, you should load RCX with the address of a zero-terminated string prior to calling "printf()". If the format string contains any format specifiers (for example, %d), you must pass appropriate parameter values in RDX, R8, and R9.

Jmp Instruction

The jmp instruction unconditionally transfers control to a specified symbol.

jmp statement_label

Like all MASM symbols, statement labels have two major attributes associated with them: an address (which is the memory address of the machine instruction following the label) and a type. The type is label, which is the same type as a proc directive’s identifier.

statement_label: mov eax, 55

;; Label doesn't have to be on the same line as the instruction

rLabel:
 mov eax, 55

Conditional Jump Instructions

Conditional jmp instructions depend on one of the following four flags in the FLAGs register - Carry, Sign, Overflow and zero flag.

To execute a conditional jump, first execute an instruction that affects one or more of the conditional flags.

   mov eax, int32Var
   add eax, anotherVar
   jc overflowOccurred
; Continue down here if the addition did not
; produce an overflow.
 .
 .
 .
overflowOccurred:
; Execute this code if the sum of int32Var and anotherVar
; does not fit into 32 bits.

Not all instruction affect the flags. Only sub, add, and, or, xor and not instructions affect the flags.

The cmp Instruction

The cmp instruction has the same syntax as the sub instruction and in fact it also subtracts the second operand from the first operand and sets the condition code flags based on the result of the subtraction.

cmp left_operand, right_operand

The x86-64 CPUs provide an additional set of conditional jump instructions that allow you to test for comparison conditions.

Some of the instructions are synonymous to each other. For example "jb" and "jc", both of which are executed when CF or carry flag is set to 1. This is done for convenience and reliability. After a cmp instruction, jb is much more meaningful than jc.

The cmp instruction sets the flag only for integer comparisions and doesn't compate floating point instructions.

Hello World!

Below is the asm program for printing "Hello World!".

        option  casemap:none
        .data

; Note: "10" value is a line feed character, also known as the "C" newline character.
 
fmtStr  byte    'Hello, World!', 10, 0

        .code

; External declaration so MASM knows about the C/C++ printf function

        externdef   printf:proc

; Here is the "asmFunc" function.
        
        public  asmFunc
asmFunc proc

; "Magic" instruction offered without explanation at this point

        sub     rsp, 56
                
; Here's where will call the C printf function to print 
; "Hello, World!" Pass the address of the format string
; to printf in the RCX register. Use the LEA instruction 
; to get the address of fmtStr.
        
        lea     rcx, fmtStr
        call    printf
 
; Another "magic" instruction that undoes the effect of the 
; previous one before this procedure returns to its caller.
       
        add     rsp, 56
        
        ret     ;Returns to caller
        
asmFunc endp
        end

Returning Function Results

In order to coordinate function result return location, MASM uses Microsoft Windows ABI (Application Binary Interface).
The Microsoft ABI specifies that the first four parameters to printf() (or any C++ function, for that matter) must be passed in RCX, RDX, R8, and R9.
The Windows ABI also states that functions (procedures) return integer and pointer values (that fit into 64 bits) in the RAX register. So if some C++ code expects your assembly procedure to return an integer result, you would load the integer result into RAX immediately before returning from your procedure.
Below is an example program that follows the Microsoft Windows ABI.

; An assembly language program that demonstrate returning
; a function result to a C++ program.

        option  casemap:none

nl      =       10  ;ASCII code for newline
maxLen  =       256 ;Maximum string size + 1

         .data  
titleStr byte    'Listing 1-8', 0
prompt   byte    'Enter a string: ', 0
fmtStr   byte    "User entered: '%s'", nl, 0

; "input" is a buffer having "maxLen" bytes. This program 
; will read a user string into this buffer.
;
; The "maxLen dup (?)" operand tells MASM to make "maxLen" 
; duplicate copies of a byte, each of which is uninitialized.
 
input    byte   maxLen dup (?)

        .code

        externdef   printf:proc
        externdef   readLine:proc


; The C++ function calling this assembly language module 
; expects a function named "getTitle" that returns a pointer 
; to a string as the function result. This is that function:

         public getTitle
getTitle proc

; Load address of "titleStr" into the RAX register (RAX holds 
; the function return result) and return back to the caller:

         lea rax, titleStr
         ret
getTitle endp

        
; Here is the "asmMain" function.

        
        public  asmMain
asmMain proc
        sub     rsp, 56
                

; Call the readLine function (written in C++) to read a line 
; of text from the console.
;
; int readLine( char *dest, int maxLen )
;
; Pass a pointer to the destination buffer in the RCX register.
; Pass the maximum buffer size (max chars + 1) in EDX.
; This function ignores the readLine return result.
; Prompt the user to enter a string:

        lea     rcx, prompt
        call    printf


; Ensure the input string is zero terminated (in the event 
; there is an error):

        mov     input, 0
        
; Read a line of text from the user:

        lea     rcx, input
        mov     rdx, maxLen
        call    readLine
        
; Print the string input by the user by calling printf:

        lea     rcx, fmtStr
        lea     rdx, input
        call    printf
 
        add     rsp, 56
        ret     ;Returns to caller
        
asmMain endp
        end

Microsoft ABI

Along with the asm program, the calling C++ program should also follow the ABI rules. Following are the ABI rules that need to be followed.

Variable Size

It's crucial to maintain the size of the data between the C++ and assembly language programs. Following table lists the common data types and their sizes.

Register Usage

Code that calls a function can pass the first four integer arguments to the function in the RCX, RDX, R8, and R9 registers, respectively. Programs pass the first four floating-point arguments in XMM0, XMM1, XMM2, and XMM3.
Registers RAX, RCX, RDX, R8, R9, R10, and R11 are volatile, which means that the values in them can be altered by the function.
XMM0/YMM0 through XMM5/YMM5 are also volatile. The function or procedure can alter the values in these registers.
RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are nonvolatile registers. A procedure/function must preserve these registers’ values across a call. If a procedure modifies one of these registers, it must save the register’s value before the first such modification and restore the register’s value from the saved location prior to returning from the function/procedure.
RBX, RBP, RDI, RSI, RSP, R12, R13, R14, and R15 are nonvolatile registers. A procedure/function must preserve these registers’ values across a call. If a procedure modifies one of these registers, it must save the register’s value before the first such modification and restore the register’s value from the saved location prior to returning from the function/procedure.
Programs that use the x86-64’s floating-point coprocessor instructions must preserve the value of the floating-point control word across procedure calls. Such procedures should also leave the floating-point stack cleared.
Any procedure/function that uses the x86-64’s direction flag must leave that flag cleared upon return from the procedure/function.

C++ also expects function return values to appear in one of two places -

Integer results come back in RAX register. If the return type is smaller than 64 bits, the upper bits of the register remain undefined i.e if the value is 16 bits, then the value is stored within bits 0-15 and bits 16-63 contain garbage value.
Floating point results come back in XMM0 register.

Data Objects

Microsoft ABI requires all data to be aligned on a natural boundary for that data object. A natural boundary is an address that is a multiple of the object’s size (up to 16 bytes). Therefore, if you intend to pass a word/sword, dword/sdword, or qword/sqword value to a C++ procedure, you should attempt to align that object on a 2-, 4-, or 8-byte boundary, respectively.
When calling code written in a Microsoft ABI–aware language, you must ensure that the stack is aligned on a 16-byte boundary before issuing a call instruction. This can severely limit the usefulness of the push and pop instructions. If you use the push instructions to save a register’s value prior to a call, you must make sure you push two (64-bit) values, or otherwise make sure the RSP address is a multiple of 16 bytes, prior to making the call.

Arrays

The Microsoft ABI expects fields of an array to be aligned on their natural size: the offset from the beginning of the structure to a given field must be a multiple of the field’s size. On top of this, the whole structure must be aligned at a memory address that is a multiple of the size of the largest object in the structure (up to 16 bytes).
The entire structure’s size must be a multiple of the largest element in the structure (you must add padding bytes to the end of the structure to appropriately fill out the structure’s size).
The Microsoft ABI expects arrays to begin at an address in memory that is a multiple of the element size. For example, if you have an array of 32-bit objects, the array must begin on a 4-byte boundary.

Activation Records

The caller passes the first four parameters in registers rather than on the stack (though it must still reserve storage on the stack for those parameters).
The caller must reserve (at least) 32 bytes of parameter data on the stack, even if there are fewer than five parameters (plus 8 bytes for each additional parameter if there are five or more parameters).
Parameters are always 8-byte values.
RSP must be 16-byte-aligned immediately before the call instruction pushes the return address onto the stack.

Array Parameters

For parameters, all procedure/function parameters must consume exactly 64 bits. If a data object is smaller than 64 bits, the HO bits of the parameter value (the bits beyond the actual parameter’s native size) are undefined (and not guaranteed to be zero). Procedures should access only the actual data bits for the parameter’s native type and ignore the HO bits.
If a parameter’s native type is larger than 64 bits, the Microsoft ABI requires the caller to pass the parameter by reference rather than by value.
Although the Microsoft calling convention passes the first four parameters in registers, it still requires the caller to allocate storage on the stack for these parameters (shadow storage).13 In fact, the Microsoft calling convention requires the caller to allocate storage for four parameters on the stack even if the procedure doesn’t have four parameters.
The magic instructions i.e "add rsp,48" or "sub rsp,48" are used for allocating storage for local variables and all the parameter space for the procedures being called as well as keeping the stack 16-byte-aligned. However, if you use this trick to allocate storage for your procedures’ parameters, you will not be able to use the push instructions to move the data onto the stack. The storage has already been allocated on the stack for the parameters; you must use mov instructions to copy the data onto the stack (using the [RSP+constant] addressing mode) when copying the fifth and greater parameters.

PreviousMASM 64 Bit Assembly NextComputer Data Representation and Operations

Last updated 1 day ago