Procedures
Chapter 5
A procedure is a set of instructions that compute a value or take an action.
Calling a procedure -
; Simple procedure call example.
option casemap:none
nl = 10
.const
ttlStr byte "Listing 5-1", 0
.data
dwArray dword 256 dup (1)
.code
; Here is the user-written procedure
; that zeros out a buffer.
zeroBytes proc
mov eax, 0
mov edx, 256
repeatlp: mov [rcx+rdx*4-4], eax
dec rdx
jnz repeatlp
ret
zeroBytes endp
; Here is the "asmMain" function.
public asmMain
asmMain proc
; "Magic" instruction offered without
; explanation at this point:
sub rsp, 48
lea rcx, dwArray
call zeroBytes
add rsp, 48 ;Restore RSP
ret ;Returns to caller
asmMain endp
end
If, for some reason, you don’t want MASM to treat all the statement labels in a procedure as local to that procedure, you can turn scoping on and off with the following statements:
option scoped
option noscoped
Preservation Of Registers
Callee preservation has two advantages: space and maintainability. If the callee (the procedure) preserves all affected registers, only one copy of the push and pop instructions exists—those the procedure contains. If the caller saves the values in the registers, the program needs a set of preservation instructions around every call. This makes your programs not only longer but also harder to maintain. Remembering which registers to save and restore on each procedure call is not easily done.
One big problem with having the caller preserve registers is that your program may change over time. You may modify the calling code or the procedure to use additional registers. Such changes, of course, may change the set of registers that you must preserve. Worse still, if the modification is in the subroutine itself, you will need to locate every call to the routine and verify that the subroutine does not change any registers the calling code uses.
Assembly language programmers use a common convention with respect to register preservation: unless there is a good reason (performance) for doing otherwise, most programmers will preserve all registers that a procedure modifies (and that doesn’t explicitly return a value in a modified register). This reduces the likelihood of defects occurring in a program because a procedure modifies a register the caller expects to be preserved.
; Preserving registers (caller) example
option casemap:none
nl = 10
.const
ttlStr byte "Listing 5-4", 0
space byte " ", 0
asterisk byte '*, %d', nl, 0
.data
saveRBX qword ?
.code
externdef printf:proc
; print40Spaces-
;
; Prints out a sequence of 40 spaces
; to the console display.
print40Spaces proc
sub rsp, 48 ;"Magic" instruction
mov ebx, 40
printLoop: lea rcx, space
call printf
dec ebx
jnz printLoop ;Until ebx==0
add rsp, 48 ;"Magic" instruction
ret
print40Spaces endp
; Here is the "asmMain" function.
public asmMain
asmMain proc
push rbx
; "Magic" instruction offered without
; explanation at this point:
sub rsp, 40
mov rbx, 20
astLp: mov saveRBX, rbx
call print40Spaces
lea rcx, asterisk
mov rdx, saveRBX
call printf
mov rbx, saveRBX
dec rbx
jnz astLp
add rsp, 40
pop rbx
ret ;Returns to caller
asmMain endp
end
Preserving registers isn’t all there is to preserving the environment. You can also push and pop variables and other values that a subroutine might change. Because the x86-64 allows you to push and pop memory locations, you can easily preserve these values as well.
Procedures And The Stack
Because procedures use the stack to hold the return address, you must exercise caution when pushing and popping data within a procedure.
If a push instruction is given inside a procedure, not popped and ret instruction is given, then the ret instruction isn’t aware that the value on the top of the stack is not a valid address. It simply pops whatever value is on top and jumps to that location.
The program will probably crash or exhibit another undefined behavior. Therefore, when pushing data onto the stack within a procedure, you must take care to properly pop that data prior to returning from the procedure.

Popping extra data off the stack prior to executing the ret statement can also create havoc in your programs.
Once again, the ret instruction blindly pops whatever data happens to be on the top of the stack and attempts to return to that address. Unlike the previous example, in which the top of the stack was unlikely to contain a valid return address, there is a small possibility that the top of the stack in this example does contain a return address. However, this will not be the proper return address for the calling procedure.

Activation Records
Whenever you call a procedure, the program associates certain information with that procedure call, including the return address, parameters, and automatic local variables, using a data structure called an activation record.
Construction of an activation record begins in the code that calls a procedure. The caller makes room for the parameter data (if any) on the stack and copies the data onto the stack. Then the call instruction pushes the return address onto the stack.
At this point, construction of the activation record continues within the procedure itself. The procedure pushes registers and other important state information and then makes room in the activation record for local variables. The procedure might also update the RBP register so that it points at the base address of the activation record.

// code for above activation record example
void ARDemo(unsigned i, int j, unsigned k)
{
int a;
float r;
char c;
bool b;
short w
.
.
.
}
Accessing data from activation records
To access objects in the activation record, you must use offsets from the RBP register to the desired object. The two items of immediate interest to you are the parameters and the local variables. You can access the parameters at positive offsets from the RBP register; you can access the local variables at negative offsets from the RBP register.
Intel specifically reserves the RBP (Base Pointer) register for use as a pointer to the base of the activation record. This is why you should avoid using the RBP register for general calculations. If you arbitrarily change the value in the RBP register, you could lose access to the current procedure’s parameters and local variables.
Assembly Language Entry Sequence
The caller of a procedure is responsible for allocating storage for parameters on the stack and moving the parameter data to its appropriate location. In the simplest case, this just involves pushing the data onto the stack by using push instructions.
The call instruction pushes the return address onto the stack. It is the procedure’s responsibility to construct the rest of the activation record.
push rbp ; Save a copy of the old RBP value
mov rbp, rsp ; Get ptr to activation record into RBP
sub rsp, num_vars ; Allocate local variable storage plus padding
If the number of bytes of local variables in the procedure is not a multiple of 16, you should round up the value to the next higher multiple of 16 before subtracting this constant from RSP. Doing so will slightly increase the amount of storage the procedure uses for local variables but will not otherwise affect the operation of the procedure.
If you cannot ensure that RSP is 16-byte-aligned (RSP mod 16 == 8) upon entry into your procedure, you can always force 16-byte alignment by using the following sequence at the beginning of your procedure:
push rbp
mov rbp, rsp
sub rsp, num_vars ; Make room for local variables
and rsp, -16 ; Force qword stack alignment
Assembly Language Entry Sequence
Before a procedure returns to its caller, it needs to clean up the activation record. Standard MASM procedures and procedure calls, therefore, assume that it is the procedure’s responsibility to clean up the activation record.
mov rsp, rbp ; Deallocate locals and clean up stack
pop rbp ; Restore pointer to caller's activation record
ret ; Return to the caller
In the Microsoft ABI (as opposed to pure assembly procedures), it is the caller’s responsibility to clean up any parameters pushed on the stack. Therefore, if you are writing a function to be called from C/C++ , your procedure doesn’t have to do anything at all about the parameters on the stack.
If you are writing procedures that will be called only from your assembly language programs, it is possible to have the callee (the procedure) rather than the caller clean up the parameters on the stack upon returning to the caller.
mov rsp, rbp ; Deallocate locals and clean up stack
pop rbp ; Restore pointer to caller's activation record
ret parm_bytes ; Return to the caller and pop the parameters
The parm_bytes operand of the ret instruction is a constant that specifies the number of bytes of parameter data to remove from the stack after the return instruction pops the return address.
Intel has added a special instruction to the instruction set to shorten the standard exit sequence: leave. This instruction copies RBP into RSP and then pops RBP.
leave
ret optional_const
Local Variables
Local variables (or, more properly, automatic variables) have their storage allocated upon entry into a procedure, and that storage is returned for other use when the procedure returns to its caller. The name automatic refers to the program automatically allocating and deallocating storage for the variable on procedure invocation and return.
Static objects (those you declare in the .data,.const, .data?, and .code sections) have a lifetime equivalent to the total runtime of the application.
A procedure can access any global .data, .data?, or .const object the same way the main program accesses such variables—by referencing the name. Accessing global objects is convenient and easy. Of course, accessing global objects makes your programs harder to read, understand, and maintain, so you should avoid using global variables within procedures.
Your program accesses local variables in a procedure by using negative offsets from the activation record base address (RBP)
; Accessing local variables
option casemap:none
.code
; localVars - Demonstrates local variable access
;
; sdword a is at offset -4 from RBP
; sdword b is at offset -8 from RBP
;
; On entry, ECX and EDX contain values to store
; into the local variables a & b (respectively)
localVars proc
push rbp
mov rbp, rsp
sub rsp, 16 ;Make room for a & b
mov [rbp-4], ecx ;a = ecx
mov [ebp-8], edx ;b = edx
; Additional code here that uses a & b
mov rsp, rbp
pop rbp
ret
localVars endp
end

A slightly better solution is to create equates for your local variable names.
; Accessing local variables #2
option casemap:none
.code
; localVars - Demonstrates local variable access
;
; sdword a is at offset -4 from RBP
; sdword b is at offset -8 from RBP
;
; On entry, ECX and EDX contain values to store
; into the local variables a & b (respectively)
a equ [rbp-4]
b equ a-4
localVars proc
push rbp
mov rbp, rsp
sub rsp, 16 ;Make room for a & b
mov a, ecx
mov b, edx
; Additional code here that uses a & b
mov rsp, rbp
pop rbp
ret
localVars endp
end
However, getting too crazy with fancy equates doesn’t pay; MASM provides a high-level-like declaration for local variables (and parameters) you can use if you really want your declarations to be as maintainable as possible.
MASM Local Directive
MASM provides a directive that lets you specify local variables, and MASM automatically fills in the offsets for the locals. That directive, local, uses the following syntax:
local list_of_declarations
The list_of_declarations is a list of local variable declarations, separated by commas. A local variable declaration has two main forms:
identifier:type ;; for variables
identifier [elements]:type ;; for array variables
local directives, if they appear in a procedure, must be the first statement(s) after a procedure declaration (the proc directive). A procedure may have more than one local statement; if there is more than one local directive, all must appear together after the proc declaration.
procWithLocals proc
local var1:byte, local2:word, dVar:dword
local qArray[4]:qwor
local ptrVar:qword
local userTypeVar:userType
.
. ; Other statements in the procedure.
.
procWithLocals endp
MASM automatically associates appropriate offsets with each variable you declare via the local directive. MASM assigns offsets to the variables by subtracting the variable’s size from the current offset (starting at zero) and then rounding down to an offset that is a multiple of the object’s size.
Upon entry into the procedure, you must still allocate storage for the local variables on the stack; that is, you must still provide the code for the standard entry (and standard exit) sequence. MASM does provide a solution (of sorts) for this problem: the option directive. You’ve seen the option casemap:none, option noscoped, and option scoped directives already; the option directive actually supports a wide array of arguments that control MASM’s behavior. Two option operands control procedure code generation when using the local directive: prologue and epilogue.
option prologue:PrologueDef
option prologue:none
option epilogue:EpilogueDef
option epilogue:none
By default, MASM assumes prologue:none and epilogue:none. When you specify none as the prologue and epilogue values, MASM will not generate any extra code to support local variable storage allocation and deallocation in a procedure.
If you insert the option prologue:PrologueDef (default prologue generation) and option epilogue:EpilogueDef (default epilogue generation) into your source file, all following procedures will automatically generate the appropriate standard entry and exit sequences for you.
For MASM’s automatically generated prologue code to work, the procedure must have exactly one entry point. If you define a global statement label as a second entry point, MASM won’t know that it is supposed to generate the prologue code at that point.
MASM deals with the issue of multiple exit points by automatically translating any ret instruction it finds into the standard exit sequence. Assuming, of course, that option epilogue:EpilogueDef is active.
In addition to specifying prologue:PrologueDef and epilogue:EpilogueDef, you can also supply a macro identifier after the prologue: or epilogue: options. If you supply a macro identifier, MASM will expand that macro for the standard entry or exit sequence.
Automatic Allocation
One big advantage to automatic storage allocation is that it efficiently shares a fixed pool of memory among several procedures.
For example, say you call three procedures in a row, The first procedure (ProcA in this code) allocates its local variables on the stack. Upon return, ProcA deallocates that stack storage. Upon entry into ProcB, the program allocates storage for ProcB’s local variables by using the same memory locations just freed by ProcA. Likewise, when ProcB returns and the program calls ProcC, ProcC uses the same stack space for its local variables that ProcB recently freed up. This memory reuse makes efficient use of the system resources and is probably the greatest advantage to using automatic variables.
you must always assume that a local var object is uninitialized upon entry into a procedure. If you need to maintain the value of a variable between calls to a procedure, you should use one of the static variable declaration types.
Parameters
Pass By Value
A parameter passed by value is just that—the caller passes a value to the procedure. Pass-by-value parameters are input-only parameters. You can pass them to a procedure, but the procedure cannot return values through them.
Example in C:
Procedure_name(Parameter);
Because you must pass a copy of the data to the procedure, you should use this method only for passing small objects like bytes, words, double words, and quad words. Passing large arrays and records by value is inefficient.
; Declare external printf function
externdef printf:proc
option casemap:none
.data
fmt db "Square of %d is %d", 10, 0
.code
Square proc
mov eax, ecx ; ecx contains input number (by value)
imul eax, eax ; square it
ret
Square endp
public asmMain
asmMain proc
sub rsp, 40 ; shadow space (Windows ABI)
mov ecx, 7 ; pass value 7 by value (into ecx)
call Square ; return in eax
; print result
mov ecx, offset fmt
mov edx, 7 ; original value
mov r8d, eax ; result
call printf
add rsp, 40
ret
asmMain endp
end
Pass By Reference
To pass a parameter by reference, you must pass the address of a variable rather than its value. In other words, you must pass a pointer to the data. The procedure must dereference this pointer to access the data.
option casemap:none
.data
staticVar dword ?
.code
externdef someFunc:proc
getAddress proc
mov rcx, offset staticVar
call someFunc
ret
getAddress endp
end
Using the offset operator raises a couple of issues. First of all, it can compute the address of only a static variable; you cannot obtain the address of an automatic (local) variable or parameter, nor can you compute the address of a memory reference involving a complex memory addressing mode. Another problem is that an instruction like mov rcx, offset staticVar assembles into a large number of bytes.
So another way of passing by reference is by using the lea instruction. Another advantage to using lea is that it will accept any memory addressing mode, not just the name of a static variable.
option casemap:none
.data
staticVar dword ?
.code
externdef someFunc:proc
getAddress proc
lea rcx, staticVar
call someFunc
ret
getAddress endp
end
Pass by reference is usually less efficient than pass by value. You must dereference all pass-by-reference parameters on each access; this is slower than simply using a value because it typically requires at least two instructions. However, when passing a large data structure, pass by reference is faster because you do not have to copy the large data structure before calling the procedure.
Low Level Parameter Implementation
A parameter-passing mechanism is a contract between the caller and the callee (the procedure). Both parties have to agree on where the parameter data will appear and what form it will take.
If your assembly language procedures are being called only by other assembly language code that you’ve written, you control both sides of the contract negotiation and get to decide where and how you’re going to pass parameters.
However, if external code is calling your procedure, or your procedure is calling external code, your procedure will have to adhere to whatever calling convention that external code uses. On 64-bit Windows systems, that calling convention will, undoubtedly, be the Windows ABI.
Passing Parameters In Registers
Try to use the below registers according to the parameter data type-

If you are passing several parameters to a procedure in the x86-64’s registers, you should probably use up the registers in the following order:
First Last
RCX, RDX, R8, R9, R10, R11, RAX, XMM0/YMM0-XMM5/YMM5
In general, you should pass integer and other non-floating-point values in the general-purpose registers, and floating-point values in the XMMx/YMMx registers.
Passing Parameters In The Code Stream
Another place where you can pass parameters is in the code stream immediately after the call instruction.
; Demonstration passing parameters in the code stream.
option casemap:none
nl = 10
stdout = -11
.const
ttlStr byte "Listing 5-11", 0
.data
soHandle qword ?
bWritten dword ?
.code
; Magic equates for Windows API calls:
extrn __imp_GetStdHandle:qword
extrn __imp_WriteFile:qword
; Here's the print procedure.
; It expects a zero-terminated string
; to follow the call to print.
print proc
push rbp
mov rbp, rsp
and rsp, -16 ;Ensure stack 16-byte aligned
sub rsp, 48 ; Set up stack for MS ABI
; Get the pointer to the string immediately following the
; call instruction and scan for the zero-terminating byte.
mov rdx, [rbp+8] ;Return address is here
lea r8, [rdx-1] ;R8 = return address - 1
search4_0: inc r8 ;Move on to next char
cmp byte ptr [R8], 0 ;At end of string?
jne search4_0
; Fix return address and compute length of string:
inc r8 ;Point at new return address
mov [rbp+8], r8 ;Save return address
sub r8, rdx ;Compute string length
dec r8 ;Don't include 0 byte
; Call WriteFile to print the string to the console
;
; WriteFile( fd, bufAdrs, len, &bytesWritten );
;
; Note: pointer to the buffer (string) is already
; in RDX. The len is already in R8. Just need to
; load the file descriptor (handle) into RCX:
mov rcx, soHandle ;Zero extends!
lea r9, bWritten ;Address of "bWritten" in R9
call __imp_WriteFile
leave
ret
print endp
; Here is the "asmMain" function.
public asmMain
asmMain proc
push rbp
mov rbp, rsp
sub rsp, 40
; Call getStdHandle with "stdout" parameter
; in order to get the standard output handle
; we can use to call write. Must set up
; soHandle before first call to print procedure
mov ecx, stdout ;Zero-extends!
call __imp_GetStdHandle
mov soHandle, rax ;Save handle
; Demonstrate passing parameters in code stream
; by calling the print procedure:
call print
byte "Hello, World!", nl, 0
; Clean up, as per Microsoft ABI:
leave
ret ;Returns to caller
asmMain endp
end
The instruction "lea r8, [rdx-1]" isn’t actually loading an address into R8, per se. This is really an arithmetic instruction that is computing R8 = RDX – 1 (with a single instruction rather than two as would normally be required). This is a common usage of the lea instruction in assembly language programs. Therefore, it’s a little programming trick that you should become comfortable with.
We have two easy ways to handle variable-length parameters: either use a special terminating value (like 0) or pass a special length value that tells the subroutine the number of parameters you are passing.
Using a special value to terminate a parameter list requires that you choose a value that never appears in the list. For example, print uses 0 as the terminating value, so it cannot print the NUL character (whose ASCII code is 0).
Despite the convenience afforded by passing parameters in the code stream, passing parameters there has disadvantages. If you fail to provide the exact number of parameters the procedure requires, the subroutine will get confused.
Passing Parameters On The Stack
Most high-level languages use the stack to pass a large number of parameters because this method is fairly efficient. Although passing parameters on the stack is slightly less efficient than passing parameters in registers, the stack, allows you to pass a large amount of parameter data without difficulty. This is the reason that most programs pass their parameters on the stack.
Make all your variables qword objects. Then you can directly push them onto the stack by using the push instruction prior to calling a procedure. However, not all objects fit nicely into 64 bits.
push qword ptr k
push qword ptr j
push qword ptr i
call CallProc
This sequence pushes the 64-bit values starting at the addresses associated with variables i, j, and k, regardless of the size of these variables.
If the i, j, and k variables are smaller objects (perhaps 32-bit integers), these push instructions will push their values onto the stack along with additional data beyond these variables. As long as CallProc treats these parameter values as their actual size (say, 32 bits) and ignores the HO bits pushed for each argument onto the stack, this will usually work out properly.
You must ensure that such variables do not appear at the very end of a memory page (with the possibility that the next page in memory is inaccessible). The easiest way to do this is to make sure the variables you push on the stack in this fashion are never the last variables you declare in your data sections
i dword ?
j dword ?
k dword ?
pad qword ? ; Ensures that there are at least 64 bits
; beyond
call CallProc
Another way to “push” data onto the stack is to drop the RSP register down an appropriate amount in memory and then simply move data onto the stack by using a mov (or similar) instruction.
sub rsp, 24 ; Allocate a multiple of 8 byte so as to align the data
mov eax, k
mov [rsp+16], eax
mov eax, j
mov [rsp+8], eax
mov eax, i
mov [rsp], eax
call CallProc
The mov instructions spread out the data on 8-byte boundaries. The HO dword of each 64-bit entry on the stack will contain garbage (whatever data was in stack memory prior to this sequence). That’s okay; the CallProc procedure (presumably) will ignore that extra data and operate only on the LO 32 bits of each parameter value.

If your procedure includes the standard entry and exit sequences, you may directly access the parameter values in the activation record by indexing off the RBP register.
CallProc proc
push rbp ; This is the standard entry sequence
mov rbp, rsp ; Get base address of activation record into RBP
mov eax, [rbp+32] ; Accesses the k parameter
mov ebx, [rbp+24] ; Accesses the j parameter
mov ecx, [rbp+16] ; Accesses the i parameter
.
.
.
leave
ret 24

Accessing Value Parameters On The Stack
Accessing parameters passed by value is no different from accessing a local variable object. One way to accomplish this is by using equates, as was demonstrated for local variables earlier.
option casemap:none
nl = 10
stdout = -11
.const
ttlStr byte "Listing 5-12", 0
fmtStr1 byte "Value of parameter: %d", nl, 0
.data
value1 dword 20
value2 dword 30
.code
externdef printf:proc
theParm equ <[rbp+16]>
ValueParm proc
push rbp
mov rbp, rsp
sub rsp, 32 ;Magic instruction
lea rcx, fmtStr1
mov edx, theParm
call printf
leave
ret
ValueParm endp
; Here is the "asmMain" function.
public asmMain
asmMain proc
push rbp
mov rbp, rsp
sub rsp, 40
mov eax, value1
mov [rsp], eax ;Store parameter on stack
call ValueParm
mov eax, value2
mov [rsp], eax
call ValueParm
; Clean up, as per Microsoft ABI:
leave
ret ;Returns to caller
asmMain endp
end
Although you could access the value of theParm by using the anonymous address [RBP+16] within your code, using the equate in this fashion makes your code more readable and maintainable.
Declaring Parameters Using The proc Directive
proc_name proc parameter_list
Each parameter declaration takes the form
parm_name:type
The parameter declarations appearing as proc operands assume that a standard entry sequence is executed and that the program will access parameters off the RBP register, with the saved RBP and return address values at offsets 0 and 8 from the RBP register. Again, the offsets are always 8 bytes, regardless of the parameter data type.
As per the Microsoft ABI, MASM will allocate storage on the stack for the first four parameters, even though you would normally pass these parameters in RCX, RDX, R8, and R9. These 32 bytes of storage (starting at RBP+16) are called shadow storage in Microsoft ABI nomenclature.
When calling a procedure whose parameters you declare in the operand field of a proc directive, don’t forget that MASM assumes you push the parameters onto the stack in the reverse order they appear in the parameter list, to ensure that the first parameter in the list is at the lowest memory address on the stack.
mov eax, dwordValue
push rax ; Parms are always 64 bits
mov ax, wordValue
push rax
mov al, byteValue
push rax
call procWithParms
Another faster solution is to:
sub rsp, 24 ; Reserve storage for parameters
mov eax, dwordValue ; i
mov [rsp+16], eax
mov ax, wordValue
mov [rsp+8], ax ; j
mov al, byteValue
mov [rsp], al ; k
call procWithParms
Accessing Reference Parameters on the Stack
The RefParm procedure has a single pass-by-reference parameter. A pass-by-reference parameter is always a (64-bit) pointer to an object. To access the value associated with the parameter, this code has to load that quad-word address into a 64-bit register and access the data indirectly.
; Accessing a reference parameter on the stack
option casemap:none
nl = 10
stdout = -11
.const
fmtStr1 byte "Value of parameter: %d", nl, 0
.data
value1 dword 20
value2 dword 30
.code
externdef printf:proc
theParm equ <[rbp+16]>
RefParm proc
push rbp
mov rbp, rsp
sub rsp, 32 ;Magic instruction
lea rcx, fmtStr1
mov rax, theParm ;Dereference parameter
mov edx, [rax]
call printf
leave
ret
RefParm endp
; Here is the "asmMain" function.
public asmMain
asmMain proc
push rbp
mov rbp, rsp
sub rsp, 40
lea rax, value1
mov [rsp], rax ;Store address on stack
call RefParm
lea rax, value2
mov [rsp], rax
call RefParm
; Clean up, as per Microsoft ABI:
leave
ret ;Returns to caller
asmMain endp
end
Passing large objects, like arrays and records, is where using reference parameters becomes efficient. When passing these objects by value, the calling code has to make a copy of the actual parameter; if it is a large object, the copy process can be inefficient. Because computing the address of a large object is just as efficient as computing the address of a small scalar object, no efficiency is lost when passing large objects by reference.
; Passing a large object by reference
option casemap:none
nl = 10
NumElements = 24
Pt struct
x byte ?
y byte ?
Pt ends
.const
fmtStr1 byte "RefArrayParm[%d].x=%d ", 0
fmtStr2 byte "RefArrayParm[%d].y=%d", nl, 0
.data
index dword ?
Pts Pt NumElements dup ({})
.code
externdef printf:proc
ptArray equ <[rbp+16]>
RefAryParm proc
push rbp
mov rbp, rsp
mov rdx, ptArray
xor rcx, rcx ;RCX = 0
; while ecx < NumElements, initialize each
; array element. x = ecx/8, y=ecx % 8
ForEachEl: cmp ecx, NumElements
jnl LoopDone
mov al, cl
shr al, 3 ;AL = ecx / 8
mov [rdx][rcx*2].Pt.x, al
mov al, cl
and al, 111b ;AL = ecx % 8
mov [rdx][rcx*2].Pt.y, al
inc ecx
jmp ForEachEl
LoopDone: leave
ret
RefAryParm endp
; Here is the "asmMain" function.
public asmMain
asmMain proc
push rbp
mov rbp, rsp
sub rsp, 40
; Initialize the array of points:
lea rax, Pts
mov [rsp], rax ;Store address on stack
call RefAryParm
; Display the array:
mov index, 0
dispLp: cmp index, NumElements
jnl dispDone
lea rcx, fmtStr1
mov edx, index ;zero extends!
lea r8, Pts ;Get array base
movzx r8, [r8][rdx*2].Pt.x ;Get x field
call printf
lea rcx, fmtStr2
mov edx, index ;zero extends!
lea r8, Pts ;Get array base
movzx r8, [r8][rdx*2].Pt.y ;Get y field
call printf
inc index
jmp dispLp
; Clean up, as per Microsoft ABI:
dispDone:
leave
ret ;Returns to caller
asmMain endp
end
Functions
Procedures are a sequence of machine instructions that fulfill a task. The result of the execution of a procedure is the accomplishment of that activity. Functions, on the other hand, execute a sequence of machine instructions specifically to compute a value to return to the caller.
The x86-64’s registers are the most common place to return function results. The strlen() routine in the C Standard Library is a good example of a function that returns a value in one of the CPU’s registers. It returns the length of the string (whose address you pass as a parameter) in the RAX register.
You could return function results in any register if it is more convenient to do so. Of course, if you’re calling a Microsoft ABI–compliant function, you have no choice but to expect the function’s return result in the RAX register.
For values slightly larger than 64 bits (for example, 128 bits or maybe even as many as 256 bits), you can split the result into pieces and return those parts in two or more registers. It is common to see functions returning 128-bit values in the RDX:RAX register pair. Of course, the XMM/YMM registers are another good place to return large values. Just remember that these schemes are not Microsoft ABI–compliant, so they’re practical only when calling code you’ve written.
You can deal with large function return results in two common ways: either pass the return value as a reference parameter or allocate storage on the heap (for example, using the C Standard Library malloc() function) for the object and return a pointer to it in a 64-bit register.
Recursion
Recursion occurs when a procedure calls itself. Example of recursion implemented as a quicksort algorithm :
; Recursive quicksort
option casemap:none
nl = 10
numElements = 10
.const
fmtStr1 byte "Data before sorting: ", nl, 0
fmtStr2 byte "%d " ;Use nl and 0 from fmtStr3
fmtStr3 byte nl, 0
fmtStr4 byte "Data after sorting: ", nl, 0
.data
theArray dword 1,10,2,9,3,8,4,7,5,6
.code
externdef printf:proc
; quicksort-
;
; Sorts an array using the quicksort algorithm.
;
; Here's the algorithm in C, so you can follow along:
;
; void quicksort(int a[], int low, int high)
; {
; int i,j,Middle;
; if( low < high)
; {
; Middle = a[(low+high)/2];
; i = low;
; j = high;
; do
; {
; while(a[i] <= Middle) i++;
; while(a[j] > Middle) j--;
; if( i <= j)
; {
; swap(a[i],a[j]);
; i++;
; j--;
; }
; } while( i <= j );
;
; // recursively sort the two sub arrays
;
; if( low < j ) quicksort(a,low,j-1);
; if( i < high) quicksort(a,j+1,high);
; }
; }
;
; Args:
; RCX (_a): Pointer to array to sort
; RDX (_lowBnd): Index to low bound of array to sort
; R8 (_highBnd): Index to high bound of array to sort
_a equ [rbp+16] ;Ptr to array
_lowBnd equ [rbp+24] ;Low bounds of array
_highBnd equ [rbp+32] ;High bounds of array
; Local variables (register save area)
saveR9 equ [rbp+40] ;Shadow storage for R9
saveRDI equ [rbp-8]
saveRSI equ [rbp-16]
saveRBX equ [rbp-24]
saveRAX equ [rbp-32]
; Within the procedure body, these registers
; have the following meaning:
;
; RCX: Pointer to base address of array to sort
; EDX: Lower bound of array (32-bit index).
; r8d: Higher bound of array (32-bit index).
;
; edi: index (i) into array.
; esi: index (j) into array.
; r9d: Middle element to compare against
quicksort proc
push rbp
mov rbp, rsp
sub rsp, 32
; This code doesn't mess with RCX. No
; need to save it. When it does mess
; with RDX and R8, it saves those registers
; at that point.
; Preserve other registers we use:
mov saveRAX, rax
mov saveRBX, rbx
mov saveRSI, rsi
mov saveRDI, rdi
mov saveR9, r9
mov edi, edx ;i=low
mov esi, r8d ;j=high
; Compute a pivotal element by selecting the
; physical middle element of the array.
lea rax, [rsi+rdi*1] ;RAX=i+j
shr rax, 1 ;(i+j)/2
mov r9d, [rcx][rax*4] ;Middle = ary[(i+j)/2]
; Repeat until the edi and esi indexes cross one
; another (edi works from the start towards the end
; of the array, esi works from the end towards the
; start of the array).
rptUntil:
; Scan from the start of the array forward
; looking for the first element greater or equal
; to the middle element).
dec edi ;to counteract inc, below
while1: inc edi ;i = i + 1
cmp r9d, [rcx][rdi*4] ;While middle > ary[i]
jg while1
; Scan from the end of the array backwards looking
; for the first element that is less than or equal
; to the middle element.
inc esi ;To counteract dec, below
while2: dec esi ;j = j - 1
cmp r9d, [rcx][rsi*4] ;while Middle < ary[j]
jl while2
; If we've stopped before the two pointers have
; passed over one another, then we've got two
; elements that are out of order with respect
; to the middle element, so swap these two elements.
cmp edi, esi ;If i <= j
jnle endif1
mov eax, [rcx][rdi*4] ;Swap ary[i] and ary[j]
mov r9d, [rcx][rsi*4]
mov [rcx][rsi*4], eax
mov [rcx][rdi*4], r9d
inc edi ;i = i + 1
dec esi ;j = j - 1
endif1: cmp edi, esi ;Until i > j
jng rptUntil
; We have just placed all elements in the array in
; their correct positions with respect to the middle
; element of the array. So all elements at indexes
; greater than the middle element are also numerically
; greater than this element. Likewise, elements at
; indexes less than the middle (pivotal) element are
; now less than that element. Unfortunately, the
; two halves of the array on either side of the pivotal
; element are not yet sorted. Call quicksort recursively
; to sort these two halves if they have more than one
; element in them (if they have zero or one elements, then
; they are already sorted).
cmp edx, esi ;if lowBnd < j
jnl endif2
; Note: a is still in RCX,
; Low is still in RDX
; Need to preserve R8 (High)
; Note: quicksort doesn't require stack alignment
push r8
mov r8d, esi
call quicksort ;( a, Low, j )
pop r8
endif2: cmp edi, r8d ;if i < High
jnl endif3
; Note: a is still in RCX,
; High is still in R8d
; Need to preserve RDX (low)
; Note: quicksort doesn't require stack alignment
push rdx
mov edx, edi
call quicksort ;( a, i, High )
pop rdx
; Restore registers and leave:
endif3:
mov rax, saveRAX
mov rbx, saveRBX
mov rsi, saveRSI
mov rdi, saveRDI
mov r9, saveR9
leave
ret
quicksort endp
; Little utility to print the array elements:
printArray proc
push r15
push rbp
mov rbp, rsp
sub rsp, 40 ;Shadow parameters
lea r9, theArray
mov r15d, 0
whileLT10: cmp r15d, numElements
jnl endwhile1
lea rcx, fmtStr2
lea r9, theArray
mov edx, [r9][r15*4]
call printf
inc r15d
jmp whileLT10
endwhile1: lea rcx, fmtStr3
call printf
leave
pop r15
ret
printArray endp
; Here is the "asmMain" function.
public asmMain
asmMain proc
push rbp
mov rbp, rsp
sub rsp, 32 ;Shadow storage
; Display unsorted array:
lea rcx, fmtStr1
call printf
call printArray
; Sort the array
lea rcx, theArray
xor rdx, rdx ;low = 0
mov r8d, numElements-1 ;high= 9
call quicksort ;(theArray, 0, 9)
; Display sorted results:
lea rcx, fmtStr4
call printf
call printArray
leave
ret ;Returns to caller
asmMain endp
end
The quicksort function is a leaf function; it doesn’t call any other functions. Therefore, it doesn’t need to align the stack on a 16-byte boundary.
Procedure Pointers
There are three ways to call a procedure -
1. call proc_name ; Direct call to procedure proc_name
2. call reg64 ; Indirect call to procedure whose address
; appears in the reg64
3. call qwordVar ; Indirect call to the procedure whose address
; appears in the qwordVar quad-word var
1.
call proc_name ;Directly calls the procedure
2.
mov rax, offset proc_name
call rax
;; or
lea rax, proc_name
call rax
3.
p proc
.
.
.
p endp
.
.
.
.data
ptrToP qword offset p
.
.
.
call ptrToP ; Calls p if ptrToP has not changed
;; or
; Reload ProcPointer with the address of q.
lea rax, q
mov ProcPointer, rax
.
.
.
call ProcPointer ; This invocation calls q
Procedure Parameters
One place where procedure pointers are quite invaluable is in parameter lists. Selecting one of several procedures to call by passing the address of a procedure is a common operation.
When using parameter lists with the MASM proc directive, you can specify a procedure pointer type by using the proc type specifier
procWithProcParm proc parm1:word, procParm:proc
.
.
.
call procParm
Saving the State of the Machine
callsFuncs proc
saveRAX textequ <[rbp-8]>
saveRBX textequ <[rbp-16]>
saveRCX textequ <[rbp-24]>
push rbp
mov rbp, rsp
sub rsp, 48 ; Make room for locals and parms
mov saveRAX, rax ; Preserve registers in
mov saveRBX, rbx ; local variables
mov saveRCX, rcx
.
.
.
mov [rsp], rax ; Store parm1
mov [rsp+8], rbx ; Store parm2
mov [rsp+16], rcx ; Store parm3
call theFunction
.
.
.
mov rcx, saveRCX ; Restore registers
mov rbx, saveRBX
mov rax, saveRAX
leave ; Deallocate locals
ret
callsFuncs endp
Last updated