Procedures

Chapter 5

A procedure is a set of instructions that compute a value or take an action.

Calling a procedure -

; Simple procedure call example.

         option  casemap:none

nl       =       10

         .const
ttlStr   byte    "Listing 5-1", 0

 
        .data
dwArray dword   256 dup (1)

        
        .code

; Here is the user-written procedure
; that zeros out a buffer.

zeroBytes proc
          mov eax, 0
          mov edx, 256
repeatlp: mov [rcx+rdx*4-4], eax
          dec rdx
          jnz repeatlp
          ret
zeroBytes endp

; Here is the "asmMain" function.

        public  asmMain
asmMain proc

; "Magic" instruction offered without
; explanation at this point:

        sub     rsp, 48

        lea     rcx, dwArray
        call    zeroBytes 

        add     rsp, 48 ;Restore RSP
        ret     ;Returns to caller
asmMain endp
        end

If, for some reason, you don’t want MASM to treat all the statement labels in a procedure as local to that procedure, you can turn scoping on and off with the following statements:

option scoped
option noscoped

Preservation Of Registers

Callee preservation has two advantages: space and maintainability. If the callee (the procedure) preserves all affected registers, only one copy of the push and pop instructions exists—those the procedure contains. If the caller saves the values in the registers, the program needs a set of preservation instructions around every call. This makes your programs not only longer but also harder to maintain. Remembering which registers to save and restore on each procedure call is not easily done.
One big problem with having the caller preserve registers is that your program may change over time. You may modify the calling code or the procedure to use additional registers. Such changes, of course, may change the set of registers that you must preserve. Worse still, if the modification is in the subroutine itself, you will need to locate every call to the routine and verify that the subroutine does not change any registers the calling code uses.
Assembly language programmers use a common convention with respect to register preservation: unless there is a good reason (performance) for doing otherwise, most programmers will preserve all registers that a procedure modifies (and that doesn’t explicitly return a value in a modified register). This reduces the likelihood of defects occurring in a program because a procedure modifies a register the caller expects to be preserved.

; Preserving registers (caller) example


               option  casemap:none

nl             =       10

              .const
ttlStr        byte    "Listing 5-4", 0
space         byte    " ", 0
asterisk      byte    '*, %d', nl, 0

              .data
saveRBX       qword   ?
        
              .code
              externdef printf:proc

; print40Spaces-
; 
;  Prints out a sequence of 40 spaces
; to the console display.

print40Spaces proc
              sub  rsp, 48   ;"Magic" instruction
              mov  ebx, 40
printLoop:    lea  rcx, space
              call printf
              dec  ebx
              jnz  printLoop ;Until ebx==0
              add  rsp, 48   ;"Magic" instruction
              ret
print40Spaces endp


; Here is the "asmMain" function.

              public  asmMain
asmMain       proc
              push    rbx
                
; "Magic" instruction offered without
; explanation at this point:

              sub     rsp, 40

              mov     rbx, 20
astLp:        mov     saveRBX, rbx
              call    print40Spaces
              lea     rcx, asterisk
              mov     rdx, saveRBX
              call    printf
              mov     rbx, saveRBX
              dec     rbx
              jnz     astLp

              add     rsp, 40
              pop     rbx
              ret     ;Returns to caller
asmMain       endp
              end

Preserving registers isn’t all there is to preserving the environment. You can also push and pop variables and other values that a subroutine might change. Because the x86-64 allows you to push and pop memory locations, you can easily preserve these values as well.

Procedures And The Stack

Because procedures use the stack to hold the return address, you must exercise caution when pushing and popping data within a procedure.
If a push instruction is given inside a procedure, not popped and ret instruction is given, then the ret instruction isn’t aware that the value on the top of the stack is not a valid address. It simply pops whatever value is on top and jumps to that location.
The program will probably crash or exhibit another undefined behavior. Therefore, when pushing data onto the stack within a procedure, you must take care to properly pop that data prior to returning from the procedure.

Popping extra data off the stack prior to executing the ret statement can also create havoc in your programs.
Once again, the ret instruction blindly pops whatever data happens to be on the top of the stack and attempts to return to that address. Unlike the previous example, in which the top of the stack was unlikely to contain a valid return address, there is a small possibility that the top of the stack in this example does contain a return address. However, this will not be the proper return address for the calling procedure.

Activation Records

Whenever you call a procedure, the program associates certain information with that procedure call, including the return address, parameters, and automatic local variables, using a data structure called an activation record.
Construction of an activation record begins in the code that calls a procedure. The caller makes room for the parameter data (if any) on the stack and copies the data onto the stack. Then the call instruction pushes the return address onto the stack.
At this point, construction of the activation record continues within the procedure itself. The procedure pushes registers and other important state information and then makes room in the activation record for local variables. The procedure might also update the RBP register so that it points at the base address of the activation record.

// code for above activation record example

void ARDemo(unsigned i, int j, unsigned k)
{
 int a;
 float r;
 char c;
 bool b;
 short w
 .
 .
 .
}

Accessing data from activation records

To access objects in the activation record, you must use offsets from the RBP register to the desired object. The two items of immediate interest to you are the parameters and the local variables. You can access the parameters at positive offsets from the RBP register; you can access the local variables at negative offsets from the RBP register.
Intel specifically reserves the RBP (Base Pointer) register for use as a pointer to the base of the activation record. This is why you should avoid using the RBP register for general calculations. If you arbitrarily change the value in the RBP register, you could lose access to the current procedure’s parameters and local variables.

Assembly Language Entry Sequence

The caller of a procedure is responsible for allocating storage for parameters on the stack and moving the parameter data to its appropriate location. In the simplest case, this just involves pushing the data onto the stack by using push instructions.
The call instruction pushes the return address onto the stack. It is the procedure’s responsibility to construct the rest of the activation record.

push rbp ; Save a copy of the old RBP value
mov rbp, rsp ; Get ptr to activation record into RBP
sub rsp, num_vars ; Allocate local variable storage plus padding

If the number of bytes of local variables in the procedure is not a multiple of 16, you should round up the value to the next higher multiple of 16 before subtracting this constant from RSP. Doing so will slightly increase the amount of storage the procedure uses for local variables but will not otherwise affect the operation of the procedure.
If you cannot ensure that RSP is 16-byte-aligned (RSP mod 16 == 8) upon entry into your procedure, you can always force 16-byte alignment by using the following sequence at the beginning of your procedure:

push rbp
mov rbp, rsp
sub rsp, num_vars ; Make room for local variables
and rsp, -16 ; Force qword stack alignment

Assembly Language Entry Sequence

Before a procedure returns to its caller, it needs to clean up the activation record. Standard MASM procedures and procedure calls, therefore, assume that it is the procedure’s responsibility to clean up the activation record.

mov rsp, rbp ; Deallocate locals and clean up stack
pop rbp ; Restore pointer to caller's activation record
ret ; Return to the caller

In the Microsoft ABI (as opposed to pure assembly procedures), it is the caller’s responsibility to clean up any parameters pushed on the stack. Therefore, if you are writing a function to be called from C/C++ , your procedure doesn’t have to do anything at all about the parameters on the stack.
If you are writing procedures that will be called only from your assembly language programs, it is possible to have the callee (the procedure) rather than the caller clean up the parameters on the stack upon returning to the caller.

mov rsp, rbp ; Deallocate locals and clean up stack
pop rbp ; Restore pointer to caller's activation record
ret parm_bytes ; Return to the caller and pop the parameters

The parm_bytes operand of the ret instruction is a constant that specifies the number of bytes of parameter data to remove from the stack after the return instruction pops the return address.

Intel has added a special instruction to the instruction set to shorten the standard exit sequence: leave. This instruction copies RBP into RSP and then pops RBP.

leave
ret optional_const

Local Variables

Local variables (or, more properly, automatic variables) have their storage allocated upon entry into a procedure, and that storage is returned for other use when the procedure returns to its caller. The name automatic refers to the program automatically allocating and deallocating storage for the variable on procedure invocation and return.
Static objects (those you declare in the .data,.const, .data?, and .code sections) have a lifetime equivalent to the total runtime of the application.
A procedure can access any global .data, .data?, or .const object the same way the main program accesses such variables—by referencing the name. Accessing global objects is convenient and easy. Of course, accessing global objects makes your programs harder to read, understand, and maintain, so you should avoid using global variables within procedures.
Your program accesses local variables in a procedure by using negative offsets from the activation record base address (RBP)

; Accessing local variables


               option  casemap:none
               .code

; localVars - Demonstrates local variable access
;
; sdword a is at offset -4 from RBP
; sdword b is at offset -8 from RBP
;
; On entry, ECX and EDX contain values to store
; into the local variables a & b (respectively)

localVars     proc
              push rbp
              mov  rbp, rsp
              sub  rsp, 16  ;Make room for a & b
              
              mov  [rbp-4], ecx  ;a = ecx
              mov  [ebp-8], edx  ;b = edx
              
    ; Additional code here that uses a & b
              
              mov   rsp, rbp
              pop   rbp
              ret
localVars     endp
              end

A slightly better solution is to create equates for your local variable names.

; Accessing local variables #2


            option  casemap:none
            .code

; localVars - Demonstrates local variable access
;
; sdword a is at offset -4 from RBP
; sdword b is at offset -8 from RBP
;
; On entry, ECX and EDX contain values to store
; into the local variables a & b (respectively)

a           equ     [rbp-4]
b           equ     a-4
localVars   proc
            push    rbp
            mov     rbp, rsp
            sub     rsp, 16  ;Make room for a & b
              
            mov     a, ecx
            mov     b, edx
              
    ; Additional code here that uses a & b
              
            mov     rsp, rbp
            pop     rbp
            ret
localVars   endp
            end

However, getting too crazy with fancy equates doesn’t pay; MASM provides a high-level-like declaration for local variables (and parameters) you can use if you really want your declarations to be as maintainable as possible.

MASM Local Directive

MASM provides a directive that lets you specify local variables, and MASM automatically fills in the offsets for the locals. That directive, local, uses the following syntax:

local list_of_declarations

The list_of_declarations is a list of local variable declarations, separated by commas. A local variable declaration has two main forms:

identifier:type    ;; for variables
identifier [elements]:type   ;; for array variables

local directives, if they appear in a procedure, must be the first statement(s) after a procedure declaration (the proc directive). A procedure may have more than one local statement; if there is more than one local directive, all must appear together after the proc declaration.

procWithLocals proc
                local var1:byte, local2:word, dVar:dword
                local qArray[4]:qwor
                local ptrVar:qword
                local userTypeVar:userType
                 .
                 . ; Other statements in the procedure.
                 .
procWithLocals endp

MASM automatically associates appropriate offsets with each variable you declare via the local directive. MASM assigns offsets to the variables by subtracting the variable’s size from the current offset (starting at zero) and then rounding down to an offset that is a multiple of the object’s size.
Upon entry into the procedure, you must still allocate storage for the local variables on the stack; that is, you must still provide the code for the standard entry (and standard exit) sequence. MASM does provide a solution (of sorts) for this problem: the option directive. You’ve seen the option casemap:none, option noscoped, and option scoped directives already; the option directive actually supports a wide array of arguments that control MASM’s behavior. Two option operands control procedure code generation when using the local directive: prologue and epilogue.

option prologue:PrologueDef
option prologue:none
option epilogue:EpilogueDef
option epilogue:none

By default, MASM assumes prologue:none and epilogue:none. When you specify none as the prologue and epilogue values, MASM will not generate any extra code to support local variable storage allocation and deallocation in a procedure.
If you insert the option prologue:PrologueDef (default prologue generation) and option epilogue:EpilogueDef (default epilogue generation) into your source file, all following procedures will automatically generate the appropriate standard entry and exit sequences for you.
For MASM’s automatically generated prologue code to work, the procedure must have exactly one entry point. If you define a global statement label as a second entry point, MASM won’t know that it is supposed to generate the prologue code at that point.
MASM deals with the issue of multiple exit points by automatically translating any ret instruction it finds into the standard exit sequence. Assuming, of course, that option epilogue:EpilogueDef is active.
In addition to specifying prologue:PrologueDef and epilogue:EpilogueDef, you can also supply a macro identifier after the prologue: or epilogue: options. If you supply a macro identifier, MASM will expand that macro for the standard entry or exit sequence.

Automatic Allocation

One big advantage to automatic storage allocation is that it efficiently shares a fixed pool of memory among several procedures.
For example, say you call three procedures in a row, The first procedure (ProcA in this code) allocates its local variables on the stack. Upon return, ProcA deallocates that stack storage. Upon entry into ProcB, the program allocates storage for ProcB’s local variables by using the same memory locations just freed by ProcA. Likewise, when ProcB returns and the program calls ProcC, ProcC uses the same stack space for its local variables that ProcB recently freed up. This memory reuse makes efficient use of the system resources and is probably the greatest advantage to using automatic variables.
you must always assume that a local var object is uninitialized upon entry into a procedure. If you need to maintain the value of a variable between calls to a procedure, you should use one of the static variable declaration types.

Parameters

Pass By Value

A parameter passed by value is just that—the caller passes a value to the procedure. Pass-by-value parameters are input-only parameters. You can pass them to a procedure, but the procedure cannot return values through them.
Example in C:

Procedure_name(Parameter);

Because you must pass a copy of the data to the procedure, you should use this method only for passing small objects like bytes, words, double words, and quad words. Passing large arrays and records by value is inefficient.

; Declare external printf function
        externdef printf:proc

        option casemap:none

        .data
        fmt db "Square of %d is %d", 10, 0

        .code
        
Square proc
    mov eax, ecx        ; ecx contains input number (by value)
    imul eax, eax       ; square it
    ret
Square endp

public asmMain
asmMain proc

    sub rsp, 40          ; shadow space (Windows ABI)

    mov ecx, 7           ; pass value 7 by value (into ecx)
    call Square          ; return in eax

    ; print result
    mov ecx, offset fmt
    mov edx, 7           ; original value
    mov r8d, eax         ; result
    call printf

    add rsp, 40
    ret
    
asmMain endp
end

Pass By Reference

To pass a parameter by reference, you must pass the address of a variable rather than its value. In other words, you must pass a pointer to the data. The procedure must dereference this pointer to access the data.

           option  casemap:none

            .data
staticVar   dword   ?

            .code
            externdef someFunc:proc
            
getAddress  proc

            mov     rcx, offset staticVar
            call    someFunc

            ret
getAddress  endp

            end

Using the offset operator raises a couple of issues. First of all, it can compute the address of only a static variable; you cannot obtain the address of an automatic (local) variable or parameter, nor can you compute the address of a memory reference involving a complex memory addressing mode. Another problem is that an instruction like mov rcx, offset staticVar assembles into a large number of bytes.
So another way of passing by reference is by using the lea instruction. Another advantage to using lea is that it will accept any memory addressing mode, not just the name of a static variable.

            option  casemap:none

            .data
staticVar   dword   ?

            .code
            externdef someFunc:proc
            
getAddress  proc

            lea     rcx, staticVar
            call    someFunc

            ret
getAddress  endp

            end

Pass by reference is usually less efficient than pass by value. You must dereference all pass-by-reference parameters on each access; this is slower than simply using a value because it typically requires at least two instructions. However, when passing a large data structure, pass by reference is faster because you do not have to copy the large data structure before calling the procedure.

Low Level Parameter Implementation

A parameter-passing mechanism is a contract between the caller and the callee (the procedure). Both parties have to agree on where the parameter data will appear and what form it will take.
If your assembly language procedures are being called only by other assembly language code that you’ve written, you control both sides of the contract negotiation and get to decide where and how you’re going to pass parameters.
However, if external code is calling your procedure, or your procedure is calling external code, your procedure will have to adhere to whatever calling convention that external code uses. On 64-bit Windows systems, that calling convention will, undoubtedly, be the Windows ABI.

Passing Parameters In Registers

Try to use the below registers according to the parameter data type-

If you are passing several parameters to a procedure in the x86-64’s registers, you should probably use up the registers in the following order:

First                                           Last
RCX, RDX, R8, R9, R10, R11, RAX, XMM0/YMM0-XMM5/YMM5

In general, you should pass integer and other non-floating-point values in the general-purpose registers, and floating-point values in the XMMx/YMMx registers.

Passing Parameters In The Code Stream

Another place where you can pass parameters is in the code stream immediately after the call instruction.

; Demonstration passing parameters in the code stream.

        option  casemap:none

nl          =       10
stdout      =       -11

            .const
ttlStr      byte    "Listing 5-11", 0
        
            .data
soHandle    qword   ?
bWritten    dword   ?
        
            .code
            
            ; Magic equates for Windows API calls:
            
            extrn __imp_GetStdHandle:qword
            extrn __imp_WriteFile:qword

; Here's the print procedure.
; It expects a zero-terminated string
; to follow the call to print.


print       proc
            push    rbp
            mov     rbp, rsp
            and     rsp, -16        ;Ensure stack 16-byte aligned
            sub     rsp, 48         ; Set up stack for MS ABI
            
; Get the pointer to the string immediately following the
; call instruction and scan for the zero-terminating byte.
            
            mov     rdx, [rbp+8]     ;Return address is here
            lea     r8, [rdx-1]      ;R8 = return address - 1
search4_0:  inc     r8               ;Move on to next char
            cmp     byte ptr [R8], 0 ;At end of string?
            jne     search4_0
            
; Fix return address and compute length of string:

            inc     r8               ;Point at new return address
            mov     [rbp+8], r8      ;Save return address
            sub     r8, rdx          ;Compute string length
            dec     r8               ;Don't include 0 byte
            
; Call WriteFile to print the string to the console
;
; WriteFile( fd, bufAdrs, len, &bytesWritten );
;
; Note: pointer to the buffer (string) is already
; in RDX. The len is already in R8. Just need to
; load the file descriptor (handle) into RCX:

            mov     rcx, soHandle    ;Zero extends!
            lea     r9, bWritten    ;Address of "bWritten" in R9
            call    __imp_WriteFile

            leave
            ret
print       endp
 

; Here is the "asmMain" function.

        
            public  asmMain
asmMain     proc
            push    rbp
            mov     rbp, rsp
            sub     rsp, 40
        
; Call getStdHandle with "stdout" parameter
; in order to get the standard output handle
; we can use to call write. Must set up
; soHandle before first call to print procedure

            mov     ecx, stdout     ;Zero-extends!
            call    __imp_GetStdHandle
            mov     soHandle, rax   ;Save handle    

; Demonstrate passing parameters in code stream
; by calling the print procedure:

            call    print
            byte    "Hello, World!", nl, 0

; Clean up, as per Microsoft ABI:

            leave
            ret     ;Returns to caller
        
asmMain     endp
            end

The instruction "lea r8, [rdx-1]" isn’t actually loading an address into R8, per se. This is really an arithmetic instruction that is computing R8 = RDX – 1 (with a single instruction rather than two as would normally be required). This is a common usage of the lea instruction in assembly language programs. Therefore, it’s a little programming trick that you should become comfortable with.
We have two easy ways to handle variable-length parameters: either use a special terminating value (like 0) or pass a special length value that tells the subroutine the number of parameters you are passing.
Using a special value to terminate a parameter list requires that you choose a value that never appears in the list. For example, print uses 0 as the terminating value, so it cannot print the NUL character (whose ASCII code is 0).
Despite the convenience afforded by passing parameters in the code stream, passing parameters there has disadvantages. If you fail to provide the exact number of parameters the procedure requires, the subroutine will get confused.

Passing Parameters On The Stack

Most high-level languages use the stack to pass a large number of parameters because this method is fairly efficient. Although passing parameters on the stack is slightly less efficient than passing parameters in registers, the stack, allows you to pass a large amount of parameter data without difficulty. This is the reason that most programs pass their parameters on the stack.
Make all your variables qword objects. Then you can directly push them onto the stack by using the push instruction prior to calling a procedure. However, not all objects fit nicely into 64 bits.

push qword ptr k
push qword ptr j
push qword ptr i
call CallProc

This sequence pushes the 64-bit values starting at the addresses associated with variables i, j, and k, regardless of the size of these variables.
If the i, j, and k variables are smaller objects (perhaps 32-bit integers), these push instructions will push their values onto the stack along with additional data beyond these variables. As long as CallProc treats these parameter values as their actual size (say, 32 bits) and ignores the HO bits pushed for each argument onto the stack, this will usually work out properly.
You must ensure that such variables do not appear at the very end of a memory page (with the possibility that the next page in memory is inaccessible). The easiest way to do this is to make sure the variables you push on the stack in this fashion are never the last variables you declare in your data sections

i dword ?
j dword ?
k dword ?
pad qword ? ; Ensures that there are at least 64 bits
            ; beyond
call CallProc

Another way to “push” data onto the stack is to drop the RSP register down an appropriate amount in memory and then simply move data onto the stack by using a mov (or similar) instruction.

sub rsp, 24 ; Allocate a multiple of 8 byte so as to align the data
mov eax, k
mov [rsp+16], eax
mov eax, j
mov [rsp+8], eax
mov eax, i
mov [rsp], eax
call CallProc

The mov instructions spread out the data on 8-byte boundaries. The HO dword of each 64-bit entry on the stack will contain garbage (whatever data was in stack memory prior to this sequence). That’s okay; the CallProc procedure (presumably) will ignore that extra data and operate only on the LO 32 bits of each parameter value.

If your procedure includes the standard entry and exit sequences, you may directly access the parameter values in the activation record by indexing off the RBP register.

CallProc proc
         push rbp ; This is the standard entry sequence
         mov rbp, rsp ; Get base address of activation record into RBP
         mov eax, [rbp+32] ; Accesses the k parameter
         mov ebx, [rbp+24] ; Accesses the j parameter
         mov ecx, [rbp+16] ; Accesses the i parameter
          
          .
          .
          .

         leave
         ret 24

Accessing Value Parameters On The Stack

Accessing parameters passed by value is no different from accessing a local variable object. One way to accomplish this is by using equates, as was demonstrated for local variables earlier.

        option  casemap:none

nl          =       10
stdout      =       -11

            .const
ttlStr      byte    "Listing 5-12", 0
fmtStr1     byte    "Value of parameter: %d", nl, 0
        
            .data
value1      dword   20
value2      dword   30
        
            .code
            externdef printf:proc
            
theParm     equ     <[rbp+16]> 
ValueParm   proc
            push    rbp
            mov     rbp, rsp
            
            sub     rsp, 32 ;Magic instruction
            
            lea     rcx, fmtStr1
            mov     edx, theParm
            call    printf
            
            leave
            ret
ValueParm   endp


; Here is the "asmMain" function.

        
            public  asmMain
asmMain     proc
            push    rbp
            mov     rbp, rsp
            sub     rsp, 40
        
            mov     eax, value1
            mov     [rsp], eax      ;Store parameter on stack
            call    ValueParm
            
            mov     eax, value2
            mov     [rsp], eax
            call    ValueParm
            
; Clean up, as per Microsoft ABI:

            leave
            ret     ;Returns to caller
        
asmMain     endp
            end

Although you could access the value of theParm by using the anonymous address [RBP+16] within your code, using the equate in this fashion makes your code more readable and maintainable.

Declaring Parameters Using The proc Directive

proc_name proc parameter_list

Each parameter declaration takes the form

parm_name:type

The parameter declarations appearing as proc operands assume that a standard entry sequence is executed and that the program will access parameters off the RBP register, with the saved RBP and return address values at offsets 0 and 8 from the RBP register. Again, the offsets are always 8 bytes, regardless of the parameter data type.
As per the Microsoft ABI, MASM will allocate storage on the stack for the first four parameters, even though you would normally pass these parameters in RCX, RDX, R8, and R9. These 32 bytes of storage (starting at RBP+16) are called shadow storage in Microsoft ABI nomenclature.
When calling a procedure whose parameters you declare in the operand field of a proc directive, don’t forget that MASM assumes you push the parameters onto the stack in the reverse order they appear in the parameter list, to ensure that the first parameter in the list is at the lowest memory address on the stack.

mov eax, dwordValue
push rax ; Parms are always 64 bits
mov ax, wordValue
push rax
mov al, byteValue
push rax
call procWithParms

Another faster solution is to:

sub rsp, 24 ; Reserve storage for parameters
mov eax, dwordValue ; i
mov [rsp+16], eax
mov ax, wordValue
mov [rsp+8], ax ; j
mov al, byteValue
mov [rsp], al ; k
call procWithParms

Accessing Reference Parameters on the Stack

The RefParm procedure has a single pass-by-reference parameter. A pass-by-reference parameter is always a (64-bit) pointer to an object. To access the value associated with the parameter, this code has to load that quad-word address into a 64-bit register and access the data indirectly.

; Accessing a reference parameter on the stack

        option  casemap:none

nl          =       10
stdout      =       -11

            .const
fmtStr1     byte    "Value of parameter: %d", nl, 0
        
            .data
value1      dword   20
value2      dword   30
        
            .code
            externdef printf:proc
            
theParm     equ     <[rbp+16]> 
RefParm     proc
            push    rbp
            mov     rbp, rsp
            
            sub     rsp, 32 ;Magic instruction
            
            lea     rcx, fmtStr1
            mov     rax, theParm    ;Dereference parameter
            mov     edx, [rax]
            call    printf
            
            leave
            ret
RefParm     endp


; Here is the "asmMain" function.

        
            public  asmMain
asmMain     proc
            push    rbp
            mov     rbp, rsp
            sub     rsp, 40
        
            lea     rax, value1
            mov     [rsp], rax      ;Store address on stack
            call    RefParm
            
            lea     rax, value2
            mov     [rsp], rax
            call    RefParm
            
; Clean up, as per Microsoft ABI:

            leave
            ret     ;Returns to caller
        
asmMain     endp
            end

Passing large objects, like arrays and records, is where using reference parameters becomes efficient. When passing these objects by value, the calling code has to make a copy of the actual parameter; if it is a large object, the copy process can be inefficient. Because computing the address of a large object is just as efficient as computing the address of a small scalar object, no efficiency is lost when passing large objects by reference.

; Passing a large object by reference

        option  casemap:none

nl          =       10
NumElements =       24

Pt          struct
x           byte    ?
y           byte    ?
Pt          ends



            .const
fmtStr1     byte    "RefArrayParm[%d].x=%d ", 0
fmtStr2     byte    "RefArrayParm[%d].y=%d", nl, 0
        
            .data
index       dword   ?
Pts         Pt      NumElements dup ({})
        
            .code
            externdef printf:proc
            
ptArray     equ     <[rbp+16]> 
RefAryParm  proc
            push    rbp
            mov     rbp, rsp
            
            mov     rdx, ptArray
            xor     rcx, rcx        ;RCX = 0
            
; while ecx < NumElements, initialize each
; array element. x = ecx/8, y=ecx % 8

ForEachEl:  cmp     ecx, NumElements
            jnl     LoopDone
            
            mov     al, cl
            shr     al, 3   ;AL = ecx / 8
            mov     [rdx][rcx*2].Pt.x, al
            
            mov     al, cl
            and     al, 111b ;AL = ecx % 8
            mov     [rdx][rcx*2].Pt.y, al
            inc     ecx
            jmp     ForEachEl
                        
LoopDone:   leave
            ret
RefAryParm  endp

; Here is the "asmMain" function.

        
            public  asmMain
asmMain     proc
            push    rbp
            mov     rbp, rsp
            sub     rsp, 40
        
; Initialize the array of points:

            lea     rax, Pts
            mov     [rsp], rax      ;Store address on stack
            call    RefAryParm

; Display the array:
            
            mov     index, 0
dispLp:     cmp     index, NumElements
            jnl     dispDone
            
            lea     rcx, fmtStr1
            mov     edx, index              ;zero extends!
            lea     r8, Pts                 ;Get array base
            movzx   r8, [r8][rdx*2].Pt.x    ;Get x field
            call    printf
            
            lea     rcx, fmtStr2
            mov     edx, index              ;zero extends!
            lea     r8, Pts                 ;Get array base
            movzx   r8, [r8][rdx*2].Pt.y    ;Get y field
            call    printf
            
            inc     index
            jmp     dispLp
            
            
; Clean up, as per Microsoft ABI:

dispDone:
            leave
            ret     ;Returns to caller
        
asmMain     endp
            end

Functions

Procedures are a sequence of machine instructions that fulfill a task. The result of the execution of a procedure is the accomplishment of that activity. Functions, on the other hand, execute a sequence of machine instructions specifically to compute a value to return to the caller.
The x86-64’s registers are the most common place to return function results. The strlen() routine in the C Standard Library is a good example of a function that returns a value in one of the CPU’s registers. It returns the length of the string (whose address you pass as a parameter) in the RAX register.
You could return function results in any register if it is more convenient to do so. Of course, if you’re calling a Microsoft ABI–compliant function, you have no choice but to expect the function’s return result in the RAX register.
For values slightly larger than 64 bits (for example, 128 bits or maybe even as many as 256 bits), you can split the result into pieces and return those parts in two or more registers. It is common to see functions returning 128-bit values in the RDX:RAX register pair. Of course, the XMM/YMM registers are another good place to return large values. Just remember that these schemes are not Microsoft ABI–compliant, so they’re practical only when calling code you’ve written.
You can deal with large function return results in two common ways: either pass the return value as a reference parameter or allocate storage on the heap (for example, using the C Standard Library malloc() function) for the object and return a pointer to it in a 64-bit register.

Recursion

Recursion occurs when a procedure calls itself. Example of recursion implemented as a quicksort algorithm :

; Recursive quicksort

        option  casemap:none

nl          =       10
numElements =       10


            .const
fmtStr1     byte    "Data before sorting: ", nl, 0
fmtStr2     byte    "%d "   ;Use nl and 0 from fmtStr3
fmtStr3     byte    nl, 0
fmtStr4     byte    "Data after sorting: ", nl, 0

        
            .data
theArray    dword   1,10,2,9,3,8,4,7,5,6
        
            .code
            externdef printf:proc
            
; quicksort-
;
;  Sorts an array using the quicksort algorithm.
;
; Here's the algorithm in C, so you can follow along:
;
; void quicksort(int a[], int low, int high)
; {
;     int i,j,Middle;
;     if( low < high)
;     {
;         Middle = a[(low+high)/2];
;         i = low;
;         j = high;
;         do
;         {
;             while(a[i] <= Middle) i++;
;             while(a[j] > Middle) j--;
;             if( i <= j)
;             {
;                 swap(a[i],a[j]);
;                 i++;
;                 j--;
;             }
;         } while( i <= j );
;  
;         // recursively sort the two sub arrays
;
;         if( low < j ) quicksort(a,low,j-1);
;         if( i < high) quicksort(a,j+1,high);
;     }
; }
;
; Args:
;    RCX (_a):      Pointer to array to sort
;    RDX (_lowBnd): Index to low bound of array to sort
;    R8 (_highBnd): Index to high bound of array to sort    

_a          equ     [rbp+16]        ;Ptr to array
_lowBnd     equ     [rbp+24]        ;Low bounds of array
_highBnd    equ     [rbp+32]        ;High bounds of array

; Local variables (register save area)

saveR9      equ     [rbp+40]        ;Shadow storage for R9
saveRDI     equ     [rbp-8]
saveRSI     equ     [rbp-16]
saveRBX     equ     [rbp-24]
saveRAX     equ     [rbp-32]

; Within the procedure body, these registers
; have the following meaning:
;
; RCX: Pointer to base address of array to sort
; EDX: Lower bound of array (32-bit index).
; r8d: Higher bound of array (32-bit index).
;
; edi: index (i) into array.
; esi: index (j) into array.
; r9d: Middle element to compare against

quicksort   proc
            push    rbp
            mov     rbp, rsp
            sub     rsp, 32
	            
; This code doesn't mess with RCX. No
; need to save it. When it does mess
; with RDX and R8, it saves those registers
; at that point.

; Preserve other registers we use:

            mov     saveRAX, rax
            mov     saveRBX, rbx
            mov     saveRSI, rsi
            mov     saveRDI, rdi
            mov     saveR9, r9
            
            mov     edi, edx        ;i=low
            mov     esi, r8d        ;j=high

; Compute a pivotal element by selecting the
; physical middle element of the array.
        
            lea     rax, [rsi+rdi*1]  ;RAX=i+j
            shr     rax, 1            ;(i+j)/2
            mov     r9d, [rcx][rax*4] ;Middle = ary[(i+j)/2]
                    

; Repeat until the edi and esi indexes cross one
; another (edi works from the start towards the end
; of the array, esi works from the end towards the
; start of the array).

rptUntil:
        
; Scan from the start of the array forward
; looking for the first element greater or equal
; to the middle element).
            
            dec     edi     ;to counteract inc, below
while1:     inc     edi     ;i = i + 1
            cmp     r9d, [rcx][rdi*4] ;While middle > ary[i]
            jg      while1

; Scan from the end of the array backwards looking
; for the first element that is less than or equal
; to the middle element.

            inc     esi     ;To counteract dec, below
while2:     dec     esi     ;j = j - 1
            cmp     r9d, [rcx][rsi*4]       ;while Middle < ary[j]
            jl      while2            
            
            
; If we've stopped before the two pointers have
; passed over one another, then we've got two
; elements that are out of order with respect
; to the middle element, so swap these two elements.
            
            cmp     edi, esi ;If i <= j
            jnle    endif1            
           
            mov     eax, [rcx][rdi*4] ;Swap ary[i] and ary[j]
            mov     r9d, [rcx][rsi*4]
            mov     [rcx][rsi*4], eax
            mov     [rcx][rdi*4], r9d
            
            inc     edi     ;i = i + 1
            dec     esi     ;j = j - 1
                
endif1:     cmp     edi, esi ;Until i > j
            jng     rptUntil
        
; We have just placed all elements in the array in
; their correct positions with respect to the middle
; element of the array. So all elements at indexes
; greater than the middle element are also numerically
; greater than this element. Likewise, elements at
; indexes less than the middle (pivotal) element are
; now less than that element. Unfortunately, the
; two halves of the array on either side of the pivotal
; element are not yet sorted. Call quicksort recursively
; to sort these two halves if they have more than one
; element in them (if they have zero or one elements, then
; they are already sorted).
        
            cmp     edx, esi ;if lowBnd < j
            jnl     endif2

            ; Note: a is still in RCX,
            ; Low is still in RDX
            ; Need to preserve R8 (High)
            ; Note: quicksort doesn't require stack alignment
                
            push    r8
            mov     r8d, esi    
            call    quicksort ;( a, Low, j )
            pop     r8
            
endif2:     cmp     edi, r8d ;if i < High
            jnl     endif3

            ; Note: a is still in RCX,
            ; High is still in R8d
            ; Need to preserve RDX (low)
            ; Note: quicksort doesn't require stack alignment
              
            push    rdx
            mov     edx, edi  
            call    quicksort ;( a, i, High )
            pop     rdx

; Restore registers and leave:
            
endif3:
            mov     rax, saveRAX 
            mov     rbx, saveRBX 
            mov     rsi, saveRSI 
            mov     rdi, saveRDI 
            mov     r9, saveR9
            leave
            ret  
quicksort   endp
    


; Little utility to print the array elements:

printArray  proc
            push    r15
            push    rbp
            mov     rbp, rsp
            sub     rsp, 40 ;Shadow parameters
            
            lea     r9, theArray
            mov     r15d, 0
whileLT10:  cmp     r15d, numElements
            jnl     endwhile1
            
            lea     rcx, fmtStr2
            lea     r9, theArray
            mov     edx, [r9][r15*4]
            call    printf
            
            inc     r15d
            jmp     whileLT10

endwhile1:  lea     rcx, fmtStr3
            call    printf
            leave
            pop     r15
            ret
printArray  endp

; Here is the "asmMain" function.

        
            public  asmMain
asmMain     proc
            push    rbp
            mov     rbp, rsp
            sub     rsp, 32   ;Shadow storage
        
; Display unsorted array:

            lea     rcx, fmtStr1
            call    printf
            call    printArray
            

; Sort the array

            lea     rcx, theArray
            xor     rdx, rdx                ;low = 0
            mov     r8d, numElements-1      ;high= 9
            call    quicksort ;(theArray, 0, 9)
            
; Display sorted results:

            lea     rcx, fmtStr4
            call    printf
            call    printArray   
            
            leave
            ret     ;Returns to caller
        
asmMain     endp
            end

The quicksort function is a leaf function; it doesn’t call any other functions. Therefore, it doesn’t need to align the stack on a 16-byte boundary.

Procedure Pointers

There are three ways to call a procedure -

1. call proc_name ; Direct call to procedure proc_name

2. call reg64     ; Indirect call to procedure whose address
                  ; appears in the reg64

3. call qwordVar  ; Indirect call to the procedure whose address
                  ; appears in the qwordVar quad-word var

call proc_name    ;Directly calls the procedure

mov rax, offset proc_name
call rax

                ;; or

lea rax, proc_name
call rax

p proc
 .
 .
 .
p endp
 .
 .
 .
 
                 .data
                ptrToP qword offset p

                 .
                 .
                 .
 
                 call ptrToP ; Calls p if ptrToP has not changed
 
 
                ;; or
                
; Reload ProcPointer with the address of q.
                lea rax, q
                mov ProcPointer, rax
                
                .
                .
                .
 
                call ProcPointer ; This invocation calls q

Procedure Parameters

One place where procedure pointers are quite invaluable is in parameter lists. Selecting one of several procedures to call by passing the address of a procedure is a common operation.
When using parameter lists with the MASM proc directive, you can specify a procedure pointer type by using the proc type specifier

procWithProcParm proc parm1:word, procParm:proc

.
.
.

call procParm

Saving the State of the Machine

callsFuncs proc
saveRAX textequ <[rbp-8]>
saveRBX textequ <[rbp-16]>
saveRCX textequ <[rbp-24]>
        push rbp
        mov rbp, rsp
        sub rsp, 48 ; Make room for locals and parms
        mov saveRAX, rax ; Preserve registers in
        mov saveRBX, rbx ; local variables
        mov saveRCX, rcx
        
        .
        .
        .
        
        mov [rsp], rax ; Store parm1
        mov [rsp+8], rbx ; Store parm2
        mov [rsp+16], rcx ; Store parm3
        call theFunction
        
        .
        .
        .
        
        mov rcx, saveRCX ; Restore registers
        mov rbx, saveRBX
        mov rax, saveRAX
        leave ; Deallocate locals
        ret
callsFuncs endp

PreviousConstants And Data Types NextArithmetic

Last updated 1 month ago