Some Assembly Required - The initial stack, reading process arguments (and environment variables)

I wrote a 32-bit assembly application a while ago which performed the simple task of printing out the program arguments and then the environment variables. Most people have seen a C-style main method:

int main(int argc, char* argv[]) {
 //...
}

Wikipedia tells me that > Unix (though not POSIX.1) and Windows have a third argument giving the program’s environment and takes the following form:

int main(int argc, char *argv[], char *envp[]) {
 //...
}

See libc_start_main.c at lines 58-67 for the mechanics of how the signature is varied between the two forms. Depending on which form is implemented the stack frame would look different as it would have a third argument pushed onto it. However, we’re not going to rely on the glibc code to invoke a main method, we’re going to implement a global function called _start and see what we’re given: this is at a level below what you’d see in a C-style main function.

The story with `x86` code

The second form of the main function appears unusual for a programmer who is used to writing software in Java but gives an indication as to how a program is initialised and receives its program arguments. If you do a search for “System V ABI i386” you’ll no doubt find a document whose copyright is asserted by both the “Santa Cruz Operation, Inc” and AT&T. I’ve been looking at the fourth edition, and on page 54, section 3-28, it shows the initial process stack. A helpful diagram shows something a bit like this:

Unspecified

High addresses

Information block, including argument strings, environment strings, auxillary information ... (size varies).

Unspecified

Null auxiliary vector entry

Auxiliary vector ... (2-word entries)

Null word

Environment pointers ... (one word each)

Null word

4(%esp)

Argument pointers ... (argc words)

0(%esp)

Argument count (argc)

Undefined

Low addresses

In the document the term word refers to a 32-bit value, which unhelpfully collides with what you come to accept as a “word” if you do any assembly programming (i.e. you expect it to refer to a 16-bit value).

Just to ensure that everyone’s on the same page, it’s worth mentioning that the stack starts at a high address and “grows” downwards. As you add a stack frame for nested function invocations, the value of the stack pointer in %esp (or %rsp for 64-bit code) decreases.

I’ll quote directly from the document at this point:

Argument strings, environment strings, and the auxiliary information appear in no specific order within the information block; the system makes no guarantees about their arrangement. The system also may leave an unspecified amount of memory between the null auxiliary vector entry and the beginning of the information block.

The ABI goes into some detail about the structure of the Auxillary Vector entries (they are an 8-byte structure containing a type and a value or pointer).

On the basis that a picture is worth a thousand words, I’m going to borrow another diagram from the document (hopefully not so slavishly that SCO sue me):

pad

High addresses

0x8047ff0

0x8047fe0

0x8047fd0

Auxiliary vector

0x8047fc0

0x8047ff0

0x8047fe1

Environment vector

0x8047fdd

0x8047fb0

0x8047fd8

Argument vector

0(%esp), 0x8047fac

Argument count

Undefined

Low adresses

I hope this helps, it took ages to type it in! You can see how the two argument pointers point to the first byte of their argument values, which are null-terminated. Accessing the environment variables involves accounting for the argc value and the null-word between the argument vector and the environment vector and multiplying the number of arguments by their width. We can write this using indexed addressing, which in 32-bit code (each pointer being four-bytes) is

8(%esp, _argc_, 4)

or slightly more readably

8 + %esp + (_argc_ * 4)

In this example, that makes 8 + 0x8047fac + 8 which is 0x8047fbc.

The program arguments in this example were echo and abi and the environment variables are HOME=/home/dir and PATH=/usr/bin:. Interestingly, without the benefit of glibc start-up code, it’s necessary to scan through the environment vector until the null word is detected in order to access the auxiliary vector.

As an aside, it seems that glibc initialises the hidden variables __libc_argc, __libc_argv and the readable __environ in the _init function. I don’t know from where they’re they’re later accessed, but they’re declared as follows:

/* Remember the command line argument and enviroment contents for  
   later calls of initializers for dynamic libraries. */  
int __libc_argc attribute_hidden;  
char **__libc_argv attribute_hidden;

The story with `x86_64` code

Given the different function calling convention with x86_64 defined in the x86_64 System V ABI, when I went to convert a simple 32-bit piece of assembly to do the same using 64-bit instructions and registers, I was somewhat surprised to find that the arguments to the _start function were not passed in the registers %rdi, %rsi and %rdx, but are passed in the same way as for 32-bit code.

They provide the following table in the PDF:

Figure 3.9 Initial Process Stack
Purpose	Start Address	Length
Unspecified	High Addresses
Information bloc, including argument strings, environment strings, auxiliary information …		varies
Unspecified
Null auxiliary vector entry		1 eightbyte
Auxiliary vector entries …		2 eightbytes each
`0`		eightbyte
Environment pointers …		1 eightbyte each
`0`	`8+8*argc+%rspx`	eightbyte
Argument pointers …	`8+%rsp`	`argc` eightbytes
Argument count	`%rsp`	eightbyte
Undefined	Low Addresses

In other words, allowing for the difference in size between a pointer on the two different architectures, the initial process stack is the same on both. It’s no surprise that process arguments, environment variables and auxiliary vector are stored there. It took some thought to realise why argc and the pointer to argv[0] are passed on the stack, rather than %rdi and %rsi: permanence. If the values were passed in the registers, they would be ephemeral at best.

Finally, some assembly

Anyway, here’s one I made earlier. It prints the program arguments in order using the standard library’s printf function, before printing each environment variable the same way in pointer-order.

envvars.s

.section .data

argc_str:
     .asciz    "argc: %d\n"
argv_str:
     .asciz    "argv[%d]: %s\n"
env_str:
     .asciz    "env: %s\n"

.section .text
 .globl _start

_start:
 #
 # Application prologue. See page 29 in the System V 64-bit ABI 
 #
 movq       %rsp, %rbp         # Store the stack-pointer in RBP
 
#
# Print the number of command-line arguments
#
 # Function:
 #     printf
 # Parameters:
 #     RDI: address of the format-string $argc_str
 #     RSI: the number of arguments passed to this function
 #      AL: the parameter-count of this varargs function call
 # Returns:
 #     void
 movq       $argc_str, %rdi    # Store the address of the format-string in RDI
 movq       (%rbp), %rsi       # Store the cmd-line arg-count in RSI, by dereferencing RDP
 movq       $1, %rax           # Store the printf function's vararg-count in AL
 call       printf             # Invoke the standard library's printf function

#
# Print each command-line argument
#
 movq       (%rbp), %rcx       # Store the argument count in counter register RCX
 movq       %rcx, %r12         # Copy that value to the (protected) R12 register

.Lprintarg:
 movq       %rcx, %rbx         # Copy the count value to protected register RBX
 
 # Call function:
 #     printf
 # Parameters:
 #     RDI: address of the format-string "argv[%d]: %s\\n"
 #     RSI: 1st value for conversion: index-count
 #     RDX: 2nd value for conversion: address of cmd-line arg
 #      AL: number of values to the varargs section of the call
 # Returns:
 #     void
 movq       $argv_str, %rdi    # Store the address of the format-string in RDI
 movq       %r12, %rsi         # Calculate the index value in RSI
 subq       %rcx, %rsi         # Subtract counter from arg-count to get the index
                               # Calculate the pointer's address and store RDX
 leaq       0x8(%rbp, %rsi, 0x8), %rdx
 movq       (%rdx), %rdx       # Dereference that pointer to get the parameter's address
 movq       $0x2, %rax         # Set the varargs-count in the 'hidden' AL parameter
 
 call       printf             # Invoke printf

 movq       %rbx, %rcx         # Restore the counter from the protected RBX register

 loop       .Lprintarg         # Decrement RCX and loop again if not zero

#
# Print each environment variable
#
.Largsfinished:
                               # Calculate the base address of the env-vars, which is:
                               # %rbp + (8 * argc) + 16
 leaq       0x10(%rbp, %r12, 0x8), %r12 
 testq      $-0x01, (%r12)     # Test R12 against -1 to find a zero-value (footnote 1)
 jz         .Lexit

.Lprintenv:

 movq       $env_str, %rdi     # Store address of the format-string in RDI
 movq       (%r12), %rsi       # Store pointer to env-var in RSI
 movq       $0x1, %rax         # varargs component as 'hidden' parameter in AL

 call       printf

 addq       $0x8, %r12         # Step to the next pointer 
 testq      $-0x1, (%r12)      # Test to see whether it is zero (footnote 1)
 jne        .Lprintenv         # Jump if not zero to print the next variable

.Lexit:
 movq       $0x3C, %rax        # index of sys_exit
 movq       $0x0, %rdi         # exit status
 syscall

The story with x86 code

The story with x86_64 code

Finally, some assembly

The story with `x86` code

The story with `x86_64` code