x86-64 assembly for noobs: Lets say hello world 🗣️


🗣️ Wake up babe we’re gonna say hello world in x86 assembly today. Huh, Assembly seriously? Yes, assembly - the language that hardware understands. Less gooo 🏃🏃🏃…


What is assembly?

In ancient time the machine code was very hard to read for humans, it was so terrible. To save us and to simplify programming process Godess Kathleen Booth- the mother of assembly language developed a more human readable representation of machine code, which reduced errors and improved efficiency in coding for complex computations.


akshually cat image

Uhm akshually, assembly language is not a particular language, but it is a family of languages. Syntax of the language depends on the Instruction set architecture(ISA). They are defined by the processor designers. Each ISAs have their own Assembly language. Some of the popular ISAs are x86, ARM, RISC V. In this blog we are targeting x86 architecture so we’ll be looking at x86 assembly language.


Terminologies

The title has a word x86-64 assembly, what does it mean? Well, x86 is the name of architecture and 64 means we will be using 64 bit architectures. Let’s talk about some other words too.

  • Section: Assembly programs have many sections. It is used to define the type of the code. Some of the common sections are .data, .bss, .text. The section .data is used to declare constants or initialize data, .bss is used to declare uninitialized data, .text is used to declare the code.
  • Registers: It is a memory that is built directly into the CPU. They are so small but still very fast(Faster than RAM). x86-64 has 16 general purpose registers named rax, rbx, rcx, rdx, rbp, rsp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15.
  • Syscall: it is a way where a program asks the operating system to do something for it, like read, write, open, exit, delete etc.
  • Assembler: An assemblers job is to translate assembly code into machine code. Some popular x86 assemblers are nasm and fasm.
  • Linker: A linker takes object files generated by a compiler and combines it with other required libraries to create a single executable file.

Environment setup

First of all we need a linux machine let it be on VM, WSL or directly running on your pc. We’ll be using nasm as assembler and ld as linker. To install nasm on debian/ubuntu based distros do sudo sudo apt install nasm and for ld do sudo apt install binutils. If you are using some other distro you are smart enough to install it on your own.


Navigate to a folder and create a file called hello.asm and open it with a text editor. (Use nano for assembly I mean its so good for asm).


The code

Lets write the actual code. As we discussed before that an asm program has different sections and each sections have their own role we will first define the data section. .data is used to store initialized data in memory. The second line has a variable defined named msg and it is initialized with the string “hello world!” followed by ASCII code of new line character (0Ah) like \n in C.

section .data
        msg db "hello world!",0Ah

Lets move to the .text section. This section is used to store the code of the program. In global _start we have declared _start as global so that it would become accessible outside the current assembly module, allowing the linker to find the program entry point.

section .text
        global _start

Here comes the _start section. We are using Linux syscalls provided by the kernel. The syntax of the write syscall looks like this: ssize_t write(int fd, const void *buf, size_t count);. The parameters are:

  • fd: File descriptor to which we want the data to be written (1 for stdout).
  • buf: Pointer to the buffer containing the data (in our case, msg).
  • count: Number of bytes to write from the buffer (length of the string “hello world!”).

I found this very neat Linux kernel syscall table from Hacker News check this out please. And finally to exit, we need to call the exit syscall by setting rax to 60 and rdi to 0.

_start:
        ; syscall to write to the console
        mov     rax, 1   ; set rax to 1 as syscall number of write is 1
        mov     rdi, 1   ; set rdi to 1 as we want file descriptor stdout
        mov     rsi, msg ; move the buffer "hello world!", 0Ah to rsi       
        mov     rdx, 14  ; set rdx equals to buffer length
        syscall          ; syscall to call kernel

        ; syscall to exit
        mov     rax, 60  ; set rax to 60 as exit syscall number is 60
        mov     rdi, 0   ; set rdi to 0
        syscall

Here’s the overall code with all sections combined.

section .data
        msg db "hello world!",0Ah ; define message string hello world

section .text           ; indicate the start of the code section
        global _start   ; declare _start label as global so the linker can find it

_start:
        ; syscall to write to the console
        mov     rax, 1   ; set rax to 1 as syscall number of write is 1
        mov     rdi, 1   ; set rdi to 1 as we want file descriptor stdout
        mov     rsi, msg ; move the buffer "hello world!", 0Ah to rsi       
        mov     rdx, 14  ; set rdx equals to buffer length
        syscall          ; syscall to call kernel

        ; syscall to exit
        mov     rax, 60  ; set rax to 60 as exit syscall number is 60
        mov     rdi, 0   ; set rdi to 0
        syscall

To assemble it and run it do:

nasm -f elf64 -o hello.o hello.asm
ld -o hello hello.o
./hello

You can see “hello world!” printed on your terminal. Yo we did it 🕺🕺🕺🥳🥳🥳


So, first, nasm assembled the code into an object file (hello.o). Then, ld linked it to create an executable (hello). Finally, ./hello ran the program, which prints “hello world!” to the console.


Also some useful resources if you wanna learn more about assembly or about how computers work in general.


I hope you liked this article. If you wanna say Hi or anything, here’s my Telegram @ashirbadtele or you can mail me at ashirbadreal@proton.me.