x86-64 assembly for noobs: Lets say hello world 🗣️
🗣️ Wake up babe we’re gonna say hello world in x86 assembly today. Huh, Assembly seriously? Yes, assembly - the language that hardware understands. Less gooo 🏃🏃🏃…
What is assembly?
In ancient time the machine code was very hard to read for humans, it was so terrible. To save us and to simplify programming process Godess Kathleen Booth- the mother of assembly language developed a more human readable representation of machine code, which reduced errors and improved efficiency in coding for complex computations.
Uhm akshually, assembly language is not a particular language, but it is a family of languages. Syntax of the language depends on the Instruction set architecture(ISA). They are defined by the processor designers. Each ISAs have their own Assembly language. Some of the popular ISAs are x86, ARM, RISC V. In this blog we are targeting x86 architecture so we’ll be looking at x86 assembly language.
Terminologies
The title has a word x86-64 assembly, what does it mean? Well, x86 is the name of architecture and 64 means we will be using 64 bit architectures. Let’s talk about some other words too.
- Section: Assembly programs have many sections. It is used to define the type of the code. Some of the common sections are
.data
,.bss
,.text
. The section.data
is used to declare constants or initialize data,.bss
is used to declare uninitialized data,.text
is used to declare the code. - Registers: It is a memory that is built directly into the CPU. They are so small but still very fast(Faster than RAM). x86-64 has 16 general purpose registers named rax, rbx, rcx, rdx, rbp, rsp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15.
- Syscall: it is a way where a program asks the operating system to do something for it, like read, write, open, exit, delete etc.
- Assembler: An assemblers job is to translate assembly code into machine code. Some popular x86 assemblers are nasm and fasm.
- Linker: A linker takes object files generated by a compiler and combines it with other required libraries to create a single executable file.
Environment setup
First of all we need a linux machine let it be on VM, WSL or directly running on your pc. We’ll be using nasm
as assembler and ld
as linker. To install nasm on debian/ubuntu based distros do sudo apt install nasm
and for ld do sudo apt install binutils
. If you are using some other distro you are smart enough to install it on your own.
Navigate to a folder and create a file called hello.asm
and open it with a text editor. (Use nano for assembly I mean its so good for asm).
The code
Lets write the actual code. As we discussed before that an asm program has different sections and each sections have their own role we will first define the data section. .data
is used to store initialized data in memory. The second line has a variable defined named msg
and it is initialized with the string “hello world!” followed by ASCII code of new line character (0Ah
) like \n in C.
section .data
msg db "hello world!",0Ah
Lets move to the .text
section. This section is used to store the code of the program. In global _start
we have declared _start
as global so that it would become accessible outside the current assembly module, allowing the linker to find the program entry point.
section .text
global _start
Here comes the _start
section. We are using Linux syscalls provided by the kernel. The syntax of the write
syscall looks like this: ssize_t write(int fd, const void *buf, size_t count);
. The parameters are:
fd
: File descriptor to which we want the data to be written (1 for stdout).buf
: Pointer to the buffer containing the data (in our case,msg
).count
: Number of bytes to write from the buffer (length of the string “hello world!”).
I found this very neat Linux kernel syscall table from Hacker News check this out please. And finally to exit, we need to call the exit
syscall by setting rax
to 60 and rdi
to 0.
_start:
; syscall to write to the console
mov rax, 1 ; set rax to 1 as syscall number of write is 1
mov rdi, 1 ; set rdi to 1 as we want file descriptor stdout
mov rsi, msg ; move the buffer "hello world!", 0Ah to rsi
mov rdx, 14 ; set rdx equals to buffer length
syscall ; syscall to call kernel
; syscall to exit
mov rax, 60 ; set rax to 60 as exit syscall number is 60
mov rdi, 0 ; set rdi to 0
syscall
Here’s the overall code with all sections combined.
section .data
msg db "hello world!",0Ah ; define message string hello world
section .text ; indicate the start of the code section
global _start ; declare _start label as global so the linker can find it
_start:
; syscall to write to the console
mov rax, 1 ; set rax to 1 as syscall number of write is 1
mov rdi, 1 ; set rdi to 1 as we want file descriptor stdout
mov rsi, msg ; move the buffer "hello world!", 0Ah to rsi
mov rdx, 14 ; set rdx equals to buffer length
syscall ; syscall to call kernel
; syscall to exit
mov rax, 60 ; set rax to 60 as exit syscall number is 60
mov rdi, 0 ; set rdi to 0
syscall
To assemble it and run it do:
nasm -f elf64 -o hello.o hello.asm
ld -o hello hello.o
./hello
You can see “hello world!” printed on your terminal. Yo we did it 🕺🕺🕺🥳🥳🥳
So, first, nasm
assembled the code into an object file (hello.o
). Then, ld
linked it to create an executable (hello
). Finally, ./hello
ran the program, which prints “hello world!” to the console.
Also some useful resources if you wanna learn more about assembly or about how computers work in general.
- cpu.land (W website believe me)
- 0xAx’s blogs on learning asm (A big thanks to him)
- Kupala’s x86-64 linux assembly YouTube videos (Super helpful)
- asm tutor
I hope you liked this article. If you wanna say Hi or anything, here’s my Telegram @ashirbadtele or you can mail me at ashirbadreal@proton.me.