If you’re trying to compile a kernel written in C for your own toy operating system, you may run into trouble compiling/linking your code. Assuming you’re using GRUB to load your kernel, or you’ve rolled your own boot sector, you’ll now want to compile your kernel code (written in C) to a flat binary. The toolchain provided by MinGW (gcc and ld) is well suited for this, as long as you know a few tricks.
This article is part of a series on toy operating system development.
Let’s start with a very simple kernel.c
program just to see if we can get things working:
int main(void)
{
mylabel:
goto mylabel;
}
We’ll compile this with gcc, switching on all warnings (the compiler is our friend):
$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe
This will yield a working program that we can actually execute at the command prompt. It’ll pause indefinitely, as desired. However, there are a number of problems with the resulting binary:
First, the binary includes a PE header, which specifies how Windows must load and execute the program. We’re writing a kernel, so we don’t want any of this header data. We must find it way to remove it.
Second, the program is relocatable. The operating system (i.e. Windows) will load the code into memory where it wants, then use the information contained in the PE header to make sure that all references are correct. The references are provided relatively, that is, the can be relocated. For our kernel, this is not what we want: we want to load our kernel at a specific address (say 0x20000) and make all references work precisely (statically) there.
This can be illustrated by running objdump
:
$ objdump -f kernel.exe
kernel.exe: file format pei-i386
architecture: i386, flags 0x00000132:
EXEC_P, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00401160
Objdump’s output shows that a PE header is present (pei-i386 file format) and that a default random start adress of 0x00401160 has been defined. Let’s see what we can do about the start address. Since we want our kernel to always run at 0x20000, we can instruct the linked to use that address to place the code. Linker options can be passed to gcc:
Hint: do not use gcc to compile but not link, then ld to do the linking separately. Strange error messages will ensue. It’s easier to simply pass the linking options to gcc and let gcc call ld for you.
$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -Wl,-Ttext=0x20000
$ objdump -f kernel.exe
kernel.exe: file format pei-i386
architecture: i386, flags 0x00000132:
EXEC_P, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00020160
Oh look: our start address is now 0x00020160. The excess 0x160 bytes are the space occupied by the header, which we don’t want. We can try to pass the option –oformat binary
to the linker, which will make it link a flat binary for us. Unfortunately (under MinGW), we get this:
c:/mingw/bin/../lib/gcc/mingw32/4.5.2/../../../../mingw32/bin/ld.exe:
cannot perform PE operations on non PE output file 'kernel.exe'.
collect2: ld returned 1 exit status
This can be resolved though: let the linker create the kernel.exe executable, then pass it through objcopy to create the flat binary:
$ objcopy -O binary -j .text kernel.exe kernel.bin
This will yield, finally, an executable. Unfortunately, it’s 3376 bytes in size! About 10 bytes would be closer to the mark. Obviously, code is being included that we didn’t write: references to standard libraries. Since we don’t have any standard libraries in our fledgling operating system, we’ll need to remove this. This can be done by passing the -nostdlib
argument to gcc:
$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -nostdlib -Wl,-Ttext=0x20000
C:\Users\AppData\Local\Temp\cc5nshHf.o:kernel.c:(.text+0x7):
undefined reference to `__main'
collect2: ld returned 1 exit status
Foiled again! Now that we have no standard libraries, ld is looking for startup code that doesn’t exist. We did write a main
function, but it’s actually looking for a wrapper to that main function normally supplied by the standard libraries. Let’s try a different approach: we’ll rename our main function.
int start(void)
{
mylabel:
goto mylabel;
}
Now our code compiles, and we’re down to a flat binary of 2011 bytes. It turns out that we must also pass -nostdlib
to the linker:
$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -nostdlib
-Wl,-Ttext=0x20000,-nostdlib
Now we get an executable of 24 bytes. In fact, on my system I get:
00000000h: 55 89 e5 eb fe 90 90 90 ff ff ff ff 00 00 00 00
00000010h: ff ff ff ff 00 00 00 00
When disassembled, this yields:
push ebp
mov ebp, esp
jmp .-2
This corresponds exactly to the code we wrote: a stack frame is created for the start function (even though we are not interested in it – a C program must always start with a function), then an infinite loop is entered (which we wrote using a label and a goto statement).
Wait… this code only occupies 5 bytes. So why are there 24 bytes in the flat binary image? We can see that the first three unneeded bytes have a value of 0x90, which corresponds to NOP
instructions. This is probably added to get at least an 8-byte boundary. However, why an additional 16 bytes are added, I actually don’t know. If anyone can explain, I’d be grateful.
Nevertheless, we have now produced a flat binary that can be launched by our boot sector or second stage boot loader. It can be placed at 0x20000 and includes no undesired headers. Just the code, please, ma’am.