How do computer programs work

Computer programs work according to a series of steps outlined in the source code written by the programmer. This source code is then compiled or interpreted and converted into machine-readable instructions. The computer's processor executes these instructions, manipulates data, and performs calculations to achieve the specified tasks outlined in the program. During execution, the program interacts with the computer's memory, input/output devices, and other components. A computer program is basically a set of instructions that guide the computer through a predefined sequence of operations to achieve a certain goal.

Jan 27, 2024 - 12:11
Jan 27, 2024 - 12:12
 0  20
How do computer programs work
How do computer programs work

Programmers are just people who knows how to write code. You certainly know how to write a program to calculate the factor of a number or check if a string is a palindrome or not or even implement Dicer shortest path algorithm by reading code. But do you know how it works? By it, I mean, do you know how your program is actually understood by a computer? Ever thought of it?

Computer Gates

Let's start with the most basic thing, which you already know. Computers understand only binary data. But what does that mean? It means that computers can only deal with zeros and ones, no other things. But have you ever wondered why? Let me repeat to you, computers are made of circuits. These circuits are nothing but a sequence of gates like AND gate or OR gate or NOT gate. You've definitely heard these names, right? So all these gates are actually made of transistors.

Transistor

Now, a transistor is what we can call a fundamental building block of a computer, just like how a cell is a fundamental unit of life. Yes, I just linked computer science with biology. So in the simplest words, a transistor is what makes a computer work. But what exactly is this transistor? Well, it is a component that can basically hold two states. The first state is on, which we represent as one, and the second state is off, which we represent as zero. Now it must make sense why computers can only deal with zeros and ones, right?

Now, let's come into the big picture. How do we say that a program like this is understood and executed by your computer, which only understands zeros and ones? The most obvious answer is this code must be converted into a series of zeros and ones so that it actually makes sense to the computer. Yes, that is exactly what happens. Any program written in any programming language must eventually be converted into machine instructions so that your computer understands it. But how does that happen?

Computer code to Binary

Well, this conversion from source code to machine instructions is made possible by a set of magical tools together known as the compilation system. To know more about how this conversion works, let's actually take a step-by-step approach to how a program is written, compiled, and finally executed.

Python, Java to Binary Code

Let's consider this simple C program, which basically adds two numbers and prints the sum of these numbers. First, we save this program with a dot C extension. Let's say we save it as hello.c. The next step is the compilation. We can use the GCC compiler, which is open source, to compile this C program. You open your terminal or command prompt and then type in GCC space hello.c -o and then hello. This will create an executable file named as hello, which we can finally execute. And when we execute it, the computer actually executes the file and prints the output. This means that at this stage, the computer understood our code and it actually computed what is written inside the program. So basically, it added the two numbers and printed the output.

ASCII Values

Now, let's break down what just happened. Initially, when you save the hello.c file on your computer, it is actually saved as a sequence of bytes. The program that you have written basically contains English alphabets, special characters, and numbers. That's it. Now, in order to convert these individual tokens or characters into binary, we first convert each character into a decimal format and then go on to convert these decimal values into their binary format. So most computers follow the ASCII standard to do the same. Basically, each character has its own corresponding numeric value. For example, the character hash has the ASCII value 35 and the character small 'i' has the ASCII value 105. And similarly, all the characters have their own numeric value. Once the characters are converted into their ASCII values, they are then converted into binary format. Each character is represented using eight bits or, in other words, one byte.

Source Code to Machine Code

Now that everything is in binary, these sequences of bytes can finally be saved on your computer's file system. The next step is where you compile your program with GCC. Now, this is the big step because in this step, your source code is converted into the executable machine instructions. There are usually four steps to achieve this. In the first step, your program is fed to the preprocessor, which basically checks all the header files in your program and adds the code of all these header files into your program and then saves this as an intermediate file named as hello.i with a dot i extension. So for our program, since we are using the stdio.h header, our preprocessor adds the code of stdio.h into our program. This hello.i file is fed to the compiler, which converts this source code into an assembly code and then saves it as hello.s. The assembly language is just a human-readable format of the machine code. This step is actually optional because generating the assembly code from a source code is not necessary. It is only generated because the programmer will be able to read and understand this assembly code and get to know what the processor will execute from a low level. This assembly code is then passed to the assembler. This is a step where the assembler converts the assembly code into the machine instructions, in other words, the binary form. It is then saved as hello.o.

Digaram

At last comes the linking phase. In this phase, a program called the linker checks if there are any dependencies in your code. In our code, since we are using the printf function, which is defined in the standard C library, the printf.o file, which contains the code for that printf function, is actually imported and linked with our hello.o file. And this finally creates our executable file, hello, which we can finally execute in our command prompt or terminal.

Now, what happens when we run this file? That is a different story. I will brief it to you. The computer's architecture looks something like this. There is your hard disk where your files are stored, including our executable hello file. And then there is main memory, and also, obviously, there is a CPU that contains the register file, arithmetic logical unit, a bus interface, and also the program counter. And also, to handle the input/output devices, there is a USB controller and also a graphics adapter for the display. All these components are connected with each other by circuits called buses.

Initially, our executable file, hello, is stored on the disk. When we run it from our terminal or command prompt, the code and the data present in the file, hello, is copied into the main memory. Now, the program counter holds the first instruction that is to be executed. As each instruction is executed by the processor, the program counter holds the next instruction to be executed. The sum operation between the two numbers is computed by the ALU, or in other words, the arithmetic logical unit. Based on the architecture of the computer, the data present in the main memory, which is related to our program, can be copied into the register file, which is residing inside the CPU, so that the processor can access it at a much faster rate as compared to accessing it from the main memory. And finally, when the machine instruction that is related to the printf function in our program gets executed by the processor, the output, or in other words, the sum of these two numbers, is actually sent to the display device. And there we go, we have the output displayed on our screen. So yeah, this is a general idea of how a program is understood and executed by a computer.

 

Languages like Java convert the source code into an intermediate form known as bytecode instead of directly converting them into machine instructions. So when you're running the compiled Java file, which is a .class file, that is basically bytecode. A program called a Java Virtual Machine then converts this bytecode into machine instructions. This feature allows portability, which means the same compiled Java program can run on any computer of any architecture. This feature of portability lacks in the C language because if you're using the C language to write code, we need to create different executable files for computers with different architectures.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow

Mehedi I am Mehedi Hasan Siam-Professional Web Developer and IT Professional