Programs and Processes in Linux - 1
When you execute a command like ls
or sudo ls
in a Linux shell, it triggers a fascinating transformation. A program, which is simply a static collection of instructions created by developers or code generators, transitions into a process, a dynamic entity managed by the operating system. While the program exists as a file in the filesystem, designed with a human-centric perspective, the process shifts that perspective to the kernel, which manages its lifecycle, resources, and interactions with the system.
The Linux kernel, the heart of the operating system, takes over as soon as a program becomes a process. It handles process creation, assigns resources, manages permissions, schedules execution, and ensures a seamless interaction between the process and system resources. This blog breaks down the entire journey—from program to process—exploring how the kernel orchestrates these operations. Along the way, we’ll delve into the key concepts of fork, exec, and the execution environment, unraveling the inner workings of Linux when commands are executed.
What Happens When You Run ls
in the Shell?
The Shell as a Process
The shell (e.g., bash
, zsh
) is a command-line interface where you interact with the Linux operating system. It is itself a running process managed by the kernel, identified by a unique Process ID (PID).
When you type a command like
ls
, the shell:Parses your input.
Searches for the
ls
program in the directories listed in the$PATH
environment variable.Prepares to execute the program by creating a new process.
Fork and Exec: Creating a New Process
To execute ls
, the shell uses two critical system calls: fork()
and execve()
. A system call is a mechanism that allows a program to request services or resources from the operating system's kernel.
fork()
: Duplicating the Shell ProcessWhat is
fork()
?fork()
is a system call that creates a new process by duplicating the current (parent) process.The new process is called the child process, and it inherits the parent’s execution environment.
What Happens During
fork()
?The kernel allocates a new PID for the child process.
The child process receives a copy of the parent’s:
Memory space: Code, data, and stack.
File descriptors: Open files or sockets are shared.
Environment variables: Copies variables like
$PATH
,$HOME
, and$USER
.Permissions: The child process inherits the parent’s user ID (UID) and group ID (GID).
Why Use
fork()
?fork()
enables the system to efficiently create new processes without starting from scratch. This is faster and preserves the parent's context.
Example: When the shell runs
ls
, it forks a child process that will execute the command.execve()
: Replacing the Child ProcessWhat is
execve()
?execve()
is a system call that replaces the current process's memory space with a new program.The process retains its PID but replaces its code and data with those of the program being executed (
ls
in this case).
What Happens During
execve()
?The
ls
program's binary is loaded into memory.The child process starts executing
ls
, but its environment (e.g., open file descriptors, environment variables) remains intact.
Why Use
execve()
?execve()
allows the child process to seamlessly transition to executing the desired program, leaving behind the parent's shell context.
Example: After the shell forks a child process, the child replaces its code with the
ls
program usingexecve()
.
Execution Environment
The execution environment of a process is a set of attributes and resources that define its operating context. The child process created by fork()
inherits its parent’s environment, which includes:
Environment Variables:
Variables such as
$PATH
,$HOME
, and$USER
are inherited from the parent.These variables provide essential configuration for the program. For example,
$PATH
determines where the system searches for executable files.
Open File Descriptors:
File descriptors for any files, sockets, or pipes opened by the parent are inherited by the child.
This allows child processes to reuse or interact with resources already opened by the parent.
Permissions and Privileges:
The child process inherits the user ID (UID) and group ID (GID) of the parent.
For commands like
ls
, this determines which files and directories the process can access.
Kernel Scheduling
Once the ls
process is created, the kernel takes over:
Process Scheduling:
The kernel’s scheduler assigns CPU time to the
ls
process based on its priority and scheduling policies.While the
ls
process is running, the parent shell process is typically put into a waiting state.
Completion:
Once
ls
finishes execution, it sends an exit signal to the parent (shell) and terminates.The kernel reclaims the resources allocated to the process.