- System Requirements
- Architecture
- Boot Process
- Memory Layout
- Process Management
- Debugging and Development
- Hardware requirements
- AVX2/SSE2 support
- 64-bit support
- PS/2 keyboard and mouse (recommended)
- Minimum 32MB of ram
- IDE/AHCI drive with EXT2/FAT1x partitioning
- PCI device(s) (optional)
- APIC support
- PIT support
- Multiboot2 & GRUB support
- TSC support
- VESA (VBE) support (800x600 minimum, 32-bit color)
- Supported architecture
- x86_64
- Purpose: Switch from protected mode to long mode and initialize the stack for the kernel
- Location:
arch/x86_64/asm/pxs.asm - Key files:
pxs.asm - Note: If the kernel is run with
-debugcon stdio,1KCSWF23Z456789is expected, any missing characters will identify a major issue.
- Purpose: Initialize core kernel subsystems and jump to the higher half
- Location:
kernel/core - Key files:
Kernel.c
-
Bootloader Phase
- GRUB loads /boot/voidframe.krnl @ 0x100000 (for obvious reason(s))
- Request framebuffer and VESA (800x600)
- See: grub.cfg for more up-to-date details.
-
Assembly bootstrap
- File:
arch/x86_64/asm/pxs.asm - Store magic and info in eax and ebx (1)
- Setup stack (K)
- Check and enable features, extensions (FCS)
- Load GDT (2)
- Enable PAE (3)
- Setup and zero page tables (Z)
- Set CR3 (4)
- Map 4G with 2MB pages (5)
- Enable Long mode (6)
- Enable Paging (7)
- Entered 64-bit mode (8)
- Setup stack & Jump to kernel entry point (9)
- Extra: (R!) Kernel returns.
- See: pxs.asm for more up-to-date details.
- File:
-
PXS1
- Entry function:
KernelMain()&PXS1() - Check for magic and info
- Set fallback VGA mode buffer 0xB8000
- Start Serial driver across all ports
- Start VESA driver
- Start console driver
- Parse MB2 info
- Start PMM (Physical Memory Manager)
- Create new PML4
- Identity-map kernel sections
- Jump to Virtual half
- See: Kernel.c for more up-to-date details.
- Entry function:
-
PXS2
- Entry function:
KernelMainHigherHalf()&PXS2() - Check CPU features
- Validate memory layout
- Start VMM
- Start Kernel HEAP
- Start Memory pools
- Initialize GDT & IDT
- Install PIC and PIT
- Start PS/2 Drivers
- Start Shell
- Start IDE driver
- Start ramfs Driver (VFRFS)
- Start VFS driver
- Start Process management (and core subsystems)
- Check huge page support
- Start & detect ISA bus device(s)
- Enumerate PCI device(s)
- Start xHCI controller
- Start RTL8139 driver
- Start LPT driver
- Remap IRQs
- Initrd for RAMFS
- Start stack guard
- Setup memory protections
- Enable interrupts
- See: Kernel.c for more up-to-date details.
- Entry function:
- Load
- The kernel is loaded at physical address
0x100000(1MB).
- The kernel is loaded at physical address
- PMM
- Physical Memory Manager will find the highest physical address using the multiboot2 info.
- It will then create a bitmap to manage physical memory.
- It will reserve the first 1MB of memory for the kernel and other critical structures like the multiboot2 info itself.
- Kernel Sections
- The kernel is divided into several sections:
.text,.rodata,.data,.bss, and the heap. - Each section is aligned to page boundaries (4KB).
- The
.bsssection is zero-initialized. - The heap starts immediately after the
.bsssection and grows upwards.
- The kernel is divided into several sections:
- Preparing for the jump
- The kernel prepares to jump to the higher half by setting up page tables that map the kernel's physical addresses to virtual addresses starting at
0xFFFFFFFF80000000. - This involves creating a new PML4 and setting up the necessary entries in the page tables.
- The kernel prepares to jump to the higher half by setting up page tables that map the kernel's physical addresses to virtual addresses starting at
- Higher Half
- After the jump, the kernel operates in the higher half of the address space.
- VMM
- Virtual Memory Manager (VMM) will get PML4 from the higher half (or more specifically, from CR3 that bootstrap set).
- It will track kernel space and test identity mapping. A faulty test will result in a kernel panic.
- Heap
- The kernel heap is initialized using class-based memory pools.
- It is also started with validation_level = 1 (aka
KHEAP_VALIDATION_BASIC, though, it could be changed at will).
- High-Performance Scheduling: A sophisticated Multi-Level Feedback Queue (MLFQ) scheduler ensures that the system is both responsive for interactive tasks and efficient for long-running, CPU-intensive workloads.
- Proactive Security: A dedicated security process, Astra, constantly monitors system integrity, validating processes and responding to threats in real-time.
- Dynamic Performance Scaling: A smart controller, DynamoX, adaptively adjusts the system's core frequency based on current load to optimize performance and power usage.
The ProcessControlBlock is the fundamental data structure that represents a process in the kernel. It stores all essential information required to manage and schedule the process.
File: MLFQ.h
typedef struct {
char name[64];
uint32_t pid;
ProcessState state;
void* stack;
uint8_t priority;
uint8_t base_priority;
uint8_t privilege_level;
uint32_t cpu_burst_history[CPU_BURST_HISTORY];
uint32_t io_operations;
uint32_t preemption_count;
uint64_t cpu_time_accumulated;
uint64_t last_scheduled_tick;
uint64_t wait_time;
TerminationReason term_reason;
uint32_t exit_code;
uint64_t termination_time;
uint32_t parent_pid;
MLFQSecurityToken token;
MessageQueue ipc_queue;
ProcessContext context;
MLFQSchedulerNode* scheduler_node;
uint64_t creation_time;
char ProcessRuntimePath[256];
} MLFQProcessControlBlock;| Field | Description |
|---|---|
pid |
A unique Process Identifier. PID 0 is reserved for the idle process. |
state |
The current execution state of the process (e.g., Running, Ready, Blocked). |
stack |
A pointer to the process's allocated kernel stack. |
context |
Stores the CPU register state (rip, rsp, etc.) when the process is not running. This is crucial for context switching. |
priority |
The process's current priority level in the scheduler. This value can change dynamically. |
base_priority |
The initial priority assigned at creation, used to reset a process's priority. |
privilege_level |
The security privilege level (PROC_PRIV_SYSTEM or PROC_PRIV_NORM). |
token |
A SecurityToken structure used by the Astra security system to verify the process's integrity. |
scheduler_node |
A pointer to the node that links this PCB into the scheduler's priority queues. |
cpu_time_accumulated |
Total system ticks this process has spent running on the CPU. |
io_operations |
A counter for I/O-related blocking events, used to identify I/O-bound processes. |
A process transitions through several states from its creation to its termination.
File: MLFQ.h
PROC_READY: The process is ready to run and is waiting in a scheduler queue for its turn on the CPU.PROC_RUNNING: The process is currently executing on the CPU.PROC_BLOCKED: The process is waiting for an event (e.g., I/O completion) and cannot proceed until that event occurs.PROC_DYING: The process has been signaled for termination and is in the process of being shut down. It is removed from active scheduling.PROC_ZOMBIE: The process has terminated, but its PCB is kept in a termination queue to be cleaned up. It has released most of its resources.PROC_TERMINATED: The state0. The process slot is free, and all its resources have been reclaimed.
Process creation is handled by the CreateSecureProcess function, which is designed with security as a primary concern.
File: Process.c
The creation workflow is as follows:
- Security Validation: It validates the security token of the creator process. A process with a corrupt token cannot create new processes.
- Privilege Check: It prevents user-level processes from creating system-level processes. Such an attempt is considered a hostile act and results in the termination of the caller by Astra.
- Resource Allocation:
- Finds a free process slot and allocates a unique PID.
- Allocates a kernel stack for the new process.
- PCB Initialization:
- The
ProcessControlBlockis securely zeroed and initialized with default values. - An initial priority is assigned based on its privilege level.
- Security Token Generation: A new
SecurityTokenis created for the process. It contains the creator's PID, its own flags, and a secure checksum (CalculateSecureChecksum) to prevent tampering. - Context Setup: The initial CPU context is configured. The instruction pointer (
rip) is set to the process's entry point, and the stack pointer (rsp) is prepared with theProcessExitStubas the final return address, ensuring clean termination. - Scheduling: The process state is set to
PROC_READYand it is added to the scheduler viaAddToScheduler.
Termination is a robust, multi-stage process designed to be safe and secure.
File: Process.c
- Initiation (
TerminateProcess):
- A termination request is received with a
TerminationReason(e.g.,TERM_NORMAL,TERM_KILLED,TERM_SECURITY). - Security Checks: If one process tries to kill another, the system verifies that the caller has the necessary privileges. User processes cannot kill system processes, and critical/immune processes are protected. The caller's security token is also validated.
- State Transition: The process state is atomically changed to
PROC_DYING. This prevents race conditions. - De-scheduling: The process is immediately removed from the scheduler's queues (
RemoveFromScheduler). - Cleanup Queuing: The process state is set to
PROC_ZOMBIE, and its slot index is added to a lockless termination queue.
- Resource Reclamation (
CleanupTerminatedProcesses):
- This function runs periodically as part of the
DynamoXsystem process. - It safely dequeues
PROC_ZOMBIEprocesses from the termination queue. - For each zombie, it frees its final resources (like the stack), clears the
ProcessControlBlock, and frees the process slot and PID for future use.
The kernel employs a Multi-Level Feedback Queue (MLFQ) scheduler, a highly adaptive algorithm designed to achieve low latency for interactive tasks and high throughput for batch tasks.
File: MLFQ.c, MLFQ.h
- Priority Levels: The scheduler uses
MAX_PRIORITY_LEVELS(5) queues. The levels are divided into two categories: - Real-Time (Levels 0 to
RT_PRIORITY_THRESHOLD-1): These queues have absolute priority. A process in level 0 will always run before a process in level 1. They are intended for critical system tasks and highly responsive user applications. - Regular (Levels
RT_PRIORITY_THRESHOLDto 4): These are for normal user and system processes. - Quantum: Each priority level is assigned a time slice, or "quantum." Higher-priority queues have smaller quantums for faster response times, while lower-priority queues have larger quantums for better CPU throughput.
The FastSchedule function is invoked by the timer interrupt. Its logic is as follows:
- Find Next Process: It scans the priority queues from highest to lowest to find the first non-empty queue (
FindBestQueue). The process at the head of this queue is the best candidate to run next. - Preemption Logic: The currently running process can be preempted (stopped) if:
- Its time quantum expires.
- A process appears in a significantly higher-priority queue. The scheduler has a
PREEMPTION_BIASto prevent excessive "thrashing" between processes of similar priority.
- Context Switch: If a new process is chosen, the CPU state of the old process is saved to its PCB, and the state of the new process is loaded from its PCB. This is a context switch.
- Dynamic Priority Adjustment: This is the "feedback" mechanism.
- Demotion: If a process consistently uses its entire time quantum, it is considered CPU-bound. The scheduler will lower its priority, moving it to a queue with a larger quantum, preventing it from hogging the CPU and starving interactive tasks.
- Promotion: If a process frequently blocks for I/O before its quantum expires (
ProcessBlocked), it is identified as I/O-bound or interactive. The scheduler boosts its priority to a high-level queue to ensure it gets to run quickly when its I/O completes, leading to a snappier user experience.
- Aging and Starvation Prevention (
SmartAging): To prevent processes in low-priority queues from never running (starvation), theSmartAgingalgorithm periodically scans the queues. If a process has been waiting for too long (AGING_THRESHOLD_BASE), its priority is temporarily boosted to a much higher level, giving it a chance to run.
Astra is a dedicated, high-privilege system process that functions as a real-time security manager. Its primary mission is to ensure the integrity of the kernel and all running processes.
File: MLFQ.c
Every process has a SecurityToken that acts as a digital signature.
- It contains the process's flags, privilege level, and a
magicnumber. - Crucially, it includes a
checksumcalculated from the token's contents and the process's PID. - The
ValidateTokenfunction can be called at any time to recalculate the checksum and compare it with the stored value. A mismatch indicates that the process's in-memory state has been corrupted or tampered with.
Astra runs in a continuous loop, performing a series of checks:
- Token Integrity Scan: Periodically iterates through active processes and validates their security tokens. A failed validation results in immediate termination of the suspicious process (
SecurityViolationHandler). - Unauthorized Privilege Detection: This is Astra's most critical check. It looks for processes that are running with
PROC_PRIV_SYSTEMbut do not have thePROC_FLAG_SUPERVISORorPROC_FLAG_CRITICALflags in their token. This indicates a process may have illicitly escalated its own privileges, and it is terminated with extreme prejudice viaASTerminate. - Scheduler Consistency Checks: Astra cross-references the scheduler's process bitmap (
active_process_bitmap) with the globalprocess_countto detect desynchronization, which could be a sign of kernel data structure corruption. - Threat Level Management: Astra maintains a system-wide
threat_level. Each security violation increases this level.
- If the level exceeds a high threshold (
threat_level > 40), Astra triggers DEFCON 2: a system-wide lockdown where all non-critical, non-immune processes are terminated to contain a potential widespread compromise. - If the level becomes critical (
threat_level > 75), Astra assumes the system is unrecoverable and triggers a kernel panic.
Before the scheduler context-switches to a process, it performs a final, lightweight security check. This check validates the process's token and verifies its privilege flags one last time, ensuring that a compromised process is never allowed to run on the CPU.
While Astra ensures the logical integrity of processes, Cerberus acts as the kernel's dedicated memory security guard. Its mission is to detect, classify, and mitigate common and dangerous memory corruption vulnerabilities in real-time. It is enabled via the VF_CONFIG_USE_CERBERUS flag and integrates deeply with the process scheduler and memory managers.
Cerberus operates on multiple defensive fronts:
-
Stack Canary Protection: To protect against stack buffer overflows (a common attack vector), Cerberus automatically installs a "canary"—a known secret value—at the end of a process's stack when it's registered. Before the scheduler switches to a process, Cerberus checks if this canary is intact. If the value has been overwritten, it signifies a buffer overflow, and a
MEM_VIOLATION_CANARY_CORRUPTviolation is logged. -
Allocation Monitoring: Cerberus maintains a record of all active memory allocations for monitored processes (
CerberusTrackAlloc). When memory is freed (CerberusTrackFree), Cerberus validates the operation. If a process attempts to free a memory address that isn't actively allocated to it, Cerberus flags it as a potential "double-free" or an attempt to free invalid memory, logging aMEM_VIOLATION_DOUBLE_FREE.
When a critical memory fault (e.g., a page fault) occurs, Cerberus intercepts the event before it panics the kernel. It analyzes the context of the fault to determine its likely cause, translating a generic CPU exception into a specific security event.
| Violation Type | Heuristic / Cause |
|---|---|
MEM_VIOLATION_USE_AFTER_FREE |
Accessing a NULL or very low memory address (< 0x1000). |
MEM_VIOLATION_BUFFER_OVERFLOW |
Writing to a page that is not present in memory. |
MEM_VIOLATION_STACK_CORRUPTION |
Attempting to execute code from a non-executable page (NX bit violation). |
MEM_VIOLATION_BOUNDS_CHECK |
A user-mode process attempting to access kernel space memory. |
Cerberus's true strength comes from its integration with the rest of the system:
- Pre-emptive Threat Mitigation: The
CerberusPreScheduleCheckfunction is called just before a process is scheduled to run. If a process has been marked asis_compromised(either due to a canary failure or by exceeding its violation threshold), Cerberus will block the process from ever running again. This prevents a corrupted process from doing further damage. - Violation Threshold: Cerberus tracks the number of violations per process. If a process accumulates too many violations (
CERBERUS_VIOLATION_THRESHOLD), it is automatically flagged as compromised, even if the individual violations were not critical. - Threat Escalation: When a significant threat is confirmed,
CerberusReportThreatformats a detailed message and sends it to Astra via a VFS file. This allows Astra to take higher-level action, such as shutting down related processes or increasing the system-wide threat level.
By combining proactive checks with intelligent fault analysis, Cerberus provides a robust defense-in-depth security layer, making the entire system more resilient to memory-based attacks.
DynamoX is a system process that intelligently manages the CPU's operational frequency (simulated via the PIT timer) to deliver performance when needed and conserve power when the system is idle.
File: Process.c
DynamoX runs in a loop, periodically sampling various system metrics:
- Process Load: The number of active and ready processes.
- Queue Pressure: The depth of the scheduler's priority queues. Deep queues indicate high demand for CPU time.
- Context Switch Rate: A high rate of context switches can indicate system thrashing under heavy load.
Based on these metrics, it calculates a target_freq.
- Boosting: If the load is high, queue pressure is mounting, or context switching is rapid, DynamoX increases the frequency towards
max_freq. It includes an "emergency boost" for sudden, extreme spikes in load. - Reducing: If the system is idle (low process count, low context switch rate), it gradually lowers the frequency towards
min_freq. - Adaptive Learning: DynamoX uses momentum and a learning rate to smooth out frequency changes, preventing erratic fluctuations. It also maintains different power states (e.g.,
Balanced,Performance,Power Saving) to make more intelligent, long-term decisions.
The goal is to create a system that feels fast and responsive under load but doesn't waste energy when idle.
The following functions and commands provide interfaces to the process management subsystem.
| Function / Command | Description |
|---|---|
CreateProcess(entry_point) |
Creates a new user-level process. A simplified wrapper around CreateSecureProcess. |
KillProcess(pid) |
Safely requests the termination of a process with the given PID. |
MLFQYield() |
A cooperative function that allows a process to voluntarily give up the remainder of its time slice and trigger the scheduler. |
GetProcessByPid(pid) |
Retrieves a pointer to the PCB for a given PID. |
ListProcesses() |
A shell/debug command that prints a formatted list of all active processes, including their PID, state, priority, and CPU usage. |
DumpSchedulerState() |
A debug command that prints detailed internal statistics about the scheduler, including queue depths and load. |
DumpPerformanceStats() |
A debug command that prints performance counters like total context switches and security violations. |
- Known issues:
- The kernel is not bootable yet, its not an EFI stub, nor does it have a bootloader.
- It is not a raw binary either, it needs to be loaded by a elf-compatible bootloader (in this case grub, but I am thinking of using a custom one, or just use grub for god’s sake).
- No SMP support
- No UEFI support
- The fault handler is mid, any fault with bring down the system (most of the time).
- It only works on qemu, for some reason??. ive tried Bochs, but it hardly ever gets to the shell, VBox is even worse.
- The kernel's real hardware support is unknown, since I dont have a legacy BIOS pc :(.
assembler-0 @ voidframe - 19:02 - 01-10-2025