Refine README and create GPU implementation docs

Co-authored-by: johndoe6345789 <224850594+johndoe6345789@users.noreply.github.com>
This commit is contained in:
copilot-swe-agent[bot]
2025-12-28 19:04:28 +00:00
parent 3684789c36
commit 99b49ae18e
2 changed files with 108 additions and 61 deletions

View File

@@ -19,72 +19,19 @@ This OS exists solely to run **one QT6 application** on **AMD64 + Radeon RX 6600
**Creative freedom** - Not bound by POSIX or tradition
**Precise drivers** - Hardware code follows specs exactly
## GPU Implementation Strategy
1) Reality check: where the bloat really lives (RDNA2)
MetalOS leverages Mesa RADV (userspace Vulkan driver) with a minimal kernel-side GPU API to achieve high performance without excessive complexity. The strategy focuses on implementing only the essential kernel interfaces that RADV requires:
On Navi 23, you will not get good performance without:
• GPU firmware blobs (various dimgrey_cavefish_*.bin files; Navi 23s codename is “dimgrey cavefish”, and Linux systems load firmware files with that prefix).
• A real memory manager (VRAM/GTT, page tables, buffer objects)
Command submission (rings/queues) + fences/semaphores
• A Vulkan driver implementation (or reuse one)
- **Firmware loading** and ASIC initialization for Navi 23
- **Buffer objects** (VRAM/GTT management)
- **Virtual memory** (GPU page tables)
- **Command submission** (rings/queues) and synchronization primitives
So the “least bloat” strategy is: reuse a Vulkan implementation (Mesa RADV is the obvious candidate), but avoid importing a whole Unix stack by giving it a very small kernel/userspace interface tailored to your OS.
This approach keeps the OS non-POSIX while avoiding the complexity of writing a Vulkan driver from scratch.
RADV is explicitly a userspace Vulkan driver for modern AMD GPUs.
For detailed implementation notes, see [docs/GPU_IMPLEMENTATION.md](docs/GPU_IMPLEMENTATION.md).
2) The best “toy OS but fast” plan: RADV + a tiny amdgpu-shaped shim
Why this is the sweet spot
• You keep your OS non-POSIX.
• You avoid writing a Vulkan driver from scratch (the truly hard part).
• You implement only the kernel-facing parts RADV needs: a buffer object + VM + submit + sync API.
Shape of the stack
MetalOS kernel
• PCIe enumeration, BAR mapping
• interrupts (MSI/MSI-X)
• DMA mapping (or identity-map if youre being reckless)
• a GPU kernel driver that exposes a small ioctl-like API
Userspace
• gpu-service (optional but recommended for structure)
• libradv-metal (a minimal libdrm-like bridge)
• Mesa RADV compiled against your bridge (not Linux libdrm)
This is “Unix-like internally” only in the sense of interfaces, not user experience.
3) Minimal kernel GPU API (the smallest set that still performs)
Think in terms of four pillars:
A) Firmware load + ASIC init
• gpu_load_firmware(name, blob)
• gpu_init() → returns chip info (gfx1032, VRAM size, doorbells, etc.)
You will need those Navi23 firmware blobs (again: dimgrey_cavefish_*.bin family is the practical breadcrumb).
B) Buffer objects (BOs)
• bo_create(size, domain=VRAM|GTT, flags)
• bo_map(bo) / bo_unmap(bo) (CPU mapping)
• bo_export_handle(bo) (so Vulkan can bind memory)
C) Virtual memory (GPU page tables)
• vm_create()
• vm_map(vm, bo, gpu_va, size, perms)
• vm_unmap(vm, gpu_va, size)
D) Submission + synchronization
• queue_create(type=GFX|COMPUTE|DMA)
• queue_submit(queue, cs_buffer, fence_out)
• fence_wait(fence, timeout)
• timeline_semaphore_* (optional, but hugely useful)
If you implement these correctly, you get real GPU throughput.
## What We Cut