Gpu

Introduction Put two identical GPUs in the same machine, run the same workload on both, and the second one will often lag. Same model, same driver, same data — different throughput. It is not thermals or a bad BIOS profile. The second GPU is being starved at the bus level, and the reason has nothing to do with the card itself. Most of us live on top of drivers and kernel modules and never need to look down at how x86 systems actually move bytes between the CPU, RAM, and PCIe devices. But the moment you start debugging throughput asymmetry, tuning interrupt affinity, or wondering why irqaffinity matters, hardware topology stops being an abstraction. ...

Hunting the Repetition Loop in a Self-Hosted LLM Agent

Why My Second GPU Is Lazy: From PCIe to NVLink, Understanding x86 I/O Bottlenecks