High-Throughput Servers in C++: What Actually Scales
Shreyansh Jain
Low-latency often becomes the goal when designing servers, yet low-latency doesn't automatically mean high throughput. In this talk, we'll look at how common server designs hit scaling limits even when individual requests are fast.
Using real-world examples and performance benchmarks, we’ll build intuition for how different execution models behave under load - from blocking to event-driven. We'll explore the trade-offs involved, how workload characteristics influence scalability, and why certain designs collapse before latency becomes a problem.
Along the way, we’ll touch on asynchronous I/O with Boost.Asio, Linux io_uring, and modern C++ features such as coroutines and std::execution, focussing on how they change execution and the performance implications.
Shreyansh Jain
System developer in finance working on low-latency stuff. Extensive experience in cybersecurity - reverse engineering & pwning. Enthusiastic about kernel internals, low latency programming, blockchain & distributed consensus, security & C++.