Does $NBIS now have the fastest inference in the world on $NVDA devices?


Nebius acquired Eigen AI in exchange for cash and shares, bringing Eigen inference and post-training improvements directly to Nebius's code factory.
In the keynote at NVIDIA's GTC 2026, Eigen AI ranked #1 in output speed for Kimi K2.5 Reasoning, while Nebius Fast was on the verge of matching it.
As well, Nebius Fast also ranks #1 in inference speed on $643M devices for the open-source ChatGPT model, gpt-oss-120B.
Moreover, Eigen topped the GPU-based provider rankings across 25 open-source models on Artificial Analysis, excluding ASIC providers, under default 10K input settings. It is also the fastest provider for Qwen3 Coder 480B, with a speed of 255.8 T/s, beating Google Vertex at 169.2 T/s and Amazon at 121.3 T/s.
This means Eigen is approximately 51% faster than Google Vertex and more than twice as fast as Amazon on this metric.
━━━━━━━━━━━━━━━━━━━━
While the acquisition cost looks high, if Eigen can truly improve $NVDA inference performance—even by a small amount—it will have a compounding impact on profits and long-term competitive positioning, which is likely to cover its cost many times over.
━━━━━━━━━━━━━━━━━━━━
Nebius has a GPU cloud, while Eigen improves the efficiency of those GPUs in generating tokens. On the same NVIDIA hardware, performance isn’t just about capital expenditures. It’s about GPU utilization, model optimization, aggregation, latency, memory management, and dedicated cores.
Eigen’s stack focuses on areas such as quantization, KV cache optimization, differentiation, speculative decoding, custom CUDA and Triton cores, continuous batching, and runtime optimization.
If Nebius can generate more through higher inference on the same NVIDIA hardware, it strengthens revenue capability, cost per token, and gross margin without needing to increase capital expenditures proportionally.
$NBIS is on track to become a company with multi-billion-dollar annual revenue, meaning even a few percentage points of inference improvement could translate into hundreds of millions of dollars in savings.
━━━━━━━━━━━━━━━━━━━━
Open-source models are advancing rapidly. Kimi, Qwen, DeepSeek, GLM, Llama, Nemotron, MiniMax, and other models require continuous improvements to stay competitive.
By integrating Eigen, Nebius can also ship faster-optimized versions and make the code factory more attractive to developers and enterprise customers#USSeeksStrategicBitcoinReserve
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin