Intel’s Xeon 6 and Arc Pro GPUs Showcase Strong AI Inference Performance in MLPerf v6.0 Benchmarks
Article Content
Intel has demonstrated competitive AI inference performance in the latest MLPerf Inference v6.0 benchmarks, highlighting the capabilities of its Xeon 6 CPUs and Arc Pro B-Series GPUs across workstations, data centers, and edge systems. The results underscore Intel’s push to deliver scalable, low-latency AI solutions that balance performance, cost, and accessibility for developers and graphics professionals.
Key Performance Highlights
The benchmarks reveal that a four-GPU Intel Arc Pro B70/B65 system provides 128GB of VRAM, enabling the execution of 120-billion-parameter models with high concurrency. The Arc Pro B70 delivers up to 1.8x higher inference performance compared to its predecessor, the Arc Pro B60. Additionally, software optimizations in an open, containerized stack have improved performance by up to 1.18x on the same Arc Pro B60 hardware since MLPerf v5.1.
Intel’s submission also emphasizes the role of Xeon 6 processors, which powered over half of all MLPerf 6.0 submissions. The CPUs, equipped with built-in AI acceleration technologies like AMX and AVX-512, achieved up to 1.9x generational performance gains in prior MLPerf tests, reinforcing their critical function in memory management, task orchestration, and workload distribution for AI workloads.
Addressing Market Demands
The benchmarks arrive amid rising demand for high-performance, cost-effective AI inference solutions that do not compromise data privacy or incur heavy subscription costs tied to proprietary models. Intel’s Arc Pro B70/B65 GPUs are positioned as all-in-one platforms, combining validated hardware and software with features like ECC memory, SR-IOV, and remote firmware updates to meet enterprise needs.
A standout advantage is the Arc Pro B70’s ability to handle larger models and context windows in multi-GPU setups, offering up to 1.6x more KV cache capacity than comparable competitor solutions. The results highlight Intel’s broader strategy to integrate CPUs and GPUs seamlessly, ensuring scalability from single-node to multi-GPU enterprise deployments while maintaining reliability and security.
Industry Implications
Intel remains the only server processor vendor to submit standalone CPU results for MLPerf benchmarks, underscoring its leadership in AI infrastructure. The company’s dominance is further evidenced by the Xeon 6’s role as the host CPU in systems like NVIDIA’s DGX Rubin NVL8, where it orchestrates and secures modern AI workloads.
The benchmarks also reflect a broader industry shift toward open, scalable AI solutions, with Intel’s containerized software stack enabling efficient multi-GPU scaling and PCIe P2P data transfers. For developers and enterprises, this translates to lower total cost of ownership and greater flexibility in deploying AI models without vendor lock-in.
Technical Specifications and Disclaimers
Performance claims are based on specific configurations, including Xeon 698X CPUs paired with Arc Pro GPUs and DDR5 memory. Results may vary depending on use case, configuration, and updates. For full details, visit Intel’s Performance Index or MLCommons. Intel technologies require enabled hardware, software, or service activation, and no product is entirely secure.
Read more: newsroom.intel.com