Intel and SambaNova Systems have introduced a new AI inference architecture that combines different processors to handle growing agentic AI workloads.
The design uses Intel Xeon 6 processors as the primary platform. SambaNova’s RDU chips manage high-throughput decoding phases. GPUs from Intel handle the prefill stage of inference tasks. This split approach aims to improve efficiency for agentic AI applications, which require fast response times and large context processing.
The companies signed a formal agreement to develop this architecture. It targets production environments where AI agents must process complex prompts and generate responses in real time. The collaboration follows industry shifts away from GPU-only inference systems, which struggle with the memory and latency demands of agentic workloads.
Intel confirmed the Xeon 6 processors will serve as the backbone. SambaNova’s Reconfigurable Dataflow Unit (RDU) accelerates decoding operations. The combined system is expected to reduce bottlenecks in agentic AI pipelines, where sequential processing of tokens limits performance.
The announcement comes as enterprises deploy more AI agents for customer service, coding assistance, and decision-making tools. The new architecture is designed to scale across data centers and edge devices without requiring wholesale GPU replacements.
Intel and SambaNova did not disclose specific performance benchmarks or commercial timelines. However, they stated the system is ready for integration with existing AI infrastructure.
Source: newsroom.intel.com