Qualcomm apresenta AI200 e AI250 — Redefinindo o desempenho de inferência em data centers em escala de rack para a era da IA

The Qualcomm AI200 and AI250 solutions offer rack-scale performance and superior memory capacity for fast generative AI inference in data centers, with the best total cost of ownership (TCO) in the industry. The Qualcomm AI250 features an innovative memory architecture, delivering a generational leap in effective memory bandwidth and efficiency for AI workloads.
Both solutions feature a rich software stack and seamless compatibility with leading AI frameworks, empowering businesses and developers to deploy secure and scalable generative AI in data centers.
The products are part of a multi-generational AI inference roadmap for data centers with an annual cadence.

On October 28th, Qualcomm Technologies, Inc. announced the launch of its next-generation AI inference-optimized solutions for data centers: accelerator boards based on Qualcomm® AI200 and AI250 chips and racks. Building on the company's leadership in NPU technology, these solutions offer rack-scale performance and superior memory capacity for fast generative AI inference with high performance per dollar per watt, marking a major advance in enabling scalable, efficient, and flexible generative AI across all industries.

The Qualcomm AI200 features a rack-level AI inference solution specifically designed to deliver low total cost of ownership (TCO) and optimized performance for large language multimodal (LLM, LMM) model inference and other AI workloads. It supports 768 GB of LPDDR per board for increased memory capacity and lower cost, enabling exceptional scale and flexibility for AI inference.

The Qualcomm AI250 solution will launch with an innovative memory architecture based on near-memory computing, delivering a generational leap in efficiency and performance for AI inference workloads, offering more than 10 times greater effective memory bandwidth and significantly lower power consumption. This enables disaggregated AI inference for efficient hardware utilization, meeting both customer performance and cost requirements.

Both rack solutions feature direct liquid cooling for thermal efficiency, PCIe for vertical expansion, Ethernet for horizontal expansion, confidential computing for secure AI workloads, and a rack-level power consumption of 160 kW.

“With the Qualcomm AI200 and AI250, we are redefining what’s possible for rack-scale AI inference. These innovative new AI infrastructure solutions enable customers to deploy generative AI with unprecedented TCO while maintaining the flexibility and security required by modern data centers,” said Durga Malladi, senior vice president and general manager of Technology Planning, Edge Solutions and Data Center at Qualcomm Technologies, Inc. “Our rich software stack and support for open ecosystems make it easier than ever for developers and enterprises to integrate, manage, and scale already-trained AI models on our optimized AI inference solutions. With seamless compatibility with leading AI frameworks and one-click model deployment, the Qualcomm AI200 and AI250 are designed for frictionless adoption and rapid innovation.”

Our hyper-scalable AI software stack, spanning end-to-end from the application layer to the system software layer, is optimized for AI inference. The stack supports leading machine learning (ML) frameworks, inference engines, generative AI frameworks, and LLM/LMM inference optimization techniques as a disaggregated service. Developers benefit from seamless model integration and one-click deployment of Hugging Face models via the Efficient Transformers Library and Qualcomm Technologies' Qualcomm AI Inference Suite. Our software provides ready-to-use AI applications and agents, comprehensive tools, libraries, APIs, and services to operationalize AI.

The Qualcomm AI200 and AI250 are expected to be commercially available in 2026 and 2027, respectively. Qualcomm Technologies is committed to an annual cadence data center roadmap focused on industry-leading AI inference performance, energy efficiency, and industry-leading TCO. For more information, please visit our website.

About Qualcomm

Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world address some of its most pressing challenges. Building on 40 years of technological leadership and pioneering advancements, we offer a broad portfolio of solutions powered by our cutting-edge AI, high-performance, low-power computing, and unparalleled connectivity. Our Snapdragon® platforms deliver extraordinary consumer experiences, and our Qualcomm Dragonwing™ products empower businesses and industries to reach new heights. Together with our ecosystem partners, we enable the next generation of digital transformation to enrich lives, improve businesses, and advance societies. At Qualcomm, we are engineering human progress.

Qualcomm Incorporated includes our licensing business, QTL, and the vast majority of our patent portfolio. Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of our engineering, R&D, and product and service businesses, including our QCT semiconductor business. Products bearing the Snapdragon and Qualcomm brands are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm patents are licensed by Qualcomm Incorporated.

Snapdragon and Qualcomm-branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.

Qualcomm, Snapdragon, Snapdragon Elite Gaming, Hexagon Adreno and Qualcomm Incorporated are trademarks or registered trademarks of Qualcomm Incorporated.