Unveiling Jamba: AI21's Groundbreaking Hybrid SSM-Transformer Open-Source Model

News provided by

Mar 28, 2024, 11:32 ET

World's first production-grade Mamba-style model offers unparalleled throughput , only model in its size class that fits 140K context on a single GPU

TEL AVIV, Israel, March 28, 2024 /PRNewswire/ -- AI21, the leader in AI systems for the enterprise, today unveiled Jamba, the world's first production-grade Mamba-style model – integrating Mamba Structured State Space model (SSM) technology with elements of traditional Transformer architecture.

Jamba marks a significant advancement in large language model (LLM) development, offering unparalleled efficiency, throughput, and performance.

Jamba revolutionizes the landscape of LLMs by addressing the limitations of pure SSM models and traditional Transformer architectures. With a context window of 256K, Jamba outperforms other state-of-the-art models in its size class across a wide range of benchmarks, setting a new standard for efficiency and performance.

Jamba features a hybrid architecture that integrates Transformer, Mamba, and mixture-of-experts (MoE) layers, optimizing memory, throughput, and performance simultaneously. Jamba also surpasses Transformer-based models of comparable size by delivering three times the throughput on long contexts, enabling faster processing of large-scale language tasks that solve core enterprise challenges.

Scalability is a key feature of Jamba, accommodating up to 140K contexts on a single GPU, facilitating more accessible deployment and encouraging experimentation within the AI community.

Jamba's release marks two significant milestones in LLM innovation – successfully incorporating Mamba alongside the Transformer architecture plus advancing the hybrid SSM-Transformer model, delivering a smaller footprint and faster throughput on long context.

"We are excited to introduce Jamba, a groundbreaking hybrid architecture that combines the best of Mamba and Transformer technologies," said Or Dagan, VP of Product, at AI21. "This allows Jamba to offer unprecedented efficiency, throughput, and scalability, empowering developers and businesses to deploy critical use cases in production at record speed in the most cost-effective way."

Jamba's release with open weights under the Apache 2.0 license explores collaboration and innovation in the open source community, and invites further discoveries from them. And Jamba's integration with the NVIDIA API catalog as a NIM inference microservice streamlines its accessibility for enterprise applications, ensuring seamless deployment and integration.

To learn more about Jamba, read the w hitepaper and blog post available on AI21's website.

About AI21
AI21 is a leader in generative AI and Large Language Models, with the mission to empower businesses with state-of-the-art LLMs and AI systems. Founded in 2017, AI21 has raised a total of $336 million, with the most recent Series C round closing at $208 million in November 2023. It was among the first companies to bring generative AI to the masses and, to date, offers enterprise solutions and consumer applications. With the developer platform AI21 Studio, businesses can build their own generative AI-driven applications and services. AI21's consumer product, Wordtune, is the first AI-based reading and writing assistant to understand context and meaning, enabling its millions of users to refine and generate text or read and summarize text within seconds. For more information, please visit www.AI21.com.

SOURCE AI21