Skip to content

NVIDIA Nemotron 3 Super 120B A12B

NVIDIA Nemotron 3 Super 120B A12B is NVIDIA's 120B total, 12B active-parameter hybrid Mamba-Transformer MoE built for complex multi-agent applications, featuring latent MoE and multi-token prediction.

index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'nvidia/nemotron-3-super-120b-a12b',
prompt: 'Why is the sky blue?'
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Amazon Bedrock
256K
1.6s
103tps
$0.15/M$0.65/M
03/18/2026
Baseten
256K
0.2s
$0.30/M$0.75/M
Read:
Write:$0.06/M
+1
03/18/2026