Skip to content

Nemotron 3 Nano 30B A3B

Nemotron 3 Nano 30B A3B is a sparse hybrid Mamba-Transformer mixture-of-experts (MoE) model with 30B total parameters but only 3B active per token. It supports a context window of 262.1K tokens with throughput closer to a 3B dense model than a 30B one.

Reasoning
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'nvidia/nemotron-3-nano-30b-a3b',
prompt: 'Why is the sky blue?'
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
DeepInfra
262K
0.2s
98tps
$0.05/M$0.24/M
12/01/2024