Supported Models

Run the latest generative AI models on your Jetson device

Featured Models

Latest releases with day-0 support on Jetson

Qwen3.5 35B-A3B (MoE)

Alibaba's latest Mixture-of-Experts model with 35B total / 3B active parameters, featuring native tool calling and MTP speculative decoding

Details

Qwen3.5 9B

Alibaba's dense Qwen3.5 9B vision-language model with Jetson-specific checkpoints for Orin and Thor

Details

Cosmos Reason 2 8B

NVIDIA's 8B parameter vision-language model with advanced chain-of-thought reasoning capabilities

All Models

Compact directory, grouped by model family

Google Gemma3

FunctionGemma
Details
Gemma 3 270M
Details
Gemma 3 1B
Details
Gemma 3 4B
Details
Gemma 3 12B
Details
Gemma 3 27B
Details

NVIDIA Nemotron

Nemotron3 Nano 30B-A3B
Details
Nemotron Nano 9B v2
Details
Nemotron Nano 12B VL
Details

NVIDIA Cosmos Reason

Cosmos Reason 1 7B
Details
Cosmos Reason 2 2B
Cosmos Reason 2 8B New

Alibaba Qwen3.5

Qwen3.5 35B-A3B (MoE) New
Details
Qwen3.5 27B
Details
Qwen3.5 9B New
Details
Qwen3.5 4B
Details
Qwen3.5 0.8B
Details

OpenAI GPT OSS

GPT OSS 20B
Details
GPT OSS 120B
Details

Alibaba Qwen3

Qwen3 4B
Details
Qwen3 8B
Details
Qwen3 30B-A3B (MoE)
Details
Qwen3 32B
Details
Qwen3 VL 4B
Details
Qwen3 VL 8B
Details

Mistral AI Ministral 3

Ministral 3 3B Instruct
Details
Ministral 3 8B Instruct
Details
Ministral 3 14B Instruct
Details
Ministral 3 3B Reasoning
Details
Ministral 3 8B Reasoning
Details
Ministral 3 14B Reasoning
Details

Meta Llama 3

Llama 3.2 3B
Details
Llama 3.1 8B
Details
Llama 3.1 70B
Details

Performance Comparison

Benchmarks on Jetson running vLLM

Platform
Concurrency

* ISL/OSL for all benchmarks: 2048/128

* Unless otherwise specified, all models utilize W4A16 quantization for Orin and NVFP4 for Thor.