Multimodal

Qwen3.5 9B

Alibaba's dense Qwen3.5 9B vision-language model with Jetson-specific checkpoints for Orin and Thor

Command to Run on Jetson Model Details

Parameters 9B

Modalities

Text Image

Context Length 256K

License Apache 2.0

Precision

NVFP4 W4A16

Serve the model

Start server

Choose module, then engine and optional parameters on the left, then copy the serve command by clicking the button on the right.

Command

Call the model over Web API

Copy a client command below and paste it into your terminal to make a Web API request to the model you just served.

curl -s http://${JETSON_HOST}:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3.5-9B",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

from openai import OpenAI

client = OpenAI(
    base_url="http://${JETSON_HOST}:8000/v1",
    api_key="not-needed",  # vLLM / llama.cpp typically do not enforce a key
)

completion = client.chat.completions.create(
    model="Qwen/Qwen3.5-9B",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(completion.choices[0].message.content)

Model Details

View on HuggingFace

Qwen3.5 9B is a dense vision-language model in the Qwen3.5 family aimed at stronger reasoning, visual understanding, and agentic behavior on Jetson. This entry uses a W4A16 checkpoint on Jetson Orin and an NVFP4 checkpoint on Jetson Thor.

Inputs and Outputs

Input: Text and images

Output: Text

Intended Use Cases

Visual reasoning: Stronger multimodal reasoning over image and text inputs
Image understanding: Detailed captioning, scene description, and analysis
Tool calling: Native Qwen tool-call parsing in vLLM
Agents: Local assistants and workflow automation

Additional Resources

Original Model - Base Qwen3.5 9B checkpoint
W4A16 Checkpoint - Jetson Orin checkpoint
NVFP4 Checkpoint - Jetson Thor checkpoint