New Multimodal
Qwen3.5 9B
Alibaba's dense Qwen3.5 9B vision-language model with Jetson-specific checkpoints for Orin and Thor
Memory Requirement 8GB RAM
Precision NVFP4 / W4A16
Size 5GB
Jetson Inference - Supported Inference Engines
🚀
Container # Run Command
sudo docker run -it --rm --pull always --runtime=nvidia --network host ghcr.io/nvidia-ai-iot/vllm:latest-jetson-orin vllm serve Kbenkhaled/Qwen3.5-9B-quantized.w4a16 --gpu-memory-utilization 0.8 --enable-prefix-caching --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser qwen3_coder Model Details
Qwen3.5 9B is a dense vision-language model in the Qwen3.5 family aimed at stronger reasoning, visual understanding, and agentic behavior on Jetson. This entry uses a W4A16 checkpoint on Jetson Orin and an NVFP4 checkpoint on Jetson Thor.
Inputs and Outputs
Input: Text and images
Output: Text
Intended Use Cases
- Visual reasoning: Stronger multimodal reasoning over image and text inputs
- Image understanding: Detailed captioning, scene description, and analysis
- Tool calling: Native Qwen tool-call parsing in vLLM
- Agents: Local assistants and workflow automation
Additional Resources
- Original Model - Base Qwen3.5 9B checkpoint
- W4A16 Checkpoint - Jetson Orin checkpoint
- NVFP4 Checkpoint - Jetson Thor checkpoint