Skip to content

Tutorial - Introduction


Our tutorials are divided into categories roughly based on model modality, the type of data to be processed or generated.

Text (LLM)

text-generation-webui Interact with a local AI assistant by running a LLM with oobabooga's text-generaton-webui
llamaspeak Talk live with Llama using Riva ASR/TTS, and chat about images with Llava!

Text + Vision (VLM)

Give your locally running LLM an access to vision!

Mini-GPT4 Mini-GPT4, an open-source model that demonstrate vision-language capabilities.
LLaVA Large Language and Vision Assistant, multimodal model that combines a vision encoder and Vicuna LLM for general-purpose visual and language understanding.

Image Generation

Stable Diffusion Run AUTOMATIC1111's stable-diffusion-webui to generate images from prompts
Stable Diffusion XL A newer ensemble pipeline consisting of a base model and refiner that results in significantly enhanced and detailed image generation capabilities.

Vision Transformers (ViT)

EfficientVIT MIT Han Lab's EfficientViT, Multi-Scale Linear Attention for High-Resolution Dense Prediction
NanoSAM NanoSAM, SAM model variant capable of running in real-time on Jetson
NanoOWL OWL-ViT optimized to run real-time on Jetson with NVIDIA TensorRT
SAM Meta's SAM, Segment Anything model
TAM TAM, Track-Anything model, is an interactive tool for video object tracking and segmentation

Vector Database

NanoDB Interactive demo to witness the impact of Vector Database that handles multimodal data


AudioCraft Meta's AudioCraft, to produce high-quality audio and music
Whisper OpenAI's Whisper, pre-trained model for automatic speech recognition (ASR)


Knowledge Distillation
SSD + Docker
Memory optimization

About NVIDIA Jetson


We are mainly targeting Jetson Orin generation devices for deploying the latest LLMs and generative AI models.

Jetson AGX Orin 64GB Developer Kit Jetson AGX Orin Developer Kit Jetson Orin Nano Developer Kit

GPU 2048-core NVIDIA Ampere architecture GPU with 64 Tensor Cores 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores
64GB 32GB 8GB
Storage 64GB eMMC (+ NVMe SSD) microSD card (+ NVMe SSD)