Skip to content

Benchmarks

Large Language Models (LLM)

For running LLM benchmarks, see the MLC container documentation.

Small Language Models (SLM)

Small language models are generally defined as having fewer than 7B parameters (Llama-7B shown for reference)
For more data and info about running these models, see the SLM tutorial and MLC container documentation.

Vision Language Models (VLM)

This measures the end-to-end pipeline performance for continuous streaming like with Live Llava.
For more data and info about running these models, see the NanoVLM tutorial.

Vision Transformers (ViT)

VIT performance data from [1] [2] [3]

Stable Diffusion

Riva

For running Riva benchmarks, see ASR Performance and TTS Performance.

Vector Database

For running vector database benchmarks, see the NanoDB container documentation.