Skip to content

Tutorial - llamaspeak

Talk live with Llama using Riva ASR/TTS, and chat about images with Llava!

  • llamaspeak:v1 - uses text-generation-webui loaders for LLM models (llama.cpp, exllama, AutoGPTQ, Transformers)
  • llamaspeak:v2 - uses AWQ/MLC from local_llm package, web chat voice agent

llamaspeak v2 has multimodal support for chatting about images with quantized Llava-1.5:

Multimodal Voice Chat with LLaVA-1.5 13B on NVIDIA Jetson AGX Orin (container: local_llm)

See the Voice Chat section of the local_llm documentation to run llamaspeak v2.