Gemma 4 E2B
Google's compact frontier Gemma 4 model for efficient multimodal and agentic workloads
Gemma 4 E2B
Quick Start Runner
Target Environment
Authentication
This model requires a Hugging Face access token. The token is inserted into the command and never stored.
vLLM Configuration
Configure vLLM server parameters. Leave empty to use defaults.
Maximum context length the model can handle
Fraction of GPU memory to use (0.1 - 1.0)
Loading command... Commands are auto-generated based on your configuration settings.