Real-time Streaming
Watch tokens arrive as they're generated. SSE streaming renders at 60fps for a desktop-class chat experience on mobile.
One app for every model. Connect to LM Studio, Ollama, vLLM, or Anthropic — chat with any LLM from anywhere on your network.
A focused client that does one thing exceptionally — connect your phone to your models.
Watch tokens arrive as they're generated. SSE streaming renders at 60fps for a desktop-class chat experience on mobile.
Connect to every server on your network and every cloud provider. Switch between models mid-conversation.
Your conversations never leave your device. API keys stored in hardware-backed secure storage. No telemetry, no cloud sync.
Temperature, top_p, max tokens, repeat penalty — adjust per-conversation or set global defaults.
Automatically detects every model loaded on your servers. See parameter count, quantization, context window, and VRAM at a glance.
Code blocks with syntax highlighting, bold, italic, lists — rendered natively with no WebView overhead.
Point the app at any LLM server on your network or paste your Anthropic API key.
LocalLM auto-discovers every model on your server. Tap to select — see quant, context, and VRAM instantly.
Streaming responses arrive in real time. Your entire conversation history is saved locally on device.
One purchase, yours forever. Your models run on your hardware — we don't charge you monthly for that.
Join the waitlist — be the first to know when we launch.