Coming soon to iOS & Android

Your LLMs.
Your Phone.

One app for every model. Connect to LM Studio, Ollama, vLLM, or Anthropic — chat with any LLM from anywhere on your network.

Please select at least one platform
Connects to everything

Everything you need.
Nothing you don't.

A focused client that does one thing exceptionally — connect your phone to your models.

Real-time Streaming

Watch tokens arrive as they're generated. SSE streaming renders at 60fps for a desktop-class chat experience on mobile.

Multi-Endpoint

Connect to every server on your network and every cloud provider. Switch between models mid-conversation.

Privacy First

Your conversations never leave your device. API keys stored in hardware-backed secure storage. No telemetry, no cloud sync.

Fine-tune Parameters

Temperature, top_p, max tokens, repeat penalty — adjust per-conversation or set global defaults.

Model Discovery

Automatically detects every model loaded on your servers. See parameter count, quantization, context window, and VRAM at a glance.

Markdown Rendering

Code blocks with syntax highlighting, bold, italic, lists — rendered natively with no WebView overhead.

Three steps. Under a minute.

Add Your Endpoint

Point the app at any LLM server on your network or paste your Anthropic API key.

http://192.168.1.42:1234

Pick a Model

LocalLM auto-discovers every model on your server. Tap to select — see quant, context, and VRAM instantly.

Llama 3.3 70B · Q4_K_M · 8K ctx

Start Chatting

Streaming responses arrive in real time. Your entire conversation history is saved locally on device.

Streaming at 42 tok/s

No subscriptions. No tricks.

One purchase, yours forever. Your models run on your hardware — we don't charge you monthly for that.

Free
$0
Free forever
  • 1 endpoint connection
  • Unlimited conversations
  • Full conversation history
  • SSE streaming
  • Markdown rendering
  • OpenAI & Anthropic compatible
Join Waitlist

Your models deserve
a better client.

Join the waitlist — be the first to know when we launch.