A fully native macOS application for running large language models locally. Dual inference engine architecture (Ollama + llama.cpp), one-click model downloads from Hugging Face, streaming chat with markdown rendering, and real-time system monitoring.
Real-time streaming chat with markdown rendering, multi-turn conversations, code blocks, and system monitoring overlay.
Switch between Ollama and a built-in llama.cpp server. Maximum compatibility with any GGUF model format.
Browse and download GGUF models directly from the worthdoing Hugging Face catalog. Multi-file download manager with progress tracking.
Full local model management — import GGUF files, view model details, manage installed models, track disk space.
Leverages Apple Silicon's unified memory architecture and Metal GPU for maximum inference performance.
Real-time CPU, memory, and GPU usage during inference. See exactly how your Mac handles model workloads.
Get CoreLM.dmg from GitHub Releases
Double-click the downloaded file
Move CoreLM.app to Applications
Open and start chatting locally