Double-click the DMG to mount it. The app is code-signed and notarized by Apple — no Gatekeeper warnings.
Drag MLX Studio to the Applications folder. Close the DMG window and eject the disk image.
Open MLX Studio from Applications or Spotlight. On first launch, it installs vMLX Engine automatically with one click.
Search and download any MLX model from HuggingFace directly in the app, or use models you already have. We publish optimized models at huggingface.co/JANGQ-AI.
Create a session, hit Start. Chat with AI, use agentic tools, or connect via the OpenAI-compatible API at
localhost:8000.
More unified memory = larger models. 16 GB handles up to ~20B parameters, 32 GB handles ~35B, 64 GB handles ~70B, and 192 GB handles 400B+ MoE models. vMLX Engine's KV cache quantization (q4/q8) lets you push these limits further.
MLX Studio is a single self-contained app. The DMG includes everything — no Python, pip, Docker, or command-line setup. On first launch, it installs vMLX Engine automatically.
Features: beautiful streaming chat UI, 20+ agentic coding tools (file, shell, git, web search, browser), voice chat, vision/multimodal, collapsible reasoning blocks, inline tool call pills, HuggingFace model browser, remote endpoint support, and an OpenAI-compatible API.
MLX Studio is powered by vMLX Engine — the fastest local AI inference engine for Mac. 5-layer caching (prefix + paged KV + q4/q8 quantization + continuous batching + disk), speculative decoding, 50+ architectures, and Mamba/SSM support.
The engine installs automatically on first launch — no configuration required.