MLX Studio for macOS

Version 1.2.1 · Apple Silicon (arm64) · 361 MB
Download from GitHub
downloads
Code-signed & notarized Developer ID: ShieldStack LLC macOS 26+ (Tahoe)

Installation

Open the DMG

Double-click the DMG to mount it. The app is code-signed and notarized by Apple — no Gatekeeper warnings.

Drag to Applications

Drag MLX Studio to the Applications folder. Close the DMG window and eject the disk image.

Launch MLX Studio

Open MLX Studio from Applications or Spotlight. On first launch, it installs vMLX Engine automatically with one click.

Pick a model

Search and download any MLX model from HuggingFace directly in the app, or use models you already have. We publish optimized models at huggingface.co/JANGQ-AI.

Start chatting

Create a session, hit Start. Chat with AI, use agentic tools, or connect via the OpenAI-compatible API at localhost:8000.

Requirements

Platform
macOS 26+ (Tahoe)
Remote endpoints available on macOS 14+
Chip
Apple Silicon (M1 or later)
RAM (minimum)
8 GB unified memory
RAM (recommended)
16 GB+ (for 7B–20B models)

More unified memory = larger models. 16 GB handles up to ~20B parameters, 32 GB handles ~35B, 64 GB handles ~70B, and 192 GB handles 400B+ MoE models. vMLX Engine's KV cache quantization (q4/q8) lets you push these limits further.

What's included

MLX Studio is a single self-contained app. The DMG includes everything — no Python, pip, Docker, or command-line setup. On first launch, it installs vMLX Engine automatically.

Features: beautiful streaming chat UI, 20+ agentic coding tools (file, shell, git, web search, browser), voice chat, vision/multimodal, collapsible reasoning blocks, inline tool call pills, HuggingFace model browser, remote endpoint support, and an OpenAI-compatible API.

About the Engine

MLX Studio is powered by vMLX Engine — the fastest local AI inference engine for Mac. 5-layer caching (prefix + paged KV + q4/q8 quantization + continuous batching + disk), speculative decoding, 50+ architectures, and Mamba/SSM support.

The engine installs automatically on first launch — no configuration required.

Report a Bug