Run LLMs on your M Series with Apple’s new MLX machine learning framework


Apple has released MLX, an efficient machine learning framework tailored for Apple silicon, and MLX Data, a flexible data loading package.

Both have been released by Apple’s machine learning research team. MLX’s Python API closely follows NumPy, with a few differences.

1. Composable Function Transformations: MLX provides automatic differentiation, automatic vectorization, and computational graph optimization through composable function transformations.
2. Lazy computation: Computations in MLX are lazy, meaning that arrays are materialized only when needed.
3. Multi-device support: Operations can be executed on any supported device, such as CPU and GPU.

The design of MLX is inspired by frameworks such as PyTorch, Jax, and ArrayFire. A notable difference between these frameworks and MLX is the unified memory model, Apple writes.

Arrays in MLX live in shared memory, allowing operations on MLX arrays to be performed on any supported device type without performing data copies. MLX Data (Github) is a framework-agnostic and flexible data-loading package.



Run Mistral and Llama on your M2 Ultra

With MLX and MLX Data, users can perform tasks such as training a Transformer language model or fine-tuning with LoRA, text generation with Mistral, image generation with Stable Diffusion, and speech recognition with Whisper. For an example of how to get started with MLX and Mistral, see this tutorial.

The following video shows the performance of a Llama v1 7B model implemented in MLX and running on an M2 Ultra, highlighting the capabilities of MLX on Apple Silicon devices.

Video: Awni Hannun via

For details, see the MLX Github and Apple’s documentation.

So far, Apple has mostly talked publicly about “machine learning” and how it’s implementing ML features in its products, such as better word prediction for its iPhone keyboard.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top