Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

ml-explore / mlx Public

Notifications You must be signed in to change notification settings
Fork 855
Star 15k

Code
Issues 95
Pull requests 6
Discussions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Projects
Security
Insights

Releases: ml-explore/mlx

Releases · ml-explore/mlx

v0.14.1

31 May 19:34

awni

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.14.1 Latest

Latest

🚀

Assets 2

All reactions

v0.14.0

24 May 01:33

angeloskath

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.14.0

Highlights

Small-size build that JIT compiles kernels and omits the CPU backend which results in a binary <4MB
- Series of PRs 1, 2, 3, 4, 5
mx.gather_qmm quantized equivalent for mx.gather_mm which speeds up MoE inference by ~2x
- Some numbers
Grouped 2D convolutions
- Some numbers

Core

mx.conjugate
mx.conv3d and nn.Conv3d
List based indexing
Started mx.distributed which uses MPI (if installed) for communication across machines
- mx.distributed.init
- mx.distributed.all_gather
- mx.distributed.all_reduce_sum
Support conversion to and from dlpack
mx.linalg.cholesky on CPU
mx.quantized_matmul sped up for vector-matrix products
mx.trace
mx.block_masked_mm now supports floating point masks!

Fixes

Error messaging in eval
Add some missing docs
Scatter index bug
The extensions example now compiles and runs
CPU copy bug with many dimensions

Assets 2

kemchenj, guptaaryan16, sck-at-ucy, lin72h, basketball-hub, fkouteib, EvilSpudBoy, and nastya236 reacted with thumbs up emoji

altaic, derekelewis, awni, stevengans, yliess86, sck-at-ucy, alxndrTL, spichardo, lin72h, trycycle, and 2 more reacted with hooray emoji

altaic, awni, sck-at-ucy, vfsunny, lin72h, lexara-prime-ai, basketball-hub, and nastya236 reacted with rocket emoji

All reactions

👍 8 reactions
🎉 12 reactions
🚀 8 reactions

18 people reacted

v0.13.1

17 May 03:52

awni

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.13.1

🚀

Assets 2

fbadine reacted with thumbs up emoji

zidlera, hrithickcodes, vladkalinichencko, atiorh, and Chandram-Dutta reacted with laugh emoji

theSalted, zidlera, lexara-prime-ai, HongyuS, andreylysenko, yrmo, Chandram-Dutta, ReLRail, openjay, and chen-zichen reacted with rocket emoji

All reactions

👍 1 reaction
😄 5 reactions
🚀 10 reactions

14 people reacted

v0.13.0

10 May 01:21

angeloskath

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.13.0

Highlights

Block sparse matrix multiply speeds up MoEs by >2x
- some numbers
Improved quantization algorithm should work well for all networks
- see evaluations
Improved gpu command submission speeds up training and inference
- some numbers

Core

Bitwise ops added:
- mx.bitwise_[or|and|xor], mx.[left|right]_shift, operator overloads
Groups added to Conv1d
Added mx.metal.device_info to get better informed memory limits
Added resettable memory stats
mlx.optimizers.clip_grad_norm and mlx.utils.tree_reduce added
Add mx.arctan2
Unary ops now accept array-like inputs ie one can do mx.sqrt(2)

Bugfixes

Fixed shape for slice update
Bugfix in quantize that used slightly wrong scales/biases
Fixed memory leak for multi-output primitives encountered with gradient checkpointing
Fixed conversion from other frameworks for all datatypes
Fixed index overflow for matmul with large batch size
Fixed initialization ordering that occasionally caused segfaults

Assets 2

awni, altaic, stevengans, atiorh, lin72h, Blaizzy, and trycycle reacted with hooray emoji

awni, lin72h, Blaizzy, saishreddyk, and Aleis007891 reacted with rocket emoji

Aleis007891 reacted with eyes emoji

All reactions

🎉 7 reactions
🚀 5 reactions
👀 1 reaction

9 people reacted

v0.12.2

02 May 23:38

awni

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.12.2

Patch bump (#1067)

* version

* use 0.12.2

Assets 2

alien2327 reacted with eyes emoji

All reactions

👀 1 reaction

1 person reacted

v0.12.0

25 Apr 21:31

angeloskath

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.12.0

Highlights

Faster quantized matmul
- Up to 40% faster QLoRA or prompt processing, some numbers

Core

mx.synchronize to wait for computation dispatched with mx.async_eval
mx.radians and mx.degrees
mx.metal.clear_cache to return to the OS the memory held by MLX as a cache for future allocations
Change quantization to always represent 0 exactly (relevant issue)

Bugfixes

Fixed quantization of a block with all 0s that produced NaNs
Fixed the len field in the buffer protocol implementation

Assets 2

xsa-dev, lin72h, yliess86, theSalted, ardalan-dsht, saishreddyk, and alien2327 reacted with thumbs up emoji

lin72h, theSalted, wgcban, and alien2327 reacted with hooray emoji

lin72h, theSalted, amirhossein-razlighi, atiorh, alien2327, and SuhrudhSarathy reacted with heart emoji

All reactions

👍 7 reactions
🎉 4 reactions
❤️ 6 reactions

11 people reacted

v0.11.0

18 Apr 20:25

awni

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.11.0

Core

mx.block_masked_mm for block-level sparse matrix multiplication
Shared events for synchronization and asynchronous evaluation

NN

nn.QuantizedEmbedding layer
nn.quantize for quantizing modules
gelu_approx uses tanh for consistency with PyTorch

Assets 2

zidlera, lin72h, stevengans, NishadiSS, pasxn, BIGChask, fkouteib, and alien2327 reacted with thumbs up emoji

zidlera, lin72h, and alien2327 reacted with laugh emoji

eliascarv, lin72h, xsa-dev, awni, mlaves, nastya236, and alien2327 reacted with rocket emoji

All reactions

👍 8 reactions
😄 3 reactions
🚀 7 reactions

13 people reacted

v0.10.0

11 Apr 19:53

awni

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.10.0

Highlights

Improvements for LLM generation
- Reshapeless quant matmul/matvec
- mx.async_eval
- Async command encoding

Core

Slightly faster reshapeless quantized gemms
Option for precise softmax
mx.metal.start_capture and mx.metal.stop_capture for GPU debug/profile
mx.expm1
mx.std
mx.meshgrid
CPU only mx.random.multivariate_normal
mx.cumsum (and other scans) for bfloat
Async command encoder with explicit barriers / dependency management

NN

nn.upsample support bicubic interpolation

Misc

Updated MLX Extension to work with nanobind

Bugfixes

Fix buffer donation in softmax and fast ops
Bug in layer norm vjp
Bug initializing from lists with scalar
Bug in indexing
CPU compilation bug
Multi-output compilation bug
Fix stack overflow issues in eval and array destruction

Assets 2

zidlera, djphoenix, beatgeek, nastya236, lin72h, plusv, Go2Heart, nightscape, fferflo, Phantasmal77, and 5 more reacted with thumbs up emoji

LiPingYen and alien2327 reacted with hooray emoji

thomasjo, lin72h, plusv, alexliap, wgcban, atiorh, and alien2327 reacted with heart emoji

djphoenix, pharringtonp19, nastya236, lin72h, abeleinin, amirhossein-razlighi, isaac-florence, ivanfioravanti, stockeh, altaic, and 2 more reacted with rocket emoji

All reactions

👍 15 reactions
🎉 2 reactions
❤️ 7 reactions
🚀 12 reactions

27 people reacted

v0.9.0

28 Mar 23:19

awni

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.9.0

Highlights:

Fast partial RoPE (used by Phi-2)
Fast gradients for RoPE, RMSNorm, and LayerNorm
- Up to 7x faster, benchmarks

Core

More overhead reductions
Partial fast RoPE (fast Phi-2)
Better buffer donation for copy
Type hierarchy and issubdtype
Fast VJPs for RoPE, RMSNorm, and LayerNorm

NN

Module.set_dtype
Chaining in nn.Module (model.freeze().update(…))

Bugfixes

Fix set item bugs
Fix scatter vjp
Check shape integer overlow on array construction
Fix bug with module attributes
Fix two bugs for odd shaped QMV
Fix GPU sort for large sizes
Fix bug in negative padding for convolutions
Fix bug in multi-stream race condition for graph evaluation
Fix random normal generation for half precision

Assets 2

lin72h, dsdanielpark, AlexandreBrown, lukaemon, santiago3991, lichili233, wy-gi, xsa-dev, cn-tre-lfp, and alien2327 reacted with thumbs up emoji

lin72h, dsdanielpark, and alien2327 reacted with laugh emoji

lin72h, dsdanielpark, stevengans, fkouteib, alwint3r, wgcban, zidlera, nightscape, mlatysh, amirhossein-razlighi, and 6 more reacted with hooray emoji

lin72h, dsdanielpark, alwint3r, hikettei, and alien2327 reacted with heart emoji

lin72h, dsdanielpark, alwint3r, stockeh, hanspaa2017108, zhyncs, and alien2327 reacted with rocket emoji

mattsta, alwint3r, cocoyayann, hanspaa2017108, and alien2327 reacted with eyes emoji

All reactions

👍 10 reactions
😄 3 reactions
🎉 16 reactions
❤️ 5 reactions
🚀 7 reactions
👀 5 reactions

28 people reacted

v0.8.0

21 Mar 21:00

awni

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

v0.8.0

Highlights

More perf!
mx.fast.rms_norm and mx.fast.layer_norm
Switch to Nanobind substantially reduces overhead
Up to 4x faster __setitem__ (e.g. a[...] = b)

Core

mx.inverse, CPU only
vmap over mx.matmul and mx.addmm
Switch to nanobind from pybind11
Faster setitem indexing
- Benchmarks
mx.fast.rms_norm, token generation benchmark
mx.fast.layer_norm, token generation benchmark
vmap for inverse and svd
Faster non-overlapping pooling

Optimizers

Set minimum value in cosine decay scheduler

Bugfixes

Fix bug in multi-dimensional reduction

Assets 2

LiPingYen, mikecvet, l0d0v1c, zidlera, Mr-Dey, fkouteib, gezibash, ashishakkumar, BIGChask, and alien2327 reacted with thumbs up emoji

pharringtonp19, Vaibhavs10, lin72h, stockeh, ixti-frederick, cocoyayann, zidlera, stanbiryukov, kartzke, trycycle, and alien2327 reacted with hooray emoji

VIsualXcc, lin72h, ixti-frederick, stevengans, abeleinin, shwu-nyunai, caapap, and alien2327 reacted with rocket emoji

All reactions

👍 10 reactions
🎉 11 reactions
🚀 8 reactions

24 people reacted

Previous 1 2 3 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.