[LFX Contribution]: Integrating MLX framework as a WasmEdge NN Backend #3330
+321
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#3266
Thanks for giving me such an important feature to work on, and I have been working on the issue for a while and feels like it can be taken for an initial review now.
Now my work is divided into two parts:
Implementing a CPP inference Example for MLX using the current CPP API:
I have implemented a simple NN with the current CPP API of MLX, keeping in mind the needs for the plugin and the general reference to other projects like PyTorch C++ Fronted API.
Currently the API supports creating layers like BatchNorm, transformers among others and is able to perform basic inference . I am working on more classes and models and will try to complete them as soon as possible.
Current features:
For implementation, you can check: https://github.com/guptaaryan16/mlx/blob/Cpp_api/examples/cpp/nn_inference_example.cpp
https://github.com/guptaaryan16/mlx/blob/Cpp_api/examples/cpp/attention.cpp
Implementing mlx.cpp and other files for wasi-nn plugin
I have been studying the implementation of other plugins, especially ggml and pytorch, and it has helped me to implement some functionality like loading weights from SafeTensors and GGUF format for now.
Most of the functionality further will be dependent on the implementation of the CPP frontend for the MLX library, which I plan to now complete as soon as possible.
My Notes and Info About Completed Milestones
Currently, I have faced multiple problems due my current knowledge in C++ and low level design for the CPP based NN ( for loading LLMs) and thus I have not been able to move as fast as I have listed in my milestones. Still I believe that after a solid baseline implementation of a NN class(which is almost done now) and further a LLAMA model, I will be able to complete the plugin as soon as possible.
From now on, I will be using the current implemented NN example to create something similar to llama.cpp for MLX (will try to keep it within the library itself to simplify loading of models) and thus will complete more parts for the plugin within a few weeks.
Thank you for helping me till now and hope that I complete the project as soon as possible.
cc @hydai @awni