Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][WASI-NN] whisper.cpp backend #3331

Closed
wants to merge 5 commits into from

Conversation

rum1887
Copy link

@rum1887 rum1887 commented Apr 8, 2024

No description provided.

Signed-off-by: Ramya <ramyapgk1887@gmail.com>
@rum1887 rum1887 requested a review from ibmibmibm as a code owner April 8, 2024 18:19
Copy link
Member

juntao commented Apr 8, 2024

Hello, I am a code review bot on flows.network. Here are my reviews of code commits in this PR.


Considering all the changes in this work-in-progress Pull Request (PR), it seems the developer is focusing on introducing a new "whisper" backend for WASI-NN, adding support for it in different parts of the system, and enhancing the code readability by improving the formatting. Here are the key findings:

Potential Issues and Errors

  1. A discrepancy was detected in the initial "BackendMap" string view mapping. It seems the "whisper" was mapped to "GGML" instead of "WHISPER". However, this has been corrected in later commits.
  2. The early versions of the introduced "whisper"-related functions were doing nothing or returning success arbitrarily. Although it seems that further improvements were made (particularly to the load function), scrutinizing the actual implementation is necessary.
  3. Changes in the order of #include directives could potentially lead to bugs, especially if there are dependencies or conditions in those files.
  4. Hard-coded strings are being used for model paths, which could lead to problems with different setups or platforms.
  5. Memory leaks could potentially occur as there are ambiguous scenarios around handling errors or failures.
  6. The unmuted debug logging could lead to performance issues or expose unnecessary details.

Key Findings

  1. The Whisper backend is still under development and the pull request is not complete yet. Several placeholders and stubs are present, indicating future detailed implementation.
  2. There has been an extensive improvement in code formatting, which increases its readability. However, more descriptive commit messages should be encouraged.
  3. As new file additions ("whispercpp.cpp", "whispercpp.h") and modifications in existing files are reported, it highlights the corrective measures taken for the integration of the "whisper" backend.
  4. Adjustments have also been made in handling error conditions and reportBackendNotSupported function, reflecting consideration for robustness.

Given these findings, an integral next step would be to execute a thorough testing phase. It is also recommended to see this PR in the larger scope of where the project is currently in its development loop, considering it is still a work in progress.

Details

Commit 95b52c857b4f7a9e9c50070effbe0a3828bbc81f

This pull request contains work for adding the "whisper" backend for WASI-NN (WasmEdge Systems Interface for Neural Network). The key changes revolve around creating support for the new backend. Here're some of the changes:

  1. "whispercpp.cpp" and "whispercpp.h" files have been added.
  2. Backend support for "whisper" has been added in the CMake scripts and also in wasinnenv.cpp, wasinnenv.h, and types.h files.
  3. In the CMakeLists.txt file, it has been modified to download and set up the whisper.cpp source code.
  4. An additional condition has been added to handle the whisper backend in the else-if ladder in "WASINNDeps.cmake".
  5. Enums have been added to types.h for the "whisper" backend and the execution of this backend has been hooked with the corresponding functions.
  6. If the "whisper" backend is not built then the corresponding functions simply return "BackendNotSupported" error.
    Overall, it can be seen as adding basic support for a new backend and adjusting the different parts of the systems to handle this new piece of functionality.

However, I do see two potential problems or areas to investigate:

  1. While a "whisper" const string view has been added to the "BackendMap" in "wasinnenv.cpp", the value doesn't seem correct. It is currently mapped to "GGML" while it should be probably mapped to "WHISPER".

  2. All whisper-related function calls in "whispercpp.cpp" either do nothing or return success for now. The proper implementation for these functions seems missing.

This should be a detailed initial step to integrating the whisper backend, further developments are required to complete the integration. This includes actually implementing the whispered backend functions and ensuring that all parts of the system are aware of this new backend, alongside thorough testing.

Commit c30c2e57acec23bdc3f52a9784ff742300c7d131

Key changes in this pull request:

  1. In types.h, the Backend enum class has had its formatting changed, with each backend type listed on a separate line. This likely improves readability but doesn't change any functionality.
  2. Formatting changes have been made to whispercpp.cpp. This includes changes to indentation, line breaks, and the use of spaces between function arguments. These changes improve the readability of the code but are unlikely to change any functionality. There have also been changes to the order of header file inclusions.
  3. Similarly, whispercpp.h has had minor formatting changes, including changes to indentation, line breaks, and the use of spaces. This file also had changes to the order of its includes.

Potential Problems:

  1. While it's mostly formatting changes and there don't seem to be any issues that would alter functionality, there has been a change in the order of #include directives in the whispercpp.cpp and whispercpp.h files. Depending on the content of the included files, and if there are any dependencies or conditions in them, this could potentially introduce bugs.
  2. Given that this is a "[WIP] pull request, we should potentially be cautious. The functionality for the whisper backend still appears to be stubbed out. The developer might be planning on adding more to this pull request before it's ready for review.
  3. Lastly, not necessarily a problem, but more a note that additional work may be needed. Despite the title, this PR doesn't seem to implement the whisper backend. This might be coming in a subsequent PR or is still in development.

Commit 841daf76a7c51009c5b6926659748c7166ff60a3

Summary of Changes:

  1. Updated the backend mapping for "whisper" in wasinnenv.cpp to reflect WHISPER instead of GGML.
  2. Significantly expanded the load function in whispercpp.cpp. These changes include:
    • Enabling the whisper backend.
    • Dealing with different types of model paths.
    • Loading the "whisper" model with default parameters.
    • Handling error conditions such as failing to create a temporary file or initializing the model.
  3. Modified the reportBackendNotSupported function to reflect a change in command line flags.

Potential Problems:

  1. The usage of hard-coded strings like "models/ggml-base.en.bin"sv is potentially problematic as they may not be platform-independent or flexible for different setups. The model file path should ideally be parameterized or configurable.
  2. It isn't clear whether or not the whisper_init_from_file_with_params function properly dispatches cleanup code in case of error. This could potentially lead to memory leaks.
  3. There's no check for whether whisper_context_default_params() was successfully executed. If it fails, it might lead to unexpected behavior.
  4. The patch assumes the input model is in binary format (BinModel). If the input model format differs, this could lead to issues.
  5. The debug logging is enabled with GraphRef.EnableDebugLog = false; If this is intended to be used for debugging, the flag should probably be true.

Commit 25552881f4a0a8276cbcdca371260f8edc6b02aa

Summary of Key Changes

  1. Corrected the syntax error in whispercpp.cpp by adding a missing "s" to the keyword "struct". The change was made on the variable declaration of ContextDefault.

  2. Un-commented the line in whispercpp.h; whisper_model *WhisperModel is now initialized to nullptr.

Potential Problems

  1. The change in whispercpp.h may cause an issue if there are any parts of the code that don't correctly handle WhisperModel being nullptr. We need to make sure that this change does not result in null pointer exceptions.

  2. The commit message does not mention why the whisper_model initialization was uncommented. This could be a potential communication issue leading to confusion among the team members.

  3. The commit's title is not really descriptive of what changes are made, it just mentions that it's for formatting. A more descriptive title would be helpful for team members to understand the changes easily.

@github-actions github-actions bot added c-Plugin An issue related to WasmEdge Plugin c-WASI-NN c-CMake labels Apr 8, 2024
rum1887 and others added 4 commits April 8, 2024 23:56
Signed-off-by: Ramya <ramyapgk1887@gmail.com>
Signed-off-by: Ramya <ramyapgk1887@gmail.com>
Signed-off-by: Ramya <ramyapgk1887@gmail.com>
@hydai
Copy link
Member

hydai commented May 22, 2024

No updates from the mentee, closing it.

@hydai hydai closed this May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c-CMake c-Plugin An issue related to WasmEdge Plugin c-WASI-NN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants