Huginn Hears 🐦‍⬛

Huginn Hears is a Python application designed to transcribe speech and summarize it in Norwegian. It is meant to be used locally and ran on a single machine. The application is built using the Streamlit framework for the user interface, Faster-Whisper for speech-to-text transcription, llmlingua-2 for compressing the transcribed text and llama-ccp-python for summarization. The main goal is to allow useres with little technical knowledge to test and try out STOA models locally on their computer. Taking advatage of the amazing open source projects out there and bundel it all into a simple installer.

Features

Transcribes speech into text.
Summarizes the transcribed text.
Supports both English and Norwegian languages.

Demo.HuginnHears.mp4

Installation

This project uses Poetry for dependency management. Before installing the project dependencies, it's essential to set up certain environment variables required by llama-ccp-python.

Setting Environment Variables for llama-cpp-python

llama-cpp-python requires specific environment variables to be set up in your system to function correctly. Follow the instructions in their repo to get the correct variables for your system. https://github.com/abetlen/llama-cpp-python

Examples

# Linux and Mac
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"

# Windows
$env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"

Commen errors 🤯

Soulutions from llama-cpp-python

Windows Notes

Error: Can't find 'nmake' or 'CMAKE_C_COMPILER'

If you run into issues where it complains it can't find 'nmake' '?' or CMAKE_C_COMPILER, you can extract w64devkit as mentioned in llama.cpp repo and add those manually to CMAKE_ARGS before running pip install:

$env:CMAKE_GENERATOR = "MinGW Makefiles"
$env:CMAKE_ARGS = "-DLLAMA_OPENBLAS=on -DCMAKE_C_COMPILER=C:/w64devkit/bin/gcc.exe -DCMAKE_CXX_COMPILER=C:/w64devkit/bin/g++.exe"

See the above instructions and set CMAKE_ARGS to the BLAS backend you want to use.

MacOS Notes

Detailed MacOS Metal GPU install documentation is available at docs/install/macos.md

M1 Mac Performance Issue

Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
bash Miniforge3-MacOSX-arm64.sh

Otherwise, while installing it will build the llama.cpp x86 version which will be 10x slower on Apple Silicon (M1) Mac.

M Series Mac Error: `(mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))`

Try installing with

CMAKE_ARGS="-DCMAKE_OSX_ARCHITECTURES=arm64 -DCMAKE_APPLE_SILICON_PROCESSOR=arm64 -DLLAMA_METAL=on" pip install --upgrade --verbose --force-reinstall --no-cache-dir llama-cpp-python

Installing Project Dependencies

After setting up the required environment variables, you can proceed to install the project. Ensure you have Poetry installed on your system. Then, run the following command in the project root directory:

poetry install

This will install all the necessary dependencies as defined in the pyproject.toml file.

Usage

To run the application, use the following command:

streamlit run streamlit_app/app.py

This will start the Streamlit server and the application will be accessible at localhost:8501.

Building

To build the project into an executable, use the setup.py script with cx_Freeze:
NB: Make sure you installed llama-cpp-python with static linking.

python setup.py build

This will create an executable in the build directory.

Acknowledgements

This project build on a lot of great work done by others. The following projects were used:

Faster-Whisper for speech-to-text transcription.
llmlingua-2 for prompt compression.
llama-ccp-python to run LLMs locally and on CPUs.
- llama-ccp
cx_Freeze for building executables.
Langchain for controlling the prompt-response flows.
Streamlit for building the UI.

Big thanks to all the contributors to these open-source projects!

In addition, the following models were used:

Nasjonalbiblioteket AI Lab NB-Whisper.
Microsoft LLMLingua-2.
TheBloke for all sorts quantisized models.

Papers

You can read more about these models in these papers:

License

This project is licensed under the Appache 2.0 License. See the LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
huginn_hears		huginn_hears
streamlit_app		streamlit_app
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cuda.Dockerfile		cuda.Dockerfile
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
setup.py		setup.py

License

MagnusS0/HuginnHears

Folders and files

Latest commit

History

Repository files navigation

Huginn Hears 🐦‍⬛

Features

Installation

Setting Environment Variables for llama-cpp-python

Commen errors 🤯

Windows Notes

MacOS Notes

Installing Project Dependencies

Usage

Building

Acknowledgements

Papers

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages