Skip to content

Belluxx/LlamaTerm

Repository files navigation

LlamaTerm

LlamaTerm is a simple CLI utility that allows to use local LLM models easily and with some additional features.

⚠️ Currently this project supports models that use ChatML format or something similar. Use for example Phi-3-mini and LLama3 GGUFs.

Preview

Basic usage:

Injecting file content:

Features

  • Give local files to the model using square brackets
    User: Can you explain the code in [helloworld.c] please?
  • More coming soon

Setup

You can setup LLamaTerm by:

  1. Rename example.env to .env
  2. Modify .env so that the model path corresponds (you may also need to edit EOS and PREFIX_TEMPLATE)
  3. If you need syntax highlighting for code and markdown, then set REAL_TIME=0 in the .env. Note that you will lose real time output generation.
  4. Install python dependencies with pip install -r requirements.txt
  5. If you have a CUDA GPU: install with cuBLAS acceleration: CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install --upgrade --force-reinstall --no-cache llama-cpp-python
  6. If you have an AMD GPU: install with HIP acceleration CMAKE_ARGS="-DLLAMA_HIPBLAS=on -DAMDGPU_TARGETS=insert gpu arch or compatible arch" FORCE_CMAKE=1 CXX=/opt/rocm/bin/hipcc pip install llama-cpp-python --upgrade --force-reinstall --no-cache. For more info see llama-cpp-python

Run

Run LlamaTerm by adding the project directory to the PATH and then running llamaterm

Models supported out of the box

For the following models you will just need to rename the corresponding example example-*.env file to .env:

All the other models that have a prompt template similar to ChatML are supported but you will need to customize some fields like PREFIX_TEMPLATE, EOS etc... in the .env.