ASRTK is a comprehensive Python toolkit for building and deploying end-to-end Automatic Speech Recognition (ASR) systems.
This project is a rewrite and open-source release of the previously closed-source toolkit known as "YTK".
It provides a streamlined workflow and a collection of utilities to simplify the process of creating, training, and evaluating ASR models.
- Data Preparation: Easily prepare and preprocess speech datasets, including data cleaning, normalization, and feature extraction.
- Model Training: Train state-of-the-art ASR models using popular architectures and techniques. (Coming soon)
- Evaluation and Testing: Evaluate the performance of trained models using various metrics and perform inference on new audio data.
Deployment: Deploy trained ASR models in real-world applications with support for different platforms and environments.
- Python 3.9 or higher
- PyTorch 1.7 or higher
- NumPy
- SciPy
- librosa
To get started with ASRTK, follow these steps:
-
Clone the repository:
git clone https://github.com/yourusername/asrtk.git
-
Navigate to the project directory:
cd asrtk
-
Install the required dependencies:
pip install -r requirements.txt
Running asrtk --help
or python -m asrtk --help
shows a list of all of the available options and commands:
We welcome contributions from the community! Whether it's adding new features, improving documentation, or reporting bugs, please feel free to make a pull request or open an issue.
ASRTK is released under the MIT license. Contributions must adhere to this license.