A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
-
Updated
May 22, 2024 - Python
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
End-to-End Speech Processing Toolkit
Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!
HTML Web template that can RECOGNIZE any live audio/video streaming (using Chrome webkitSpeechRecognition API) then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE
Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
Offline voice input panel & keyboard with punctuation for Android.
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
This repository contains code and instructions to implement single word speech recognition on any board running CircuitPython
VietGPT VoiceBot: Chatbot automatically recognizes Vietnamese voice and uses the ChatGPT API for natural language interaction.
Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.
🧠 Leon is your open-source personal assistant.
Segment speech sequences based on speaker transitions, using ML and DSP.
Port of OpenAI's Whisper model in C/C++
Personal Desktop Assistant, Jarvis, built using python.
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR
Add a description, image, and links to the speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speech-recognition topic, visit your repo's landing page and select "manage topics."