Skip to content

Help on improving transcription quality #1948

Answered by zubbyy
i4lina asked this question in Q&A
Discussion options

You must be logged in to vote

Ran on Arch Linux 6.7.5, hyprland, i7-8565U, nvidia mx130, 8gb ram

Hey.
I'm no expert in this field, but since i wanted to get a little more into whisper.cpp since i'm gonna need it for a future project, i tried to take a look into your issue;
My approach was noise reduction, so i downloaded your mp4.

I then proceeded to convert it in a .wav file:
ffmpeg -i input.mp4 -ar 16000 -ac 1 -c:a pcm_s16le -t 100 output_dirty.wav

For the noise reduction i found the ffmpeg's afftdn filter:
ffmpeg -i output_dirty.wav -af "afftdn=nr=20:nf=-20:tn=1" output.wav

I then proceeded to run it on my large model.
./main -m models/ggml-large-v3.bin -l auto samples/output.wav

and the output was the following:

w…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@i4lina
Comment options

Answer selected by i4lina
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants