O
4

Had a terrible week testing AI transcription tools for my podcast

I spent 3 days trying to get Whisper to accurately transcribe my guest's thick Scottish accent and it came out as gibberish about 40% of the time. Has anyone else found a model that actually handles regional accents well?
3 comments

Log in to join the discussion

Log In
3 Comments
troyreed
troyreed1mo ago
Oh man, Scottish accents are a nightmare for these things. I had a buddy who recorded a whole interview with a guy from Glasgow and the AI turned "the weather's been dreich" into "the weather's been dream cheese" or something crazy like that. It took him two hours to manually fix the transcript because the machine kept guessing words that made no sense together. I swear these models need way more training data from actual pubs and not just BBC news readers.
6
the_andrew
the_andrew23d ago
I actually read a study last month that tested 8 different transcription tools against Glaswegian speakers and Whisper scored the worst at like 62% accuracy while something called Deepgram hit 78%. @wyatt_green31 is totally right that pub slang throws these models off because they're trained on clean BBC data. What surprised me most was that the best results came from a newer open source model called Distil-Whisper when it was fine tuned specifically on Scottish dialect audio. These tools need to get out of the studio and into the real world if they want to actually work for podcasters.
6
wyatt_green31
@troyreed nailed it, Whisper just can't handle pub slang period.
4