DEEPSPEECH ARCHITECTURE BASED NON-NATIVE ENGLISH SPEAKER ASR USING FINE-TUNING METHOD

ABRAR, FADIYAH NAHDAH and Hendy, Santosa and Ika, Novia Anggraini (2024) DEEPSPEECH ARCHITECTURE BASED NON-NATIVE ENGLISH SPEAKER ASR USING FINE-TUNING METHOD. Other thesis, Universitas Bengkulu.

[thumbnail of Thesis] Archive (Thesis)
SKRIPSI_FADIYAH NAHDAH ABRAR - nada yay.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons GNU GPL (Software).

Download (2MB)

Abstract

This research focuses on optimizing automatic speech recognition (ASR) for non- native English speakers, specifically Mandarin L1 speakers. The aim is to identify
the best dataset, hyper-parameters, and evaluation methods. The study uses
secondary datasets and evaluates using Word Error Rate (WER), Character Error
Rate (CER), and learning curves. The research applies fine-tuning techniques and
training from scratch to improve ASR performance. The fine-tuned model
achieved a WER of 0.45, CER of 0.19, and a loss of 35.29, outperforming the
model trained from scratch. The study found that a batch size of 8 for training and
2 for testing and validation resulted in the most efficient resource usage. The fine�tuned model performed best over 150 epochs, showing stable learning and
minimized loss.The results showed that fine-tuning outperformed training from
scratch, with lower WER and loss. The optimal GPU usage was achieved with
specific batch sizes, and the best models achieved a 0% WER/CER, ensuring
accurate speech transcription. Keyword: ASR, Non-native Speaker, Transfer Learning

Item Type: Thesis (Other)
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisions: Faculty of Engineering > Department of Electrical Engineering
Depositing User: 58 lili haryanti
Date Deposited: 04 Sep 2025 03:11
Last Modified: 04 Sep 2025 03:11
URI: https://repository.unib.ac.id/id/eprint/24295

Actions (login required)

View Item
View Item