PDF(2170 KB)
PDF(2170 KB)
PDF(2170 KB)
长短期记忆循环神经网络在自动语音识别中的应用*
Application of Short-term and Long-term Memory Cyclic Neural Network in Automatic Speech Recognition
近年来,深度学习方法在自动语音识别(ASR)中取得了优于传统机器学习方法的成果,尤其是基于长短时记忆(LSTM)的深度学习方法改善了ASR性能,然而传统LSTM对于处理连续输入流存在限制,需要大量内存带宽和计算资源。本研究提出了一种增强型深度学习LSTM循环神经网络(RNN)模型,该模型将RNN作为“遗忘门”,融入记忆块,能够在子序列开始时重置单元状态,从而高效处理连续输入流。此外,修改了LSTM网络的标准架构,以更有效地利用模型参数。与基于CNN和顺序模型的方法相比,在电网调度语音数据集上,LSTM-RNN模型以99.36%的准确率优于其他深度学习模型。
In recent years,deep learning methods have achieved superior results compared to traditional machine learning methods in Automatic Speech Recognition (ASR).Particularly,deep learning methods based on Long Short-Term Memory (LSTM) have improved ASR performance.However,traditional LSTM models have limitations when it comes to handling continuous input streams,requiring substantial memory bandwidth and computational resources.This research introduces an enhanced deep learning model called LSTM Recurrent Neural Network (RNN),which addresses these limitations.In this model,the RNN serves as a “forgetting gate” integrated into the memory block.It allows for efficient processing of continuous input streams by resetting the cell state at the beginning of each subsequence.Additionally,modifications have been made to the standard architecture of the LSTM network to more effectively utilize model parameters.Compared to methods based on CNN and sequential models,the LSTM-RNN model achieves a 99.36% accuracy on the power grid dispatch speech dataset,surpassing other deep learning models.
automatic speech recognition / deep supervised learning / recurrent neural network
| [1] |
杨航, 卢伟开, 黄海英, 等. 基于深度学习的IT服务综合监控系统异构数据集成方法[J]. 微型电脑应用, 2023, 39(3):68-70.
|
| [2] |
周滟. 基于深度学习网络的智能交通信号控制研究[J]. 单片机与嵌入式系统应用, 2022, 22(1):17-20.
|
| [3] |
贾婧雯, 蔡英, 尔古打机. 基于残差网络改进的中文语音情感识别[J]. 计算机工程与设计, 2023, 44(3):922-928.
|
| [4] |
谭磊, 余欣洋, 罗伟洋, 等. 基于深度学习的移动端语音识别系统设计[J]. 单片机与嵌入式系统应用, 2020, 20(9):28-31,35.
|
| [5] |
高芸芸, 赵腊生, 张强. 基于双向长短时记忆和卷积Transformer的声学词嵌入模型[J]. 计算机应用, 2023(6):1-8.
|
| [6] |
刘建伟, 宋志妍. 循环神经网络研究综述[J]. 控制与决策, 2022, 37(11):2753-2768.
|
| [7] |
施峰, 周坤晓. 基于Siren函数改进的循环神经网络机器阅读理解[J]. 东莞理工学院学报, 2022, 29(5):47-52.
|
| [8] |
杨志杰, 张梅, 李冠龙, 等. 基于长短时记忆元的语音智能识别系统设计[J]. 电子设计工程, 2020, 28(1):55-58,64.
|
| [9] |
朱丹浩, 黄肖宇. 基于异构特征融合的论文引用预测方法[J]. 数据采集与处理, 2022, 37(5):1134-1144.
|
| [10] |
先正锴, 甘刚. 基于BPTT算法的webshell检测研究[J]. 计算机与数字工程, 2020, 48(2):372-377,408.
|
| [11] |
潘红丽. 基于RNN弱监督网络的英语语义分析技术研究[J]. 电子设计工程, 2021, 29(15):97-101.
|
/
| 〈 |
|
〉 |