24.1.15.1 Speech Recognition, Neural Networks

Chapter Contents (Back)
Speech. Neural Networks.

Wu, J.X.[Jian-Xiong], Chan, C.[Chorkin],
Isolated word recognition by neural network models with cross-correlation coefficients for speech dynamics,
PAMI(15), No. 11, November 1993, pp. 1174-1185.
IEEE DOI 0401
BibRef

Chen, W.Y.[Wen-Yuan], Liao, Y.F.[Yuan-Fu], Chen, S.H.[Sin-Horng],
Speech recognition with hierarchical recurrent neural networks,
PR(28), No. 6, June 1995, pp. 795-805.
Elsevier DOI 0401
BibRef

Lee, T.[Tan], Ching, P.C., Chan, L.W.[Lai-Wan],
Isolated word recognition using modular recurrent neural networks,
PR(31), No. 6, June 1998, pp. 751-760.
Elsevier DOI 0401
BibRef

Stavrakoudis, D.G., Theocharis, J.B.,
Pipelined Recurrent Fuzzy Neural Networks for Nonlinear Adaptive Speech Prediction,
SMC-B(37), No. 5, October 2007, pp. 1305-1320.
IEEE DOI 0711
BibRef

Kay, S.,
A New Approach to Fourier Synthesis With Application to Neural Encoding and Speech Classification,
SPLetters(17), No. 10, October 2010, pp. 855-858.
IEEE DOI 1008
BibRef

Kay, S.,
A New Proof of the Neyman-Pearson Theorem Using the EEF and the Vindication of Sir R. Fisher,
SPLetters(19), No. 8, August 2012, pp. 451-454.
IEEE DOI 1208
BibRef

Scanzio, S.[Stefano], Cumani, S.[Sandro], Gemello, R.[Roberto], Mana, F.[Franco], Laface, P.,
Parallel implementation of Artificial Neural Network training for speech recognition,
PRL(31), No. 11, 1 August 2010, pp. 1302-1309.
Elsevier DOI 1008
Artificial Neural Network; Block Back-propagation; Focused Attention Back-Propagation; GPU; CUDA; Fast Training BibRef

Siniscalchi, S.M., Yu, D.[Dong], Deng, L.[Li], Lee, C.H.[Chin-Hui],
Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model,
SPLetters(20), No. 3, March 2013, pp. 201-204.
IEEE DOI 1303
BibRef

Hutchinson, B.[Brian], Deng, L.[Li], Yu, D.[Dong],
Tensor Deep Stacking Networks,
PAMI(35), No. 8, 2013, pp. 1944-1957.
IEEE DOI 1307
Closed-form solutions; Deep learning; handwriting image classification; BibRef

Bengio, Y.[Yoshua], Courville, A.[Aaron], Vincent, P.[Pascal],
Representation Learning: A Review and New Perspectives,
PAMI(35), No. 8, 2013, pp. 1798-1828.
IEEE DOI Survey, Learning. 1307
Neural networks; Speech recognition; Boltzmann machine; Deep learning; representation learning; unsupervised learning BibRef

Swietojanski, P., Ghoshal, A., Renals, S.,
Convolutional Neural Networks for Distant Speech Recognition,
SPLetters(21), No. 9, September 2014, pp. 1120-1124.
IEEE DOI 1406
Acoustics BibRef

Espi, M.[Miquel], Fujimoto, M.[Masakiyo], Nakatani, T.[Tomohiro],
Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning,
IEICE(E98-D), No. 10, October 2015, pp. 1799-1807.
WWW Link. 1511
BibRef

Richardson, F., Reynolds, D., Dehak, N.,
Deep Neural Network Approaches to Speaker and Language Recognition,
SPLetters(22), No. 10, October 2015, pp. 1671-1675.
IEEE DOI 1506
feature extraction BibRef

Trentin, E.[Edmondo],
Maximum-likelihood normalization of features increases the robustness of neural-based spoken human-computer interaction,
PRL(66), No. 1, 2015, pp. 71-80.
Elsevier DOI 1511
Feature normalization BibRef

Sangeetha, J., Jothilakshmi, S.,
Automatic continuous speech recogniser for Dravidian languages using the auto associative neural network,
IJCVR(6), No. 1-2, 2016, pp. 113-126.
DOI Link 1601
BibRef

Fredes, J., Novoa, J., King, S., Stern, R.M., Yoma, N.B.,
Locally Normalized Filter Banks Applied to Deep Neural-Network-Based Robust Speech Recognition,
SPLetters(24), No. 4, April 2017, pp. 377-381.
IEEE DOI 1704
cepstral analysis BibRef

Shahnawazuddin, S., Sinha, R., Pradhan, G.,
Pitch-Normalized Acoustic Features for Robust Children's Speech Recognition,
SPLetters(24), No. 8, August 2017, pp. 1128-1132.
IEEE DOI 1708
feature extraction, spectral analysis, speech recognition, time-frequency analysis, SMAC features, adaptive-cepstral truncation, additive noise, spectral smoothening approach, Additive noise, Hidden Markov models, Mel frequency cepstral coefficient, Robustness, Speech, Automatic speech recognition (ASR), BibRef

Gosztolya, G.[Gábor], Tóth, L.[László],
DNN-Based Feature Extraction for Conflict Intensity Estimation From Speech,
SPLetters(24), No. 12, December 2017, pp. 1837-1841.
IEEE DOI 1712
estimation theory, feature extraction, greedy algorithms, neural nets, speech processing, feature extraction BibRef

Gosztolya, G.[Gábor], Bánhalmi, A.[András], Tóth, L.[László],
Using One-Class Classification Techniques in the Anti-phoneme Problem,
IbPRIA09(433-440).
Springer DOI 0906
BibRef

Kim, M.[Minkyoung], Kim, H.[Harksoo],
Integrated neural network model for identifying speech acts, predicators, and sentiments of dialogue utterances,
PRL(101), No. 1, 2018, pp. 1-5.
Elsevier DOI 1801
Integrated intention identification model BibRef

Affonso, E.T., Rosa, R.L., Rodríguez, D.Z.,
Speech Quality Assessment Over Lossy Transmission Channels Using Deep Belief Networks,
SPLetters(25), No. 1, January 2018, pp. 70-74.
IEEE DOI 1801
IP networks, belief networks, feature extraction, radial basis function networks, speech coding, speech processing, speech quality assessment BibRef

Kim, H.G., Lee, H., Kim, G., Oh, S.H., Lee, S.Y.,
Rescoring of N-Best Hypotheses Using Top-Down Selective Attention for Automatic Speech Recognition,
SPLetters(25), No. 2, February 2018, pp. 199-203.
IEEE DOI 1802
neural nets, speech recognition, Aurora4 speech recognition tasks, top-down selective attention BibRef

Kaushik, L., Sangwan, A., Hansen, J.H.L.,
Speech Activity Detection in Naturalistic Audio Environments: Fearless Steps Apollo Corpus,
SPLetters(25), No. 9, September 2018, pp. 1290-1294.
IEEE DOI 1809
acoustic noise, acoustic signal detection, audio recording, feedforward neural nets, learning (artificial intelligence), speech activity detection (SAD) BibRef

Heracleous, P.[Panikos], Even, J.[Jani], Sugaya, F.[Fumiaki], Hashimoto, M.[Masayuki], Yoneyama, A.[Akio],
Exploiting alternative acoustic sensors for improved noise robustness in speech communication,
PRL(112), 2018, pp. 191-197.
Elsevier DOI 1809
Body-conducted sensors, Hidden Markov models (HMMs), Automatic speech recognition, Speech intelligibility, Fusion, Noise robustness BibRef


Zhang, S., Liu, W.[Wen], Qin, Y.,
Wake-up-word spotting using end-to-end deep neural network system,
ICPR16(2878-2883)
IEEE DOI 1705
Computational modeling, Computer architecture, Hidden Markov models, Logic gates, Neural networks, Speech recognition, Training, CTC, LSTM, RNN, Wake-up-Word system, speech, recognition BibRef

Zhang, S.[Shilei], Qin, Y.,
Rapid feature space MLLR speaker adaptation for deep neural network acoustic modeling,
ICPR16(2889-2894)
IEEE DOI 1705
Acoustics, Adaptation models, Data models, Hidden Markov models, Standards, Training, Transforms, Deep Neural Networks, FMLLR, bilinear models, rapid, speaker, adaptation BibRef

Zheng, H.[Huadi], Cai, W., Zhou, T.[Tianyan], Zhang, S.[Shilei], Li, M.,
Text-independent voice conversion using deep neural network based phonetic level features,
ICPR16(2872-2877)
IEEE DOI 1705
Covariance matrices, Data mining, Data models, Feature extraction, Speech, Training, Training data, Gaussian mixture model, deep neural network, phoneme posterior probability, voice, conversion BibRef

Zhang, B.[Bo], Gan, Y.Q.[Yu-Qin], Song, Y.[Yan], Tang, B.L.[Ben-Lai],
Application of pronunciation knowledge on phoneme recognition by LSTM neural network,
ICPR16(2906-2911)
IEEE DOI 1705
Automata, Dictionaries, Hidden Markov models, Linear programming, Neural networks, Speech, Training, connectionist temporal classification, phoneme recognition, pronunciation, knowledge BibRef

García, F.[Fernando], Sanchis, E.[Emilio], Hurtado, L.F.[Lluís F.], Segarra, E.[Encarna],
Adaptive Training for Robust Spoken Language Understanding,
CIARP15(519-526).
Springer DOI 1511
BibRef

Pastor, J.[Joan], Hurtado, L.F.[Lluís F.], Segarra, E.[Encarna], Sanchis, E.[Emilio],
Language Modelization and Categorization for Voice-Activated QA,
CIARP11(475-482).
Springer DOI 1111
BibRef

García, F.[Fernando], Hurtado, L.F.[Lluís F.], Sanchis, E.[Emilio], Segarra, E.[Encarna],
An Active Learning Approach for Statistical Spoken Language Understanding,
CIARP11(565-572).
Springer DOI 1111
BibRef

Hurtado, L.F.[Lluís F.], Griol, D.[David], Sanchis, E.[Emilio], Segarra, E.[Encarna],
A Statistical User Simulation Technique for the Improvement of a Spoken Dialog System,
CIARP07(743-752).
Springer DOI 0711
BibRef
Earlier: A2, A1, A4, A3:
A Dialog Management Methodology Based on Neural Networks and Its Application to Different Domains,
CIARP08(643-650).
Springer DOI 0809
BibRef

He, H.Y.[Hai-Yan], Wen, C.Y.[Cheng-Yi],
ART2-based multiple MLPs neural network for speaker-independent recognition of isolated words,
ICPR92(II:590-593).
IEEE DOI 9208
BibRef

Chapter on New Unsorted Entries, and Other Miscellaneous Papers continues in
Speech Analysis, other than Recognition .


Last update:Nov 12, 2018 at 11:26:54